piece of microfiche from Department of Justice circa 1990
PHOTO: Mr.TinDC

A renaissance is going on in information capture. This became clear as I completed recent research for the AIIM report, "State of the Intelligent Information Management Industry 2020." Why? Simply stated, it’s because a revolution is underway in both the volume and variety of information coming into organizations.

Our Growing Information Problem

On average, organizations expect the volume of information to grow from X to 4.5X over the next 2 to 3 years. That by itself would be complicated enough. But the other side of the problem is that 57% of that information will be of the pesky unstructured and semistructured variety — in other words, content.

assessing the amount of information in organizations

According to IDC’s The Digitization of the World From Edge to Core (pdf), digitization is happening and digital content is being created in three primary locations: “the core (traditional and cloud data centers), the edge (enterprise-hardened infrastructure like cell towers and branch offices), and the endpoints (PCs, smart phones and IoT devices).” IDC predicts that this “Global Datasphere” will grow from 33 zettabytes (ZB) in 2018 to 175 ZB by 2025 and that endpoints and edge will play an increasingly important role in this growth.

Related Article: Information Overload Comes in 3 Flavors: Here's How to Combat It

Tackling Information Chaos at Its Origin

This rising tide of information chaos has a profound impact on digital transformation efforts. Until organizations develop a strategy for managing, securing and protecting their information assets across the entire lifecycle of those assets, even the most well-intentioned of transformation efforts will prove frustrating. And that means thinking differently about how information comes into the organization in the first place and attacking the problem at its origin — i.e., information capture.

If it seems like we have been talking about “capture” forever ... well, we have. And that’s part of the problem. The old way of thinking about capture — essentially scanning a piece of paper and storing the resultant image somewhere — is no longer adequate to deal with the information tsunami facing organizations.

Leading organizations are rethinking the capture equation. And while traditional “document capture” clearly is still part of this equation, the real focus needs to shift to “intelligent capture.” 

Related Article: Grab That Metadata Now, Before it Slips Away

From Document Capture to Intelligent Capture

What does “intelligent capture” look like? Here are five things to address as you think about modernizing your information management strategy.

Capture at the point of origination — This has been an objective for many organizations for a long time, but it is only now becoming a reality. IDC’s observations about endpoint and edge information sprawl make this critical.

Automated classification and categorization — The time has passed when organizations could afford the luxury of manually identifying and categorizing incoming information. Organizations must embrace AI and machine learning tools to remove the friction from the process of classifying incoming information and assigning relevant metadata.

Automated data extraction — The ability to use machine learning to train systems to identify and extract key metadata and process information from semi-structured and freeform documents is critical to automating the capture process.

An explosion of business input — My friend Harvey Spencer uses the term Capture 2.0 to encompass the broad range of business inputs organizations now face and the technologies needed to manage them. Per Harvey, “Capture 2.0 services enable organizations to interpret and understand incoming multichannel data and thus transforming them into information. Multiple channel inputs can be paper or electronic. Input media can be on the form of images (document, photographic) voice, video, text messages (SMS/chat/social media). Capture 2.0 technologies services transform data contained in these media into information. Capture 2.0 technologies include: OCR/ICR, image recognition, object recognition, voice recognition, NLP, semantic understanding, sentiment analysis, and more.”

Automate all those rote and manual micro-processes — Robotic Process Automation (RPA) is a key bridge technology in extending the life and functionality of legacy BPM and ECM systems. It also extends process automation functionality to a much larger percentage of knowledge workers than is traditionally possible.

In the AIIM research, I looked at the difference between digital transformation leaders and digital transformation followers in the adoption rates of a series of key capture-related technologies:

  • Multi-channel Capture: Leader adoption rate is 20 percentage points higher than Followers.
  • Robotic Process Automation: Leader adoption rate is 22 percentage points higher than Followers.
  • Data Recognition, Extraction and Standardization: Leader adoption rate is 24 percentage points higher than Followers.
  • Analytics, Machine Learning and AI: Leader adoption rate is 24 percentage points higher than Followers.

It’s time to revisit your capture investments and commit to the next generation of intelligent capture technologies.