Recently, we took a look at the importance of defining developing a metadata strategy for SharePoint, predicated on the basis that an enterprise has already made a decision on how and what content will be taken into its environment. But what about the step before that? How are you going to get that content in the first place, and how will you manage the storage of business critical information?

It seems like an obvious question, and something that should be decided before even deciding that SharePoint is the way your enterprise should be going, but according to Sean Baird, senior manager for product marketing with EMC (news, site), assuming that you have decided on SharePoint, the process is only half complete, as the environment still needs to be populated.

Again, this appears to be an obvious statement, but in a recent paper entitled Getting More In And Out Of SharePoint, he points out that the decision on what kind of capture tool for getting information into SharePoint, and what to use to move data to a place where it can be stored, has not always been considered.

Data Deluge

The justification for deploying a capture technology with SharePoint, or another enterprise CMS, should be clear by now. The volume of data being used by enterprises has rocketed -- the amount of digital content created last year was an estimated 1.2 zettabytes, and will grow to 35 zettabytes by 2020.

Leaving aside the content that now resides in places such as YouTube, Facebook, Flickr and other social media sites, for enterprises, a huge proportion of content is still paper-based, and while a good deal of that is in Office documents that can be ingested electronically into SharePoint, a good deal more is in external paper format.

And here lies one of the problems: While SharePoint was originally an enterprise collaboration tool, it has now, for many organizations, become their principal enterprise CMS and data store, and, as such, it means that paper-based documents need to be fed into it as well.

Transforming Paper

There are, as we know, quite a number of scanners on the market that are more than capable of digitizing documents in such a way that they can be managed in SharePoint. However, there are not so many that can digitize documents as high-volume batches, as well as offering intelligent document classification,

With intelligent capture, Baird says, companies can ingest large numbers of documents easily and quickly, not just from a single point, but also from distributed points, which can mean locally distributed, or geographically distributed, on a large scale.

By using intelligent capture, organization can take whatever paper-based information is needed, digitize it, send it into SharePoint and send that information across many different departments and into many different workflows and processes.

What is Intelligence?

We’ve looked at the capture part of the equation, so what is the intelligent part? While we talk about sending the information into SharePoint, it is not quite as simple as that.

Baird points out that, for the efficient ingestion of information, the capture software needs to be able to distinguish one document from another, read the data and take it in with the minimum of human intervention.

The kind of reading he is talking about applies to just about any kind of data you can think of, including handwritten data, machine printed and barcodes, and at a rate that human intervention could never achieve, so incoming documents that once took days or weeks to process are now delivered into SharePoint in a matter of minutes.

Managing SharePoint Information

So you have your information placed in SharePoint. But so does everyone else in the enterprise, including all their videos, images, structured and unstructured data.

The problem is that, with all this data now placed in SharePoint,  performance time may be affected, which leads to longer search and retrieval times as well as long backup windows, and, ultimately, defeating one of the purposes of having such a system in the first place.

There is also generally other data that has been lingering in the system, as enterprise managers are nervous of dumping information that may be needed in the future for business processes, or even compliance issues.

The solution is reduce the load on the SharePoint servers by moving what can be moved out, but, at the same time, giving users access to that information whenever they need it.

Clearly, though, Microsoft had anticipated that with SharePoint 2010 and provides APIs that enable third-party vendors to develop solutions that can redirect data into more cost-effective locations.

Old information can clearly be relocated, and, according to Baird, industry averages put the amount of SharePoint sites that are inactive or redundant at between 25% and 30% -- a considerable drain on storage that can be better used elsewhere.

Archiving for SharePoint

To deal with this, a suitable archiving or content preservation application should be considered. However, before deploying such a system, enterprises should ensure that the new solution can:

  • Improve operational efficiencies by moving inactive content to a more cost-effective tier of storage
  • Ensure existing retention and disposition policies are maintained in the archive
  • Provide access to archived content through SharePoint

Investing Up Front

While the argument for providing front-end capture as well as additional information storage is a compelling one, for CFOs it may not be clear at the outset, when costing for a SharePoint deployment is considered.

The importance of effective information capture that can batch load content, apply metadata and place it in an appropriate location cannot be underestimated.

SharePoint does not come with document capture, nor does it come with storage policies; these are both things that enterprises have to provide for themselves.

However, to get the best out of your deployment, it would probably be best to budget these elements from the word go. After all, if your enterprise is going to go as far as providing SharePoint, it should also be ready to provide the necessary tools to make that deployment work to the best of its capabilities.