Another attempt to track all that scary unstructured content lurking in the corners of every self-respecting enterprise content management system is the recently released, customizable Zoogma search engine.

If it actually does what it says it does, Zoogma targets buried market and document intelligence as well as integrates with other data sources to collect, analyze and produce the most relevant data in response to search queries.

It does this by finding clues in unstructured text and making those clues findable, its developers, Virginia-based Cormine Data Intelligence says.

Unstructured Searching

If you’ve been following us over the past couple of months you will be aware of the huge numbers of companies that have no idea what unstructured data is available on their systems, or even the huge number of companies that have no in-house policies on the storage of electronic data (according to one survey -- 26% of companies).

As anyone who has ever tried to do this will know, playing catch-up is very difficult especially in industries where there is a large dependency on electronic (or even paper) records such as legal, health or even marketing. And that’s without even thinking about the dual gremlins of compliance and eDiscovery.

It would be unfair to Zoogma to describe it as a solution to these particular issues alone, but it will certainly help. However, as it stands many search engines are too linear to deal with combined horizontal and vertical searches.

NLP Searches

Using Natural Language Processing (NLP), Zoogma collects information from web scrapers, databases and other repositories, stores that information, analyzes it and delivers it through a web services interface.

NLP is a field of computer science and linguistics concerned with the interactions between computers and human languages. Natural language generation systems convert information from computer databases into readable human language.

“While keywords help you find what you know, Zoogma is specifically geared towards finding what you don’t know,” said Alex Emmermann, General Manager of Cormine Intelligent Data.

Data Analyzers

So how does it work? Zoogma’s analysis engine runs based on three key analyzers: categorizer, entity miner and near dupe detector. Working with a number of different user defined analyzers, it gives users accurate answers to broad questions.

The analyzers consist of different pieces of software that scan through data looking for very specific pieces of information.

They include:

  • Automated categorization, by which large collections of documents are tagged according to user-defined criteria. New documents entering a system are automatically tagged for content type
  • Smart duplicate and similarity detection, which enables Zoogma to identify subject streams across documents as well as duplicated documents
  • Named entity search for specific people, places, organizations as well as documents based on those entities. It also comes with a method for discovering and assigning “aliases” for entities that can be expressed in multiple ways. e.g. Dr. John Smith, Dr. J. Smith, Doctor Smith

And if you’re concerned about integration with existing enterprise systems, Zoogma can plug in to many enterprise content management systems, but which ones have not been specified.

A recent release, there is little feed-back as yet to indicate whether Zoogma works on not. However, if it can control and search unstructured data, it might well be worth a trial-run at least.