Previously we’ve reported about the ways that companies can begin to access information stored across multiple data storage platforms. In a continued effort to simplify the process, Index Engines has released Collection Engine for Data Domain, which allows organizations to better understand what it is on their back up disks.

Indexing + Collection

Not only does the Collection Engine allow you to better understand your archived content, it allows you to collect what is useful so that it can be made available for compliance and litigation purposes, no doubt saving time, money and reducing the volume of archived data and eliminate the data that needs to be archived to tapes for long-term retention.

Collection Engine for Data Domain makes it possible for companies to index back ups as they happen, and as a result, users can create a policy so that rules can be applied for each back up.

How Does it Work?

The Collection Engine integrates with the existing backup process and automatically extracts specific files and email into a repository that is used by the legal and compliance teams who need access to these business records. The process includes three key tasks -- identification, collection and management. Here’s how it works:


As new backups are created, Index Engines will crawl the backup images via NFS or CIFS to perform full content and metadata indexing of the unstructured content within the backup image.


Within the Collection Engine queries can be defined which represent the corporate data retention policies. The policy is based on file and email metadata (dates, users, locations, etc) as well as content. Policies are defined and stored in the Collection Engine using the comprehensive search parameters. These searches can be high-level metadata such as user mailboxes, or detailed queries based on file or email content, location and date ranges. Searches are saved as stored queries that run automatically once a new backup is executed.


Extracted data is placed in the Index Engines Collection and written back as an image on the Data Domain storage system. This image represents the unstructured data that is required for long-term storage. The collection process is run as new backup images are generated ensuring the Collection is always current.

Most importantly and conveniently, the Collection can be queried as needed using a browser based interface and users of the collection can find and extract specific files and emails as needed, with all extracted content retrievable in its native format. In addition, parameters and policies can be created so that data in the collection can be expired according to policy ensuring that only the required content will be saved.

Archiving Relevant Data

The Collection Engine from Index Engines allows organization to expand the backup process to support the archiving of relevant records for legal and compliance purposes. By leveraging backup, and turning this content into relevant business records, organizations of any size can enforce policy and govern information according to corporate policy.

Index Engines demonstrated the Collection Engine for EMC Data Domain at EMCWorld last week and will available for shipment on July 1.