What Comet Landers Can Teach Us About Metadata

2014-26-November-Rosetta-Earth.jpg
The recent success of the European Space Agency (ESA) sending the Rosetta probe in orbit around a comet, and the more bittersweet successful landing of the Philae lander has made me nostalgic for the days when I worked with space scientists.

The Mars - Metadata Connection

Back in 2001, the wonderful Professor Colin Pillinger, then head of the Open University’s Planetary and Space Sciences Research Institute (PSSRI), asked me to help with the IT and networking for the build of both the Aseptic Assembly Facility (clean rooms) where the Beagle 2 Mars lander would be built, and the Lander Operations Planning Centre where its operations, experiments and data retrieval would be planned and controlled. Once the building was done, Pillinger asked me to join the Mars Express mission, Beagle 2 lander “instrument” team as the science data archiving manager. No disrespect to anyone else I have ever worked for or with, but I still regard it as the best job ever! Pillinger was also to play a pivotal role on the current Rosetta mission.

All very cool you big show off, I can hear you muttering, but what does this have to do with metadata?

To work in the science data archiving end of an ESA mission I had to learn the NASA Planetary Data System (PDS) standards. PDS is a massive archive of data from all planetary exploration missions. The PDS standards include standards for formatting your science data, formatting your documents and images and of course standards for the metadata included with your content, so that other scientists can easily find and work on your data too.

This was my introduction to “serious” use of metadata. I understood the term from an IT support guy point of view. This was around 2002-2003 when I started working with the science teams and I realized that -- of all things -- the "tags" in iPod and iTunes were spreading the idea of metadata to a much wider audience. The concepts of “tagging” your MP3 files by genre, as well as album name, artist name, and more resonated with the younger members of the science teams when discussing the metadata fields required on the TIFF files of any images sent back from the surface of Mars.

Unfortunately the Beagle 2 lander crashed so we received no science data to archive. But I was smitten and decided my career path lay in information management. In another role at Open University I worked on an enterprise content management strategy, which was enabled via the procurement of the EMC Documentum suite of products. This returned me to working with large amounts of complex metadata on learning objects.

There are IEEE standards for Learning Object Metadata that should allow different Learning Management Systems (LMS) to manage and present objects from different authors or institutions. Interoperability was a key driver of the complex metadata required on videos and other interactive learning objects. Key to this was building a workflow in Documentum that meant that no single individual was responsible for filling out all 27 fields on a particular video.

Fast Forward: The Consumerization of Enterprise IT & Cloudfind

While the Rosetta science teams in place today will be working with complex metadata schemas and custom built repositories, it most likely will feel the effects of the consumerization of IT trend. In the context of content management this has meant the continued and growing popularity of services like Box, Dropbox and Google Drive.

It was in this context that I had a discussion with Robert Curran, the Chief Marketing Officer of UK-based Cloudfind. The four year old company has had its product on the market in its original form for almost two years. It was originally offered as an add on to Salesforce.com’s platform, purchased through their AppExchange: automatically connecting content in third party storage repositories to the record in Salesforce’s CRM.

The latest release from Cloudfind is the next stage in the product's evolution. It helps manage your content by sitting as a tag management and search layer connected to your Dropbox or Google Drive repository.

2014-26-Nov-Cawthorne-Image1.jpg

CloudFind connected to my Google Drive, in Folder View

The system aims to simplify life for users familiar with tagging by being fast, responsive and lightweight because it is not cracking open the content and doing a full text search. This might suit some scenarios better than others. For example, lawyers working in a matter-centric fashion where everything is tagged with a matter number, searching on that matter number will bring back all related documents. After scanning your documents, it's a good idea to use the drop down menu in the top right hand corner to “manage tags” to see what has been automatically added, and edit or remove as appropriate:

2014-26-Nov-Cawthorne-Image2.jpg

The Manage Tags view show tags automatically created after scanning my Google Drive

Folder names are automatically used as tags, so for example, in my Cloudfind instance linked to my (not very heavily used) Google Drive, if I start to type in “presentations” I will be shown all files residing inside the presentations folder:

2014-26-Nov-Cawthorne-Image3.jpg

Searching on the topic of presentations shows documents in the presentations folder

Having found a couple of presentations, I may now decided that one of the documents should also be tagged with a different topic, so can use the “pencil” edit icon to both create a new topic and add it’s tag to that document in a single action:

2014-26-Nov-Cawthorne-Image4.jpg

Creating a new topic, and adding it to a document in a single step

Simple Metadata Management for the Masses on Cloud Storage?

This product is an interesting add on to cloud storage, providing a new level of metadata management for products that don’t have it built in. It’s certainly interesting to me, having worked with massive and complex metadata schemas, with many many fields that needed to be filled in by multiple operators for a single asset. Let's face it -- even “rocket scientists” don’t want to have to become metadata specialists, not even to satisfy the PDS standards. The easier we can make metadata management, the better, and extending this concept to cloud storage is another step in making people’s work lives a little simpler.

I have to close out by wishing everyone involved with the Rosetta mission and the Philae lander at ESA and the OU in particular, my congratulations on a job very well done. If only Professor Pillinger was still here to see it, he would be so proud of his colleagues -- I know I am!

Title image by ESA ©2009 MPS for OSIRIS Team MPS/UPD/LAM/IAA/RSSD/INTA/UPM/DASP/IDA