SharePoint MOSS Web Content Management and CMIS Integration Microsoft has played a major role in the development of the Content Management Interoperability Specification (CMIS) that is currently in the hands of the OASIS working group. Committed as the team has been, they (including Microsoft) have not provided much information or examples of how CMIS will work with Microsoft's SharePoint Server 2007 (MOSS). Until now.

A new MSDN article has arrived that demonstrates how one can integrate an external document repository with MOSS (or the free version of SharePoint, WSS) -- which is exactly what the CMIS spec is all about. Our interest piqued, we took a look and in the following article, we share what we found.

Microsoft definitely likes to be in the thick of things. With the most popular business collaboration tool around it's no surprise they are deep in the process of developing a content interoperability specification.

With so many organizations having some kind of SharePoint implementation, both business people and developers alike having been anxiously waiting for some kind of word on how SharePoint could implement CMIS.

Microsoft has now provided an example. It doesn't cover the full CMIS specification, but there's enough to give you a good idea of how it could work.

But before we get into the nuts and bolts of the example, let's take a step back and review some background.

A Quick Overview of MOSS and WSS

Now we are sure you know about SharePoint. SharePoint or Microsoft Office SharePoint Server 2007 (MOSS) is Microsoft's version of an Enterprise Content Management platform. It offers document management, web content management, collaboration, search and business intelligence in one neat little package, designed for the small to medium sized business.

MOSS is built upon Windows SharePoint Services (WSS). This is the core framework that comes free with any Microsoft Windows Server. WSS provides document management, collaboration, a centralized shared document repository, the core service of SharePoint Lists and a few other features.

SharePoint is currently the most deployed content management solution in enterprise today. With well over US$ 1 billion in licenses sold, you know that any content management standard needs to work with this solution.

To learn more about MOSS, read our SharePoint 2007 Review - Six Pillars of MOSS.

What is CMIS? (briefly)

For those of you living in a dark room with no connection to the real world (internet world that is), CMIS stands for Content Management Interoperability Specification. It is a proposed new content management standard aimed particularly at document management and all the good stuff that goes with it (like metadata).

The standard was brought to life by Microsoft, IBM, EMC, Oracle, Alfresco and Open Text. The purpose of the standard? To provide a simple web services interface that would enable developers to write applications that can talk to more than one content repository without having to know the specific details of each repository.

The specification uses a least common denominator approach. Which means that only a basic set of operations for talking to a content repository are included. It defines bindings using two protocols: SOAP and REST/Atom.

Work on the CMIS standard is currently in the hands of an OASIS working group. The first official face-to-face of the OASIS technical committee occurred at the end of January and things sounded pretty positive. A public review is expected sometime late spring.

For more detailed info on CMIS, see the following:

You can get official coverage on the OASIS CMIS website.

A Demo of CMIS and SharePoint

The scenario described in the MSDN article outlines having a document library in SharePoint that integrates the contents of an external document library.

Components in the Example

There are several components involved in this scenario:

SP_CMIS_example.jpg
SharePoint Talking to a CMIS Repository -- Solution Architecture
 

  • External Repository:
    This is the external repository that actually stores the documents. It could be another content management system, but in this example it is set up as a File Directory structure that emulates a content management system.
  • Windows Communication Foundation (WCF) Web Services:
    The WCF Web Services are used to access the documents stored in the repository. Included are a set of basic operations that can be performed on the documents. These operations are within the CMIS specification and include examples such as check in, check out, versioning and delete.
  • ASMX Web Services:
    The ASMX Web Services are internal to SharePoint and provide two capabilities. The first is to talk to the single sign-on services (SSO) within SharePoint. It's the SSO services that map the credentials of the SharePoint user (Windows credentials) to the credentials of the external repository. The second is related to the use of Silverlight in this example.

    The Silverlight UI will talk to the ASMX Web Services, who then talks to the WCF Web Services. This is done to prevent Silverlight from making cross domain calls.
  • Enterprise Search:
    Search is a component of the CMIS specification. The protocol handler enables SharePoint to index the external repository and supports searching for documents in that repository through the Search Center.
  • Custom Document Library:
    In this example, a custom document library has been created that points to the external library. For this example, it supports a single Silverlight webpart for viewing and interacting with documents. Attached to this document library is a task assignment processes and an approval workflow process.

CMIS Implementation Overview

This example is very straightforward and easy to understand, which may be due in large part to the simplicity of the CMIS Specification.

Here's a high level overview of how it works:

Creating the Library and Connecting to the External Repository

First you create a new Document Library in SharePoint of the type External Library. This is a custom library that will ask you for the connection information of the external repository you want to connect to. So documents look like they are in SharePoint, but they are really in the external repository.

SP_CMIS_1.jpg
SharePoint -- Creating an External Data Library
 

For this example, the external repository happens to be a File Directory structure that includes all the information necessary to talk to the repository including user access permissions and content types.

Content types are stored in an XML file and metadata is defined for each content type. Metadata and user access privileges for a specific document are stored in an XML file that has the same name as the document.

Accessing A Document

To retrieve a document from the library, your windows credentials are passed to the SharePoint single sign-on (SSO) feature to map to the credentials in the repository. Once authenticated, your credentials are passed to the WCF web services to use when an action is attempted on a document. Remember, your permissions for a document are located in a properties XML file for that document.

Defining Additional Metadata for A Document

This example does something interesting. Remember that each document has a content type associated with it and a set of metadata. What if there was additional information you wanted to track on that document, but you didn't want to go through the effort of updating that external repository (you may likely not have the ability to do that).

You can add additional metadata to a content type directly in SharePoint and have SharePoint store that extra information in its own content repository.This additional information is created as a set of columns.

SP_CMIS_3.jpg
SharePoint -- Integrated Document Metadata
 

When a document is uploaded, the ASMX web services integrate the metadata defined for the content type from the external repository with the additional columns (metadata) defined directly in the SharePoint document library. The user sees one interface to add, view or modify the metadata, but underneath the surface, the information is stored in separate locations.

We know how its stored in the external repository (within a XML properties file). But how is it stored within SharePoint?

When you upload or view a document that has additional metadata defined, a stub file is created within SharePoint behind the scenes and the extra metadata is attached to it.

Why don't you see these stub documents? Because the only interface that you are allowed to use is the Silverlight interface, and it only shows documents from the external repository.

Assigning Tasks and Workflow

Along the same lines, if you want to create tasks and workflow processes that kick in whenever you add, update or delete a document, you do this within the SharePoint document library. 

SP_CMIS_4.jpg
SharePoint -- External Document Workflow
 

The workflow processes starts by checking out the document from the external repository and making a copy directly in SharePoint. The workflow is then activated against the SharePoint version of the document. Once the workflow is completed (for example, it has been approved), the workflow then copies the document back to the external repository and checks it in.

Additional Capabilities

The example also discussed how a document can be copied from the external repository to another location (eg SharePoint site) and searching the repository using the custom protocol handler. One of the challenges they came across when implementing the search capability was how to restrict the view of documents based on the user's credentials.

Since the credentials are stored within the external repository, SharePoint can't use them to filter the search results. The articles discusses some workarounds of which all are either difficult to implement, time consuming or expensive (use MS Identity Lifecycle Management Server).

Final Thoughts

If you know Microsoft well, you should have no problems following the article or downloading and installing the code. Included in the downloaded code is an executable that will create the File Directory structure that represents the External Library, so you don't have to manually create that one yourself.

Remember that this example does not demonstrate all the operations applicable to the CMIS specification, and it only implements the SOAP web services protocol. In a real world solution, all operations must be implemented and both protocols (SOAP and REST/Atom) must be supported.

While this is a good example of how SharePoint can integrate documents from another document library it doesn't demonstrate how SharePoint content may be consumed by another Document Library or application.

It's possible that the decision to explain how to pull documents into SharePoint has occurred first because many organizations like to use SharePoint as their front-end to more enterprise level content management systems on the back-end.

Many questions have been answered with this example, including the most important: can CMIS be implemented in SharePoint. But maybe even more important, many questions still remain. How will security be filtered in search queries? How will SharePoint expose access its own internal repository? How can the data in external documents be pulled into SharePoint without the use of Silverlight? We hope these answers will become clear when the spring public review of the CMIS specification arrives.

Next: Get Your Fingers Dirty

That's the intro folks. We've broken down in mostly laypersons' terms how Microsoft intends you to wire up CMIS repos with SharePoint. And now if you're a thirsty SharePoint geek...yes, it's time for the next step.