The CMIS specification is still in early stages, but the buzz around it is consistently loud. Most recently, the spec has piqued interest of the Apache Software Foundation (ASF), as some saw a "reference implementation" ingredient missing from the CMIS recipe for success.

Meet Chemistry -- a recently-proposed Apache Incubator Project with a goal of creating a generic, open-source, Java-language implementation of CMIS.

CMSWire had an opportunity to talk to one of Chemistry committers -- Day Software's David Nuescheler -- to get more insight into the project.

Background on Chemistry

If Chemistry’s proposal is accepted to the Apache Incubator (it is currently pending some administration approvals), the project promises the following:

  • a high level API
  • a low level SPI
  • generic implementations of clients and servers for AtomPub and SOAP bindings
  • sample backends to serve data from repositories (including JCR)
  • some unofficial TCK (probably)

Check out the wiki for more information.

Who is Behind Chemistry?

In addition to Nuxeo and Day, Chemistry’s committers are affiliated with Alfresco and SourceSense (a European open source systems integrator strongly represented in the Apache community). Interestingly, as Nuescheler mentioned, one of the committers, Gabriele Columbro, has recently left SourceSense for its partner Alfresco.

Nuescheler also expressed hopes that Columbro will be able to continue his contributions to the ASF and Chemistry from Alfresco, bringing the “true spirit of open source” to the ECM vendor. This is not to mention that with Columbro’s arrival, Alfresco got its first ever Apache committer on board.

There are potential plans for Magnolia and Hippo to contribute to the effort, as both have expressed interest in Chemistry. As Jackrabbit users, they would get this CMIS implementation (that will sit on top of JCR and Jackrabbit by default) for free, so the intention is quite clear here.

As far as the big CMIS guns go, not much movement there. According to Nuescheler, Open Text is involved, but no code contributions yet. IBM has a different, its own history of getting involved with the ASF. Microsoft has recently submitted their first code to the ASF, so time will tell how the big companies will react to Chemistry.

Nuescheler did mention that one of the biggest stumbling blocks on their way to contribute is the fact that those big companies need to feel comfortable with the organization they contribute to. The ASF holds a “unique position” in this case.

Taking JCR-Based CMIS Implementation to the Next Level

A JCR-based CMIS implementation has been of interest to many since the proposed spec saw the light. Late last year, CMIS Apache Jackrabbit sandbox was initiated by… Nuescheler, of course, with both his Apache and Day interests at heart. The goal was to “allow any JCR implementation to be CMIS compliant automatically (once the specification is released ;).”

The code developed since then in the sandbox, has morphed into being part of Chemistry’s code. 

The second part of Chemistry code comes from Nuxeo and Florent Guillaume, who came into play and started working on a more generic CMIS implementation framework in a Mercurial source repository outside Apache. But there was a need for the Incubator, if the initiative was ever to get a more controlled, yet less restrictive and confusing environment.

Apache Could be a Good Playground for CMIS

What’s good about the Incubator, said Nuescheler, is that there’s a whole process the projects have to go through. If they’re not led and mentored well, they can either get stuck or be removed completely, as Apache does weed out those that don’t spark enough interest. Graduating from the Incubator “is a big thing for an Apache project.”

Nuescheler brought up a good point saying that, as with any spec, he encourages early implementations, even though the CMIS spec is yet to be ratified and is still going through significant changes.

As it happened with JCR and Jackrabbit in the past, the best way to go is to start implementations, label it alpha code (not production code) and let developers find any inconsistencies sooner rather than later, working collaboratively and reducing the number of implementations, wasted money and redundancies.

This is where the Incubator value lies, as it comes with Apache’s “warning label” attached to it. We could label Chemistry a lab rat, of sorts, used for the spec bug testing and discovery. It’s a neutral place not owned by any vendor, where the community is just getting its feet wet with the-still-new CMIS spec.

Where’s The Chemistry Exactly?

The relationship between Chemistry and Jackrabbit is quite interesting. Every project in the Incubator has to have a sponsor: for Chemistry, it is Apache Jackrabbit.

One may wonder, what are Chemistry’s dependencies on JCR, Jackrabbit, or other Apache projects? Nuescheler said there aren’t really any, as CMIS APIs are defined abstractly and can be easily wrapped on top of JCR. And there are examples of how closely the two can cuddle … So, is it all purely about CMIS after all?

According to the Chemistry wiki, the first supported back-end in Chemistry will be the JCR repository (but not the Apache Jackrabbit, Apache’s reference implementation of the JCR.) Why the JCR? It feels “almost natural” to standardize the supported API, added Nuescheler. Plus, there are hardly any other alternatives.

Commercial vendors have their APIs locked up. But, with Apache’s liberal licensing, Jackrabbit is a natural (if not the only) repository implementation of JCR. Nuescheler repeats “Absolutely no ties to Jackrabbit, only to JCR -- but not as a dependency, just as an option.”

CMIS PlugFest starts tomorrow in Basel. One of the event's activities is CMIS implementations by several vendors. Let's see what kind of chemistry comes out of that. To say the least, we will see a lot more chapters in the CMIS saga.