WCM Field Notes is a regular column written in collaboration with Jon Marks (@McBoof), Head of Development at LBi. This second issue looks at what Open Source really means, and suggests ways for you to sensibly include both open source and proprietary systems in your Content Management System selection exercise. 

There seems to be a lot of fear, uncertainty and doubt surrounding open source content management these days. Last week, I was fortunate enough to be asked to speak at a British Computer Society Open Source event but was rather surprised by the lack of agreement about Open Source Software (OSS).

Many attendees thought that Open Source and Open Standards are one and the same. During the panel debate, one of the delegates even said "I would always select Open Source because it allows me to develop using an agile methodology," ... which really got me growling.

So I'd like to use this post to clarify 3 somewhat interrelated concepts -- open standards, open source and open data. Once we've done that, I'll offer my thoughts on how to go about including OSS and proprietary options in your CMS Selection RFP on a nice, happy and level playing field.

[Editor's Note: Don't miss the previous edition of WCM Field Notes: The Skinny on JCR, CMIS and OSGi.]

What Does Open Really Mean? 

Concept: Open Standards 

Open standards are what make the Internet possible. The railway gauge (the width between the tracks of a railway) is the classic example used to explain the concept. Back in the day, train tracks were of different widths so people and cargo actually needed to change trains because the one they were on didn't fit on the next track.

Once the width was standardized, life became a whole lot easier for everyone. Open standards are the railway tracks of the web. I was planning to stretch this analogy further, with the software being the trains and the cargo being the data but my wise colleagues advised me not to. The analogy, er, quickly fell off the rails.

For a standard to be truly open, it should have been created in a transparent way and should be available for anyone to use. Some people believe that a standard cannot be truly open unless it comes with an open source reference implementation.

I don't buy this -- the standard is nothing to do with the source code, although having a reference implementation certainly helps. There are several bodies that work extremely hard to develop and foster open standards. The ones that most affect my world are the W3C, IETF, OASIS and the JCP Program.

We have many useful open standards. We have low level standards that are the plumbing of the Internet, such as TCP/IP, DNS and HTTP. We have standards that allow us to make web pages that work on multiple browsers and devices (XHTML, CSS) and perform clever interactions ( XMLHttpRequest to support AJAX). We have accessibility standards (WCAG, WAI-ARIA) that ensure all users can access the pages. And we have semantic and classification standards (RDF, Dublin Core) that ensure the machines can understand and use the content too.

Higher up the chain, standards get more domain specific. There are many data format standards. If you judge a standard by its adoption (which is the best way), then XML was the most successful standard of the last decade

There are standards for content syndication, for authenticating users across many applications, for the creation of "widgets", for portability across social networks and almost anything else you can think of.

SQL was a wildly successful standard that allowed us to store and access content. The Java Content Repository (JCR) standard is a well known content management specific standard, and Content Management Interoperability Services (CMIS) is a newly emerging one. I talked about the JCR and CMIS in the previous WCM Field Notes column. And I've created An Incomplete Directory of Open Standards for those that want a more complete list.

2010-01-WCMFN-OpenStandards.gif

Image Credit: Rob Cottingham

Concept: Open Source

True open source software is software that is licensed under specific open terms (free and redistributable) and developed using a particular open process, part of which includes full access to the source code, for anyone. That's really about it. A good definition of OSS can be found on the Open Source Initiative website.

My friend Justin Cormack came for a beer and a chat after the BCS event. He has written an excellent blog post helping to distill the essence of OSS, and its impact on content management. He says:

open source...started with developers, about more efficient ways of building, architecting and delivering software; in terms of influence on the end users it is still small.

This is very important. The fact that a product is open source should not matter much to anyone except the development teams. And maybe those signing the checks and the lawyers, but we'll talk about this later.

Justin also recommends reading the Cathedral and the Bazaar essay, written over ten years ago by Eric S. Raymond. This study analyzes how one successful open source project worked and explores the argument that "Given enough eyeballs, all bugs are shallow."

The essay wonderfully captures the spirit and power of OSS as a vehicle for software development. The Cathedral refers to organized, closed development and the Bazaar to the mayhem of true, open development. I've always wondered if Raymond picked a cathedral to imply some link to the religious debate between Grand Design versus Evolution -- the proprietary versus OSS debate can get pretty religious at times.


Eric S. Raymond Describing The Cathedral vs. The Bazaar

Good OSS will heartily embrace open standards. But so will good proprietary software. Be warned, though -- there is bad software of all types out there that ignores standards.

2010-01-WCMFN_OpenSource.gif

Image Credit: Rob Cottingham

Concept: Open Data

The open data philosophy believes that certain data should be free and available to anyone. There are justifications for this -- it could be because the research to produce the data was paid for by taxpayers, or because of the belief that you cannot put a copyright on facts. Or because openness is simply better. Some people -- like Hans Rosling of GapMinder.org -- even think that all educational materials paid for by public funds should be made open and accessible to all.

Data is not considered open if there are licenses preventing re-use of the data, if only certain individuals (for example, registered members on a web site) can get at it, or if the storage format makes it difficult to access.

So we have overlap with open source (licensing and copyright models) and open standards (storage and interchange formats), but it is a distinct concept.

Tim Berners-Lee, the man credited with founding the Web, is one of the loudest voices in favor of open data. In the below video he attempts to explain the concepts of open data and linked data to a non-technical audience -- and the choice of language is at times rather amusing. His flower analogy is, however, a powerful illustration.


Tim Berners-Lee: The Next Web of Open, Linked Data

The most visible open data project in the last 10 years has been the Human Genome project. You can get free access to this data (all 150GB of it) now from various sources. In fact, it is one of the popular Public Data Sets hosted on the Amazon EC2 Web Services platform.

The list of categories of datasets on the platform gives a good insight into the kind of data that has already been made open: Astronomy, Biology, Chemistry, Climate, Economics, Encyclopedic, Geographic and Mathematics.

A good example of an open data project is OpenStreetMap -- a "a free editable map of the whole world". And the UK Government is planning to open up the UK Postcode data in 2010. Currently you have to pay to use this data. As more and more data becomes open, we'll see more clever and useful applications of it.

2010-01-WCMFN_OpenData.gif

Image Credit: Rob Cottingham

Evaluate Open Source Fairly

Now that we know what OSS really is, we need a way to decide if it is the right choice for us. In my daily work I see a lot of CMS shortlists. However, virtually all of these short lists are either entirely composed of proprietary systems, or entirely of open source systems. The "to open source or not open source" decision seems to have been made much earlier, sometimes subconciously, and almost always for the wrong reasons.

In my view, the fact that a CMS has an open source badge does not really matter. CMS Watch have acknowledged this by removing the Open Source category in their report and including Open Source products alongside their proprietary peers.

The boundaries between these ideas have gotten rather blurry of late. Community open source platforms (such as WordPress or Drupal) have very different models from commercial open source platforms (such Alfresco). Some vendors ship both an open and proprietory version of their software. And sometimes the open source badge on a product is all about marketing.

Some companies have a bias against open source. Something which has been on my mind recently revolves around the difference between a product that most companies would consider to be "black box" (such as an operating system, web server or office productivity suite) compared with a product that most companies would build upon, such as a content management system. Many companies having no issues using Ubuntu, Apache Web Server or OpenOffice, but are scared of an open source CMS. I think the type of product does impact whether OSS is appropriate for you, but that is a conversation for another day.

There are two easy steps to make sure you consider all the options. Firstly, make sure you get responses from all kinds of vendors. And secondly, have sensible evaluation criteria.

Write Open RFPs

A good consultant can smell what an RFP is looking for very quickly. And, it turns out, most RFPs make it very clear if they are expecting an open source response or not. Of the ones that I see, many of the questions are extremely difficult to answer if you are proposing to use an open source solution. As someone responding to an RFP, it is normally safer to suggest not to suggest OSS unless you're sure that the client is open to it.

There seems to be a perception that systems integrators and agencies would prefer to push an expensive product because of the kick-backs they receive from the vendors. This isn't true -- kick-backs are few and far between. In fact, the opposite is often true. If the integrator knows you have a budget they would rather gobble up as much as possible of that budget in services, not see it going to a vendor in license cost. More importantly, though, all the good integrators want to suggest the product that will make their life (and so your life) easier. They don't have preconceptions.

As an aside, I have seen people use OSS as a way to avoid the procurement process completely. Because it is "free", no-one else needs to get involved and no RFP needs to be written. However, although these download-and-do-it-yourself projects sometimes work wonderfully, they also lead to many Shadow IT departments, maintenence nightmares, and can give OSS a bad reputation in the minds of IT departments.

Have Sensible Evaluation Criteria

The most important factors when picking a CMS should be whether it provides the required functionality, whether it can be extended to support future functionality and, of course, the Total Cost of Ownership (TCO). Let's look at the various aspects of an OSS product and see how we can best evaluate them.

Let's start with cost, which is (despite what people say) the main reason I see buyers going for open source. However, the cost of the license isn't the important thing. It's theTCO that matters. Be sure to factor in support and maintenance costs, and the cost of development you'll need on top of the product.

The cost of the CMS implementation is normally far greater than the cost of the license, so be sure not to focus on a $10K license fee saving that could mean ten times that cost in development. Many buyers are attracted to the idea that an OSS product means they don't need to pay a big lump sum up front to get the software. However, be aware that many of the proprietary vendors now ofter a SaaS model which allow you to pay-as-you-go.

I'm assuming you are wise enough to make sure that the product does what you need it to do. However, no CMS will do everything you need it to, and you'll certainly need to extend the product for future requirements you don't even know about yet. So the product needs to be extensible, which is not about changing the core product.  A good product, open source or otherwise, will have extension points that allow you to make changes without changing core code. Your extensions should continue to work with new versions when they are released. The ability to create modules or plugins is part of this.

Have a look at the developer community for a list of existing plugins. Note that many proprietary systems have an excellent extension API and a vast array of plugins and modules -- many of which will actually be OSS.

The way the software is developed should not affect your software choice, unless you plan to develop it further and commit back using your own in house development teams. It is not important to you whether or not an open development process described earlier does, on average, produce better software. You're judging the end result of the process.

Ease of maintenance is important, and OSS sounds attractive here. "I'll never be at the mercy of a vendor who won't fix my bug", you cry. "I'll fix it myself!" However, you need to ask yourself whether being able to see and change the code really is an advantage for you. Would you be brave enough to change the core of a product to fix a bug? If you are developing and supporting the product in-house, then it may well be a big bonus. If you aren't, then this is somebody else's problem and you shouldn't worry about it. Ask yourself who will help you if you have a problem with the software.

Your product needs to be modular to ensure you can replace bits if needed without throwing the whole thing out. Open standards help promote modularity by necessity, as each standard influences different layers of the system.

Your product needs inter-operate with other systems you may already have, or systems you will get in the future. However, the fact that something is open source doesn't matter here. Just make sure it uses open standards.

You need to make sure you're not locking your data in a vault. Ensure that data isn't stored in a way that makes it impossible to get at or, at the very least, that the system provides an easy way to export the data into a standard format.

The way the software is licensed probably will not have a big impact on your decision, unless there are particularly sensitive Intellectual Property issues. It is very rare to see a product picked or disqualified because of the license.

Finally, consider the risks of vendor lock-in. We've all heard many horror stories of customers being abused horribly be their proprietary vendor. But remember, choosing an OSS product doesn't guarantee you are safe. You can still be at the mercy of an implementer or community, if not a product vendor. Don't think that it is easy to move, for example, from Drupal to Plone. The best way to insulate yourself from this risk is, again, open standards and open (or at least easily portable) data.

The point of all of this is that you shouldn't evaluate your product on the fact that it is OSS or not. That doesn't really matter. Evaluate all the products on the dimensions outlined above and pick the one that comes out top.

In Summary

Open standards, open source and open data are three important, yet separate, concepts. Blurring the lines between them creates confusion and makes you susceptible to marketing hype from both sides of the Propietary vs OSS war.

However, if you have a clear understanding if these concepts, it becomes a lot easier to see through the FUD, compare software from all different walks of life and give yourself a chance to meet the content management product of your dreams.

Finally, the three cartoons used above were taken from the excellent Social Signal blog, and have an open license. Openness is good.

[Editor's Note: Don't miss the previous edition of Jon Marks' WCM Field Notes: The Skinny on JCR, CMIS and OSGi.]