2011 is not only said to be the year of the mobile, but also a key time for cloud computing. As we look to leverage cloud environments for cost savings, among other things, we need to understand how different products and solutions support the cloud. This article focuses on enterprise content management and business process management in a cloud environment, with a specific look at some of the players in these markets.
The Knowledge Economy & the Cloud
We all agree that we are working in a knowledge economy and organizations are striving to create an empowered workforce, with the in-house knowledge capital being critical for a company's operations and performance, not to mention innovation and competitive advantage. Organizations are focusing on how to improve business operations, bring down costs of servicing customers and reduce time-to-market for new products and services.
Workflow automation, straight-through processing, paperless environment, service oriented architecture and multi-channel integration and distribution are a few key initiatives that companies pursue to meet their goals and objectives. Social media and networking is a paradigm shift that is changing the way an organization interacts with its ecosystems of customers, suppliers, partners and employees.
While the social paradigm and its adoption are very critical to today’s business landscape and for the next generation customer interaction model, we will not be focusing on this phenomenon in this article. We will focus on the cloud computing paradigm, especially on how BPM and ECM tools and technologies can be better leveraged through a cloud operating model and architecture.
A lot of progress is being made on how the cloud paradigm can be leveraged to create, store, manage and deliver content and information to users, primarily through Enterprise Content Management (ECM) tools that are available in the market (both proprietary and open source). This helps in the digitization of information required to execute business processes and transactions.
Adding Business Process Management (BPM) capabilities on top of content management and digitization will help work move to users and process participants. It will automate work assignment through skills-based routing, make available the relevant content required for each process step, tracking of pending work and monitoring process performance and user productivity.
ECM & BPM in the Cloud
Solution architectures that combine the capabilities of BPM and ECM are not new, but providing these solutions using a cloud-based delivery platform is rapidly evolving and being adopted in the marketplace. Of the myriad of BPM and ECM vendors who have been in the market for a long time, quite a few claim that their products are cloud-enabled and cloud-ready. These vendors and the systems integrators who provide services around these products are aggressively promoting the cloud-based offerings to various customers, with multiple levels of success. We have been evaluating some of these products like Alfresco, Nuxeo, KnowledgeTree, SpringCM in the ECM space and Appian (Cloud BPM), Pega (SmartPaaS) in BPM space.
This article focuses on some of the important aspects of cloud architecture and attempts to evaluate how these ECM and BPM product vendors are aligned with core cloud architectural principles required to run their software on private and public clouds.
It is expected that readers will have a basic understanding of different ECM/BPM products and cloud computing architecture to appreciate our observations and viewpoints.
Let us look at the cloud architectural features that customers and cloud services providers (we also refer to them as systems integrators (SI)) will be looking for in the cloud version of these ECM and BPM products.
Single Tenancy and Multi-tenancy
SI(s) will be looking for a platform (PaaS -- Platform as a Service) for the following reasons:
- They can build innovative business solutions (SaaS -- Software as a Service) rapidly and sell them to their customer base, especially, Small and Medium businesses. Ideally, they would like to build once and then be able to deploy to multiple customers (also called Tenants in the cloud world) with minimal customization.
- They would prefer to have flexible configuration options to onboard new tenants quickly and efficiently. This ensures that they achieve economies of scale and therefore provide a competitive, pay-as-you-go pricing model.
- These SIs will be interested in the maximum reuse of the application layer/logic across the tenants.
Customers may not be too concerned about the application logic layer (mostly driven by PaaS), unless and until it is transparent to them and the data is isolated. Customers will be more interested to ensure that they are paying only for what they are really using (both compute and storage) and this is usually driven by the Infrastructure-as-a-Service (IaaS) layer.
In a single instance, multi-tenant environment, IaaS and PaaS layers are shared across multiple tenants and the single instance of the software/application logic can serve multiple customers/tenants. But, in a single instance and single tenant scenario, the IaaS and PaaS layers are tenant specific. As the compute power is not shared, economies of scale are not leveraged much and the tenant ends up paying more per transaction. The cost of operating a cloud instance that serves a single tenant will be more than offering the same instance to multiple tenants.
The question then arises -- what is the difference between an on-premise/hosted application and a single instance/single tenant cloud application? Well, because of the inherent and elastic nature of the IaaS layer, you can scale up/down the IaaS resources (like VM, CPU Cycle, Storage, Memory, etc.) that are required to run the cloud application on demand (almost instantly). This will not be possible if the application is running on on-premise hardware, since this hardware is a sunk cost and the customer is responsible for resource utilization, not the cloud/IaaS provider. Metering and billing come with a true cloud provider offering , which will not be the case for an on-premise application or ASP delivery model. So, what is obvious is that vendors need to build their product in such a way that Paas/application layer can support multi-tenancy inherently to run on the IaaS layer. This will help maximize reuse and optimize costs.
Some ECM products like Alfresco and SpringCM, and BPM products like Pega support multi-tenancy, a critical component for multi-company cloud implementations, as it maximizes use of hardware, application platform and common logic. Data isolation is managed by application layer logic for Alfresco, SpringCM and Pega.
- The multi-tenancy (MT) features of Alfresco helps to enable and configure it to run as a true, single-instance, multi-tenant environment. A single instance (installed either on a single server or across a cluster of servers) of Alfresco can host multiple independent tenants. The Alfresco instance is logically partitioned such that it will appear to each tenant as if they are accessing a completely separate instance of Alfresco. Client facing interfaces (like Alfresco Share, Explorer, etc.) are aware of the tenant’s domain, authentication, user context and accordingly, can store and retrieve tenant specific data.
- The Pega SmartPaaS cloud offering inherently supports multi-tenant architecture with respect to access, security, and the execution model. Independent business applications (rules and business process) can be built using the PaaS layer and SaaS applications are segregated by access and organizational rules, but can share non-specific sets of rules to provide common functionality.
- SpringCM supports true single instance, multi-tenant architecture in which the cloud service provider can provision and maintain one instance of the platform that can be used by all customers/tenants.
- Appian (a BPM cloud provider) follows the single instance/single tenant model inherently. No application components and process information is shared across tenant(s) for this product. This works well with customers who have concerns and apprehensions of a shared data model and potential security issues and threats. Adopting a single instance/single tenant model helps in those cases where customers may not prefer the business process/platform layer (PaaS) to be shared with other tenants.
- KnowledgeTree (an ECM cloud provider) used to follow a single instance/single tenant architectural strategy until recently. They currently support multi-tenancy.
- Nuxeo, a Document Management Product does have partial support for multi-tenancy. The Nuxeo (version 5.1) architecture cleanly isolates the repository with domain and workspaces, though it does not isolate user management. As we speak, there is an effort going on to make this product multi-tenant in all respects.
We will deep dive into how ECM software vendors like Alfresco, SpringCM and Nuxeo (partial) enable their single instance cloud platforms to work in a multi-tenant environment. Whenever a tenant is provisioned in the instance, it creates an independent domain (@xyz) so that each tenant is domain-aware and can be identified using its domain name. Domain information will primarily be stored in the schema/database.
Based on the user’s authentication (user1@xyz), a dispatcher sends the tenant-specific, context aware request to the PaaS layer, where the PaaS common logic handles the request and stores/retrieves tenant specific data. The whole process may not be as simple as described though.
The following diagram depicts the conceptual architecture of an ECM/BPM multi-tenant cloud enabled solution at very high level:
Cloud Infrastructure (IaaS) support
Alfresco, Nuxeo and KnowledgeTree have their cloud version (Amazon Machine Image -- AMI) running on Amazon EC2. Customers, end-users, application developers and service providers can sign up for an Amazon EC2 account (including storage space on S3 or EBS) and run their own Alfresco, Nuxeo or KnowledgeTree cloud instances with minimal effort. Amazon provides options for operating systems (Linux, Windows) as well as database (Open SQL, Oracle etc.) Customers can also engage cloud infrastructure providers like RightScale to setup a robust production environment in EC2 very quickly and manage/monitor it efficiently.
The SpringCM cloud platform operates on hundreds of virtual machines in its data centers.
Pega’s SmartPaaS cloud offering is housed in a data center providing best-in-class physical and network security with excellent redundancy, scalability and reliability. SmartPaaS Platform-as-a-Service (PaaS) is developed using IBM middleware on Amazon Elastic Compute Cloud with the support of Amazon Web service (AWS) on-demand technology infrastructure, providing instant scalability and elasticity. Capgemini’s global Cloud Computing Center of Excellence is also involved in this effort.
Appian’s cloud offering (Appian Cloud BPM) standard and premium -- both runs on its own Amazon EC2 or Rackspace cloud hosting environments.Appian’s Standard and Premium offerings offer localized hosting in the United States, European Union or Asian / Pacific regions. Free 30-day trials provided by Appian are available,offering the complete BPM Suite solution hosted in an Amazon EC2 environment.
This is the most critical and sensitive piece of the puzzle that influences an organization’s decision to move to cloud. Let us explore the possible strategies that can be adapted by any cloud vendor.
Separate database and separate schema for each tenant
In this model, each tenant gets its own separate schema and database. This looks very attractive from a customer perspective as it guarantees better security and enables more customization, but comes at a bigger cost as each tenant needs to have its own database infrastructure in the IaaS layer and the cloud application/service provider is not able to capitalize on economies of scale that will be realized if one database schema / instance was to support multiple tenants. Some customers might feel that their sensitive data should not be resident with other tenants’ data. Appian and KnowledgeTree support this type of data architecture.
Shared database and separate schema
Each and every tenant shares the same database but gets its own schema. So tenant specific data gets stored in its own schema in a shared database. SpringCM and Pega SmartPaaS support this data architecture. But Pega's situation is unique in the sense that it also has a common schema so that all tenants can share the common, reusable rules across the application.
Shared database and shared schema
All tenants' data gets stored in a single shared schema, in a single database. In Alfresco’s multi-tenant environment, content gets stored in its respective tenant specific file store, but metadata etc. are stored in a single shared schema running in a relational database. There is a greater economy of scale, but less security as data has to share the same schema & database across all the tenants. Data separation is an application level responsibility. Nuxeo (An ECM product) supports this kind of data architecture.
Configuration & Development
A software product running on the cloud should be configurable enough, and should be able to provide a platform (PaaS) so that business applications (SaaS) can be built on top of it quickly and easily.
- Appian provides pre-built application and process templates that may be configured and deployed to manage its customers’ immediate needs. Custom smart services/components can be built, but can only be deployed by Appian’s product support organization.
- Pega provides a thin client interface for configuration and development. No separate IDE needs to be installed to develop a custom solution.
- The multi-tenant features of Alfresco depend on the Dynamic Model features. This provides tenants with the ability to customize their Alfresco environment, including models, workflows and the web client UI. The core CMS supports the template/dynamic model to build a SaaS application.
- Nuexo does not provide any templates or similar facilities/utilities that will help in rapidly developing a SaaS application. So, tenant specific customization may not be that easy for Nuxeo.
- The excellent, template-based configuration/development capability of SpringCM helps to build custom SaaS applications very rapidly. Lots of standard templates are already available. Developers can very quickly configure and customize the appearance of SpringCM to include branding and logo, access control on folders and documents, and configure certain behaviors of features.
- Knowledge Tree also provides an XML template to configure customer-specific needs like metadata etc. Customization in its multi-tenant environment is quite difficult.
How quickly a customer/tenant can be on-boarded in a multi-tenant environment depends on whether the platform has enough tenant management and administration capabilities like provisioning a tenant, importing data from one tenant to another, ease of user management, etc.
- The Appian product support group brings up the instance with the tenant specific need. Once this is done, the instance that has been brought up is monitored proactively for the premium edition. Proactive monitoring is also available for the standard edition.
- Pega SmartPaaS administrators can add and configure servers, application server containers, and databases using the administration console. New tenants can be added and managed very easily using the portal-based PaaS administration console that comes with powerful wizards.
- Alfresco provides an administration utility to quickly create a tenant, import tenant data, user management, etc.
- Nuxeo does not have any admin capability by which a tenant instance can be managed. But then Nuxeo does not support multi-tenancy inherently.
- SpringCM does have extensive administration capabilities for tenant management.
- In a single instance/multi-tenant KnowledgeTree environment, a SaaS application can be very quickly provisioned.
There will always be a need to consume ECM or BPM services from an external, on-premise application or from another cloud application. In the case of BPM, services exposed from an on-premise system will need to be consumed by the BPM cloud application. These are some strategies that can be followed to make this happen:
- Appian Cloud BPM provides a number of integration adapters, including SOAP, Java, JMS, SFTP. Network communication can either be through a secure VPN or secure web service call over HTTPS.
- Data maintained in the on-premise, back-end systems can be integrated with Pega Cloud through web services such as SOAP, .NET, JMS and MQ Messaging via secure VPN.
- Alfresco provides SOAP, REST and CMIS APIs to integrate with on-premise and other external cloud applications and providers.
- SpringCM offers several pre-built adapters to interface with other platforms like SharePoint and Salesforce.com. SpringCM exposes its functionalities through web services too, so external applications can consume these exposed web services.
- KnowledgeTree’s functional capabilities can be made available using REST and SOAP Web Service APIs.
- External systems (both on-premise and cloud) can consume the exposed SOAP web services or CMIS services by Nuxeo.
Authentication & Single Sign-on (SSO)
SSO might be required while integrating on-premise and other cloud-based applications with the ECM and/or BPM cloud instances.
- SSO through Appian Cloud BPM is enabled via SAML integration or a direct integration to your authentication provider behind your firewall using secure VPN connection. If it is necessary to enable SSO integration on a particular Appian Cloud BPM site, SAML authentication needs to be enabled on the identity management servers or a secure VPN must be configured between your network and the Appian Cloud environment.
- Identity management solutions like LDAP and others can be integrated with Pega SmartPaaS platform without any coding effort.
- Alfresco does not yet support LDAP integration out-of-the-box in its multi-tenant implementation.
- SpringCM’s cloud platform supports SAML 2.0 for implementing SSO.
- KnowledgeTree’s cloud offering supports integration with Microsoft Active Directory via LDAP and also leverages OneLogin's open-source SAML toolkit to implement web-based SSO.
Alfresco, Nuxeo are open source products and the core technology used is Java. KnowledgeTree is also an open source product, but the core technology used is PHP. SpringCM is a proprietary product and technology used in .NET . Appian Cloud BPM and Pega SmartPaaS are both proprietary products and the core technology used is Java.
In this article we have tried to paint a picture of cloud computing, specifically as it relates to BPM and ECM. We have brought out the key architectural considerations and factors to consider while building ECM and BPM applications on the cloud. We have compared and contrasted a few leading software products and platforms (PaaS), how they have approached their cloud architecture and some of the pros and cons attached to their architectural strategy.
While both ECM and BPM products vendors have taken the plunge into the cloud arena, they are at different levels of maturity and need to overcome challenges that are unique to the space they operate in. Two critical aspects that determine the success of a cloud application are business need and adoption. Adoption by a critical mass needs to be achieved before these solutions become mainstream and therefore are embraced by a larger group, in order to make the business model viable for software vendors and cloud service providers. Mainstream adoption and a healthy revenue potential will attract more investment to evolve and mature these cloud-based offerings.
ECM Cloud Adoption
The ECM cloud products and offerings have gained more adoption than BPM products and therefore offer a better feature/function set and capabilities. Adding collaboration and social networking capabilities to the base ECM offerings have made them even more attractive to customers / users. Since ECM is not core to the business and does not require high-volume, critical transactional capability, there is less pressure on the integration needs and more importantly, these solutions can run as mostly as independent business applications with minimal integration.
Various industries and sectors have adopted ECM offerings on the cloud. A large UK-based insurer won the Sharepoint 2010 Innovator Award for building an intranet solution on the cloud by incorporating content management, collaboration and workflow into their intranet application. Claims First Notice of Loss (FNOL), Policy Servicing, Service Center Management, Agent marketing management and licensing are few of the areas where insurance ECM solutions on the cloud would help in reducing cost of operations and improving service levels. Combined with BPM / workflow capability, these solutions will extend content management capabilities and increase service automation and productivity.
BPM Cloud Adoption
Adoption of BPM has also increased in the recent past, though at a slower pace when compared to ECM. BPM, unlike ECM, is all about process automation and transactional capability, volumes, mission criticality and integration needs are much higher when compared to ECM applications.
BPM is used to automate, streamline and optimize core business processes and transactions and many of these end-to-end processes traverse across multiple internal and external applications, including legacy applications. Integration and performance are key and current versions of cloud-based BPM offerings do not have robust integration built into them. They are still evolving.
While large enterprises are still looking at how BPM cloud applications will fit in their enterprise architecture and meet the stringent security, integration and performance requirements, Small and Medium businesses (SMBs) are key candidates for BPM cloud offerings and solutions. Their integration needs are less and volumes handled are lower. Their business processes are simpler and their budgets lighter. All these factors are ideal for a cloud-based BPM strategy.
Cloud-based BPM is also being used to automate and support non-core and support processes. Customers are also using the cloud-based offerings of BPM vendors for their proof-of-concepts, development and testing, while the production environment is still on-premise. This helps reduce the total cost of ownership.
As technical and delivery models for cloud computing evolve and start to reach a scale of global adoption to satisfy enterprise needs, the success of a cloud computing initiative is essentially dependent on the business needs being addressed, the pains that need to be eliminated and the tangible ROI this cloud investment will bring to the organization. There are multiple critical success factors to consider, such as the nature of the operations and processes that are to be cloud-enabled (e.g. sales force automation, self-service management, claims processing, online B2C sales, etc.), the sensitivity of the data and information, the regulatory requirements from a compliance standpoint, the cost arbitrage that will be gained by moving these processes and applications on-demand and the volumes that need to be handled.
Within the organization implementing cloud-based applications, user buy-in is critical to the success of a cloud initiative. Key stakeholder(s) and end-user(s) buy-in is essential to kick-start and more importantly, sustain this initiative.