cloudera, big data, information management,
Cloudera challenges enterprises to ask bigger questions, and why not? This is the age of Big Data, after all.

But up until now users in highly regulated industries like finance, healthcare and government couldn’t leverage all the data they had to get answers because they couldn’t bring it into Hadoop for Big Data crunching without breaking compliance rules.

Today they can.

Sentry - Fine-Grained Authorization for Hadoop

Cloudera is introducing Sentry, a new Apache open source project that delivers the industry’s first fine-grained authorization framework for Hadoop. What Sentry does, in the simplest possible terms, is to provide access controls as to who gets to see and use what information. So in a financial services firm, for example, some workers and applications might be allowed to access database columns containing the last four digits of credit card numbers while others won’t even know that information is present in those columns.

Up until now RBAC (Role Based Access Control) for Hadoop didn't exist and that has been a big problem for Enterprises and Organizations that wanted to leverage their Big Data assets but couldn’t without violating PCI (credit card compliance) SOX and HIPAA rules.


Prior to Sentry, administrators in highly regulated industries had to choose between two bad choices -- to leave their data unprotected or to lock users out. And the locking out, it should be noted, had to be done at the file level because sensitive and non-sensitive data is often stored in the same file.

Sentry, in case it’s not obvious, works at the data level which not only opens certain less sensitive data up for access (providing that the right permissions are in place) but also the non-sensitive data which was previously inaccessible because of where it was stored.

And while the administration of all this newly unleashed data might seem burdensome, the team at Cloudera built-in multi-tenant administration so that central administrators can empower lower level admins to manage.

Enterprises Require Tight Information Security

Sentry brings with it tremendous opportunity, says Justin Erickson, Cloudera’s director of product management. It will allow more sensitive information to be stored in Hadoop, it will allow more users to work with the data, and it will give birth to new Big Data use cases in Finance, Investing, Insurance, Healthcare and Government and other industries.

It should be noted that the team at Cloudera built Sentry as an open source project and that is being donated under the Apache 2.0 license so that it can be integrated with other frameworks across the Hadoop stack. Erickson says that it will be submitted for incubation with the Apache Software Foundation as soon as it is ready so that the greater community can contribute to its development and work together to further advance the security of the Hadoop platform for enterprise applications.

What the introduction of Sentry means to Enterprises and Organizations is that Hadoop can now be made more secure and that its power can be unleashed without (as much) risk.

Erickson said that he hasn't yet spoken to an enterprise that hasn't been looking for this.

Sentry is immediately available for free download as an add-on for CDH 4.3. It can be used in conjunction with Hive and Impala 1.1 and is supported as part of the base Cloudera Enterprise subscription.