It’s likely that neither big data vendors nor developers slept well last night because they were all probably thinking about Kinesis, Amazon’s fully managed service for real-time processing of streaming data at massive scale (like that generated by Twitter and the Internet of Things). It was introduced in Las Vegas yesterday at Amazon’s re:Invent Conference and brought on the same kind of enthusiasm among developers as the first iPhone brought to consumers.
Twitter Cheers, Analysts Rave, Competitors ...
Not only did the crowd of 9,000+ Cloud enthusiasts who watched Amazon CTO Werner Vogels unveil it at the Sands Convention Center go wild, but so did Twitter:
Analysts raved as well.
"Amazon Web Services (AWS) pretty much has the big data workload taken care of,” said Wikibon analyst Jeff Kelly on theCUBE. Kinesis handles the real time processing of streaming data, Redshift handles traditional data warehousing, Elastic MapReduce (EMR) handles the open source Hadoop and DynamoDB serves as the database.
theCUBE’S HOST, John Furrier asked, ““If they close the loop with Kinesis, who can take on Amazon? Who can provide social on top of the infrastructure?”
The answer? No other vendor can, at least at this point in time.
When it comes to big data processing on the Cloud, only Microsoft with Azure, Google Cloud and maybe IBM come close, and that’s overstating it.
And when it comes to how Amazon’s ability to process big data streams in real-time affects the vendors that we think of when we hear the word “Hadoop,” we reached out to a good number of them and what we got back was -- pretty much -- silence. Though one did slip us some slides, with the condition that we do not reveal their identity (or publish the slides, of course).
Is There a Downside to Processing Big Data on Amazon’s Cloud?
The issues the slides brought up were around EMR, not Kinesis, but they included some relevant points about AWS’s vulnerabilities when it comes to issues like:
- Security -- AWS’ policy reads “Because you’re building systems on top of the AWS cloud infrastructure, the security responsibilities will be shared: AWS manages the underlying infrastructure but you must secure anything you put on the infrastructure.”
- Gravity -- Data needs to be moved to the Cloud then be extracted and made available to applications, analytics, the data warehouse ….
- Limited Hadoop Components -- Amazon’s focus is on EMR.
- Costs -- While AWS is great for transient projects, the slides explain, on premises is less expensive when data is required to stay in the cloud, or cluster. The per minute costs of EMR in this scenario are infinite.
Will Established Enterprises Leap onto Amazon’s Cloud?
Will these concerns keep enterprises from moving to Amazon? In some cases, yes. There are those who simply don’t trust the cloud and, in those cases, the move may not happen until the next generation of leadership steps in.
But there’s no doubt that developers will push the issue, make good arguments for their cases, and win permission to stick their toes in the water on projects where data isn’t sensitive. And if these “experiments” go well (and security isn’t breached) and management discovers that it can get better results faster and cheaper using AWS, (and that their “king maker” developers are turned on by their work), they might move more and more of their workloads onto Amazon’s cloud until it seems the only sensible thing to do.
The trick for everyone in the big data space is to catch up with Amazon before that happens and that won’t be easy to do.
Finally, because it’s Friday, here’s a related funny from Twitter.