If you were looking for all the news that’s fit to print, you’ve come to the wrong place. We read (almost) everything big data related, including nearly 100 press releases and articles each week, and then bring you the news on what we think is most remarkable.
Trifacta Narrows the Gap Between Big Data and Insight
Big Data is messy -- you don’t need a data scientist to tell you that. It’s comprised of structured, unstructured, semi-stuctured data … and before you can work with it, it needs to be cleaned up. Traditionally this requires hours of painstaking manual coding, creating a costly lag between information and insight -- an expense that companies in this data-driven economy can’t afford to bear.
And now they won’t have to, provided they work with Trifacta, that is. Trifacta is a data transformation platform that leverages machine learning and visualization to create “predictive interactions.”
The way it works is that Trifacta learns as data and business analysts highlight data that they’re interested in, it then creates previews of code that users can keep or discard depending on its usefulness, and it eventually returns data in usable form.
An impressive group of data scientists like Jeff Hammerbacher and industry partners such as Pivotal, Cloudera and Tableau have all given Trifacta rave reviews and kudos for helping enterprises and organizations reap value out data faster.
LinkedIn Acquires Bright
A good chunk of LinkedIn’s revenue comes from connecting jobseekers with the companies who are in need of their skills. And though LinkedIn has some of the smartest algorithms in the industry to make that happen, it seems that there’s something they hadn’t yet thought of or built.
That something is the technology behind job hunting site Bright.com. Bright uses an algorithm that calculates a “Bright Score,” which on a scale of 100 assesses a job applicant’s probability of scoring an initial interview and then making it to the second round for a given position.
Bright created the algorithm(s) over an 18 month period during which it conducted, what it calls the largest scientific resume study in the history of the industry. During the study, a team of engineers and data scientists studied the backgrounds of 8.6 million job seekers, 2.6 million resumes and 15 million job openings. They then enlisted over 100 recruiters to tell them why certain resumes made the shortlist and why some others didn’t. This is how the Bright algorithm was trained.
With LinkedIn’s acquisition of Bright, its bucketful of data (it probably has more than Bright does) and some of the world’s smartest data scientists, look for the company to bring even greater value to employers and jobseekers.
MemSQL Gives Larry Ellison Another Reason to Worry
Pivotal boss Paul Maritz often talks not only about big data but “fast data” as well. And there’s a new generation of databases that are knocking the socks off of anything we’ve seen before. Though MongoDB and DataStax’s Cassandra are among them, there’s now MemSQL to consider. Earlier this week the company unveiled MemSQL 3.0, which supports both columnar and relational data stores. In lighter-geek speak, the new in-memory database (suffice it to say that it’s lightning fast) can process analytical and transactional data in real time thereby making it possible for one to inform the other.
Does this mean that companies like Oracle will see their customers flee in short order? We don’t think so. But in an economy where information is the raw material, newer companies will likely see little reason to go to (more expensive) second generation technologies. And the more established enterprises, who will eventually need to compete with the upstarts, will need to invest in faster, next generation solutions to remain viable.
CNN, Twitter and Dataminr Partner to Redefine Breaking News
If you’ve ever been on Twitter when something newsworthy and remarkable occurs, you know that there’s hardly a better source for information. And while some folks might be interested in discovering what kind of trouble Justin Bieber is getting into at any particular moment in time, others might want to stay up to date on the protests in Kiev or the Olympics in Sochi.
But what about the news that’s being made in the moment, that ABC, NBC and many of us won’t know about for hours or even days? It will be quietly broken on Twitter. And not by a single tweet, but by many single tweets that most people will never see or take note of.
Dataminr sees these tweets. Its technology takes in Twitter’s firehose of data and finds patterns long before they become trends.
Up until now, the company’s technology has been leveraged primarily by financial institutions so that they can “trade before it trends" and by governments.
Twitter, CNN and Dataminr have now announced that they have partnered to uncover news in real time as it’s occurring. They plan to sell the solution to newsrooms.
Title image by gkrphoto (Shutterstock)