If you haven’t yet heard of graph databases, get ready. They’re the next hot ticket in a world consumed by big data, analytics and the Internet of Things.
They do things other databases do not do well, like help us discover insights via relationships —between people, places or things.
They don’t as much crunch data as help the world make sense of data. “It’s an amazing concept,” said Philip Rathle, vice president of products at Neo Technology, the commercial company behind open source graph database, Neo4j.
And he doesn’t seem to be the only one who thinks so. The graph database has the highest rate of growth of any kind of database in the world.
Unlike traditional databases which squeeze data into tables, graph databases work much the same way as the human brain and they process data similarly as well. They use nodes (which can be a person, a place, a business, a device … just about anything) and the relationship it has to … whatever.
Companies like eBay, Amazon, Linkedin, Facebook and Netflix use these to figure out what you might want to buy, who you might know, what movie you might be interested in and so on.
A node might be someone like Bob and his relationships might be with Bill, his frat brothers and other college chums, Phil the annoying guy who leaves smelly pastrami sandwiches in the fridge at the office, the Plaid Pussycat boutique where has a frequent shoppers card, Smashburger where he eats, Miss Barnes his kid’s kindergarten teacher, the address (gleaned from his mobile) that he ends up at 6 PM Thursday nights, the Runkeeper route he runs, we could go on…
A graph could then map out Bob and Bill’s relationships, other Smashburger customers, similar running routes, other runners who run similar routes and so on.
Needless to say, the relationships and insights that could be gleaned from graph databases are both endless and potentially valuable — they look at causalities via person to person connections (social graphs), patterns of behavior, the steps a person might take before they buy something on the web and more.
Detecting Fraud and More
Rathie gave us a few examples of how businesses are using Neo4j, which may not readily come to mind.
Take, for example, that Neo Technology has a customer that uses Neo4j to detect fraud. It turns out that thieves, troublemakers and shoppers don’t move the mouse the same way, which gives retailers an ability to stop the bad guys or to get more information before they part with the goods.
A company like eBay uses graph databases, in select cities, to put purchases into a customer’s hands before an Amazon drone (if one even existed) could. This is made possible by eBay knowing what folks who live in a city like New York tend to buy, storing it locally and having relationships with nearby couriers who have availability.
Couldn’t this have been done with a traditional database? Not fast enough, says Rathle. In fact, he said, eBay initially tried to use a SQL database to accomplish this but the required query needed about 700 lines of code. A graph database, using Neo4j, needs ten.
Not only that, but Neo4j, in general, requires 1/100 lines of code and runs as much as 1,000 times faster.
Too Much Data for a Table
In another case, a global 100 manufacturer used Neo4j to get a handle on sales compensation. If you can imagine thousands of sales reps each of whom sells different product categories, different product lines, has a different sales territory, reports to a different sales manager and has a different commission schedule.
”Try shoving all of that into a table,” said Rathle. And though he doesn’t doubt that it can be done, the relationships are certainly easier to map and discover using a graph database.
Why Did We Wait So Long?
While the concept behind graph databases has been around for some time, the technology to make them easy to use has not. The first line of code for Neo4j, an open source project, was written in 2000.
Neo Technology, which provides an enterprise grade rendition of Neo4j, wasn’t founded until 2010. Neo4j 2.0, which makes the database easier to use, was released in January.
Loading Data into Neo4j is Fast and Easy
Today Neo Technology announces the availability of Neo4j 2.1, which makes it easier to load data into Neo4j. It accomplishes this by providing easy methods for mapping tabular data into Neo4j from CSV files, and runs up to 100 times faster than previously existing methods.
Anybody Else Out There?
You’d think that, given all the insights that graph databases can help us glean, that Neo Technology would have a lot of competition, but that’s not the case.
“Our biggest competitor is people not knowing about us,” said Rathle.
We have a feeling that’s about to change. So do the researchers at Gartner, by the way. Graph databases were at the entry point or “Innovation Trigger” stage on Gartner’s big data hype cycle in 2013.
Jumping on Board
Even with the lack of hype, Neo Technology already has a rather impressive customer list. There’s Adobe, CarrerBuilder, Cisco, glassdoor, FirstData, eHarmony, HP, TechCrunch and many, many others.
Title image by Palis Michalis / Shutterstock.