For over a decade the Semantic Web has been maligned, misconstrued and misunderstood. It’s been overhyped by its supporters while its critics have hung the albatross of artificial intelligence around its neck. Even its successes have been understated, often coming with little fanfare and without the mindshare and hype surrounding other trends such as Web 2.0, NoSQL or Big Data.
So I wouldn’t fault you in the slightest if you were surprised, confused or downright skeptical when I claim that the Semantic Web is emerging as the technology of choice for tackling some of today’s most pressing challenges in enterprise information management. This article is the first in a series that will introduce and explain Semantic Web technologies and their role in enterprise information management today.
The World Wide Web Today: A Web of Documents
The World Wide Web as we know it today is a Web of linked documents, full of content intended to be displayed for humans. The information within these documents (web pages, videos, images, etc.), is completely opaque to computers, and so almost impossible to leverage in any automated way.
So for all its pervasiveness and success, it does not realize the extent of Sir Tim Berners-Lee’s vision for a fully Semantic Web. Alongside Jim Hendler and Ora Lassila, Berners-Lee introduced the Semantic Web to the world in a 2001 article in Scientific American. This article set forth a world where software agents scurry across the Web, discovering and consuming structured content and semantic relationships that allow them to automate many aspects of our lives, such as scheduling doctor appointments by coordinating our own calendars, our doctors’ availabilities, and our health insurance requirements. The Semantic Web, then, is a Web of linked data.
The Semantic Web: A Web of Linked Data
This is a great vision. And since 1999, hundreds of passionate and talented individuals from academia, government and industry have banded together under the auspices of the World Wide Web Consortium (the W3C) to advance the Semantic Web project. They’ve made a lot of progress.
Today there are hundreds of data sets across the Web linked together at the data element level by millions of relationships, altogether comprising hundreds of billions of individual pieces of data. It’s all connected data, and it can be browsed, searched and queried independently of the usual hyperlink structure of the (document-based) Web. And while we don’t yet have agents running most aspects of our lives, the foundation is in place for realizing the Semantic Web vision (Siri, I’m looking at you).
Billions of pieces of interlinked data didn’t pop up on the Web overnight. Over the past 15 years, the W3C has produced nearly a dozen technology standards for defining the Semantic Web. These standards range from a flexible data model for structured content on the Web, schema, taxonomy and ontology languages for describing this data, a distributed query language for accessing the data, rules languages for drawing conclusions from the data, markup languages for sticking data inside HTML, and more.
All told, the W3C Semantic Web standards make up a family of technologies, designed to work well together and designed to cope with the extreme diversity, decentralization and ever-changing nature of data on the Web.
Enterprises Need the Semantic Web
Which brings us to the challenges of modern-day enterprise information management. Consider some enterprise information trends over the past five years:
- Key enterprise data assets are no longer confined to predictably structured transactional databases and data warehouses. Decision makers are relying on data buried in spreadsheets, in emails, in Access databases and in documents.
- Companies are increasingly drawing on data from outside their organization. They’re pulling together information from supply chain partners, from customers, from social media sites, from websites and public web databases.
- The information needed today is different from the information needed tomorrow, next week or next month. Changes come from everywhere: internal strategy changes, competitive pressures, new regulatory requirements.
All of these trends demand an information management approach that can link together data from any source and any format without requiring extensive upfront planning. This is exactly what Semantic Web technologies were designed for, with one difference: if these technologies work at the scale of the Web, they’ll work even better inside large enterprises which are, after all, a tiny fraction of the size and complexity of the Web.