For over a decade the Semantic Web has been maligned, misconstrued and misunderstood. It’s been overhyped by its supporters while its critics have hung the albatross of artificial intelligence around its neck. Even its successes have been understated, often coming with little fanfare and without the mindshare and hype surrounding other trends such as Web 2.0, NoSQL or Big Data.
So I wouldn’t fault you in the slightest if you were surprised, confused or downright skeptical when I claim that the Semantic Web is emerging as the technology of choice for tackling some of today’s most pressing challenges in enterprise information management. This article is the first in a series that will introduce and explain Semantic Web technologies and their role in enterprise information management today.
The World Wide Web Today: A Web of Documents
The World Wide Web as we know it today is a Web of linked documents, full of content intended to be displayed for humans. The information within these documents (web pages, videos, images, etc.), is completely opaque to computers, and so almost impossible to leverage in any automated way.
So for all its pervasiveness and success, it does not realize the extent of Sir Tim Berners-Lee’s vision for a fully Semantic Web. Alongside Jim Hendler and Ora Lassila, Berners-Lee introduced the Semantic Web to the world in a 2001 article in Scientific American. This article set forth a world where software agents scurry across the Web, discovering and consuming structured content and semantic relationships that allow them to automate many aspects of our lives, such as scheduling doctor appointments by coordinating our own calendars, our doctors’ availabilities, and our health insurance requirements. The Semantic Web, then, is a Web of linked data.
The Semantic Web: A Web of Linked Data
This is a great vision. And since 1999, hundreds of passionate and talented individuals from academia, government and industry have banded together under the auspices of the World Wide Web Consortium (the W3C) to advance the Semantic Web project. They’ve made a lot of progress.
Today there are hundreds of data sets across the Web linked together at the data element level by millions of relationships, altogether comprising hundreds of billions of individual pieces of data. It’s all connected data, and it can be browsed, searched and queried independently of the usual hyperlink structure of the (document-based) Web. And while we don’t yet have agents running most aspects of our lives, the foundation is in place for realizing the Semantic Web vision (Siri, I’m looking at you).
Billions of pieces of interlinked data didn’t pop up on the Web overnight. Over the past 15 years, the W3C has produced nearly a dozen technology standards for defining the Semantic Web. These standards range from a flexible data model for structured content on the Web, schema, taxonomy and ontology languages for describing this data, a distributed query language for accessing the data, rules languages for drawing conclusions from the data, markup languages for sticking data inside HTML, and more.
All told, the W3C Semantic Web standards make up a family of technologies, designed to work well together and designed to cope with the extreme diversity, decentralization and ever-changing nature of data on the Web.
Enterprises Need the Semantic Web
Which brings us to the challenges of modern-day enterprise information management. Consider some enterprise information trends over the past five years: