For over a decade the Semantic Web has been maligned, misconstrued and misunderstood. It’s been overhyped by its supporters while its critics have hung the albatross of artificial intelligence around its neck. Even its successes have been understated, often coming with little fanfare and without the mindshare and hype surrounding other trends such as Web 2.0, NoSQL or Big Data.
So I wouldn’t fault you in the slightest if you were surprised, confused or downright skeptical when I claim that the Semantic Web is emerging as the technology of choice for tackling some of today’s most pressing challenges in enterprise information management. This article is the first in a series that will introduce and explain Semantic Web technologies and their role in enterprise information management today.
The World Wide Web Today: A Web of Documents
The World Wide Web as we know it today is a Web of linked documents, full of content intended to be displayed for humans. The information within these documents (web pages, videos, images, etc.), is completely opaque to computers, and so almost impossible to leverage in any automated way.
So for all its pervasiveness and success, it does not realize the extent of Sir Tim Berners-Lee’s vision for a fully Semantic Web. Alongside Jim Hendler and Ora Lassila, Berners-Lee introduced the Semantic Web to the world in a 2001 article in Scientific American. This article set forth a world where software agents scurry across the Web, discovering and consuming structured content and semantic relationships that allow them to automate many aspects of our lives, such as scheduling doctor appointments by coordinating our own calendars, our doctors’ availabilities, and our health insurance requirements. The Semantic Web, then, is a Web of linked data.
The Semantic Web: A Web of Linked Data
This is a great vision. And since 1999, hundreds of passionate and talented individuals from academia, government and industry have banded together under the auspices of the World Wide Web Consortium (the W3C) to advance the Semantic Web project. They’ve made a lot of progress.
Today there are hundreds of data sets across the Web linked together at the data element level by millions of relationships, altogether comprising hundreds of billions of individual pieces of data. It’s all connected data, and it can be browsed, searched and queried independently of the usual hyperlink structure of the (document-based) Web. And while we don’t yet have agents running most aspects of our lives, the foundation is in place for realizing the Semantic Web vision (Siri, I’m looking at you).
Billions of pieces of interlinked data didn’t pop up on the Web overnight. Over the past 15 years, the W3C has produced nearly a dozen technology standards for defining the Semantic Web. These standards range from a flexible data model for structured content on the Web, schema, taxonomy and ontology languages for describing this data, a distributed query language for accessing the data, rules languages for drawing conclusions from the data, markup languages for sticking data inside HTML, and more.
All told, the W3C Semantic Web standards make up a family of technologies, designed to work well together and designed to cope with the extreme diversity, decentralization and ever-changing nature of data on the Web.
Enterprises Need the Semantic Web
Which brings us to the challenges of modern-day enterprise information management. Consider some enterprise information trends over the past five years:
- Key enterprise data assets are no longer confined to predictably structured transactional databases and data warehouses. Decision makers are relying on data buried in spreadsheets, in emails, in Access databases and in documents.
- Companies are increasingly drawing on data from outside their organization. They’re pulling together information from supply chain partners, from customers, from social media sites, from websites and public web databases.
- The information needed today is different from the information needed tomorrow, next week or next month. Changes come from everywhere: internal strategy changes, competitive pressures, new regulatory requirements.
All of these trends demand an information management approach that can link together data from any source and any format without requiring extensive upfront planning. This is exactly what Semantic Web technologies were designed for, with one difference: if these technologies work at the scale of the Web, they’ll work even better inside large enterprises which are, after all, a tiny fraction of the size and complexity of the Web.
The Semantic Web in Use
So how are enterprises using Semantic Web technologies today?
- In the pharmaceutical industry, R&D and informatics groups are employing Semantic Web technologies to incrementally construct extensive knowledge that link together information from basic research to drug discovery and development to clinical trials to post-approval sales, marketing and pharmacovigilance.
The flexibility of the Semantic Web technology stack means that pharma companies can make gradual investments to link together more and more information and end up using the information to answer questions as diverse as looking for new off-label indications of existing drugs to understanding why physicians in one part of the world are less likely to prescribe a particular class of drugs to identifying the next high-value druggable targets to focus R&D around.
- Publishers use Semantic Web technologies to classify content and dynamically link together related pieces of content in ways that would be prohibitively expensive to be done manually or with conventional technologies. By representing key topics semantically, publishers build up metadata graphs that are used to give users a rich experience when browsing and searching from one piece of content to the next.
- Government agencies are embracing semantic technologies for everything from identifying key personnel resources to providing public data sets in a standard, open way that can easily be consumed by citizens, non-profits, watchdog groups, journalists, etc.
- Financial services companies -- who have been increasingly establishing Chief Data Officers in recognition of the key value of data to their businesses -- have increasingly been turning to semantics as the cheapest and most effective way to provide an intelligent access layer on top of their information assets. Whether this takes the form of semantic MDM systems, SOA governance projects, semantic metadata repositories, or IT asset management systems, the goal is the same: to find a way for people across the business to get access to the key bits of information that they need at any given time, even if they are scattered in data silos across the company.
In future articles I’ll look at how Semantic Web technologies enable such a broad spectrum of use cases, and we’ll also look at some of these particular uses in more detail.
Title image courtesy of Jezper (Shutterstock)