(Page 3 of 3)
In the context of adapting our architecture to the complexity of all of the different Olympic events and disciplines, we made one significant change: We added a MarkLogic XML database. In the words of Senior Technical Architect David Rogers:
Fundamental to this approach was the use of MarkLogic to store and retrieve data. MarkLogic is an XML database which uses XQuery to store and retrieve data. Given the timescales, this project would not have been achievable using a SQL database, which would have pushed the design towards more complete modelling of the data. Using MarkLogic, we could write a complete XML document, and retrieve that document either by reference to its location, a URI, or using XQuery to define criteria about its contents."
LF: Are there other uses of Semantic Web technology not related to content publishing that are being explored within BBC?
BBC: We are currently exploring various other uses of Semantic Web technologies within BBC R&D. In particular we’re looking at ways in which Linked Data can be used to help search and discovery of archive content. We have been working on automatically identifying the topics and the contributors for BBC programmes from their content, using a combination of Linked Data, signal processing, speech-to-text and Named Entity Recognition technologies, which we have been talking about in various places, such as the Linked Data on the Web workshop and at WWW’2012. The automatically generated links from programmes to entities described in the Linked Data cloud might be incorrect in places, so we are also exploring how users can validate or correct those links, and how this feedback can be taken into account within our automated interlinking workflow. We are planning to write in more details about our experiments in that space on the BBC R&D blog in the next couple of weeks.
LF: What are your plans going forwards?
BBC: We are currently annotating quite a lot of our content with Linked Data URIs to drive a number of aggregations on our site, but we are making little use of the connections between all these URIs. So far, we have only been using those in our automated tagging tools, to disambiguate between candidate identifiers. There is a big opportunity in using those connections for storytelling purposes — using paths in that graph of data to help tell stories around our content. It becomes even more of an opportunity if we start describing the content of individual programmes in more details, such as describing the narrative structure of dramas, for example. We started some investigation in that area in our Mythology Engine project, but there is much more that could be done.
I think there are several lessons to learn from the BBC’s experience with Semantic Web technologies:
- Embracing these technologies was an evolutionary process; it started with a general philosophy, rolled out incrementally, and ended up providing a significant strategic advantage.
- The BBC invested a great deal of energy in being able to clearly articulate the vision and the value of the Semantic Web approach on their various blogs, and in doing so sought to engage a much larger community beyond the BBC.
- Semantic Web technologies are not an end in themselves. While they play a crucial role in what the BBC has accomplished with dynamic site publishing, there are many other technologies (such as XML, Silverlight and standard HTTP) that need to come together for this application.
My thanks to Yves, Michael, and Olivier for taking the time to contribute their experiences for us all.
Editor's Note: To read the beginning of Lee's series on the semantic web, The Semantic Web and the Modern Enterprise.
About the Author
Lee Feigenbaum (@LeeFeigenbaum), co-founder and VP of Marketing for Cambridge Semantics, is a leading expert in Semantic Web technologies and their applicability to enterprise IT challenges. He is passionate about helping information professionals use semantic technology to solve enterprise information management challenges.
Lee is the editor-in-chief and a core contributor to Semantic University, a free online resource for learning about Semantic Web technologies.
- Endangered Species: The Corporate Intranet
- Forget Intranets, Give Me an ESN
- Are These Vendors the Best at Social Media Monitoring?
- Multitasking? You're Killing Yourself for Nothing
- Think Digital Marketing Technology: Think ... Microsoft?
- Beware Red Herrings: Intranet vs. ESN is a Sham
- Microsoft's New BI Tool Plays Nice, Even With 3rd Party Vendors