(Page 3 of 3)
In the context of adapting our architecture to the complexity of all of the different Olympic events and disciplines, we made one significant change: We added a MarkLogic XML database. In the words of Senior Technical Architect David Rogers:
Fundamental to this approach was the use of MarkLogic to store and retrieve data. MarkLogic is an XML database which uses XQuery to store and retrieve data. Given the timescales, this project would not have been achievable using a SQL database, which would have pushed the design towards more complete modelling of the data. Using MarkLogic, we could write a complete XML document, and retrieve that document either by reference to its location, a URI, or using XQuery to define criteria about its contents."
LF: Are there other uses of Semantic Web technology not related to content publishing that are being explored within BBC?
BBC: We are currently exploring various other uses of Semantic Web technologies within BBC R&D. In particular we’re looking at ways in which Linked Data can be used to help search and discovery of archive content. We have been working on automatically identifying the topics and the contributors for BBC programmes from their content, using a combination of Linked Data, signal processing, speech-to-text and Named Entity Recognition technologies, which we have been talking about in various places, such as the Linked Data on the Web workshop and at WWW’2012. The automatically generated links from programmes to entities described in the Linked Data cloud might be incorrect in places, so we are also exploring how users can validate or correct those links, and how this feedback can be taken into account within our automated interlinking workflow. We are planning to write in more details about our experiments in that space on the BBC R&D blog in the next couple of weeks.
LF: What are your plans going forwards?
BBC: We are currently annotating quite a lot of our content with Linked Data URIs to drive a number of aggregations on our site, but we are making little use of the connections between all these URIs. So far, we have only been using those in our automated tagging tools, to disambiguate between candidate identifiers. There is a big opportunity in using those connections for storytelling purposes — using paths in that graph of data to help tell stories around our content. It becomes even more of an opportunity if we start describing the content of individual programmes in more details, such as describing the narrative structure of dramas, for example. We started some investigation in that area in our Mythology Engine project, but there is much more that could be done.
I think there are several lessons to learn from the BBC’s experience with Semantic Web technologies:
- Embracing these technologies was an evolutionary process; it started with a general philosophy, rolled out incrementally, and ended up providing a significant strategic advantage.
- The BBC invested a great deal of energy in being able to clearly articulate the vision and the value of the Semantic Web approach on their various blogs, and in doing so sought to engage a much larger community beyond the BBC.
- Semantic Web technologies are not an end in themselves. While they play a crucial role in what the BBC has accomplished with dynamic site publishing, there are many other technologies (such as XML, Silverlight and standard HTTP) that need to come together for this application.
My thanks to Yves, Michael, and Olivier for taking the time to contribute their experiences for us all.
Editor's Note: To read the beginning of Lee's series on the semantic web, The Semantic Web and the Modern Enterprise.
About the Author
Lee Feigenbaum (@LeeFeigenbaum), co-founder and VP of Marketing for Cambridge Semantics, is a leading expert in Semantic Web technologies and their applicability to enterprise IT challenges. He is passionate about helping information professionals use semantic technology to solve enterprise information management challenges.
Lee is the editor-in-chief and a core contributor to Semantic University, a free online resource for learning about Semantic Web technologies.
- Hey Cloudera & MapR: Open Data Platform is the Real Deal
- Discussion Point: Why Do Intranets Fail?
- A Look at Gartner's Data Management Analytics Leaders
- The Sticking Point with Social Collaboration Tools
- 3 Ways Marketing Automation Boosts Business Efficiency
- Is There a Future in Content Marketing?
- 3 Vendors Lead the Wave for Big Data Predictive Analytics