(Page 2 of 3)
- 2010: Published the World Cup website using a BigOWLIM triple store [LF: a triple store is a database that stores RDF data]. News articles were tagged with entities in the triple store and inference used to propagate those tags to all relevant entities through the graph.
- 2011: Rolled out the World Cup approach across the whole of BBC Sport.
- 2012: Rolled out the Olympics site using the same model as BBC Sport.
LF: Could you describe the main use cases of Semantic Web technologies at the BBC? Would you characterize these use cases as “dynamic content publishing”?
BBC: Our use of Linked Data breaks down into three areas:
[LF: the term Linked Data refers to a specific set of best practices for working with Semantic Web (RDF) data; the term Linked Open Data refers to Linked Data that is freely available on the Web.]
- Publishing Linked Data: to make our content more findable (e.g. by search engines) and more linkable (e.g. via social media or by other Linked Data publishers using the same vocabularies and identifiers);
- Consuming Linked Data: to “borrow” additional context for our content where we don’t have existing data and want to cut content by specific domains (music, nature, food, sport). The Linked Open Data that we use also helps give us additional links between domains.
- Managing data internally as Linked Data: to maximize the use we get out of editorial input by propagating editorially added links across data graphs; to make more links between otherwise siloed sites.
It’s not really accurate to call our use cases “dynamic content publishing.” Our actual content (TV and radio programmes and news articles) is still fairly static. The Linked Data / Domain Driven Design approach is less about dynamic content and more about dynamic context and dynamic aggregations around that content that let us maximize exposure to our content by placing it in different contexts (wildlife, music, food, football, etc.).
- Because bbc.co.uk has content in so many domains, it’s like a microcosm of the web. One of our goals with this work is to move from a set of silo’ed sites to a coherent service which we can only do if our content is well described and interlinked. Finally, by using domain-native URL keys we can generate more inbound link density and make our content more findable on search engines.
LF: How did the BBC produce these sites before the Semantic Web approach?
BBC: By hand. There was lots of hand-rolling of microsites around specific items. There was lots of aggregations maintained by editorial hands. The Semantic Web approach meant that we could provide many more aggregations and many more routes to content at lower cost.
LF: Have you been able to measure any results from these efforts?
BBC: For the Olympics, the Dynamic Semantic Publishing (DSP) architecture allowed us to offer a single page for every country (200+), every athlete (10000+), every discipline/event (400-500) and every venue. All of these pages were complete with aggregated relevant stats and news.
[LF: A blog entry about the 2010 World Cup site indicates that it has over 700 pages for teams, groups and players. This was an amount of content that never would have even been considered without the automated Semantic Web approach. The same blog entry puts this into perspective: “The World Cup site had more index pages than the rest of the [hand-edited] BBC Sport site in its entirety.”]
LF: Were there particular things that you learned from the World Cup and were able to change for the Olympics?
BBC: The World Cup site worked. Everything that we learned at a relatively small scale from the World Cup site could be applied to the Olympics, which was an order of magnitude more complex.