W3C Logo
As the usage of XML as the primary representation format within a content management system proliferates, the discussion revolves around how to transform said XML into a human readable format. And what about the machines? Who is looking out for them when it comes time to consume data in a custom XML format?Affectionately known as "griddle," the "Gleaning Resource Descriptions from Dialects of Languages" (GRDDL) recommendation from the W3C provides a way for an XML document to declare that it contains gleanable data along with a link to an algorithm (which will usually use XSLT) for gleaning RDF data from the document. What this recommendation means for computers across the globe is that they are no longer dependent on a software developer to provide a method of transformation for the XML data being consumed. A GRDDL-aware agent can retrieve an XML document over HTTP and determine: 1) if there is any gleanable data available within the document, and 2) how to glean said data. If your brows are furrowed in consternation, let's go through an example to help explain what is admittedly a pretty fuzzy concept. A pharmaceutical company uses a clinical research data management system that employs XML as the primary representation format for their trial records. The developers currently responsible for the data management system would like to use a content management system in order to automatically replicate the XML documents into RDF graphs to take advantage of RDF's rich querying capabilities. The downside to this approach is the expense in storage space and synchronization of the same content in dual formats. On the other hand, the upside to this approach is that the content can be queried both as XML and RDF. Furthermore, RDF provides the ability to query the data using a standard pharmaceutical OWL ontology. The introduction of GRDDL can solve the problem of maintaining the same data in dual formats by allowing a computer-based trial record or any XML-based clinical research data to be queried semantically by associating a GRDDL profile to the XML vocabulary. Therefore, a web application could be constructed that works with a GRDDL-aware agent that retrieves trial records from a remote server. These records are source documents associated with transforms that extract trial data as RDF expressed in a universally supported vocabulary for a trial record. An RDF representation of the trial data provides the ability to map the pharmaceutical data to a unified ontology and thus alleviate the difficulties of using multiple XML vocabularies over domains such as pharmaceutical data. Although that is a somewhat specific example, organizations worldwide are excited about this recommendation from the W3C. One such organization is Creative Commons, which are looking to GRDDL "to make it easier to process data from diverse formats in an interoperable fashion, when that is appropriate." If your organization is into "mashups" and is looking for a standard to follow, take a look at GRDDL and see if it meets your needs. What is your take on yet another W3C recommendation? Does GRDDL have a future outside of academia? Share your opinion in the comments.