Customer Experience Management (CXM), Information Management, Social Business
 
 
 

W3C Illuminates Differences, Rationale Behind HTML5

W3C_logo_2010.jpg

HTML5 defines the 5th major revision of the core language of the Web. In a recently published work the W3C details the differences between HTML 4 and HTML 5 and provides insight into the thinking behind the changes.

What is HTML5?

HTML5 is a revision of the HTML standard in the works by the W3C (news, site). HTML 5 will (eventually) replace HTML4 and XHTML1, as HTML5 defines a single language that can be written using either HTML or XML syntax.

There are separate conformance requirements for authors and user agents. Those writing web browsers and other clients that work over the web are required to support older elements and attributes for backwards compatibility. Web page authors will work with a slightly simplified language with some elements and attributes moved into CSS, and will no longer encounter the term deprecated since older code will still be supported.

Syntax Differences

There are many differences in HTML5, as outlined by the W3C in a recent document. HTML5 syntax is compatible with both HTML4 and XHTML1, except for "the more esoteric SGML features of HTML4." These are mostly those that aren't supported by most user agents, including such examples as processing instructions and shorthand markup. Other differences when it comes to syntax include:

  • Most HTML documents will be served with the text/html media type
  • Parsing rules, including error handling, that user agents must use with the text/html media type
  • A text/html-sandboxed media type for HTML syntax documents where you're hosting untrusted content
  • XML syntax documents must be served with an XML media type such as application/xml, with elements in the XHTML namespace following the XML specifications
  • HTML5's HTML now has native support for Internationalized Resource Identifiers (IRIs) if the encoding is UTF-8 or UTF-16
  • The lang attribute can now contain an empty string along with a valid language identifier, mirroring xml:lang in XML
  • The HTML syntax of HTML5 requires a DOCTYPE declaration of " to ensure that the browser renders the page in standards mode," but the XML syntax doesn't since XML is always rendered in this mode
  • HTML5's HTML syntax allows for MathML and SVG elements

Character Encoding Differences

HTML5's HTML syntax offers three options for character encoding, according to the HTML5 differences from HTML4 document:

  1. At the transport level of the TCP/IP stack, such as using the HTTP Content-Type header
  2. Starting the file with a Unicode Byte Order Mark (BOM) character, which provides a signature for the type of encoding used
  3. Including a meta element (for example, <meta charset="UTF-8">) with a charset attribute specifying the encoding within the first 512 bytes of the document, significantly shortening the syntax required previously

Authors working in XML syntax will use the rules already set in the XML specifications.

Changes in Elements

There are three groups of changes to the language of HTML from versions 4 to 5. These groups break down to new, changed, and deleted elements.

While too numerous to list out fully here, you can find the full list of changes in the HTML5 differences from HTML4 document. Some of the more interesting new elements (subject to change) are:

 

Continue reading this article:

 
 
Useful article?
  Email It      

Related Articles:
Tags: , , , , , , , , , ,
 
 
 

Featured Events  View all | Add event | feed RSS

Who's Hiring?  View all | Post a job | feed RSS


 
Are you hiring?    Post your job today ($45 for 45 days)!