Navigation

This site is at beta test stage! Comments are welcome. Contributions are sought and will be published with acknowledgement.

 

home page

quick overview

 

flow chart

site index

contact us

site use

 

contribute now!

 

©Liddy Nevile

Acknowledgements

 

These notes are taken form notes prepared for the IMS White Paper Version 1.0 by Liddy Nevile with editorial help from other IMS Accessibility Working group members.

See now: http://www.imsproject.org/accessibility/

XML (Extensible Mark-up Language)

As the popularity of the Web and the complexity of what is presented over the Web has grown, so have the the limitations of HTML. XML, or Extensible Mark-up Language was developed by the XML Working Group of the W3C, and the W3C Recommendation for XML 1.0 was published in February 1998 as the next generation mark-up language. Significantly, XML was developed with W3C's accessibility goals in mind and accessibility guidelines are currently being developed by the. XML supports much more flexible than HTML andbecause it is a metalanguage, a language (or a set of syntax rules and guidelines) used to define and create new markup languages. XML can be used for tags or tags sets for formatting and displaying data, combining structure and display of a document. XML separates structure and display, and the semantics of an XML document are defined by the applications that process them or by stylesheets.

There are many other advantages to using XML, including:

A public working draft of the WC3 XML Accessibility Guidelines is available at: http://www.w3.org/TR/xmlgl.

In the accessibility context, XML may not yet be supported by enough user agents to make it useful, but one set of tags defined within XML is well-established and recommended. These are the 'XHTML' tags. This means that well-formed HTML, the legacy mark-up language, can be easily translated into XHTML, which refers the user agent to standardised, online specifications of how to interoret it. XHTML can be used safely as one of the features of XHTML is that the first part of an XHTML resource informs the user agent that this is XHTML and then defaults to HTML where XHTML is not used. This redundency comes at a very small cost and offers an enormous opportunity for user agents that are designed to increase accessibility using XML.

Standardization for Accessibility: Validation and Well-formedness:

XML-based mark-up languages, including author created ones, are dependent upon the online DTDs, or schemas that provide rules for their interpretation. This means that the tags, including those in XHTML, must be well-formed and validated. This additional strictness, not required by HTML, imposes a discipline on content developers but it is not burdensome. There are code validators available from W3C and both content and the associated style sheets can be validated.

XML Mark-up and Style Sheets

Another primary difference between HTML and XML is that HTML is used to for both formatting and displaying data, with structure and display of a document combined, whilst XML separates structure and display. The XML metalanguage is used to create languages that describe and structure data or that format data. The semantics of an XML document are, therefore, defined by the applications that process them and/or by stylesheets (XSLT) which renders the semantics of the XML file appropriately for differing output formats.

Although it is possible to use CSS (Cascading Stylesheets) that can be referenced by an HTML file, thereby separating structure and display to some extent, the stylesheet is able only to alter the display and formatting of all or part of the contents of the HTML file to which it applies. Thus, for example, <H1> tags in the HTML file can be defined to appear as bold, 15 point and blue. CSS are, therefore, capable only of altering the formatting and display selected aspects of the entire contents of the HTML file. The whole HTML file will therefore be displayed in the browser.

XSLT stylesheets are, by contrast, a much more powerful tool with the capability to allow selection of particular parts of an XML file to display according to the desired format but can also allow the addition (within the XSLT file itself) of, for example, images, text or video if desired. Thus, authors have the freedom to write data files once, including for example text in English, Spanish and French, and each language can be separately selected by from this file by the stylesheet, and displayed separately together with any language-specific images, audio or video that is appropriate. This obviates the necessity to author, for example, separate files for each language as the use of HTML alone would dictate.

a) separation of data (within the XML file itself);

b) structure (as defined by the XSTL stylesheet) and

c) validation (by the DTD or schema)

together present a powerful combination that is highly suited to the provision of content presented according to differing accessibility requirements.

Overview of XML Qualities

There are many other advantages to using XML, including:

A public working draft of the W3C XML Accessibility Guidelines is available at http://www.w3.org/TR/xmlgl.

Customisable and Flexible

For advanced users, XML has significantly fewer limitations than HTML. While the tags and attributes within HTML are restricted to a pre-defined set, XML has the capability to employ author-defined tags. It is therefore fully customisable and headers and paragraph tags can be replaced by tags describing the information contained within those tags in a more intelligent and intelligible manner. Thus it is possible for an author to tailor data files to meaningfully describe the type of data in any particular set of tags in a datafile.

This is relevant to accessibility in that it is possible to mark up files with tags that clearly differentiate content which is intended for, for example, blind users or visually-impaired and able-bodied users within a single data file.

For example:

From a biological sciences course:

<A_robustus_intro>

<audio description> This page contains information relating to the fossils of Australopithesuc robustus. It shows an image of a skull specimen discovered in 1950 by Robert Broom showing a slight sagittal crest and large zygomatic arches that project forwards, hiding the sunken nasal area.</audio description>

<large print> The above image shows the skull of Australopithecus robustus. Discovered in 1950 by Robert Broom this. Look at the distinctive features and note them </large print>

<standard>The above image shows the skull of Australopithesuc robustus. Discovered in 1950 by Robert Broom this. Look at the distinctive features and note them</standard>

</A_robustus_intro>

The various versions could then be added via the XSLT (Extensible Stylesheet Language Transformations) stylesheet.

The above brief example illustrates that the capacity of XML to include author-tailored tags increases both the flexibility and readability of the file. This also facilitates the editing process.

 

From the accessibility perspective, XML features such as

  1. separation of data (within the XML file itself);
  2. structure (as defined by the XSTL stylesheet) and
  3. validation (by the DTD or schema)

together present a powerful combination that is highly suited to the provision of content presented according to differing accessibility requirements.

The latest version of the XML Accessibility Guidelines is available at http://www.w3.org/TR/xmlgl

It says:

"This document explains how to design accessible applications using XML, the Extensible Markup Language. Compared to the HTML or MathML languages, XML is one level up: it is a meta syntax used to describe these languages, as well as new ones. As a meta syntax, XML provides no intrinsic guarantee of device independence or textual alternate support. It is essential, therefore, that XML formats and tools designers are provided with guidelines that explain how to include basic accessibility features - such as those present in HTML, SMIL, and SVG - in all their new developments."

and

"Introduction

XML (Extensible Markup Language) is a meta-syntax, used to create new languages. It can be seen as a simplification of SGML (Standard Generalized Markup Language), designed to promote a wider acceptance in Web markets, but serving the same functionality of extensibility and new language design.

HTML (HyperText Markup Language), on the other hand, is one particular application of SGML, which covers one set of needs ("simple" hypertext documents) and one set of element and attributes.

For instance, in HTML, authors can write elements like:

<title>XML and Accessibility</title>
<address lang=fr>Daniel Dardailler</address>
<h1>Background</h1>

and they can only use elements (title, h1, etc) defined by the HTML specification (which defines about a hundred), and their attributes.

In SGML and XML, authors can define their own set of elements, and end up with documents like:

<menu>New England Restaurant</menu>
<appetizer>Clam Chowder
<photo url="clam.jpg">A large creamy bowl of clam showder, with bread crumbs on top</photo>
</appetizer>

which may fit more closely the needs of their information system.

Within W3C, the HTML language is now being recast as XML - this is called XHTML - including a modularization of HTML to suit the needs of a larger community (mobile users, Web TV, etc). XML is therefore not to be seen as a replacement of HTML, but as a new building layer on top of which HTML is to be placed, next to other languages designed by W3C, such as MathML (for representing mathematical formula), SMIL (for synchronizing multimedia), SVG (for scalable graphics), etc., and other new languages designed by other organizations (such a OpenEBook, XML-EDI, etc.). Furthermore, it is important to understand that XML is not only a User Interface technology (like HTML), but can and is often used in protocol communication, to serialize and encode data to be sent from one machine to another."

The XML Accessibility Guidelines only deal with what is called data-oriented use of XML, defined as below:

data-oriented:

Tagsets for: User Interface (UI)--oriented structural textual rendering, such as Docbook, HTML, MenuML, OEB, etc.; specialized rendering - for example MathML, Scalable Vector Graphics (SVG), MusicML, Synchronized Multimedia Integration Language (SMIL); or any generic data storage format. An informal definition is 'anything for which the question "is there a textual equivalent of all rich media data bits?" makes sense'. Data-centric schemata include both the interaction and behavioral aspects of an XML application.

It goes on to explain how HTML has been worked on until it is clear how to make it accessible but it is not so easy with XML.

See more information about XHTML.


Last updated: 8 March 2002