Navigation

This site is at beta test stage! Comments are welcome. Contributions are sought and will be published with acknowledgement.

 

home page

quick overview

 

flow chart

site index

contact us

site use

 

contribute now!

 

©Liddy Nevile

Acknowledgements

 

Portable Document Format (PDF)

Note that Google provides text versions as alternatives of all PDFs.

The following is extracted from the Melb-WAG email list on which discussion of PDFs took place recently in melbourne. The archive is available at http://yahoogroups.com/melb-wag/

CCN:Summary:

No, don't use PDF alone without checking that the conversion to HTML provided (which can be done by linking to the Adobe site - they have an online converter) also meets accessibility requirements. This is in the process of changing, but hasn't yet and I don't think will for at least another year or so (depends on how fast people upgrade their browsers. I would guess it will be fine about the time when people are no longer using IE6.0 or Netscape 6.1) (excruciating) detail: I don't think this changes that frequently - I have been saying the same thing for a couple of years (ever since Adobe started including accessibility features in PDF) about whether or not it is reasonable to rely on. I haven't looked around for a formal consensus anywhere (I don't have many ways of finding one about anything) but when I talk to people to see if there is something I should know, I haven't come across many who say that PDF is a fine format to use. There are some, of course, but my experience is that among people who can make a case for it on any kind of accessibility basis they are a tiny minority, and more to the point their arguments rely on assumptions like "everyone with a disability uses JAWS", which is not very convincing. My humble opinion is that PDF is not magically accessible, but that it is possible to make PDFs accessible to people who have the right software. Most programs for generating PDF don't do this - it is possible with the latest Acrobat software from Acrobat (but almost impossible, for example, with Word using the Adobe-supplied PDF conversion).

Given that access technology that can use PDFs is not yet available on all platforms, and is still new and so not all users have it, I recommend not using it. I don't think that it meets the requirements (which are, however a matter of opinion to some extent) of WCAG checkpoints such as

6.3 Ensure that pages are usable when scripts, applets, or other programmatic objects are turned off or not supported. If this is not possible, provide equivalent information on an alternative accessible page. [Priority 1] (PDF is a programmatic object - it is in fact a program, unlike HTML. But I think that is deeper than one needs to look).

The reasoning behind this has changed though:

With version 4 of Acrobat (I think version 1.3 of PDF), Adobe provided a conversion service to HTML. This was the best way to find out if a given PDF document was accessible - run it through the conversion, and if the result was accessible, then the PDF itself was too, as far as PDF 4 could be. (Of course this meant that there was an HTMl document available. Whether you posted it as an HTML document, or whether you provided a URI to what is now called a Web Service, which produced that document on the fly, is pretty immaterial).

With version 5.0 of Acrobat (I think version 1.4 of PDF) they made Acrobat itself compatible with some screen readers.

So more and more people will have access to PDF directly, and in the meantime there is the HTML conversion. It is still easier for most people to check whether the converted HTML meets accessibiltiy requirements than to check whether PDF does (I imagine they will keep working on reducing this gap, as they have been). At some point in the future, this will mean that there is no reason not to just publish a PDF and not think about the HTML version that might be generated.

In the meantime another problem is that the HTML conversion only handles ASCII, which covers english, bahasa melayu, bahasa indonesi and latin, but not european or asian languages in general, and only some african languages. This is a flaw, and the claim that characters such as umlauts cannot be represented in text based formats is untrue - it is even possible to produce a text-based XHTML page in Klingon. (Star Trek fans have a lot to answer for. Hi Jonathan!)

Jason:

Structured (tagged) PDF can only be produced if the necessary structure exists in the first place - that is, the document must already be stored in a format that captures the structural elements. This could be, for example, an XML-based format or a word processor format with proper use of style sheets.

Given these requirements, it is just as easy to convert the document to valid HTML as it is to produce structured PDF output; so there shouldn't be extra work involved in creating both versions (after all, the conversion processes can be completely automated).

On the other hand if the starting point is an unstructured document (no structural elements; a highly presentational format is used, etc.), then the work involved is that of adding structure to the document in the first place. Under those circumstances it will be the same amount of work to produce structured PDF as to produce high-quality HTML. The existence of structured PDF doesn't diminish the problem which arises from authoring tools and authoring practices that don't lead to the inclusion of proper, structural representations of the document.

Incidentally, if anyone wants to modify one of the freely available tools such as XPDF (http://www.foolabs.com/xpdf/) to extract structured PDF files, for example by converting them to XML, there is a very good and worthwhile programming project there that needs to be undertaken. To my knowledge, nobody has volunteered to work on this kind of project yet.

CCN:

Well, structured PDF can be fairly easily edited in some Adobe tools - so it is possible to take a document from Word, export it as PDF, and clean it up in the same way it is possible to do so for HTML, and it is (more or less - it depends mostly on your personal work flow and tool familiarity) the same amount of work.

The problem is that not all users can use PDF even if it is accessible to some assistive technologies (not all assistive technologies have yet implemented ways to work with acrobat), and using the Adobe conversion to HTML doesn't guarantee accessible HTML (although for well-structured PDF it is generally pretty good).

So my baseline recommendation is that it isn't yet sufficiently widely implemented to rely on it working, in much the same way as I wouldn't recommend people rely on SVG unless they are experts in how to use it (it can be made to work in all kinds of browsers, including Lynx, netscape 2, etc, for many cases, but it isn't all that easy and for many people it is currently still faster to make a png or jpg and produce equivalent alternatives).

cheers

Chaals

Rob Pedlow:

THis has been a subject I have spent some time on lately.

My view is:

1.) for users with the latest technology i.e. acrobat 5.0 reader & jaws 3.71+ pdfs that have been created correctly are quite accessible.

2.) As a matter of pragmatic reality the percentage of pwd with the latest tech is probably quite small. Hence in my view pdf's are likely in practis even when correctly created to be inaccessible for a significant percentage of users with adaptive tech. & ideally one should still have a HTML version Some Corp. owners however have a strong desire to use PDF's. In this case they need to made following the ADOBE accessibility guidelines & using Acrobat writer v5.0+

Jonathan:

.And don't forget that PDFs are delivered as one large, indigestible chunk, so they trip over electric fences. That is, rural users may have their downloads interrupted when an electric fence kicks in and momentarily disrupts their phone service.

Electric fences turn on and off on a periodic basis. If you have ever telephoned someone in the country and heard a regular 'tick, tick, tick' on the line, it will be their electric fence (or ASIS).

Larry:

Here are some further observations, not so much W3C guidelines, but public policy /usability issues --ie Acrobat is neither fully accessible or fully usable

1) PDFs, especially larger complex files, eat bandwidth, and take time to download for many users still on coppper - remember, most people have dial up accounts, and for people in rural and regional areas, bandwidth issues are critical (the need for the right version of adobe is another disincentive, ie another big download, plus skills to install).

I can't remember the site (maybe on useit.com) but someone observed that a huge number of people just stop at the gate when it says download in pdf, and there is some alleged statistic around the place demonstrating this.

So the investment in html balanced against the internal convenience of the easiness of pdf to the exclusion of many people is a tradeoff when it may enlarge the customer/user base. This might explain why on some of the big academic databases, there is a choice of pdf/html/text (this is an expensive option of course, paid for by university subscribers)

2) Design for device & platform independence though a high level of skill in language. Designers should be encouraged to think of the future -- Encourage better use of html, xhtml, xml etc which allows for files to be served up in multiple ways (including xml > pdf on the fly conversion, I believe - but it would be good it someone could provide an example of a site which showed how you can serve up in xhtml or pdf if you want).. See http://www.w3.org/TR/REC- CSS2/media.html under 7.3 for different media types -- can someone provide a better explanation on this list of the xml conversion process and the use of media style sheets. All I know is that I can get at least 3 working on my own pages (and I could attach speech, but no devices are working yet).

3) A lot of people don't like reading pdfs on screen (Adobe is not user-friendly) - eg full screen mode gives to a tiny page, not a full screen at the font size you want - idiotic!, and they DON'T want to print off the files either when they are paying for paper and ink.

4) As Charles says, the accessible pdf=screenreader issue is a furphy - there is no simple equation and anything except the most simple file converts badly - to show people how - just get some site you know has pdfs on Google, and convert the page automatically, and people will get the message.


Last updated: 8 March 2002