Previously, I'd wrestled with Docbook markup for documentation. The toolchain used produced HTML 4.01, a different standard to the rest of the website, based on XHTML. As part of a drive to improve the consistency of the web site, which involved writing a dead link checker and an XHTML checker in python, I found that these HTML documentation pages were a pain.
This was for two reasons. Firstly, I'd had to complicate the
generation of pages for the web site due to the two different
standards used, and secondly, the python XML parser I was using to
validate pages, xml.sax
, could not handle HTML, since it is
not well-formed XML. The HTML parser offered by python would only
identify very gross errors in the HTML code, which made it fairly
useless for checking syntax. I could, of course, used the callback
offered by the HtmlParser
object to write the validation
myself, but this would take more time than I was prepared to spend.
So, was there a XSL stylesheet to transform Docbook to XHTML? Of
course there was; the Docbook
project on Sourceforge provides a slew of tools to handle
Docbook format. I already had a copy of xsltproc
on my
Debian box to perform the translation.
As there are a couple of parameters I need to set, I created a simple stylesheet customisation layer. The first parameter setting causes function synopsis elements to be rendered in ANSI C format, rather than the default K&R. The second includes the standard hydrus CSS stylesheet in the generated XHTML pages. The local layer is shown below.
<?xml version='1.0'?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:import href="/home/mark/doc/docbook-xsl-1.69.1/xhtml/chunk.xsl"/> <xsl:param name="funcsynopsis.style" select="ansi"/> <xsl:param name="html.stylesheet" select="'/styles/style.css'"/> </xsl:stylesheet>
There was one problem; the XHTML output chapter which described the
B Tree test harness, bt
, was not valid XHTML. This was
caused by my use of the cmdsynopsis
tag within the
term
tag of a variablelist
element. The Docbook
to XHTML stylesheet converted a variablelist
into the
following XHTML tags:
<dl> <dt>term entry</dt> <dd>description</dd> </dl>
The stylesheet placed <div>
and <p>
elements around the body of the command synopsis. XHTML, however,
does not permit these tags within a <dt>
element.
As a quick workaround, I modified the synop.xsl
to
eliminate the generation of <div>
and
<p>
elements around <cmdsynopsis>
. To
see if this was a defect in the XHTML stylesheets, I asked the
question on the docbook-apps mailing list. The concensus was that
it would always be possible generate illegal XHTML from Docbook, as
Docbook has a less constrained content model than XHTML. The parameter
variablelist.as.table
was brought to my attention, which
causes variablelists
to be rendered as tables
.
By adding the line:
<xsl:param name="variablelist.as.table" select="1"/>
to the local customisation layer, I ended up with a tabular
presentation. Unfortunately, the visual effect was much less
attractive, as the longer commands were made somewhat unreadable
since they were folded to fit with the automatically generated
column widths. I decided to explicitly rework the command list as a
table, using spanspecs
to produce the layout I desired.
You can view the effect of the
rework. The benefit of this change is that I no longer required
the patch I'd applied to synop.xsl
.
Now I was able to move on to writing the python scripts to perform automatic validation of the web site (see the next journal entry).