![]() | ![]() |
|
DocBook XML/SGML Processing Using OpenJadeSaqib Ali
1. IntroductionSome Acronyms:
The objective of this document is to setup OpenJade to convert DocBook 3.2 and 4.2 Standard Generalized Markup Language (SGML) and Extensible Markup Language (XML) documents to HyperText Markup Language (HTML), Rich Text Format (RTF), and Portable Document Format (PDF). 1.1. Copyright and LicenseThis document is Copyright 2001 by Saqib Ali. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license is available at http://www.gnu.org/copyleft/fdl.html. 1.2. CreditsAll praise is due to Allah, The Lord of the Worlds. All credits go to Allah. Any mistake in this document is my own fault. Additionally, I would like to acknowledge the following people for their valuable contributions to this document:
1.3. What is DocBook?DocBook is a document type definition (DTD). A DTD defines the syntax of a document. DocBook describes the types of structure and formats to use in technical documents. It is commonly used because of its simplicity and completeness. A DTD defines the syntax of a document - essentially it is a 'rule book' that describes the sets of tags and attributes that will be used to describe specific kinds of content. So DocBook is a "rule book" that is used for writing documents. Every tag that is used in writing the document, must be defined very specifically and formally in the DTD. 1.4. What is DSSSL?A Document Style Semantics and Specification Language (DSSSL) defines how to convert an Standard Generalized Markup Language (SGML) document into a human-readable viewing format such as HTML, RTF and PDF. 1.5. What do we need?The tools needed to set up OpenJade for converting SGML and XML are:
1.6. AssumptionsThis document assumes that you have the following already installed on your system.
2. RequirementsYou'll have to download and compile only one package (OpenJade). This HOWTO will explain the compilation process, but you should be familiar with installing from source code. Most of the packages that we need are located at The Linux Documentation Project (TLDP) website. 2.1. Pre-requirementsCreate a directory /tmp/downloads. We will use this directory to store the downloaded source code. 2.2. OpenJadeOpenJade will be used to process DocBook documents. OpenJade can be downloaded from: http://openjade.sourceforge.net/. At the time of writing this document OpenJade 1.3.1 was available. Download the openjade-1.3.x.tar.gz file. 2.3. DocBook DTDsAll the DocBook DTDs are available from The Linux Documentation Project website at http://www.tldp.org/authors/index.html#resources Please download DocBook SGML v4.1, DocBook SGML v3.1, and DocBook XML v4.1.2.
2.4. ISO EntitiesThe Linux Documentation Project has packaged all the Entities into one big tar file and placed it at http://www.tldp.org/authors/tools/entities.tar.gz for the convenience of the users. Thanks to TLDP for this. 2.5. Norman Walsh's DSSSLNorman Walsh's DSSSL can be downloaded from the DocBook project website at http://sourceforge.net/project/showfiles.php?group_id=21935. At the time of writing this document docbook-dsssl-1.7.6 was available. 2.6. LDP customized DSL stylesheetsLDP DSL is a customized style sheet used by The Linux Documentation Project (TLDP). It is an extension to Norman Walsh's DSSSL. It add things like background and Table of Contents. It can be downloaded from http://www.tldp.org/authors/tools/ldp.dsl. ldp.dsl requires Normal Walsh's DSSSL 2.7. HTMLDOC (Optional)HTMLDOC can be used for converting the HTML to PDF. If you would like to produce PDF documents, please download HTMLDOC from http://www.easysw.com/htmldoc/software.php 2.8. Norman Walsh's XSL (Optional)This is not necessary. But if you would like to serve DocBook 4.1.2 XML content using Tomcat + Cocoon, you will need Norman Walsh's XML Style Sheets. The Style Sheets are available for download at http://sourceforge.net/projects/docbook/. Please download the package called docbook-xsl.
2.9. LDP Customized XSL (Optional)Also download the LDP Customized XSL from http://my.core.com/~dhorton/docbook/tldp-xsl/ 3. Installing Processing Tools - OpenJadeIn this section we will install all the tools in the appropriate directories. All the tools go in the /usr/local/dbtools/ directory. Create this directory using the following command:
3.1. Installing OpenJadeThis process is the easy part, but the most time consuming one too. Keep in mind that OpenJade take a long time to compile. To install OpenJade, complete the following steps:
3.2. Installing Norman Walsh's DSSSLIn this step we will install Norman Walsh's DSSSL in an appropriate place. The DSSSL does not have to be compiled.
3.3. Installing DocBook DTDsIn this section we will install the DocBook DTDs.
3.4. Installing the ISO EntitiesIn this section we will install the ISO entities that we downloaded from the LDP website. First we install the ISO Entities for the 3.1 SGML DTD.
Next we install the ISO Entities for the 4.1 SGML DTD.
3.5. Installing the LDP DSLFinally we install the customised LDP stylesheet.
3.6. Installing HTMLDOCThis step is optional. It is only required if you want to produce PDF documents from HTML. Change back to the downloads directory.
Untar the source code for HTMLDOC.
Run configure to set the installation location.
At the time of writing this document HTMLDOC ver 1.8.20-1 was available. This version had a little problem in the fonts Makefile. It would complain while installing the fonts, because the correct fonts were not available on the system. Here is the error you will get while running make install:
To fix this installation issue, please edit fonts/Makefile and comment out the lines with references to ZapfChancery and ZapfDingbats fonts. Then execute the install:
4. Using OpenJadeIn this section we will use OpenJade to convert DocBook SGML/XML documents to HTML, RTF, and PDF. 4.1. Processing SGML4.1.1. Setting the SGML_CATALOG_FILES Environmental Variable for SGMLThe SGML_CATALOG_FILES variable must be set to point to appropriate catalog files. To set the variable, use the following command for the Bourne shell:
Use the following command for the C shell:
4.1.2. SGML to HTMLTo convert from SGML to HTML, use the following command:
To create a non-chunked (all in one) output:
4.2. Processing XMLYou can download a sample DocBook 4.1.2 XML file from http://www.xml-dev.com:8080/cocoon/mount/docbook/openjade.xml 4.2.1. Setting the SGML_CATALOG_FILES Environmental Variable for XMLThe SGML_CATALOG_FILES variable must be set to point to appropriate catalog files. To set the variable, use the following command for the Bourne shell:
Use the following command for the C shell:
4.3. HTML to PDF (optional)To convert HTML to PDF we must use HTMLDOC. First create non-chunked HTML output of the SGML:
Then run HTMLDOC to produce PDF.
5. Serving DocBook 4.1.2 XMLThere are 3 ways to serve DocBook 4.1.2 XML from a web server:
Using an application server like Cocoon is the best the option.
In this section we will see how to serve DocBook 4.1.2 XML content using Tomcat + Cocoon. 5.1. Tomcat + CocoonTomcat is the Java Servlet Container. For more information please visit http://jakarta.apache.org/tomcat/index.html. Apache Cocoon is an XML publishing framework. For more information please visit http://xml.apache.org/cocoon/index.html. This HOWTO will not go into details of setting up Tomcat + Cocoon, since it is already explained in the document http://xml.apache.org/cocoon/installing/index.html. Setting up Tomcat + Cocoon is an easy process and should take less than five minutes. Once you have the Cocoon + Tomcat setup and working, please follow the next the sections to server DocBook 4.1.2 XML content.
At the very least, make sure you're using the latest JRE from Sun (at this writing, 1.4.2). Also consider upgrading the Xalan parser to the latest release. At this writing, the latest Sun JRE, 1.4.2, is bundled with Xalan 2.4.1, while Xalan itself is up to version 2.5.1. To check the version currently installed, type
For more info, visit http://xml.apache.org/xalan-j/faq.html . 5.2. Installing Norman Walsh's XSLIn this step we will install the Norman Walsh's XSL under the /usr/local/dbtools/ directory. Change to the /tmp/downloads directory and untar the docbook-xsl file.
To install the docbook-xsl please move the files to the /usr/local/dbtools.
Next install the LDP XSL. 5.3. Installing LDP XSLUnzip the tldp-xsl-xxxxx.tar.gz and the copy all the files to the /usr/local/dbtools/docbook-xsl/html directory.
5.4. Setting up sitemap.xmap$COCOON_HOME points to the Cocoon Web Application Directory. This directory is typically /usr/local/jakarta-tomcat-4.1.9/webapps/cocoon/ Create a directory named docbook under the $COCOON_HOME/mount. This is where we will put all our DocBook XML 4.1.2 content.
Create a file name sitemap.xmap in the $COCOON_HOME/mount/docbook with the following content:
5.5. Accessing DocBook 4.1.2 XML Content from a Web BrowserPlace a DocBook 4.1.2 XML file in the $COCOON_HOME/mount/docbook/ directory. A sample file is available from http://www.xml-dev.com:8080/cocoon/mount/docbook/openjade.xml. Now you can access the document using a browser at http://localhost:8080/cocoon/mount/sample.html (HTML) or http://localhost:8080/cocoon/mount/sample.pdf (PDF). 6. Further InformationThis section has some pointers to related resources on the Internet. If you would like to suggest additional resources for this section, please email me at <saqib@seagate.com>. Thanks. 6.1. News groupsSome of the news groups of interest are:
6.2. Mailing ListsHere are some relevant mailing lists.
6.4. Web Sites
6.5. XML Authoring / Modeling Applications
|