README for Arbortext Catalog Classes

(Debian GNU/Linux user version)

Mark Johnson, 5 Jul 2001

This document mostly consists of user-oriented excerpts from the original README document.

These Java classes implement the OASIS Entity Management Catalog format as well as an XML Catalog format for resolving XML public identifiers into accessible files or resources on a user's system or throughout the Web. These definitions can easily be incorporated into most Java-based XML processors, thereby giving the users of these processors all the benefits of public identifier use.

For more information, see also the Standard Deviations from Norm column " If You Can Name It, You Can Claim It!"

Installation

apt-get install arbortext-catalog

Use

This section provides a very brief overview of the classes. For more complete information, see the API Documentation.

Sample Applications

The sample applications demonstrate some of the features of Catalogs. Each of the examples that follows assumes that you're current working directory is the directory where you unpacked the catalog distribution.

The catalog application

The catalog program parses one or more Catalog files and performs a single lookup based on catalog keywords. Running catalog with no arguments will display a summary of the following usage information.

Usage: catalog [options] command

The catalog program parses one or more Catalog files and performs a
single lookup of a public or system identifier. Running catalog with
no arguments will display a summary of this usage information.

Options:

 -h                 Print this help message
 -E                 Show usage examples
 -g                 Use GNU gij java interpreter

 -c <catalogfile>   Can be repeated to load several catalogs
 -d <debuglevel>    Parsing verbosity, an integer in [0-3]
 -p <parserClass>   Name of a parser class for reading Cowan XML Catalogs
 -s                 Load system catalogs & give them a higher search precedence
                     than catalogs specified via -c <catalogfile>

  Note: to use the -p option, the relevant class files
        needed by <parserClass> must be in your CLASSPATH

Commands take one of the following forms:

  document
  doctype  name publicid systemid
  entity   name publicid systemid
  notation name publicid systemid
  public        publicid systemid
  system                 systemid

 Arguments are positional, use the string "null" to indicate a null
 value.

'catalog' usage examples:
=========================

 --Input: catalog -s public "-//OASIS//DTD DocBook MathML Module V1.0//EN"

 --Output:
   Loading system catalogs.
   Set debug to: 0
   Resolving public:
	   Public: -//OASIS//DTD DocBook MathML Module V1.0//EN
	   System: null

   Resolved: file:/usr/share/sgml/docbook/custom/mathml/1.0/dbmathml.dtd
 ---------------------------------------------------------------------   

 --Input: catalog -s -d 0 \
          system "http://www.oasis-open.org/docbook/xml/mathml/1.0/dbmathml.dtd"

 --Output:
   Loading system catalogs.
   Set debug to: 0
   Resolving system:
	   Public: null
	   System: http://www.oasis-open.org/docbook/xml/mathml/1.0/dbmathml.dtd

   Resolved: file:/usr/share/sgml/docbook/custom/mathml/1.0/dbmathml.dtd
 --------------------------------------------------------------------- 

 --Input: catalog -c http://oasis-open.org/docbook/xml/4.1.2/docbook.cat \
                  public "-//OASIS//DTD DocBook XML V4.1.2//EN"
 --Output:
   Ignoring system catalogs.
   Set debug to: 0
   Adding catalog: http://oasis-open.org/docbook/xml/4.1.2/docbook.cat
   Resolving public:
   	   Public: -//OASIS//DTD DocBook XML V4.1.2//EN
	   System: null

   Resolved: http://oasis-open.org/docbook/xml/4.1.2/docbookx.dtd
Example 1. Using catalog with the example files

In the following example, catalog loads the OASIS Catalog file test/catalog, looks up the requested public identifier, and displays the resulting system identifier.

 $ catalog -c test/catalog
		public "-//Arbortext//TEXT Test Public Identifier//EN"

(with the whole command on a single line, naturally).

Example 2. Using catalog with system files
 $ catalog -d 0 -c /etc/sgml/catalog public "-//OASIS//DTD DocBook XML V4.1.2//EN"
		Ignoring system catalogs.
		Set debug to: 0
		Adding catalog: /etc/sgml/catalog
		Resolving PUBLIC:
		Public: -//OASIS//DTD DocBook XML V4.1.2//EN
		System: null

		Resolved: file:/usr/share/sgml/docbook/dtd/xml/4.1.2/docbookx.dtd

There are a number of options that you can pass to the catalog program:

catalog command line options
Option Example Description
-c catalogfile-c test/catalog Load the specified catalog file.
-d debuglevel-d 1 Set the debug level; the default debug level is 3.
-p parserClass -p org.apache.xerces.parsers.SAXParser Select the SAX Parser to use to parse XML Catalog files.
-s-sLoad system catalogs.

Running catalog with no arguments will display a summary of this usage information.

Note: in order to use the -p option, you will need to have the relevant class files for the parser class that you select on your CLASSPATH. In the example above, the Xerces parser from the Apache XML Project would be required. You can use any SAX compliant parser with the Catalog files.

The eresolve application

The eresolve application demonstrates the use of a CatalogEntityResolver class as a SAX entityResolver hook.

 $ eresolve -c test/catalog test/test.xml

(with the whole command on a single line, naturally).

Example: Using the eresolve Command
 $ eresolve -d 2 -c test/catalog test/test.xml
Set debug to 2
Adding catalog: test/catalog
Loading catalog: test/catalog
Parsing test/test.xml
Resolved: -//Arbortext//TEXT Test Public Identifier//EN
        file:/N:/viewstores/nwalsh_saffron/Epic/src/xml/catalog/test/testpub.xml

Resolved: urn:x-arbortext:test-system-identifier
        file:/N:/viewstores/nwalsh_saffron/Epic/src/xml/catalog/test/testsys.xml

Done parsing test/test.xml

Brief descriptions of the sample files:

catalog

This is a Catalog with a few simple entries:

OVERRIDE  YES
PUBLIC    "-//Arbortext//TEXT Test Public Identifier//EN" "testpub.xml"

SYSTEM    "urn:x-arbortext:test-system-identifier"        "testsys.xml"

OVERRIDE NO
PUBLIC "-//Arbortext//TEXT Test Override//EN" "override.xml"
		    
test.xml

This is a test document that contains several external entities:

<!DOCTYPE test [
<!ENTITY testpub PUBLIC "-//Arbortext//TEXT Test Public Identifier//EN"
"bogus-system-identifier.xml">
<!ENTITY testsys SYSTEM "urn:x-arbortext:test-system-identifier">>
<!ENTITY testovr PUBLIC "-//Arbortext//TEXT Test Override//EN"
"testovr.xml">
]
>
<test>
&testpub;
&testsys;
&testovr;
</test>

This XML document demonstrates several Catalog features:

If parsed without a catalog, the parse will fail since bogus-system-identifier.xml won't be found (and neither would the URN, unless you happen to have some other URN resolution mechanism running).

If parsed with the included catalog, the following substitutions will be made:

  • &testpub; will be replaced with the contents of testpub.xml, due to the mapping provided by the first PUBLIC entry in the catalog.

  • &testsys; will be replaced with the contents of testsys.xml, due to the mapping provided by the SYSTEM entry in the catalog.

  • &testovr; will be replaced with the contents of testovr.xml, due to the system identifier given in its entity declaration; the mapping provided by the second PUBLIC entry in the catalog is not used because the entity declaration did provide a system identifier and the matching public identifier occurs where OVERRIDE is NO.

In this example, the system catalog path is set to test/catalog and the XML Parser is asked to parse test/test.xml. In the course of this parsing, it will encounter entities which need to be resolved. The SAX entityResolver hook will use the catalog to locate appropriate resources. If you attempt to parse test/test.xml without a catalog, the parse will fail.

This example program uses the Xerces parser from the Apache XML Project and those classes must be available in order to run eresolve. You can add the Catalog functionality to any SAX compliant parser, but for the purpose of this example, we've explicitly chosen the Xerces parser.

Copyright

This code is placed in the public domain.

Warranty

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL ARBORTEXT OR ANY OTHER CONTRIBUTOR BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.