org.apache.xml.resolver
Class Catalog

java.lang.Object
  extended by org.apache.xml.resolver.Catalog
Direct Known Subclasses:
Resolver

public class Catalog
extends java.lang.Object

Represents OASIS Open Catalog files.

This class implements the semantics of OASIS Open Catalog files (defined by OASIS Technical Resolution 9401:1997 (Amendment 2 to TR 9401)).

The primary purpose of the Catalog is to associate resources in the document with local system identifiers. Some entities (document types, XML entities, and notations) have names and all of them can have either public or system identifiers or both. (In XML, only a notation can have a public identifier without a system identifier, but the methods implemented in this class obey the Catalog semantics from the SGML days when system identifiers were optional.)

The system identifiers returned by the resolution methods in this class are valid, i.e. usable by, and in fact constructed by, the java.net.URL class. Unfortunately, this class seems to behave in somewhat non-standard ways and the system identifiers returned may not be directly usable in a browser or filesystem context.

This class recognizes all of the Catalog entries defined in TR9401:1997:

Note that BASE entries are treated as described by RFC2396. In particular, this has the counter-intuitive property that after a BASE entry identifing "http://example.com/a/b/c" as the base URI, the relative URI "foo" is resolved to the absolute URI "http://example.com/a/b/foo". You must provide the trailing slash if you do not want the final component of the path to be discarded as a filename would in a URI for a resource: "http://example.com/a/b/c/".

Note that subordinate catalogs (all catalogs except the first, including CATALOG and DELEGATE* catalogs) are only loaded if and when they are required.

This class relies on classes which implement the CatalogReader interface to actually load catalog files. This allows the catalog semantics to be implemented for TR9401 text-based catalogs, XML catalogs, or any number of other storage formats.

Additional catalogs may also be loaded with the parseCatalog(java.lang.String) method.

Change Log:

2.0

Rewrite to use CatalogReaders.

1.1

Allow quoted components in xml.catalog.files so that URLs containing colons can be used on Unix. The string passed to xml.catalog.files can now have the form:

 unquoted-path-with-no-sep-chars:"double-quoted path with or without sep chars":'single-quoted path with or without sep chars'
 

(Where ":" is the separater character in this example.)

If an unquoted path contains an embedded double or single quote character, no special processig is performed on that character. No path can contain separater characters, double, and single quotes simultaneously.

Fix bug in calculation of BASE entries: if a catalog contains multiple BASE entries, each is relative to the preceding base, not the default base URI of the catalog.

1.0.1

Fixed a bug in the calculation of the list of subordinate catalogs. This bug caused an infinite loop where parsing would alternately process two catalogs indefinitely.

Version:
1.0

Derived from public domain code originally published by Arbortext, Inc.

Author:
Norman Walsh Norman.Walsh@Sun.COM
See Also:
CatalogReader, CatalogEntry

Field Summary
static int BASE
          The BASE Catalog Entry type.
static int CATALOG
          The CATALOG Catalog Entry type.
static int DELEGATE_PUBLIC
          The DELEGATE_PUBLIC Catalog Entry type.
static int DELEGATE_SYSTEM
          The DELEGATE_SYSTEM Catalog Entry type.
static int DELEGATE_URI
          The DELEGATE_URI Catalog Entry type.
static int DOCTYPE
          The DOCTYPE Catalog Entry type.
static int DOCUMENT
          The DOCUMENT Catalog Entry type.
static int DTDDECL
          The DTDDECL Catalog Entry type.
static int ENTITY
          The ENTITY Catalog Entry type.
static int LINKTYPE
          The LINKTYPE Catalog Entry type.
static int NOTATION
          The NOTATION Catalog Entry type.
static int OVERRIDE
          The OVERRIDE Catalog Entry type.
static int PUBLIC
          The PUBLIC Catalog Entry type.
static int REWRITE_SYSTEM
          The REWRITE_SYSTEM Catalog Entry type.
static int REWRITE_URI
          The REWRITE_URI Catalog Entry type.
static int SGMLDECL
          The SGMLDECL Catalog Entry type.
static int SYSTEM
          The SYSTEM Catalog Entry type.
static int SYSTEM_SUFFIX
          The SYSTEM_SUFFIX Catalog Entry type.
static int URI
          The URI Catalog Entry type.
static int URI_SUFFIX
          The URI_SUFFIX Catalog Entry type.
 
Constructor Summary
Catalog()
          Constructs an empty Catalog.
Catalog(CatalogManager manager)
          Constructs an empty Catalog with a specific CatalogManager.
 
Method Summary
 void addEntry(CatalogEntry entry)
          Cleanup and process a Catalog entry.
 void addReader(java.lang.String mimeType, CatalogReader reader)
          Add a new CatalogReader to the Catalog.
 CatalogManager getCatalogManager()
          Return the CatalogManager used by this catalog.
 java.lang.String getCurrentBase()
          Returns the current base URI.
 java.lang.String getDefaultOverride()
          Returns the default override setting associated with this catalog.
 void loadSystemCatalogs()
          Load the system catalog files.
 void parseAllCatalogs()
          Parse all subordinate catalogs.
 void parseCatalog(java.lang.String fileName)
          Parse a catalog file, augmenting internal data structures.
 void parseCatalog(java.lang.String mimeType, java.io.InputStream is)
          Parse a catalog file, augmenting internal data structures.
 void parseCatalog(java.net.URL aUrl)
          Parse a catalog document, augmenting internal data structures.
 java.lang.String resolveDoctype(java.lang.String entityName, java.lang.String publicId, java.lang.String systemId)
          Return the applicable DOCTYPE system identifier.
 java.lang.String resolveDocument()
          Return the applicable DOCUMENT entry.
 java.lang.String resolveEntity(java.lang.String entityName, java.lang.String publicId, java.lang.String systemId)
          Return the applicable ENTITY system identifier.
 java.lang.String resolveNotation(java.lang.String notationName, java.lang.String publicId, java.lang.String systemId)
          Return the applicable NOTATION system identifier.
 java.lang.String resolvePublic(java.lang.String publicId, java.lang.String systemId)
          Return the applicable PUBLIC or SYSTEM identifier.
 java.lang.String resolveSystem(java.lang.String systemId)
          Return the applicable SYSTEM system identifier.
 java.lang.String resolveURI(java.lang.String uri)
          Return the applicable URI.
 void setCatalogManager(CatalogManager manager)
          Establish the CatalogManager used by this catalog.
 void setupReaders()
          Setup readers.
 void unknownEntry(java.util.Vector strings)
          Handle unknown CatalogEntry types.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

BASE

public static final int BASE
The BASE Catalog Entry type.


CATALOG

public static final int CATALOG
The CATALOG Catalog Entry type.


DOCUMENT

public static final int DOCUMENT
The DOCUMENT Catalog Entry type.


OVERRIDE

public static final int OVERRIDE
The OVERRIDE Catalog Entry type.


SGMLDECL

public static final int SGMLDECL
The SGMLDECL Catalog Entry type.


DELEGATE_PUBLIC

public static final int DELEGATE_PUBLIC
The DELEGATE_PUBLIC Catalog Entry type.


DELEGATE_SYSTEM

public static final int DELEGATE_SYSTEM
The DELEGATE_SYSTEM Catalog Entry type.


DELEGATE_URI

public static final int DELEGATE_URI
The DELEGATE_URI Catalog Entry type.


DOCTYPE

public static final int DOCTYPE
The DOCTYPE Catalog Entry type.


DTDDECL

public static final int DTDDECL
The DTDDECL Catalog Entry type.


ENTITY

public static final int ENTITY
The ENTITY Catalog Entry type.


LINKTYPE

public static final int LINKTYPE
The LINKTYPE Catalog Entry type.


NOTATION

public static final int NOTATION
The NOTATION Catalog Entry type.


PUBLIC

public static final int PUBLIC
The PUBLIC Catalog Entry type.


SYSTEM

public static final int SYSTEM
The SYSTEM Catalog Entry type.


URI

public static final int URI
The URI Catalog Entry type.


REWRITE_SYSTEM

public static final int REWRITE_SYSTEM
The REWRITE_SYSTEM Catalog Entry type.


REWRITE_URI

public static final int REWRITE_URI
The REWRITE_URI Catalog Entry type.


SYSTEM_SUFFIX

public static final int SYSTEM_SUFFIX
The SYSTEM_SUFFIX Catalog Entry type.


URI_SUFFIX

public static final int URI_SUFFIX
The URI_SUFFIX Catalog Entry type.

Constructor Detail

Catalog

public Catalog()
Constructs an empty Catalog.

The constructor interrogates the relevant system properties using the default (static) CatalogManager and initializes the catalog data structures.


Catalog

public Catalog(CatalogManager manager)
Constructs an empty Catalog with a specific CatalogManager.

The constructor interrogates the relevant system properties using the specified Catalog Manager and initializes the catalog data structures.

Method Detail

getCatalogManager

public CatalogManager getCatalogManager()
Return the CatalogManager used by this catalog.


setCatalogManager

public void setCatalogManager(CatalogManager manager)
Establish the CatalogManager used by this catalog.


setupReaders

public void setupReaders()
Setup readers.


addReader

public void addReader(java.lang.String mimeType,
                      CatalogReader reader)
Add a new CatalogReader to the Catalog.

This method allows you to add a new CatalogReader to the catalog. The reader will be associated with the specified mimeType. You can only have one reader per mimeType.

In the absence of a mimeType (e.g., when reading a catalog directly from a file on the local system), the readers are attempted in the order that you add them to the Catalog.

Note that subordinate catalogs (created by CATALOG or DELEGATE* entries) get a copy of the set of readers present in the primary catalog when they are created. Readers added subsequently will not be available. For this reason, it is best to add all of the readers before the first call to parse a catalog.

Parameters:
mimeType - The MIME type associated with this reader.
reader - The CatalogReader to use.

getCurrentBase

public java.lang.String getCurrentBase()
Returns the current base URI.


getDefaultOverride

public java.lang.String getDefaultOverride()
Returns the default override setting associated with this catalog.

All catalog files loaded by this catalog will have the initial override setting specified by this default.


loadSystemCatalogs

public void loadSystemCatalogs()
                        throws java.net.MalformedURLException,
                               java.io.IOException
Load the system catalog files.

The method adds all of the catalogs specified in the xml.catalog.files property to the Catalog list.

Throws:
java.net.MalformedURLException - One of the system catalogs is identified with a filename that is not a valid URL.
java.io.IOException - One of the system catalogs cannot be read.

parseCatalog

public void parseCatalog(java.lang.String fileName)
                  throws java.net.MalformedURLException,
                         java.io.IOException
Parse a catalog file, augmenting internal data structures.

Parameters:
fileName - The filename of the catalog file to process
Throws:
java.net.MalformedURLException - The fileName cannot be turned into a valid URL.
java.io.IOException - Error reading catalog file.

parseCatalog

public void parseCatalog(java.lang.String mimeType,
                         java.io.InputStream is)
                  throws java.io.IOException,
                         CatalogException
Parse a catalog file, augmenting internal data structures.

Catalogs retrieved over the net may have an associated MIME type. The MIME type can be used to select an appropriate reader.

Parameters:
mimeType - The MIME type of the catalog file.
is - The InputStream from which the catalog should be read
Throws:
CatalogException - Failed to load catalog mimeType.
java.io.IOException - Error reading catalog file.

parseCatalog

public void parseCatalog(java.net.URL aUrl)
                  throws java.io.IOException
Parse a catalog document, augmenting internal data structures.

This method supports catalog files stored in jar files: e.g., jar:file:///path/to/filename.jar!/path/to/catalog.xml". That URI doesn't survive transmogrification through the URI processing that the parseCatalog(String) performs and passing it as an input stream doesn't set the base URI appropriately.

Written by Stefan Wachter (2002-09-26)

Parameters:
aUrl - The URL of the catalog document to process
Throws:
java.io.IOException - Error reading catalog file.

addEntry

public void addEntry(CatalogEntry entry)
Cleanup and process a Catalog entry.

This method processes each Catalog entry, changing mapped relative system identifiers into absolute ones (based on the current base URI), and maintaining other information about the current catalog.

Parameters:
entry - The CatalogEntry to process.

unknownEntry

public void unknownEntry(java.util.Vector strings)
Handle unknown CatalogEntry types.

This method exists to allow subclasses to deal with unknown entry types.


parseAllCatalogs

public void parseAllCatalogs()
                      throws java.net.MalformedURLException,
                             java.io.IOException
Parse all subordinate catalogs.

This method recursively parses all of the subordinate catalogs. If this method does not throw an exception, you can be confident that no subsequent call to any resolve*() method will either, with two possible exceptions:

  1. Delegated catalogs are re-parsed each time they are needed (because a variable list of them may be needed in each case, depending on the length of the matching partial public identifier).

    But they are parsed by this method, so as long as they don't change or disappear while the program is running, they shouldn't generate errors later if they don't generate errors now.

  2. If you add new catalogs with parseCatalog, they won't be loaded until they are needed or until you call parseAllCatalogs again.

On the other hand, if you don't call this method, you may successfully parse documents without having to load all possible catalogs.

Throws:
java.net.MalformedURLException - The filename (URL) for a subordinate or delegated catalog is not a valid URL.
java.io.IOException - Error reading some subordinate or delegated catalog file.

resolveDoctype

public java.lang.String resolveDoctype(java.lang.String entityName,
                                       java.lang.String publicId,
                                       java.lang.String systemId)
                                throws java.net.MalformedURLException,
                                       java.io.IOException
Return the applicable DOCTYPE system identifier.

Parameters:
entityName - The name of the entity (element) for which a doctype is required.
publicId - The nominal public identifier for the doctype (as provided in the source document).
systemId - The nominal system identifier for the doctype (as provided in the source document).
Returns:
The system identifier to use for the doctype.
Throws:
java.net.MalformedURLException - The formal system identifier of a subordinate catalog cannot be turned into a valid URL.
java.io.IOException - Error reading subordinate catalog file.

resolveDocument

public java.lang.String resolveDocument()
                                 throws java.net.MalformedURLException,
                                        java.io.IOException
Return the applicable DOCUMENT entry.

Returns:
The system identifier to use for the doctype.
Throws:
java.net.MalformedURLException - The formal system identifier of a subordinate catalog cannot be turned into a valid URL.
java.io.IOException - Error reading subordinate catalog file.

resolveEntity

public java.lang.String resolveEntity(java.lang.String entityName,
                                      java.lang.String publicId,
                                      java.lang.String systemId)
                               throws java.net.MalformedURLException,
                                      java.io.IOException
Return the applicable ENTITY system identifier.

Parameters:
entityName - The name of the entity for which a system identifier is required.
publicId - The nominal public identifier for the entity (as provided in the source document).
systemId - The nominal system identifier for the entity (as provided in the source document).
Returns:
The system identifier to use for the entity.
Throws:
java.net.MalformedURLException - The formal system identifier of a subordinate catalog cannot be turned into a valid URL.
java.io.IOException - Error reading subordinate catalog file.

resolveNotation

public java.lang.String resolveNotation(java.lang.String notationName,
                                        java.lang.String publicId,
                                        java.lang.String systemId)
                                 throws java.net.MalformedURLException,
                                        java.io.IOException
Return the applicable NOTATION system identifier.

Parameters:
notationName - The name of the notation for which a doctype is required.
publicId - The nominal public identifier for the notation (as provided in the source document).
systemId - The nominal system identifier for the notation (as provided in the source document).
Returns:
The system identifier to use for the notation.
Throws:
java.net.MalformedURLException - The formal system identifier of a subordinate catalog cannot be turned into a valid URL.
java.io.IOException - Error reading subordinate catalog file.

resolvePublic

public java.lang.String resolvePublic(java.lang.String publicId,
                                      java.lang.String systemId)
                               throws java.net.MalformedURLException,
                                      java.io.IOException
Return the applicable PUBLIC or SYSTEM identifier.

This method searches the Catalog and returns the system identifier specified for the given system or public identifiers. If no appropriate PUBLIC or SYSTEM entry is found in the Catalog, null is returned.

Parameters:
publicId - The public identifier to locate in the catalog. Public identifiers are normalized before comparison.
systemId - The nominal system identifier for the entity in question (as provided in the source document).
Returns:
The system identifier to use. Note that the nominal system identifier is not returned if a match is not found in the catalog, instead null is returned to indicate that no match was found.
Throws:
java.net.MalformedURLException - The formal system identifier of a subordinate catalog cannot be turned into a valid URL.
java.io.IOException - Error reading subordinate catalog file.

resolveSystem

public java.lang.String resolveSystem(java.lang.String systemId)
                               throws java.net.MalformedURLException,
                                      java.io.IOException
Return the applicable SYSTEM system identifier.

If a SYSTEM entry exists in the Catalog for the system ID specified, return the mapped value.

On Windows-based operating systems, the comparison between the system identifier provided and the SYSTEM entries in the Catalog is case-insensitive.

Parameters:
systemId - The system ID to locate in the catalog.
Returns:
The resolved system identifier.
Throws:
java.net.MalformedURLException - The formal system identifier of a subordinate catalog cannot be turned into a valid URL.
java.io.IOException - Error reading subordinate catalog file.

resolveURI

public java.lang.String resolveURI(java.lang.String uri)
                            throws java.net.MalformedURLException,
                                   java.io.IOException
Return the applicable URI.

If a URI entry exists in the Catalog for the URI specified, return the mapped value.

URI comparison is case sensitive.

Parameters:
uri - The URI to locate in the catalog.
Returns:
The resolved URI.
Throws:
java.net.MalformedURLException - The system identifier of a subordinate catalog cannot be turned into a valid URL.
java.io.IOException - Error reading subordinate catalog file.