org.cyberneko.html
Class HTMLConfiguration
ParserConfigurationSettings
org.cyberneko.html.HTMLConfiguration
- XMLPullParserConfiguration
public class HTMLConfiguration
extends ParserConfigurationSettings
implements XMLPullParserConfiguration
An XNI-based parser configuration that can be used to parse HTML
documents. This configuration can be used directly in order to
parse HTML documents or can be used in conjunction with any XNI
based tools, such as the Xerces2 implementation.
This configuration recognizes the following features:
- http://cyberneko.org/html/features/augmentations
- http://cyberneko.org/html/features/report-errors
- http://cyberneko.org/html/features/report-errors/simple
- http://cyberneko.org/html/features/balance-tags
- and
- the features supported by the scanner and tag balancer components.
This configuration recognizes the following properties:
- http://cyberneko.org/html/properties/names/elems
- http://cyberneko.org/html/properties/names/attrs
- http://cyberneko.org/html/properties/filters
- http://cyberneko.org/html/properties/error-reporter
- and
- the properties supported by the scanner and tag balancer.
For complete usage information, refer to the documentation.
$Id: HTMLConfiguration.java,v 1.9 2005/02/14 03:56:54 andyc Exp $
protected void | addComponent(HTMLComponent component) - Adds a component.
|
void | cleanup() - If the application decides to terminate parsing before the xml document
is fully parsed, the application should call this method to free any
resource allocated during parsing.
|
XMLDTDContentModelHandler | getDTDContentModelHandler() - Returns the DTD content model handler.
|
XMLDTDHandler | getDTDHandler() - Returns the DTD handler.
|
XMLDocumentHandler | getDocumentHandler() - Returns the document handler.
|
XMLEntityResolver | getEntityResolver() - Returns the entity resolver.
|
XMLErrorHandler | getErrorHandler() - Returns the error handler.
|
Locale | getLocale() - Returns the locale.
|
void | parse(XMLInputSource source) - Parses a document.
|
boolean | parse(boolean complete) - Parses the document in a pull parsing fashion.
|
void | pushInputSource(XMLInputSource inputSource) - Pushes an input source onto the current entity stack.
|
protected void | reset() - Resets the parser configuration.
|
void | setDTDContentModelHandler(XMLDTDContentModelHandler handler) - Sets the DTD content model handler.
|
void | setDTDHandler(XMLDTDHandler handler) - Sets the DTD handler.
|
void | setDocumentHandler(XMLDocumentHandler handler) - Sets the document handler.
|
void | setEntityResolver(XMLEntityResolver resolver) - Sets the entity resolver.
|
void | setErrorHandler(XMLErrorHandler handler) - Sets the error handler.
|
void | setFeature(String featureId, boolean state) - Sets a feature.
|
void | setInputSource(XMLInputSource inputSource) - Sets the input source for the document to parse.
|
void | setLocale(Locale locale) - Sets the locale.
|
void | setProperty(String propertyId, Object value) - Sets a property.
|
AUGMENTATIONS
protected static final String AUGMENTATIONS
Include infoset augmentations.
BALANCE_TAGS
protected static final String BALANCE_TAGS
Balance tags.
ERROR_DOMAIN
protected static final String ERROR_DOMAIN
Error domain.
ERROR_REPORTER
protected static final String ERROR_REPORTER
Error reporter.
FILTERS
protected static final String FILTERS
Pipeline filters.
NAMESPACES
protected static final String NAMESPACES
Namespaces.
NAMES_ATTRS
protected static final String NAMES_ATTRS
Modify HTML attribute names: { "upper", "lower", "default" }.
NAMES_ELEMS
protected static final String NAMES_ELEMS
Modify HTML element names: { "upper", "lower", "default" }.
REPORT_ERRORS
protected static final String REPORT_ERRORS
Report errors.
SIMPLE_ERROR_FORMAT
protected static final String SIMPLE_ERROR_FORMAT
Simple report format.
XERCES_2_0_0
protected static boolean XERCES_2_0_0
Parser version is Xerces 2.0.0.
XERCES_2_0_1
protected static boolean XERCES_2_0_1
Parser version is Xerces 2.0.1.
XML4J_4_0_x
protected static boolean XML4J_4_0_x
Parser version is XML4J 4.0.x.
fCloseStream
protected boolean fCloseStream
Stream opened by parser. Therefore, must close stream manually upon
termination of parsing.
fDTDContentModelHandler
protected XMLDTDContentModelHandler fDTDContentModelHandler
DTD content model handler.
fDTDHandler
protected XMLDTDHandler fDTDHandler
DTD handler.
fDocumentHandler
protected XMLDocumentHandler fDocumentHandler
Document handler.
fDocumentScanner
protected HTMLScanner fDocumentScanner
Document scanner.
fEntityResolver
protected XMLEntityResolver fEntityResolver
Entity resolver.
fErrorHandler
protected XMLErrorHandler fErrorHandler
Error handler.
fHTMLComponents
protected Vector fHTMLComponents
Components.
fLocale
protected Locale fLocale
Locale.
fNamespaceBinder
protected NamespaceBinder fNamespaceBinder
Namespace binder.
HTMLConfiguration
public HTMLConfiguration()
Default constructor.
addComponent
protected void addComponent(HTMLComponent component)
Adds a component.
cleanup
public void cleanup()
If the application decides to terminate parsing before the xml document
is fully parsed, the application should call this method to free any
resource allocated during parsing. For example, close all opened streams.
getDTDContentModelHandler
public XMLDTDContentModelHandler getDTDContentModelHandler()
Returns the DTD content model handler.
getDTDHandler
public XMLDTDHandler getDTDHandler()
Returns the DTD handler.
getDocumentHandler
public XMLDocumentHandler getDocumentHandler()
Returns the document handler.
getEntityResolver
public XMLEntityResolver getEntityResolver()
Returns the entity resolver.
getErrorHandler
public XMLErrorHandler getErrorHandler()
Returns the error handler.
getLocale
public Locale getLocale()
Returns the locale.
parse
public void parse(XMLInputSource source)
throws XNIException,
IOException
Parses a document.
parse
public boolean parse(boolean complete)
throws XNIException,
IOException
Parses the document in a pull parsing fashion.
complete
- True if the pull parser should parse the
remaining document completely.
- True if there is more document to parse.
pushInputSource
public void pushInputSource(XMLInputSource inputSource)
Pushes an input source onto the current entity stack. This
enables the scanner to transparently scan new content (e.g.
the output written by an embedded script). At the end of the
current entity, the scanner returns where it left off at the
time this entity source was pushed.
Hint:
To use this feature to insert the output of <SCRIPT>
tags, remember to buffer the
entire output of the
processed instructions before pushing a new input source.
Otherwise, events may appear out of sequence.
inputSource
- The new input source to start scanning.
reset
protected void reset()
throws XMLConfigurationException
Resets the parser configuration.
setDTDContentModelHandler
public void setDTDContentModelHandler(XMLDTDContentModelHandler handler)
Sets the DTD content model handler.
setDTDHandler
public void setDTDHandler(XMLDTDHandler handler)
Sets the DTD handler.
setDocumentHandler
public void setDocumentHandler(XMLDocumentHandler handler)
Sets the document handler.
setEntityResolver
public void setEntityResolver(XMLEntityResolver resolver)
Sets the entity resolver.
setErrorHandler
public void setErrorHandler(XMLErrorHandler handler)
Sets the error handler.
setFeature
public void setFeature(String featureId,
boolean state)
throws XMLConfigurationException
Sets a feature.
setInputSource
public void setInputSource(XMLInputSource inputSource)
throws XMLConfigurationException,
IOException
Sets the input source for the document to parse.
inputSource
- The document's input source.
setLocale
public void setLocale(Locale locale)
Sets the locale.
setProperty
public void setProperty(String propertyId,
Object value)
throws XMLConfigurationException
Sets a property.
(C) Copyright 2002-2005, Andy Clark. All rights reserved.