This class will run through a PDF content stream and execute certain operations
and provide a callback interface for clients that want to do things with the stream.
See the PDFTextStripper class for an example of how to use this class.
getColorSpaces
public Map getColorSpaces()
getCurrentPage
public PDPage getCurrentPage()
Get the current page that is being processed.
- The page being processed.
getFonts
public Map getFonts()
getGraphicsStack
public Stack getGraphicsStack()
- Returns the graphicsStack.
getGraphicsState
public PDGraphicsState getGraphicsState()
- Returns the graphicsState.
getGraphicsStates
public Map getGraphicsStates()
- Returns the graphicsStates.
getTextLineMatrix
public Matrix getTextLineMatrix()
- Returns the textLineMatrix.
getTextMatrix
public Matrix getTextMatrix()
getXObjects
public Map getXObjects()
processOperator
public void processOperator(String operation,
List arguments)
throws IOException
This is used to handle an operation.
operation
- The operation to perform.arguments
- The list of arguments.
processOperator
protected void processOperator(PDFOperator operator,
List arguments)
throws IOException
This is used to handle an operation.
operator
- The operation to perform.arguments
- The list of arguments.
processStream
public void processStream(PDPage aPage,
PDResources resources,
COSStream cosStream)
throws IOException
This will process the contents of the stream.
aPage
- The page.resources
- The location to retrieve resources.cosStream
- the Stream to execute.
processSubStream
public void processSubStream(PDPage aPage,
PDResources resources,
COSStream cosStream)
throws IOException
Process a sub stream of the current stream.
aPage
- The page used for drawing.resources
- The resources used when processing the stream.cosStream
- The stream to process.
registerOperatorProcessor
public void registerOperatorProcessor(String operator,
OperatorProcessor op)
Register a custom operator processor with the engine.
operator
- The operator as a string.op
- Processor instance.
resetEngine
public void resetEngine()
This method must be called between processing documents. The
PDFStreamEngine caches information for the document between pages
and this will release the cached information. This only needs
to be called if processing a new document.
setColorSpaces
public void setColorSpaces(Map value)
value
- The colorSpaces to set.
setFonts
public void setFonts(Map value)
value
- The fonts to set.
setGraphicsStack
public void setGraphicsStack(Stack value)
value
- The graphicsStack to set.
setGraphicsState
public void setGraphicsState(PDGraphicsState value)
value
- The graphicsState to set.
setGraphicsStates
public void setGraphicsStates(Map value)
value
- The graphicsStates to set.
setTextLineMatrix
public void setTextLineMatrix(Matrix value)
value
- The textLineMatrix to set.
setTextMatrix
public void setTextMatrix(Matrix value)
value
- The textMatrix to set.
showCharacter
protected void showCharacter(TextPosition text)
A method provided as an event interface to allow a subclass to perform
some specific functionality when a character needs to be displayed.
text
- The character to be displayed.
showString
public void showString(byte[] string)
throws IOException
You should override this method if you want to perform an action when a
string is being shown.
string
- The string to display.