org.pdfbox.pdfparser

Class BaseParser

Known Direct Subclasses:
PDFObjectStreamParser, PDFParser, PDFStreamParser

public abstract class BaseParser
extends java.lang.Object

This class is used to contain parsing logic that will be used by both the PDFParser and the COSStreamParser.
Version:
$Revision: 1.59 $
Author:
Ben Litchfield

Field Summary

static String
DEF
This is a byte array that will be used for comparisons.
static byte[]
ENDSTREAM
This is a byte array that will be used for comparisons.
protected PushBackInputStream
pdfSource
This is the stream that will be read from.

Constructor Summary

BaseParser(InputStream input)
Constructor.
BaseParser(byte[] input)
Constructor.

Method Summary

void
addXref(PDFXref xref)
This will add an xref.
List
getXrefs()
This will get all of the xrefs.
protected boolean
isClosing()
This will tell if the next character is a closing brace( close of PDF array ).
protected boolean
isClosing(int c)
This will tell if the next character is a closing brace( close of PDF array ).
protected boolean
isEOL()
This will tell if the next byte to be read is an end of line byte.
protected boolean
isEOL(int c)
This will tell if the next byte to be read is an end of line byte.
protected boolean
isEndOfName(char ch)
Determine if a character terminates a PDF name.
protected boolean
isWhitespace()
This will tell if the next byte is whitespace or not.
protected boolean
isWhitespace(int c)
This will tell if the next byte is whitespace or not.
protected COSBoolean
parseBoolean()
This will parse a boolean object from the stream.
protected COSArray
parseCOSArray()
This will parse a PDF array object.
protected COSDictionary
parseCOSDictionary()
This will parse a PDF dictionary.
protected COSName
parseCOSName()
This will parse a PDF name from the stream.
protected COSStream
parseCOSStream(COSDictionary dic, RandomAccess file)
This will read a COSStream from the input stream.
protected COSString
parseCOSString()
This will parse a PDF string.
protected COSBase
parseDirObject()
This will parse a directory object from the stream.
protected String
readExpectedString(String theString)
This will read bytes until the end of line marker occurs.
protected int
readInt()
This will read an integer from the stream.
protected String
readLine()
This will read bytes until the end of line marker occurs.
protected String
readString()
This will read the next string from the stream.
protected String
readString(int length)
This will read the next string from the stream up to a certain length.
void
setDocument(COSDocument doc)
Set the document for this stream.
protected void
skipSpaces()
This will skip all spaces and comments that are present.

Field Details

DEF

public static final String DEF
This is a byte array that will be used for comparisons.

ENDSTREAM

public static final byte[] ENDSTREAM
This is a byte array that will be used for comparisons.

pdfSource

protected PushBackInputStream pdfSource
This is the stream that will be read from.

Constructor Details

BaseParser

public BaseParser(InputStream input)
            throws IOException
Constructor.
Parameters:
input - The input stream to read the data from.

BaseParser

protected BaseParser(byte[] input)
            throws IOException
Constructor.
Parameters:
input - The array to read the data from.

Method Details

addXref

public void addXref(PDFXref xref)
This will add an xref.
Parameters:
xref - The xref to add.

getXrefs

public List getXrefs()
This will get all of the xrefs.
Returns:
A list of all xrefs.

isClosing

protected boolean isClosing()
            throws IOException
This will tell if the next character is a closing brace( close of PDF array ).
Returns:
true if the next byte is ']', false otherwise.

isClosing

protected boolean isClosing(int c)
This will tell if the next character is a closing brace( close of PDF array ).
Parameters:
c - The character to check against end of line
Returns:
true if the next byte is ']', false otherwise.

isEOL

protected boolean isEOL()
            throws IOException
This will tell if the next byte to be read is an end of line byte.
Returns:
true if the next byte is 0x0A or 0x0D.

isEOL

protected boolean isEOL(int c)
This will tell if the next byte to be read is an end of line byte.
Parameters:
c - The character to check against end of line
Returns:
true if the next byte is 0x0A or 0x0D.

isEndOfName

protected boolean isEndOfName(char ch)
Determine if a character terminates a PDF name.
Parameters:
ch - The character
Returns:
true if the character terminates a PDF name, otherwise false.

isWhitespace

protected boolean isWhitespace()
            throws IOException
This will tell if the next byte is whitespace or not.
Returns:
true if the next byte in the stream is a whitespace character.

isWhitespace

protected boolean isWhitespace(int c)
This will tell if the next byte is whitespace or not.
Parameters:
c - The character to check against whitespace
Returns:
true if the next byte in the stream is a whitespace character.

parseBoolean

protected COSBoolean parseBoolean()
            throws IOException
This will parse a boolean object from the stream.
Returns:
The parsed boolean object.

parseCOSArray

protected COSArray parseCOSArray()
            throws IOException
This will parse a PDF array object.
Returns:
The parsed PDF array.

parseCOSDictionary

protected COSDictionary parseCOSDictionary()
            throws IOException
This will parse a PDF dictionary.
Returns:
The parsed dictionary.

parseCOSName

protected COSName parseCOSName()
            throws IOException
This will parse a PDF name from the stream.
Returns:
The parsed PDF name.

parseCOSStream

protected COSStream parseCOSStream(COSDictionary dic,
                                   RandomAccess file)
            throws IOException
This will read a COSStream from the input stream.
Parameters:
dic - The dictionary that goes with this stream.
file - The file to write the stream to when reading.
Returns:
The parsed pdf stream.

parseCOSString

protected COSString parseCOSString()
            throws IOException
This will parse a PDF string.
Returns:
The parsed PDF string.

parseDirObject

protected COSBase parseDirObject()
            throws IOException
This will parse a directory object from the stream.
Returns:
The parsed object.

readExpectedString

protected String readExpectedString(String theString)
            throws IOException
This will read bytes until the end of line marker occurs.
Parameters:
theString - The next expected string in the stream.
Returns:
The characters between the current position and the end of the line.

readInt

protected int readInt()
            throws IOException
This will read an integer from the stream.
Returns:
The integer that was read from the stream.

readLine

protected String readLine()
            throws IOException
This will read bytes until the end of line marker occurs.
Returns:
The characters between the current position and the end of the line.

readString

protected String readString()
            throws IOException
This will read the next string from the stream.
Returns:
The string that was read from the stream.

readString

protected String readString(int length)
            throws IOException
This will read the next string from the stream up to a certain length.
Parameters:
length - The length to stop reading at.
Returns:
The string that was read from the stream of length 0 to length.

setDocument

public void setDocument(COSDocument doc)
Set the document for this stream.
Parameters:
doc - The current document.

skipSpaces

protected void skipSpaces()
            throws IOException
This will skip all spaces and comments that are present.