org.pdfbox.util

Class PDFHighlighter


public class PDFHighlighter
extends PDFTextStripper

Highlighting of words in a PDF document with an XML file.
Version:
$Revision: 1.7 $
Authors:
slagraulet (slagraulet@cardiweb.com)
Ben Litchfield
See Also:
Adobe Highlight File Format

Field Summary

Fields inherited from class org.pdfbox.util.PDFTextStripper

charactersByArticle, output

Constructor Summary

PDFHighlighter()
Default constructor.

Method Summary

protected void
endPage(PDPage pdPage)
void
generateXMLHighlight(PDDocument pdDocument, String highlightWord, Writer xmlOutput)
Generate an XML highlight string based on the PDF.
void
generateXMLHighlight(PDDocument pdDocument, String[] sWords, Writer xmlOutput)
Generate an XML highlight string based on the PDF.
static void
main(String[] args)
Command line application.

Methods inherited from class org.pdfbox.util.PDFTextStripper

endDocument, endPage, endParagraph, flushText, getCharactersByArticle, getCurrentPageNo, getEndBookmark, getEndPage, getLineSeparator, getOutput, getPageSeparator, getStartBookmark, getStartPage, getText, getText, getWordSeparator, processPage, processPages, setEndBookmark, setEndPage, setLineSeparator, setPageSeparator, setShouldSeparateByBeads, setSortByPosition, setStartBookmark, setStartPage, setSuppressDuplicateOverlappingText, setWordSeparator, shouldSeparateByBeads, shouldSortByPosition, shouldSuppressDuplicateOverlappingText, showCharacter, startDocument, startPage, startParagraph, writeCharacters, writeText, writeText

Methods inherited from class org.pdfbox.util.PDFStreamEngine

getColorSpaces, getCurrentPage, getFonts, getGraphicsStack, getGraphicsState, getGraphicsStates, getResources, getTextLineMatrix, getTextMatrix, getXObjects, processOperator, processOperator, processStream, processSubStream, registerOperatorProcessor, resetEngine, setColorSpaces, setFonts, setGraphicsStack, setGraphicsState, setGraphicsStates, setTextLineMatrix, setTextMatrix, showCharacter, showString

Constructor Details

PDFHighlighter

public PDFHighlighter()
            throws IOException
Default constructor.

Method Details

endPage

protected void endPage(PDPage pdPage)
            throws IOException
Overrides:
endPage in interface PDFTextStripper

generateXMLHighlight

public void generateXMLHighlight(PDDocument pdDocument,
                                 String highlightWord,
                                 Writer xmlOutput)
            throws IOException
Generate an XML highlight string based on the PDF.
Parameters:
pdDocument - The PDF to find words in.
highlightWord - The word to search for.
xmlOutput - The resulting output xml file.

generateXMLHighlight

public void generateXMLHighlight(PDDocument pdDocument,
                                 String[] sWords,
                                 Writer xmlOutput)
            throws IOException
Generate an XML highlight string based on the PDF.
Parameters:
pdDocument - The PDF to find words in.
sWords - The words to search for.
xmlOutput - The resulting output xml file.

main

public static void main(String[] args)
            throws IOException
Command line application.
Parameters:
args - The command line arguments to the application.