Class Tesseract

java.lang.Object
net.sourceforge.tess4j.Tesseract
All Implemented Interfaces:
ITesseract

public class Tesseract extends Object implements ITesseract
An object layer on top of TessAPI, provides character recognition support for common image formats, and multi-page TIFF images beyond the uncompressed, binary TIFF format supported by Tesseract OCR engine. The extended capabilities are provided by the Java Advanced Imaging Image I/O Tools.

Support for PDF documents is available through PDFBox.

Any program that uses the library will need to ensure that the required libraries (the .jar files for jna and jai-imageio) are in its compile and run-time classpath.
  • Constructor Details

    • Tesseract

      public Tesseract()
  • Method Details

    • getAPI

      protected TessAPI getAPI()
      Returns TessAPI object.
      Returns:
      api
    • getHandle

      protected ITessAPI.TessBaseAPI getHandle()
      Returns API handle.
      Returns:
      handle
    • setDatapath

      public void setDatapath(String datapath)
      Sets path to tessdata.
      Specified by:
      setDatapath in interface ITesseract
      Parameters:
      datapath - the tessdata path to set
    • setLanguage

      public void setLanguage(String language)
      Sets language for OCR.
      Specified by:
      setLanguage in interface ITesseract
      Parameters:
      language - the language code, which follows ISO 639-3 standard.
    • setOcrEngineMode

      public void setOcrEngineMode(int ocrEngineMode)
      Sets OCR engine mode.
      Specified by:
      setOcrEngineMode in interface ITesseract
      Parameters:
      ocrEngineMode - the OcrEngineMode to set
    • setPageSegMode

      public void setPageSegMode(int mode)
      Sets page segmentation mode.
      Specified by:
      setPageSegMode in interface ITesseract
      Parameters:
      mode - the page segmentation mode to set
    • setHocr

      public void setHocr(boolean hocr)
      Deprecated.
      Use setVariable("tessedit_create_hocr", "1") instead.
      Enables hocr output.
      Parameters:
      hocr - to enable hocr output
    • setTessVariable

      @Deprecated public void setTessVariable(String key, String value)
      Set the value of Tesseract's internal parameter.
      Specified by:
      setTessVariable in interface ITesseract
      Parameters:
      key - variable name, e.g., tessedit_create_hocr, tessedit_char_whitelist, etc.
      value - value for corresponding variable, e.g., "1", "0", "0123456789", etc.
    • setVariable

      public void setVariable(String key, String value)
      Set the value of Tesseract's internal parameter.
      Specified by:
      setVariable in interface ITesseract
      Parameters:
      key - variable name, e.g., tessedit_create_hocr, tessedit_char_whitelist, etc.
      value - value for corresponding variable, e.g., "1", "0", "0123456789", etc.
    • setConfigs

      public void setConfigs(List<String> configs)
      Sets configs to be passed to Tesseract's Init method.
      Specified by:
      setConfigs in interface ITesseract
      Parameters:
      configs - list of config filenames, e.g., "digits", "bazaar", "quiet"
    • doOCR

      public String doOCR(File imageFile) throws TesseractException
      Performs OCR operation.
      Specified by:
      doOCR in interface ITesseract
      Parameters:
      imageFile - an image file
      Returns:
      the recognized text
      Throws:
      TesseractException
    • doOCR

      public String doOCR(File inputFile, Rectangle rect) throws TesseractException
      Performs OCR operation.
      Specified by:
      doOCR in interface ITesseract
      Parameters:
      inputFile - an image file
      rect - the bounding rectangle defines the region of the image to be recognized. A rectangle of zero dimension or null indicates the whole image.
      Returns:
      the recognized text
      Throws:
      TesseractException
    • doOCR

      public String doOCR(BufferedImage bi) throws TesseractException
      Performs OCR operation.
      Specified by:
      doOCR in interface ITesseract
      Parameters:
      bi - a buffered image
      Returns:
      the recognized text
      Throws:
      TesseractException
    • doOCR

      public String doOCR(BufferedImage bi, Rectangle rect) throws TesseractException
      Performs OCR operation.
      Specified by:
      doOCR in interface ITesseract
      Parameters:
      bi - a buffered image
      rect - the bounding rectangle defines the region of the image to be recognized. A rectangle of zero dimension or null indicates the whole image.
      Returns:
      the recognized text
      Throws:
      TesseractException
    • doOCR

      public String doOCR(List<IIOImage> imageList, Rectangle rect) throws TesseractException
      Performs OCR operation.
      Specified by:
      doOCR in interface ITesseract
      Parameters:
      imageList - a list of IIOImage objects
      rect - the bounding rectangle defines the region of the image to be recognized. A rectangle of zero dimension or null indicates the whole image.
      Returns:
      the recognized text
      Throws:
      TesseractException
    • doOCR

      public String doOCR(List<IIOImage> imageList, String filename, Rectangle rect) throws TesseractException
      Performs OCR operation.
      Specified by:
      doOCR in interface ITesseract
      Parameters:
      imageList - a list of IIOImage objects
      filename - input file name. Needed only for training and reading a UNLV zone file.
      rect - the bounding rectangle defines the region of the image to be recognized. A rectangle of zero dimension or null indicates the whole image.
      Returns:
      the recognized text
      Throws:
      TesseractException
    • doOCR

      public String doOCR(int xsize, int ysize, ByteBuffer buf, Rectangle rect, int bpp) throws TesseractException
      Performs OCR operation. Use SetImage, (optionally) SetRectangle, and one or more of the Get*Text functions.
      Specified by:
      doOCR in interface ITesseract
      Parameters:
      xsize - width of image
      ysize - height of image
      buf - pixel data
      rect - the bounding rectangle defines the region of the image to be recognized. A rectangle of zero dimension or null indicates the whole image.
      bpp - bits per pixel, represents the bit depth of the image, with 1 for binary bitmap, 8 for gray, and 24 for color RGB.
      Returns:
      the recognized text
      Throws:
      TesseractException
    • doOCR

      public String doOCR(int xsize, int ysize, ByteBuffer buf, String filename, Rectangle rect, int bpp) throws TesseractException
      Performs OCR operation. Use SetImage, (optionally) SetRectangle, and one or more of the Get*Text functions.
      Specified by:
      doOCR in interface ITesseract
      Parameters:
      xsize - width of image
      ysize - height of image
      buf - pixel data
      filename - input file name. Needed only for training and reading a UNLV zone file.
      rect - the bounding rectangle defines the region of the image to be recognized. A rectangle of zero dimension or null indicates the whole image.
      bpp - bits per pixel, represents the bit depth of the image, with 1 for binary bitmap, 8 for gray, and 24 for color RGB.
      Returns:
      the recognized text
      Throws:
      TesseractException
    • init

      protected void init()
      Initializes Tesseract engine.
    • setVariables

      protected void setVariables()
      Sets Tesseract's internal parameters.
    • setImage

      protected void setImage(RenderedImage image, Rectangle rect) throws IOException
      Parameters:
      image - a rendered image
      rect - region of interest
      Throws:
      IOException
    • setImage

      protected void setImage(int xsize, int ysize, ByteBuffer buf, Rectangle rect, int bpp)
      Sets image to be processed.
      Parameters:
      xsize - width of image
      ysize - height of image
      buf - pixel data
      rect - the bounding rectangle defines the region of the image to be recognized. A rectangle of zero dimension or null indicates the whole image.
      bpp - bits per pixel, represents the bit depth of the image, with 1 for binary bitmap, 8 for gray, and 24 for color RGB.
    • getOCRText

      protected String getOCRText(String filename, int pageNum)
      Gets recognized text.
      Parameters:
      filename - input file name. Needed only for reading a UNLV zone file.
      pageNum - page number; needed for hocr paging.
      Returns:
      the recognized text
    • createDocuments

      public void createDocuments(String filename, String outputbase, List<ITesseract.RenderedFormat> formats) throws TesseractException
      Creates documents for given renderer.
      Specified by:
      createDocuments in interface ITesseract
      Parameters:
      filename - input image
      outputbase - output filename without extension
      formats - types of renderer
      Throws:
      TesseractException
    • createDocuments

      public void createDocuments(String[] filenames, String[] outputbases, List<ITesseract.RenderedFormat> formats) throws TesseractException
      Creates documents for given renderer.
      Specified by:
      createDocuments in interface ITesseract
      Parameters:
      filenames - array of input files
      outputbases - array of output filenames without extension
      formats - types of renderer
      Throws:
      TesseractException
    • getSegmentedRegions

      public List<Rectangle> getSegmentedRegions(BufferedImage bi, int pageIteratorLevel) throws TesseractException
      Gets segmented regions at specified page iterator level.
      Specified by:
      getSegmentedRegions in interface ITesseract
      Parameters:
      bi - input buffered image
      pageIteratorLevel - TessPageIteratorLevel enum
      Returns:
      list of Rectangle
      Throws:
      TesseractException
    • getWords

      public List<Word> getWords(BufferedImage bi, int pageIteratorLevel)
      Gets recognized words at specified page iterator level.
      Specified by:
      getWords in interface ITesseract
      Parameters:
      bi - input buffered image
      pageIteratorLevel - TessPageIteratorLevel enum
      Returns:
      list of Word
    • createDocumentsWithResults

      public OCRResult createDocumentsWithResults(BufferedImage bi, String filename, String outputbase, List<ITesseract.RenderedFormat> formats, int pageIteratorLevel) throws TesseractException
      Creates documents with OCR result for given renderers at specified page iterator level.
      Specified by:
      createDocumentsWithResults in interface ITesseract
      Parameters:
      bi - input buffered image
      filename - filename (optional)
      outputbase - output filenames without extension
      formats - types of renderer
      pageIteratorLevel - TessPageIteratorLevel enum
      Returns:
      OCR result
      Throws:
      TesseractException
    • createDocumentsWithResults

      public List<OCRResult> createDocumentsWithResults(BufferedImage[] bis, String[] filenames, String[] outputbases, List<ITesseract.RenderedFormat> formats, int pageIteratorLevel) throws TesseractException
      Creates documents with OCR results for given renderers at specified page iterator level.
      Specified by:
      createDocumentsWithResults in interface ITesseract
      Parameters:
      bis - array of input buffered images
      filenames - array of filenames
      outputbases - array of output filenames without extension
      formats - types of renderer
      pageIteratorLevel - TessPageIteratorLevel enum
      Returns:
      list of OCR results
      Throws:
      TesseractException
    • createDocumentsWithResults

      public OCRResult createDocumentsWithResults(String filename, String outputbase, List<ITesseract.RenderedFormat> formats, int pageIteratorLevel) throws TesseractException
      Creates documents with OCR result for given renderers at specified page iterator level.
      Specified by:
      createDocumentsWithResults in interface ITesseract
      Parameters:
      filename - input file
      outputbase - output filenames without extension
      formats - types of renderer
      pageIteratorLevel - TessPageIteratorLevel enum
      Returns:
      OCR result
      Throws:
      TesseractException
    • createDocumentsWithResults

      public List<OCRResult> createDocumentsWithResults(String[] filenames, String[] outputbases, List<ITesseract.RenderedFormat> formats, int pageIteratorLevel) throws TesseractException
      Creates documents with OCR results for given renderers at specified page iterator level.
      Specified by:
      createDocumentsWithResults in interface ITesseract
      Parameters:
      filenames - array of input files
      outputbases - array of output filenames without extension
      formats - types of renderer
      pageIteratorLevel - TessPageIteratorLevel enum
      Returns:
      list of OCR results
      Throws:
      TesseractException
    • dispose

      protected void dispose()
      Releases all of the native resources used by this instance.