Class Tesseract

java.lang.Object
net.sourceforge.tess4j.Tesseract
All Implemented Interfaces:
ITesseract

public class Tesseract extends Object implements ITesseract
An object layer on top of TessAPI, provides character recognition support for common image formats, and multi-page TIFF images beyond the uncompressed, binary TIFF format supported by Tesseract OCR engine. The extended capabilities are provided by the Java Advanced Imaging Image I/O Tools.

Support for PDF documents is available through PDFBox.

Any program that uses the library will need to ensure that the required libraries (the .jar files for jna and jai-imageio) are in its compile and run-time classpath.
  • Constructor Details

    • Tesseract

      public Tesseract()
  • Method Details

    • getAPI

      protected TessAPI getAPI()
      Returns TessAPI object.
      Returns:
      api
    • getHandle

      protected ITessAPI.TessBaseAPI getHandle()
      Returns API handle.
      Returns:
      handle
    • setDatapath

      public void setDatapath(String datapath)
      Sets path to tessdata.
      Specified by:
      setDatapath in interface ITesseract
      Parameters:
      datapath - the tessdata path to set
    • setLanguage

      public void setLanguage(String language)
      Sets language for OCR.
      Specified by:
      setLanguage in interface ITesseract
      Parameters:
      language - the language code, which follows ISO 639-3 standard.
    • setOcrEngineMode

      public void setOcrEngineMode(int ocrEngineMode)
      Sets OCR engine mode.
      Specified by:
      setOcrEngineMode in interface ITesseract
      Parameters:
      ocrEngineMode - the OcrEngineMode to set
    • setPageSegMode

      public void setPageSegMode(int mode)
      Sets page segmentation mode.
      Specified by:
      setPageSegMode in interface ITesseract
      Parameters:
      mode - the page segmentation mode to set
    • setVariable

      public void setVariable(String key, String value)
      Set the value of Tesseract's internal parameter.
      Specified by:
      setVariable in interface ITesseract
      Parameters:
      key - variable name, e.g., tessedit_create_hocr, tessedit_char_whitelist, etc.
      value - value for corresponding variable, e.g., "1", "0", "0123456789", etc.
    • setConfigs

      public void setConfigs(List<String> configs)
      Sets configs to be passed to Tesseract's Init method.
      Specified by:
      setConfigs in interface ITesseract
      Parameters:
      configs - list of config filenames, e.g., "digits", "bazaar", "quiet"
    • doOCR

      public String doOCR(File inputFile, List<Rectangle> rects) throws TesseractException
      Performs OCR operation.
      Specified by:
      doOCR in interface ITesseract
      Parameters:
      inputFile - an image file
      rects - list of the bounding rectangles defines the regions of the image to be recognized. A rectangle of zero dimension or null indicates the whole image.
      Returns:
      the recognized text
      Throws:
      TesseractException
    • doOCR

      public String doOCR(List<IIOImage> imageList, String filename, List<List<Rectangle>> roiss) throws TesseractException
      Performs OCR operation.
      Specified by:
      doOCR in interface ITesseract
      Parameters:
      imageList - a list of IIOImage objects
      filename - input file name. Needed only for training and reading a UNLV zone file.
      roiss - list of list of the bounding rectangles defines the regions of the images to be recognized. A rectangle of zero dimension or null indicates the whole image.
      Returns:
      the recognized text
      Throws:
      TesseractException
    • doOCR

      public String doOCR(int xsize, int ysize, ByteBuffer buf, int bpp, String filename, List<Rectangle> rects) throws TesseractException
      Performs OCR operation. Use SetImage, (optionally) SetRectangle, and one or more of the Get*Text functions.
      Specified by:
      doOCR in interface ITesseract
      Parameters:
      xsize - width of image
      ysize - height of image
      buf - pixel data
      bpp - bits per pixel, represents the bit depth of the image, with 1 for binary bitmap, 8 for gray, and 24 for color RGB.
      filename - input file name. Needed only for training and reading a UNLV zone file.
      rects - list of the bounding rectangle defines the regions of the image to be recognized. A rectangle of zero dimension or null indicates the whole image.
      Returns:
      the recognized text
      Throws:
      TesseractException
    • init

      protected void init()
      Initializes Tesseract engine.
    • setVariables

      protected void setVariables()
      Sets Tesseract's internal parameters.
    • setImage

      protected void setImage(RenderedImage image) throws IOException
      Parameters:
      image - a rendered image
      Throws:
      IOException
    • setImage

      protected void setImage(int xsize, int ysize, ByteBuffer buf, int bpp)
      Sets image to be processed.
      Parameters:
      xsize - width of image
      ysize - height of image
      buf - pixel data
      bpp - bits per pixel, represents the bit depth of the image, with 1 for binary bitmap, 8 for gray, and 24 for color RGB.
    • setROI

      protected void setROI(Rectangle rect)
      Sets region of interest.
      Parameters:
      rect - region of interest
    • getOCRText

      protected String getOCRText(String filename, int pageNum)
      Gets recognized text.
      Parameters:
      filename - input file name. Needed only for reading a UNLV zone file.
      pageNum - page number; needed for hocr paging.
      Returns:
      the recognized text
    • createDocuments

      public void createDocuments(String[] filenames, String[] outputbases, List<ITesseract.RenderedFormat> formats) throws TesseractException
      Creates documents for given renderer.
      Specified by:
      createDocuments in interface ITesseract
      Parameters:
      filenames - array of input files
      outputbases - array of output filenames without extension
      formats - types of renderer
      Throws:
      TesseractException
    • getSegmentedRegions

      public List<Rectangle> getSegmentedRegions(BufferedImage bi, int pageIteratorLevel) throws TesseractException
      Gets segmented regions at specified page iterator level.
      Specified by:
      getSegmentedRegions in interface ITesseract
      Parameters:
      bi - input buffered image
      pageIteratorLevel - TessPageIteratorLevel enum
      Returns:
      list of Rectangle
      Throws:
      TesseractException
    • getWords

      public List<Word> getWords(List<BufferedImage> biList, int pageIteratorLevel)
      Gets recognized words at specified page iterator level.
      Specified by:
      getWords in interface ITesseract
      Parameters:
      biList - list of input buffered images
      pageIteratorLevel - TessPageIteratorLevel enum
      Returns:
      list of Word
    • createDocumentsWithResults

      public OCRResult createDocumentsWithResults(BufferedImage bi, String filename, String outputbase, List<ITesseract.RenderedFormat> formats, int pageIteratorLevel) throws TesseractException
      Creates documents with OCR result for given renderers at specified page iterator level.
      Specified by:
      createDocumentsWithResults in interface ITesseract
      Parameters:
      bi - input buffered image
      filename - filename (optional)
      outputbase - output filenames without extension
      formats - types of renderer
      pageIteratorLevel - TessPageIteratorLevel enum
      Returns:
      OCR result
      Throws:
      TesseractException
    • createDocumentsWithResults

      public List<OCRResult> createDocumentsWithResults(BufferedImage[] bis, String[] filenames, String[] outputbases, List<ITesseract.RenderedFormat> formats, int pageIteratorLevel) throws TesseractException
      Creates documents with OCR results for given renderers at specified page iterator level.
      Specified by:
      createDocumentsWithResults in interface ITesseract
      Parameters:
      bis - array of input buffered images
      filenames - array of filenames
      outputbases - array of output filenames without extension
      formats - types of renderer
      pageIteratorLevel - TessPageIteratorLevel enum
      Returns:
      list of OCR results
      Throws:
      TesseractException
    • createDocumentsWithResults

      public OCRResult createDocumentsWithResults(String filename, String outputbase, List<ITesseract.RenderedFormat> formats, int pageIteratorLevel) throws TesseractException
      Creates documents with OCR result for given renderers at specified page iterator level.
      Specified by:
      createDocumentsWithResults in interface ITesseract
      Parameters:
      filename - input file
      outputbase - output filenames without extension
      formats - types of renderer
      pageIteratorLevel - TessPageIteratorLevel enum
      Returns:
      OCR result
      Throws:
      TesseractException
    • createDocumentsWithResults

      public List<OCRResult> createDocumentsWithResults(String[] filenames, String[] outputbases, List<ITesseract.RenderedFormat> formats, int pageIteratorLevel) throws TesseractException
      Creates documents with OCR results for given renderers at specified page iterator level.
      Specified by:
      createDocumentsWithResults in interface ITesseract
      Parameters:
      filenames - array of input files
      outputbases - array of output filenames without extension
      formats - types of renderer
      pageIteratorLevel - TessPageIteratorLevel enum
      Returns:
      list of OCR results
      Throws:
      TesseractException
    • getOSD

      public OSDResult getOSD(File imageFile)
      Gets the detected orientation of the input image and apparent script (alphabet).
      Specified by:
      getOSD in interface ITesseract
      Parameters:
      imageFile - an image file
      Returns:
      image orientation and script name
    • getOSD

      public OSDResult getOSD(BufferedImage bi)
      Gets the detected orientation of the input image and apparent script (alphabet).
      Specified by:
      getOSD in interface ITesseract
      Parameters:
      bi - a buffered image
      Returns:
      image orientation and script name
    • dispose

      protected void dispose()
      Releases all of the native resources used by this instance.