public interface ITesseract
Modifier and Type | Interface and Description |
---|---|
static class |
ITesseract.RenderedFormat
Rendered formats supported by Tesseract.
|
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
htmlBeginTag |
static java.lang.String |
htmlEndTag |
Modifier and Type | Method and Description |
---|---|
void |
createDocuments(java.lang.String[] filenames,
java.lang.String[] outputbases,
java.util.List<ITesseract.RenderedFormat> formats)
Creates documents for given renderers.
|
void |
createDocuments(java.lang.String filename,
java.lang.String outputbase,
java.util.List<ITesseract.RenderedFormat> formats)
Creates documents for given renderers.
|
java.lang.String |
doOCR(java.awt.image.BufferedImage bi)
Performs OCR operation.
|
java.lang.String |
doOCR(java.awt.image.BufferedImage bi,
java.awt.Rectangle rect)
Performs OCR operation.
|
java.lang.String |
doOCR(java.io.File imageFile)
Performs OCR operation.
|
java.lang.String |
doOCR(java.io.File imageFile,
java.awt.Rectangle rect)
Performs OCR operation.
|
java.lang.String |
doOCR(int xsize,
int ysize,
java.nio.ByteBuffer buf,
java.awt.Rectangle rect,
int bpp)
Performs OCR operation.
|
java.lang.String |
doOCR(int xsize,
int ysize,
java.nio.ByteBuffer buf,
java.lang.String filename,
java.awt.Rectangle rect,
int bpp)
Performs OCR operation.
|
java.lang.String |
doOCR(java.util.List<javax.imageio.IIOImage> imageList,
java.awt.Rectangle rect)
Performs OCR operation.
|
java.lang.String |
doOCR(java.util.List<javax.imageio.IIOImage> imageList,
java.lang.String filename,
java.awt.Rectangle rect)
Performs OCR operation.
|
void |
setConfigs(java.util.List<java.lang.String> configs)
Sets configs to be passed to Tesseract's
Init method. |
void |
setDatapath(java.lang.String datapath)
Sets tessdata path.
|
void |
setLanguage(java.lang.String language)
Sets language for OCR.
|
void |
setOcrEngineMode(int ocrEngineMode)
Sets OCR engine mode.
|
void |
setPageSegMode(int mode)
Sets page segmentation mode.
|
void |
setTessVariable(java.lang.String key,
java.lang.String value)
Sets the value of Tesseract's internal parameter.
|
static final java.lang.String htmlBeginTag
static final java.lang.String htmlEndTag
java.lang.String doOCR(java.io.File imageFile) throws TesseractException
imageFile
- an image fileTesseractException
java.lang.String doOCR(java.io.File imageFile, java.awt.Rectangle rect) throws TesseractException
imageFile
- an image filerect
- the bounding rectangle defines the region of the image to be
recognized. A rectangle of zero dimension or null
indicates
the whole image.TesseractException
java.lang.String doOCR(java.awt.image.BufferedImage bi) throws TesseractException
bi
- a buffered imageTesseractException
java.lang.String doOCR(java.awt.image.BufferedImage bi, java.awt.Rectangle rect) throws TesseractException
bi
- a buffered imagerect
- the bounding rectangle defines the region of the image to be
recognized. A rectangle of zero dimension or null
indicates
the whole image.TesseractException
java.lang.String doOCR(java.util.List<javax.imageio.IIOImage> imageList, java.awt.Rectangle rect) throws TesseractException
imageList
- a list of IIOImage
objectsrect
- the bounding rectangle defines the region of the image to be
recognized. A rectangle of zero dimension or null
indicates
the whole image.TesseractException
java.lang.String doOCR(java.util.List<javax.imageio.IIOImage> imageList, java.lang.String filename, java.awt.Rectangle rect) throws TesseractException
imageList
- a list of IIOImage
objectsfilename
- input file name. Needed only for training and reading a
UNLV zone file.rect
- the bounding rectangle defines the region of the image to be
recognized. A rectangle of zero dimension or null
indicates
the whole image.TesseractException
java.lang.String doOCR(int xsize, int ysize, java.nio.ByteBuffer buf, java.awt.Rectangle rect, int bpp) throws TesseractException
SetImage
, (optionally)
SetRectangle
, and one or more of the Get*Text
functions.xsize
- width of imageysize
- height of imagebuf
- pixel datarect
- the bounding rectangle defines the region of the image to be
recognized. A rectangle of zero dimension or null
indicates
the whole image.bpp
- bits per pixel, represents the bit depth of the image, with 1
for binary bitmap, 8 for gray, and 24 for color RGB.TesseractException
java.lang.String doOCR(int xsize, int ysize, java.nio.ByteBuffer buf, java.lang.String filename, java.awt.Rectangle rect, int bpp) throws TesseractException
SetImage
, (optionally)
SetRectangle
, and one or more of the Get*Text
functions.xsize
- width of imageysize
- height of imagebuf
- pixel datafilename
- input file name. Needed only for training and reading a
UNLV zone file.rect
- the bounding rectangle defines the region of the image to be
recognized. A rectangle of zero dimension or null
indicates
the whole image.bpp
- bits per pixel, represents the bit depth of the image, with 1
for binary bitmap, 8 for gray, and 24 for color RGB.TesseractException
void setDatapath(java.lang.String datapath)
datapath
- the tessdata path to setvoid setLanguage(java.lang.String language)
language
- the language code, which follows ISO 639-3 standard.void setOcrEngineMode(int ocrEngineMode)
ocrEngineMode
- the OcrEngineMode to setvoid setPageSegMode(int mode)
mode
- the page segmentation mode to setvoid setTessVariable(java.lang.String key, java.lang.String value)
key
- variable name, e.g., tessedit_create_hocr
,
tessedit_char_whitelist
, etc.value
- value for corresponding variable, e.g., "1", "0",
"0123456789", etc.void setConfigs(java.util.List<java.lang.String> configs)
Init
method.configs
- list of config filenames, e.g., "digits", "bazaar",
"quiet"void createDocuments(java.lang.String filename, java.lang.String outputbase, java.util.List<ITesseract.RenderedFormat> formats) throws TesseractException
filename
- input imageoutputbase
- output filename without extensionformats
- types of renderersTesseractException
void createDocuments(java.lang.String[] filenames, java.lang.String[] outputbases, java.util.List<ITesseract.RenderedFormat> formats) throws TesseractException
filenames
- array of input filesoutputbases
- array of output filenames without extensionformats
- types of renderersTesseractException