Package | Description |
---|---|
net.sourceforge.tess4j |
Modifier and Type | Method and Description |
---|---|
TessAPI.TessBaseAPI |
TessAPI.TessBaseAPICreate()
Creates an instance of the base class for all Tesseract APIs.
|
Modifier and Type | Method and Description |
---|---|
int |
TessAPI.TessBaseAPIAdaptToWordStr(TessAPI.TessBaseAPI handle,
int mode,
java.lang.String wordstr)
Applies the given word to the adaptive classifier if possible.
|
com.sun.jna.ptr.IntByReference |
TessAPI.TessBaseAPIAllWordConfidences(TessAPI.TessBaseAPI handle)
Returns all word confidences (between 0 and 100) in an array, terminated
by -1.
|
TessAPI.TessPageIterator |
TessAPI.TessBaseAPIAnalyseLayout(TessAPI.TessBaseAPI handle)
Runs page layout analysis in the mode set by SetPageSegMode.
|
void |
TessAPI.TessBaseAPIClear(TessAPI.TessBaseAPI handle)
Free up recognition results and any stored image data, without actually
freeing any recognition data that would be time-consuming to reload.
|
void |
TessAPI.TessBaseAPIClearAdaptiveClassifier(TessAPI.TessBaseAPI handle)
Call between pages or documents etc to free up memory and forget adaptive
data.
|
void |
TessAPI.TessBaseAPIDelete(TessAPI.TessBaseAPI handle)
Disposes the TesseractAPI instance.
|
void |
TessAPI.TessBaseAPIEnd(TessAPI.TessBaseAPI handle)
Close down tesseract and free up all memory.
|
com.sun.jna.ptr.PointerByReference |
TessAPI.TessBaseAPIGetAvailableLanguagesAsVector(TessAPI.TessBaseAPI handle)
Returns the available languages in the vector of STRINGs.
|
int |
TessAPI.TessBaseAPIGetBoolVariable(TessAPI.TessBaseAPI handle,
java.lang.String name,
java.nio.IntBuffer value) |
java.lang.String |
TessAPI.TessBaseAPIGetBoxText(TessAPI.TessBaseAPI handle,
int page_number)
The recognized text is returned as a char* which is coded in the same
format as a box file used in training.
|
int |
TessAPI.TessBaseAPIGetDoubleVariable(TessAPI.TessBaseAPI handle,
java.lang.String name,
java.nio.DoubleBuffer value) |
java.lang.String |
TessAPI.TessBaseAPIGetHOCRText(TessAPI.TessBaseAPI handle,
int page_number)
Make a HTML-formatted string with hOCR markup from the internal data
structures.
|
java.lang.String |
TessAPI.TessBaseAPIGetInitLanguagesAsString(TessAPI.TessBaseAPI handle)
Returns the languages string used in the last valid initialization.
|
int |
TessAPI.TessBaseAPIGetIntVariable(TessAPI.TessBaseAPI handle,
java.lang.String name,
java.nio.IntBuffer value)
Returns true (1) if the parameter was found among Tesseract parameters.
|
TessAPI.TessResultIterator |
TessAPI.TessBaseAPIGetIterator(TessAPI.TessBaseAPI handle)
Get a reading-order iterator to the results of LayoutAnalysis and/or
Recognize.
|
com.sun.jna.ptr.PointerByReference |
TessAPI.TessBaseAPIGetLoadedLanguagesAsVector(TessAPI.TessBaseAPI handle)
Returns the loaded languages in the vector of STRINGs.
|
int |
TessAPI.TessBaseAPIGetPageSegMode(TessAPI.TessBaseAPI handle)
Return the current page segmentation mode.
|
java.lang.String |
TessAPI.TessBaseAPIGetStringVariable(TessAPI.TessBaseAPI handle,
java.lang.String name) |
int |
TessAPI.TessBaseAPIGetTextDirection(TessAPI.TessBaseAPI handle,
java.nio.IntBuffer out_offset,
java.nio.FloatBuffer out_slope) |
java.lang.String |
TessAPI.TessBaseAPIGetUnichar(TessAPI.TessBaseAPI handle,
int unichar_id)
This method returns the string form of the specified unichar.
|
java.lang.String |
TessAPI.TessBaseAPIGetUNLVText(TessAPI.TessBaseAPI handle)
The recognized text is returned as a char* which is coded as UNLV format
Latin-1 with specific reject and suspect codes and must be freed with the
delete [] operator.
|
java.lang.String |
TessAPI.TessBaseAPIGetUTF8Text(TessAPI.TessBaseAPI handle)
The recognized text is returned as a char* which is coded as UTF-8 and
must be freed with the delete [] operator.
|
int |
TessAPI.TessBaseAPIInit1(TessAPI.TessBaseAPI handle,
java.lang.String datapath,
java.lang.String language,
int oem,
com.sun.jna.ptr.PointerByReference configs,
int configs_size)
Instances are now mostly thread-safe and totally independent, but some
global parameters remain.
|
int |
TessAPI.TessBaseAPIInit2(TessAPI.TessBaseAPI handle,
java.lang.String datapath,
java.lang.String language,
int oem) |
int |
TessAPI.TessBaseAPIInit3(TessAPI.TessBaseAPI handle,
java.lang.String datapath,
java.lang.String language) |
void |
TessAPI.TessBaseAPIInitForAnalysePage(TessAPI.TessBaseAPI handle)
Init only for page layout analysis.
|
int |
TessAPI.TessBaseAPIInitLangMod(TessAPI.TessBaseAPI handle,
java.lang.String datapath,
java.lang.String language)
Init only the lang model component of Tesseract.
|
int |
TessAPI.TessBaseAPIIsValidWord(TessAPI.TessBaseAPI handle,
java.lang.String word)
Check whether a word is valid according to Tesseract's language model.
|
int |
TessAPI.TessBaseAPIMeanTextConf(TessAPI.TessBaseAPI handle)
Returns the (average) confidence value between 0 and 100.
|
void |
TessAPI.TessBaseAPIPrintVariablesToFile(TessAPI.TessBaseAPI handle,
java.lang.String filename)
Print Tesseract parameters to the given file.
Note: Must not be the first method called after instance create. |
java.lang.String |
TessAPI.TessBaseAPIProcessPages(TessAPI.TessBaseAPI handle,
java.lang.String filename,
java.lang.String retry_config,
int timeout_millisec)
Recognizes all the pages in the named file, as a multi-page tiff or list
of filenames, or single image, and gets the appropriate kind of text
according to parameters:
tessedit_create_boxfile ,
tessedit_make_boxes_from_boxes ,
tessedit_write_unlv ,
tessedit_create_hocr . |
void |
TessAPI.TessBaseAPIReadConfigFile(TessAPI.TessBaseAPI handle,
java.lang.String filename,
int init_only)
Read a "config" file containing a set of param, value pairs.
|
int |
TessAPI.TessBaseAPIRecognize(TessAPI.TessBaseAPI handle,
TessAPI.ETEXT_DESC monitor)
Recognize the image from SetAndThresholdImage, generating Tesseract
internal structures.
|
int |
TessAPI.TessBaseAPIRecognizeForChopTest(TessAPI.TessBaseAPI handle,
TessAPI.ETEXT_DESC monitor)
Variant on Recognize used for testing chopper.
|
java.lang.String |
TessAPI.TessBaseAPIRect(TessAPI.TessBaseAPI handle,
java.nio.ByteBuffer imagedata,
int bytes_per_pixel,
int bytes_per_line,
int left,
int top,
int width,
int height)
Recognize a rectangle from an image and return the result as a string.
|
void |
TessAPI.TessBaseAPISetImage(TessAPI.TessBaseAPI handle,
java.nio.ByteBuffer imagedata,
int width,
int height,
int bytes_per_pixel,
int bytes_per_line)
Provide an image for Tesseract to recognize.
|
void |
TessAPI.TessBaseAPISetInputName(TessAPI.TessBaseAPI handle,
java.lang.String name)
Set the name of the input file.
|
void |
TessAPI.TessBaseAPISetOutputName(TessAPI.TessBaseAPI handle,
java.lang.String name)
Set the name of the bonus output files.
|
void |
TessAPI.TessBaseAPISetPageSegMode(TessAPI.TessBaseAPI handle,
int mode)
Set the current page segmentation mode.
|
void |
TessAPI.TessBaseAPISetRectangle(TessAPI.TessBaseAPI handle,
int left,
int top,
int width,
int height)
Restrict recognition to a sub-rectangle of the image.
|
void |
TessAPI.TessBaseAPISetSourceResolution(TessAPI.TessBaseAPI handle,
int ppi)
Set the resolution of the source image in pixels per inch so font size
information can be calculated in results.
|
int |
TessAPI.TessBaseAPISetVariable(TessAPI.TessBaseAPI handle,
java.lang.String name,
java.lang.String value)
Set the value of an internal "parameter." Supply the name of the
parameter and the value as a string, just as you would in a config file.
|
java.lang.String |
TessAPI.TessBaseGetInitLanguagesAsString(TessAPI.TessBaseAPI handle)
Returns the languages string used in the last valid initialization.
|