Common interface for every entity across the hierarchy of recognized text. An entity may contain other smaller entities, or may be an atom.
Public Method Summary
abstract Rect | |
abstract List <? extends Text > | |
abstract Point[] | |
abstract String | |
abstract String |
Public Methods
public abstract Rect getBoundingBox ()
Axis-aligned bounding box containing the text. The bounding box may extend past the image boundary.
public abstract List <? extends Text > getComponents ()
Smaller components that comprise this entity, if any. If this entity is an atom, an
empty list is returned. TextBlock
is at the top of the Text hierarchy. TextBlock
contains Line
objects, which contains Element
s. Element
s
are atoms. We may decide to add character-level objects in later versions.
For example, a client could draw bounding boxes for recognized text in different colors for paragraphs, lines, words, and alphabets by repeatedly traversing down the tree with this method.
public abstract Point[] getCornerPoints ()
Four corner points in clockwise direction starting with top-left. Due to the possible perspective distortions, this is not necessarily a rectangle. Parts of the region could be outside of the image.
public abstract String getLanguage ()
Prevailing language in the text, if any. The format is in BCP47 (e.g. "en" or "sr-Latn-BA") or "und" if the language could not be determined.
public abstract String getValue ()
Retrieve the recognized text as a string. Returned in reading order for the language. For Latin, this is top to bottom within a TextBlock, and left-to-right within Lines.