Interface to TensorFlow Lite model interpreter, excluding experimental methods.
An InterpreterApi
instance encapsulates a pre-trained TensorFlow Lite model, in which
operations are executed for model inference.
For example, if a model takes only one input and returns only one output:
try
(
InterpreterApi
interpreter
=
new
InterpreterApi
.
create
(
file_of_a_tensorflowlite_model
))
{
interpreter
.
run
(
input
,
output
);
}
If a model takes multiple inputs or outputs:
Object
[]
inputs
=
{
input0
,
input1
,
...};
Map
<
Integer
,
Object
>
map_of_indices_to_outputs
=
new
HashMap
<>
();
FloatBuffer
ith_output
=
FloatBuffer
.
allocateDirect
(
3
*
2
*
4
);
// Float tensor, shape 3x2x4.
ith_output
.
order
(
ByteOrder
.
nativeOrder
());
map_of_indices_to_outputs
.
put
(
i
,
ith_output
);
try
(
InterpreterApi
interpreter
=
new
InterpreterApi
.
create
(
file_of_a_tensorflowlite_model
))
{
interpreter
.
runForMultipleInputsOutputs
(
inputs
,
map_of_indices_to_outputs
);
}
If a model takes or produces string tensors:
String
[]
input
=
{
"foo"
,
"bar"
};
// Input tensor shape is [2].
String
[][]
output
=
new
String
[
3
][
2
]
;
// Output tensor shape is [3, 2].
try
(
InterpreterApi
interpreter
=
new
InterpreterApi
.
create
(
file_of_a_tensorflowlite_model
))
{
interpreter
.
runForMultipleInputsOutputs
(
input
,
output
);
}
Note that there's a distinction between shape [] and shape[1]. For scalar string tensor outputs:
String
[]
input
=
{
"foo"
};
// Input tensor shape is [1].
ByteBuffer
outputBuffer
=
ByteBuffer
.
allocate
(
OUTPUT_BYTES_SIZE
);
// Output tensor shape is [].
try
(
Interpreter
interpreter
=
new
Interpreter
(
file_of_a_tensorflowlite_model
))
{
interpreter
.
runForMultipleInputsOutputs
(
input
,
outputBuffer
);
}
byte
[]
outputBytes
=
new
byte
[
outputBuffer
.
remaining
()
]
;
outputBuffer
.
get
(
outputBytes
);
// Below, the `charset` can be StandardCharsets.UTF_8.
String
output
=
new
String
(
outputBytes
,
charset
);
Orders of inputs and outputs are determined when converting TensorFlow model to TensorFlowLite model with Toco, as are the default shapes of the inputs.
When inputs are provided as (multi-dimensional) arrays, the corresponding input tensor(s) will
be implicitly resized according to that array's shape. When inputs are provided as Buffer
types, no implicit resizing is done; the caller must ensure that the Buffer
byte size either matches that of the corresponding tensor, or that they first
resize the tensor via resizeInput(int, int[])
. Tensor shape and type information can be
obtained via the Tensor
class, available via getInputTensor(int)
and getOutputTensor(int)
.
WARNING:
InterpreterApi
instances are not
thread-safe.
WARNING:
An InterpreterApi
instance owns resources that must
be
explicitly freed by invoking close()
The TFLite library is built against NDK API 19. It may work for Android API levels below 19, but is not guaranteed.
Nested Classes
Public Methods
| abstract void | |
| abstract void | |
| static InterpreterApi | |
| static InterpreterApi | |
| abstract int | |
| abstract Tensor | |
| abstract int | |
| abstract Long | |
| abstract int | |
| abstract Tensor | |
| abstract int | |
| abstract void | resizeInput
(int idx, int[] dims, boolean strict)
Resizes idx-th input of the native model to the given dims.
|
| abstract void | |
| abstract void | |
| abstract void | runForMultipleInputsOutputs
( Object[]
inputs, Map
< Integer
, Object
> outputs)
Runs model inference if the model takes multiple inputs, or returns multiple outputs.
|
Inherited Methods
Public Methods
public abstract void allocateTensors ()
Explicitly updates allocations for all tensors, if necessary.
This will propagate shapes and memory allocations for dependent tensors using the input tensor shape(s) as given.
Note: This call is *purely optional*. Tensor allocation will occur automatically during execution if any input tensors have been resized. This call is most useful in determining the shapes for any output tensors before executing the graph, e.g.,
interpreter.resizeInput(0, new int[]{1, 4, 4, 3}));
interpreter.allocateTensors();
FloatBuffer input = FloatBuffer.allocate(interpreter.getInputTensor(0).numElements());
// Populate inputs...
FloatBuffer output = FloatBuffer.allocate(interpreter.getOutputTensor(0).numElements());
interpreter.run(input, output)
// Process outputs...
Note: Some graphs have dynamically shaped outputs, in which case the output shape may not fully propagate until inference is executed.
Throws
public abstract void close ()
Release resources associated with the InterpreterApi
instance.
public static InterpreterApi create ( File modelFile, InterpreterApi.Options options)
Constructs an InterpreterApi
instance, using the specified model and options. The model
will be loaded from a file.
Parameters
| modelFile | A file containing a pre-trained TF Lite model. |
|---|---|
| options | A set of options for customizing interpreter behavior. |
Throws
modelFile
does not encode a valid TensorFlow Lite
model.public static InterpreterApi create ( ByteBuffer byteBuffer, InterpreterApi.Options options)
Constructs an InterpreterApi
instance, using the specified model and options. The model
will be read from a ByteBuffer
.
Parameters
| byteBuffer | A pre-trained TF Lite model, in binary serialized form. The ByteBuffer should
not be modified after the construction of an InterpreterApi
instance. The ByteBuffer
can be either a MappedByteBuffer
that memory-maps a model file, or a
direct ByteBuffer
of nativeOrder() that contains the bytes content of a model. |
|---|---|
| options | A set of options for customizing interpreter behavior. |
Throws
byteBuffer
is not a MappedByteBuffer
nor a
direct ByteBuffer
of nativeOrder.public abstract int getInputIndex ( String opName)
Gets index of an input given the op name of the input.
Parameters
Throws
opName
does not match any input in the model used
to initialize the interpreter.public abstract Tensor getInputTensor (int inputIndex)
Gets the Tensor associated with the provided input index.
Parameters
Throws
inputIndex
is negative or is not smaller than the
number of model inputs.public abstract int getInputTensorCount ()
Gets the number of input tensors.
public abstract Long getLastNativeInferenceDurationNanoseconds ()
Returns native inference timing.
Throws
public abstract int getOutputIndex ( String opName)
Gets index of an output given the op name of the output.
Parameters
Throws
opName
does not match any output in the model used
to initialize the interpreter.public abstract Tensor getOutputTensor (int outputIndex)
Gets the Tensor associated with the provided output index.
Note: Output tensor details (e.g., shape) may not be fully populated until after inference
is executed. If you need updated details *before* running inference (e.g., after resizing an
input tensor, which may invalidate output tensor shapes), use allocateTensors()
to
explicitly trigger allocation and shape propagation. Note that, for graphs with output shapes
that are dependent on input *values*, the output shape may not be fully determined until
running inference.
Parameters
Throws
outputIndex
is negative or is not smaller than the
number of model outputs.public abstract int getOutputTensorCount ()
Gets the number of output Tensors.
public abstract void resizeInput (int idx, int[] dims, boolean strict)
Resizes idx-th input of the native model to the given dims.
When `strict` is True, only unknown dimensions can be resized. Unknown dimensions are indicated as `-1` in the array returned by `Tensor.shapeSignature()`.
Parameters
| idx |
|---|
| dims |
| strict |
Throws
idx
is negative or is not smaller than the number
of model inputs; or if error occurs when resizing the idx-th input. Additionally, the error
occurs when attempting to resize a tensor with fixed dimensions when `strict` is True.public abstract void resizeInput (int idx, int[] dims)
Resizes idx-th input of the native model to the given dims.
Parameters
| idx |
|---|
| dims |
Throws
idx
is negative or is not smaller than the number
of model inputs; or if error occurs when resizing the idx-th input.public abstract void run ( Object input, Object output)
Runs model inference if the model takes only one input, and provides only one output.
Warning: The API is more efficient if a Buffer
(preferably direct, but not required)
is used as the input/output data type. Please consider using Buffer
to feed and fetch
primitive data for better performance. The following concrete Buffer
types are
supported:
-
ByteBuffer- compatible with any underlying primitive Tensor type. -
FloatBuffer- compatible with float Tensors. -
IntBuffer- compatible with int32 Tensors. -
LongBuffer- compatible with int64 Tensors.
Buffer
s, or as scalar inputs.Parameters
| input | an array or multidimensional array, or a Buffer
of primitive types
including int, float, long, and byte. Buffer
is the preferred way to pass large
input data for primitive types, whereas string types require using the (multi-dimensional)
array input path. When a Buffer
is used, its content should remain unchanged until
model inference is done, and the caller must ensure that the Buffer
is at the
appropriate read position. A null
value is allowed only if the caller is using a Delegate
that allows buffer handle interop, and such a buffer has been bound to the
input Tensor
. |
|---|---|
| output | a multidimensional array of output data, or a Buffer
of primitive types
including int, float, long, and byte. When a Buffer
is used, the caller must ensure
that it is set the appropriate write position. A null value is allowed, and is useful for
certain cases, e.g., if the caller is using a Delegate
that allows buffer handle
interop, and such a buffer has been bound to the output Tensor
(see also Interpreter.Options#setAllowBufferHandleOutput(boolean)
),
or if the graph has dynamically shaped outputs and the caller must query the output Tensor
shape after inference has been invoked, fetching the data directly from the output
tensor (via Tensor.asReadOnlyBuffer()
). |
Throws
| IllegalArgumentException | if input
is null or empty, or if an error occurs when
running inference. |
|---|---|
| IllegalArgumentException | (EXPERIMENTAL, subject to change) if the inference is
interrupted by setCancelled(true)
. |
public abstract void runForMultipleInputsOutputs ( Object[] inputs, Map < Integer , Object > outputs)
Runs model inference if the model takes multiple inputs, or returns multiple outputs.
Warning: The API is more efficient if Buffer
s (preferably direct, but not required)
are used as the input/output data types. Please consider using Buffer
to feed and fetch
primitive data for better performance. The following concrete Buffer
types are
supported:
-
ByteBuffer- compatible with any underlying primitive Tensor type. -
FloatBuffer- compatible with float Tensors. -
IntBuffer- compatible with int32 Tensors. -
LongBuffer- compatible with int64 Tensors.
Buffer
s, or as scalar inputs. Note: null
values for invididual elements of inputs
and outputs
is
allowed only if the caller is using a Delegate
that allows buffer handle interop, and
such a buffer has been bound to the corresponding input or output Tensor
(s).
Parameters
| inputs | an array of input data. The inputs should be in the same order as inputs of the
model. Each input can be an array or multidimensional array, or a Buffer
of
primitive types including int, float, long, and byte. Buffer
is the preferred way
to pass large input data, whereas string types require using the (multi-dimensional) array
input path. When Buffer
is used, its content should remain unchanged until model
inference is done, and the caller must ensure that the Buffer
is at the appropriate
read position. |
|---|---|
| outputs | a map mapping output indices to multidimensional arrays of output data or Buffer
s of primitive types including int, float, long, and byte. It only needs to keep
entries for the outputs to be used. When a Buffer
is used, the caller must ensure
that it is set the appropriate write position. The map may be empty for cases where either
buffer handles are used for output tensor data, or cases where the outputs are dynamically
shaped and the caller must query the output Tensor
shape after inference has been
invoked, fetching the data directly from the output tensor (via Tensor.asReadOnlyBuffer()
). |
Throws
inputs
is null or empty, if outputs
is
null, or if an error occurs when running inference.
