Interpreter interface for running TensorFlow Lite models.
tf
.
lite
.
Interpreter
(
model_path
=
None
,
model_content
=
None
,
experimental_delegates
=
None
,
num_threads
=
None
,
experimental_op_resolver_type
=
tf
.
lite
.
experimental
.
OpResolverType
.
AUTO
,
experimental_preserve_all_tensors
=
False
,
experimental_disable_delegate_clustering
=
False
,
experimental_default_delegate_latest_features
=
False
)
Used in the notebooks
Models obtained from TfLiteConverter
can be run in Python with Interpreter
.
As an example, let's generate a simple Keras model and convert it to TFLite
( TfLiteConverter
also supports other input formats with from_saved_model
and from_concrete_function
)
x = np . array ([[ 1. ], [ 2. ]])y = np . array ([[ 2. ], [ 4. ]])model = tf . keras . models . Sequential ([tf . keras . layers . Dropout ( 0.2 ),tf . keras . layers . Dense ( units = 1 , input_shape = [ 1 ])])model . compile ( optimizer = 'sgd' , loss = 'mean_squared_error' )model . fit ( x , y , epochs = 1 )converter = tf . lite . TFLiteConverter . from_keras_model ( model )tflite_model = converter . convert ()
tflite_model
can be saved to a file and loaded later, or directly into the Interpreter
. Since TensorFlow Lite pre-plans tensor allocations to optimize
inference, the user needs to call allocate_tensors()
before any inference.
interpreter = tf . lite . Interpreter ( model_content = tflite_model )interpreter . allocate_tensors () # Needed before execution!
Sample execution:
output = interpreter . get_output_details ()[ 0 ] # Model has single output.input = interpreter . get_input_details ()[ 0 ] # Model has single input.input_data = tf . constant ( 1. , shape = [ 1 , 1 ])interpreter . set_tensor ( input [ 'index' ], input_data )interpreter . invoke ()interpreter . get_tensor ( output [ 'index' ]) . shape( 1 , 1 )
Use get_signature_runner()
for a more user-friendly inference API.
Args
with tf.control_dependencies()
) since the TF Lite converter will drop
control dependencies by default. Most users shouldn't turn this flag to
True if they don't insert explicit control dependencies or the graph
execution order is expected. For automatically inserted control
dependencies (with tf.Variable
, tf.Print
etc), the user doesn't need
to turn this flag to True since they are respected by default. Note that
this flag is currently experimental, and it might be removed/updated if
the TF Lite converter doesn't drop such control dependencies in the
model. Default is False.
Methods
allocate_tensors
allocate_tensors
()
get_input_details
get_input_details
()
Gets model input tensor details.
-
name: The tensor name. -
index: The tensor index in the interpreter. -
shape: The shape of the tensor. -
shape_signature: Same asshapefor models with known/fixed shapes. If any dimension sizes are unknown, they are indicated with-1. -
dtype: The numpy data type (such asnp.int32ornp.uint8). -
quantization: Deprecated, usequantization_parameters. This field only works for per-tensor quantization, whereasquantization_parametersworks in all cases. -
quantization_parameters: A dictionary of parameters used to quantize the tensor: ~scales: List of scales (one if per-tensor quantization). ~zero_points: List of zero_points (one if per-tensor quantization). ~quantized_dimension: Specifies the dimension of per-axis quantization, in the case of multiple scales/zero_points. -
sparsity_parameters: A dictionary of parameters used to encode a sparse tensor. This is empty if the tensor is dense.
get_output_details
get_output_details
()
Gets model output tensor details.
get_input_details()
. get_signature_list
get_signature_list
()
Gets the list of SignatureDefs in the model.
Example,
signatures
=
interpreter
.
get_signature_list
()
print
(
signatures
)
# {
# 'add': {'inputs': ['x', 'y'], 'outputs': ['output_0']}
# }
Then
using
the
names
in
the
signature
list
you
can
get
a
callable
from
get_signature_runner
()
.
get_signature_runner
get_signature_runner
(
signature_key
=
None
)
Gets callable for inference of specific SignatureDef.
Example usage,
interpreter
=
tf
.
lite
.
Interpreter
(
model_content
=
tflite_model
)
interpreter
.
allocate_tensors
()
fn
=
interpreter
.
get_signature_runner
(
'div_with_remainder'
)
output
=
fn
(
x
=
np
.
array
([
3
]),
y
=
np
.
array
([
2
]))
print
(
output
)
# {
# 'quotient': array([1.], dtype=float32)
# 'remainder': array([1.], dtype=float32)
# }
None can be passed for signature_key if the model has a single Signature only.
All names used are these specific SignatureDef names.
signature_key
ValueError
get_tensor
get_tensor
(
tensor_index
,
subgraph_index
=
0
)
Gets the value of the output tensor (get a copy).
If you wish to avoid the copy, use tensor()
. This function cannot be used
to read intermediate results.
tensor_index
subgraph_index
get_tensor_details
get_tensor_details
()
Gets tensor details for every tensor with valid tensor details.
Tensors where required information about the tensor is not found are not added to the list. This includes temporary tensors without a name.
invoke
invoke
()
Invoke the interpreter.
Be sure to set the input sizes, allocate tensors and fill values before calling this. Also, note that this function releases the GIL so heavy computation can be done in the background while the Python interpreter continues. No other function on this object should be called while the invoke() call has not finished.
ValueError
reset_all_variables
reset_all_variables
()
resize_tensor_input
resize_tensor_input
(
input_index
,
tensor_size
,
strict
=
False
)
Resizes an input tensor.
input_index
tensor_size
strict
strict
is True.
Unknown dimensions are indicated as -1
in the shape_signature
attribute of a given tensor. (default False)
ValueError
Usage:
interpreter
=
Interpreter
(
model_content
=
tflite_model
)
interpreter
.
resize_tensor_input
(
0
,
[
num_test_images
,
224
,
224
,
3
])
interpreter
.
allocate_tensors
()
interpreter
.
set_tensor
(
0
,
test_images
)
interpreter
.
invoke
()
set_tensor
set_tensor
(
tensor_index
,
value
)
Sets the value of the input tensor.
Note this copies data in value
.
If you want to avoid copying, you can use the tensor()
function to get a
numpy buffer pointing to the input buffer in the tflite interpreter.
tensor_index
value
ValueError
tensor
tensor
(
tensor_index
)
Returns function that gives a numpy view of the current tensor buffer.
This allows reading and writing to these tensors w/o copies. This more
closely mirrors the C++ Interpreter class interface's tensor() member, hence
the name. Be careful not to hold these output references through calls
to allocate_tensors()
and invoke()
. This function cannot be used to read
intermediate results.
Usage:
interpreter
.
allocate_tensors
()
input
=
interpreter
.
tensor
(
interpreter
.
get_input_details
()[
0
][
"index"
])
output
=
interpreter
.
tensor
(
interpreter
.
get_output_details
()[
0
][
"index"
])
for
i
in
range
(
10
):
input
()
.
fill
(
3.
)
interpreter
.
invoke
()
print
(
"inference
%s
"
%
output
())
Notice how this function avoids making a numpy array directly. This is because it is important to not hold actual numpy views to the data longer than necessary. If you do, then the interpreter can no longer be invoked, because it is possible the interpreter would resize and invalidate the referenced tensors. The NumPy API doesn't allow any mutability of the underlying buffers.
WRONG:
input
=
interpreter
.
tensor
(
interpreter
.
get_input_details
()[
0
][
"index"
])()
output
=
interpreter
.
tensor
(
interpreter
.
get_output_details
()[
0
][
"index"
])()
interpreter
.
allocate_tensors
()
# This will throw RuntimeError
for
i
in
range
(
10
):
input
.
fill
(
3.
)
interpreter
.
invoke
()
# this will throw RuntimeError since input, output
tensor_index


