LiteRT benchmark tools measure and calculate statistics for the following important performance metrics:
- Initialization time
- Inference time of warmup state
- Inference time of steady state
- Memory usage during initialization time
- Overall memory usage
The CompiledModel
benchmark tool is provided as a C++ binary, benchmark_model
. You can execute this tool from a shell command line on
Android, Linux, macOS, Windows, and embedded devices with GPU acceleration
enabled.
Download prebuilt benchmark binaries
Download the nightly prebuilt command-line binaries by following the links following:
Build benchmark binary from source
You can build the benchmark binary from source .
bazel
build
-c
opt
//litert/tools:benchmark_model
To build with Android NDK toolchain, you need to set up the build environment first by following this guide , or use the docker image as described in this guide .
bazel
build
-c
opt
--config =
android_arm64
\
//litert/tools:benchmark_model
Run benchmark
To run benchmarks, execute the binary from the shell.
path/to/downloaded_or_built/benchmark_model
\
--graph =
your_model.tflite
\
--num_threads =
4
More parameter options can be found in the source code of benchmark_model .
Benchmark GPU acceleration
These prebuilt binaries include LiteRT GPU Accelerator. It supports
- Android: OpenCL
- Linux: OpenCL and WebGPU (backed by Vulkan)
- macOS: Metal
- Windows: WebGPU (backed by Direct3D)
To use the GPU Accelerator, pass the flag --use_gpu=true
.
Profile model ops
The benchmark model binary also let you profile model ops and get the
execution times of each operator. To do this, pass the flag --use_profiler=true
to benchmark_model
during invocation.

