Introducing Google AI Edge Portal : Benchmark Edge AI at scale. Sign-up to request access during private preview.

Benchmark CompiledModel API

LiteRT benchmark tools measure and calculate statistics for the following important performance metrics:

Initialization time
Inference time of warmup state
Inference time of steady state
Memory usage during initialization time
Overall memory usage

The CompiledModel benchmark tool is provided as a C++ binary, benchmark_model . You can execute this tool from a shell command line on Android, Linux, macOS, Windows, and embedded devices with GPU acceleration enabled.

Download prebuilt benchmark binaries

Download the nightly prebuilt command-line binaries by following the links following:

Build benchmark binary from source

You can build the benchmark binary from source .

 bazel  
build  
-c  
opt  
//litert/tools:benchmark_model

To build with Android NDK toolchain, you need to set up the build environment first by following this guide , or use the docker image as described in this guide .

 bazel  
build  
-c  
opt  
--config = 
android_arm64  
 \ 
  
//litert/tools:benchmark_model

Run benchmark

To run benchmarks, execute the binary from the shell.

 path/to/downloaded_or_built/benchmark_model  
 \ 
  
--graph = 
your_model.tflite  
 \ 
  
--num_threads = 
 4

More parameter options can be found in the source code of benchmark_model .

Benchmark GPU acceleration

These prebuilt binaries include LiteRT GPU Accelerator. It supports

Android: OpenCL
Linux: OpenCL and WebGPU (backed by Vulkan)
macOS: Metal
Windows: WebGPU (backed by Direct3D)

To use the GPU Accelerator, pass the flag --use_gpu=true .

Profile model ops

The benchmark model binary also let you profile model ops and get the execution times of each operator. To do this, pass the flag --use_profiler=true to benchmark_model during invocation.