In this colab notebook, you'll learn how to use theTensorFlow Lite Model Makerlibrary to train a custom object detection model capable of detecting salads within images on a mobile device.
The Model Maker library usestransfer learningto simplify the process of training a TensorFlow Lite model using a custom dataset. Retraining a TensorFlow Lite model with your own custom dataset reduces the amount of training data required and will shorten the training time.
You'll use the publicly availableSaladsdataset, which was created from theOpen Images Dataset V4.
Each image in the dataset contains objects labeled as one of the following classes:
Baked Good
Cheese
Salad
Seafood
Tomato
The dataset contains the bounding-boxes specifying where each object locates, together with the object's label.
Prerequisites
Install the required packages
Start by installing the required packages, including the Model Maker package from theGitHub repoand the pycocotools library you'll use for evaluation.
Here you'll use the same dataset as the AutoMLquickstart.
TheSaladsdataset is available at:gs://cloud-ml-data/img/openimage/csv/salads_ml_use.csv.
It contains 175 images for training, 25 images for validation, and 25 images for testing. The dataset has five classes:Salad,Seafood,Tomato,Baked goods,Cheese.
Each row corresponds to an object localized inside a larger image, with each object specifically designated as test, train, or validation data. You'll learn more about what that means in a later stage in this notebook.
The three lines included here indicatethree distinct objects located inside the same imageavailable atgs://cloud-ml-data/img/openimage/3/2520/3916261642_0a504acd60_o.jpg.
Each row has a different label:Salad,Seafood,Tomato, etc.
Bounding boxes are specified for each image using the top left and bottom right vertices.
If you want to know more about how to prepare your own CSV file and the minimum requirements for creating a valid dataset, see thePreparing your training dataguide for more details.
If you are new to Google Cloud, you may wonder what thegs://URL means. They are URLs of files stored onGoogle Cloud Storage(GCS). If you make your files on GCS public orauthenticate your client, Model Maker can read those files similarly to your local files.
However, you don't need to keep your images on Google Cloud to use Model Maker. You can use a local path in your CSV file and Model Maker will just work.
Quickstart
There are six steps to training an object detection model:
Step 1. Choose an object detection model architecture.
This tutorial uses the EfficientDet-Lite0 model. EfficientDet-Lite[0-4] are a family of mobile/IoT-friendly object detection models derived from theEfficientDetarchitecture.
Here is the performance of each EfficientDet-Lite models compared to each others.
Model architecture
Size(MB)*
Latency(ms)**
Average Precision***
EfficientDet-Lite0
4.4
37
25.69%
EfficientDet-Lite1
5.8
49
30.55%
EfficientDet-Lite2
7.2
69
33.97%
EfficientDet-Lite3
11.4
116
37.70%
EfficientDet-Lite4
19.9
260
41.96%
* Size of the integer quantized models. ** Latency measured on Pixel 4 using 4 threads on CPU. *** Average Precision is the mAP (mean Average Precision) on the COCO 2017 validation dataset.
spec = model_spec.get('efficientdet_lite0')
Step 2. Load the dataset.
Model Maker will take input data in the CSV format. Use theobject_detector.DataLoader.from_csvmethod to load the dataset and split them into the training, validation and test images.
Training images: These images are used to train the object detection model to recognize salad ingredients.
Validation images: These are images that the model didn't see during the training process. You'll use them to decide when you should stop the training, to avoidoverfitting.
Test images: These images are used to evaluate the final model performance.
You can load the CSV file directly from Google Cloud Storage, but you don't need to keep your images on Google Cloud to use Model Maker. You can specify a local CSV file on your computer, and Model Maker will work just fine.
Step 3. Train the TensorFlow model with the training data.
The EfficientDet-Lite0 model usesepochs = 50by default, which means it will go through the training dataset 50 times. You can look at the validation accuracy during training and stop early to avoid overfitting.
Setbatch_size = 8here so you will see that it takes 21 steps to go through the 175 images in the training dataset.
Settrain_whole_model=Trueto fine-tune the whole model instead of just training the head layer to improve accuracy. The trade-off is that it may take longer to train the model.
model = object_detector.create(train_data, model_spec=spec, batch_size=8, train_whole_model=True, validation_data=validation_data)
Step 4. Evaluate the model with the test data.
After training the object detection model using the images in the training dataset, use the remaining 25 images in the test dataset to evaluate how the model performs against new data it has never seen before.
As the default batch size is 64, it will take 1 step to go through the 25 images in the test dataset.
Export the trained object detection model to the TensorFlow Lite format by specifying which folder you want to export the quantized model to. The default post-training quantization technique is full integer quantization.
model.export(export_dir='.')
Step 6. Evaluate the TensorFlow Lite model.
Several factors can affect the model accuracy when exporting to TFLite:
Quantizationhelps shrinking the model size by 4 times at the expense of some accuracy drop.
The original TensorFlow model uses per-classnon-max suppression (NMS)for post-processing, while the TFLite model uses global NMS that's much faster but less accurate.
Keras outputs maximum 100 detections while tflite outputs maximum 25 detections.
Therefore you'll have to evaluate the exported TFLite model and compare its accuracy with the original TensorFlow model.
model.evaluate_tflite('model.tflite', test_data)
You can download the TensorFlow Lite model file using the left sidebar of Colab. Right-click on themodel.tflitefile and chooseDownloadto download it to your local computer.
This model can be integrated into an Android or an iOS app using the ObjectDetector API of theTensorFlow Lite Task Library.
You can test the trained TFLite model using images from the internet.
Replace theINPUT_IMAGE_URLbelow with your desired input image.
Adjust theDETECTION_THRESHOLDto change the sensitivity of the model. A lower threshold means the model will pickup more objects but there will also be more false detection. Meanwhile, a higher threshold means the model will only pickup objects that it has confidently detected.
Although it requires some of boilerplate code to run the model in Python at this moment, integrating the model into a mobile app only requires a few lines of code.
Load the trained TFLite model and define some visualization functions
importcv2fromPILimportImagemodel_path='model.tflite'# Load the labels into a listclasses=['???']*model.model_spec.config.num_classeslabel_map=model.model_spec.config.label_mapforlabel_id,label_nameinlabel_map.as_dict().items():classes[label_id-1]=label_name# Define a list of colors for visualizationCOLORS=np.random.randint(0,255,size=(len(classes),3),dtype=np.uint8)defpreprocess_image(image_path,input_size):"""Preprocess the input image to feed to the TFLite model"""img=tf.io.read_file(image_path)img=tf.io.decode_image(img,channels=3)img=tf.image.convert_image_dtype(img,tf.uint8)original_image=imgresized_img=tf.image.resize(img,input_size)resized_img=resized_img[tf.newaxis,:]resized_img=tf.cast(resized_img,dtype=tf.uint8)returnresized_img,original_imagedefdetect_objects(interpreter,image,threshold):"""Returns a list of detection results, each a dictionary of object info."""signature_fn=interpreter.get_signature_runner()# Feed the input image to the modeloutput=signature_fn(images=image)# Get all outputs from the modelcount=int(np.squeeze(output['output_0']))scores=np.squeeze(output['output_1'])classes=np.squeeze(output['output_2'])boxes=np.squeeze(output['output_3'])results=[]foriinrange(count):ifscores[i]>=threshold:result={'bounding_box':boxes[i],'class_id':classes[i],'score':scores[i]}results.append(result)returnresultsdefrun_odt_and_draw_results(image_path,interpreter,threshold=0.5):"""Run object detection on the input image and draw the detection results"""# Load the input shape required by the model_,input_height,input_width,_=interpreter.get_input_details()[0]['shape']# Load the input image and preprocess itpreprocessed_image,original_image=preprocess_image(image_path,(input_height,input_width))# Run object detection on the input imageresults=detect_objects(interpreter,preprocessed_image,threshold=threshold)# Plot the detection results on the input imageoriginal_image_np=original_image.numpy().astype(np.uint8)forobjinresults:# Convert the object bounding box from relative coordinates to absolute# coordinates based on the original image resolutionymin,xmin,ymax,xmax=obj['bounding_box']xmin=int(xmin*original_image_np.shape[1])xmax=int(xmax*original_image_np.shape[1])ymin=int(ymin*original_image_np.shape[0])ymax=int(ymax*original_image_np.shape[0])# Find the class index of the current objectclass_id=int(obj['class_id'])# Draw the bounding box and label on the imagecolor=[int(c)forcinCOLORS[class_id]]cv2.rectangle(original_image_np,(xmin,ymin),(xmax,ymax),color,2)# Make adjustments to make the label visible for all objectsy=ymin-15ifymin-15>15elseymin+15label="{}:{:.0f}%".format(classes[class_id],obj['score']*100)cv2.putText(original_image_np,label,(xmin,y),cv2.FONT_HERSHEY_SIMPLEX,0.5,color,2)# Return the final imageoriginal_uint8=original_image_np.astype(np.uint8)returnoriginal_uint8
Run object detection and show the detection results
The EdgeTPU has 8MB of SRAM for caching model parameters (more info). This means that for models that are larger than 8MB, inference time will be increased in order to transfer over model parameters. One way to avoid this isModel Pipelining- splitting the model into segments that can have a dedicated EdgeTPU. This can significantly improve latency.
The below table can be used as a reference for the number of Edge TPUs to use - the larger models will not compile for a single TPU as the intermediate tensors can't fit in on-chip memory.
With the model(s) compiled, they can now be run on EdgeTPU(s) for object detection. First, download the compiled TensorFlow Lite model file using the left sidebar of Colab. Right-click on themodel_edgetpu.tflitefile and chooseDownloadto download it to your local computer.
Now you can run the model in your preferred manner. Examples of detection include:
This section covers advanced usage topics like adjusting the model and the training hyperparameters.
Load the dataset
Load your own data
You can upload your own dataset to work through this tutorial. Upload your dataset by using the left sidebar in Colab.
If you prefer not to upload your dataset to the cloud, you can also locally run the library by following theguide.
Load your data with a different data format
The Model Maker library also supports theobject_detector.DataLoader.from_pascal_vocmethod to load data withPASCAL VOCformat.makesense.aiandLabelImgare the tools that can annotate the image and save annotations as XML files in PASCAL VOC data format:
The model and training pipeline parameters you can adjust are:
model_dir: The location to save the model checkpoint files. If not set, a temporary directory will be used.
steps_per_execution: Number of steps per training execution.
moving_average_decay: Float. The decay to use for maintaining moving averages of the trained parameters.
var_freeze_expr: The regular expression to map the prefix name of variables to be frozen which means remaining the same during training. More specific, usere.match(var_freeze_expr, variable_name)in the codebase to map the variables to be frozen.
tflite_max_detections: integer, 25 by default. The max number of output detections in the TFLite model.
strategy: A string specifying which distribution strategy to use. Accepted values are 'tpu', 'gpus', None. tpu' means to use TPUStrategy. 'gpus' mean to use MirroredStrategy for multi-gpus. If None, use TF default with OneDeviceStrategy.
tpu: The Cloud TPU to use for training. This should be either the name used when creating the Cloud TPU, or a grpc://ip.address.of.tpu:8470 url.
use_xla: Use XLA even if strategy is not tpu. If strategy is tpu, always use XLA, and this flag has no effect.
For instance, you can set thevar_freeze_expr='efficientnet'which freezes the variables with name prefixefficientnet(default is'(efficientnet|fpn_cells|resample_p6)'). This allows the model to freeze untrainable variables and keep their value the same through training.
You can change the model architecture by changing themodel_spec. For instance, change themodel_specto the EfficientDet-Lite4 model.
spec=model_spec.get('efficientdet_lite4')
Tune the training hyperparameters
Thecreatefunction is the driver function that the Model Maker library uses to create models. Themodel_specparameter defines the model specification. Theobject_detector.EfficientDetSpecclass is currently supported. Thecreatefunction comprises of the following steps:
Creates the model for the object detection according tomodel_spec.
Trains the model. The default epochs and the default batch size are set by theepochsandbatch_sizevariables in themodel_specobject.
You can also tune the training hyperparameters likeepochsandbatch_sizethat affect the model accuracy. For instance,
epochs: Integer, 50 by default. More epochs could achieve better accuracy, but may lead to overfitting.
batch_size: Integer, 64 by default. The number of samples to use in one training step.
train_whole_model: Boolean, False by default. If true, train the whole model. Otherwise, only train the layers that do not matchvar_freeze_expr.
For example, you can train with less epochs and only the head layer. You can increase the number of epochs for better results.
The export formats can be one or a list of the following:
ExportFormat.TFLITE
ExportFormat.LABEL
ExportFormat.SAVED_MODEL
By default, it exports only the TensorFlow Lite model file containing the modelmetadataso that you can later use in an on-device ML application. The label file is embedded in metadata.
In many on-device ML application, the model size is an important factor. Therefore, it is recommended that you quantize the model to make it smaller and potentially run faster. As for EfficientDet-Lite models, full integer quantization is used to quantize the model by default. Please refer toPost-training quantizationfor more detail.
model.export(export_dir='.')
You can also choose to export other files related to the model for better examination. For instance, exporting both the saved model and the label file as follows:
Customize Post-training quantization on the TensorFlow Lite model
Post-training quantizationis a conversion technique that can reduce model size and inference latency, while also improving CPU and hardware accelerator inference speed, with a little degradation in model accuracy. Thus, it's widely used to optimize the model.
Model Maker library applies a default post-training quantization technique when exporting the model. If you want to customize post-training quantization, Model Maker supports multiple post-training quantization options usingQuantizationConfigas well. Let's take float16 quantization as an instance. First, define the quantization config.
config=QuantizationConfig.for_float16()
Then we export the TensorFlow Lite model with such configuration.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2026-05-28 UTC."],[],[]]