Label images with ML Kit on iOS

Page Summary

ML Kit's image labeling API lets you identify objects in images using a pre-trained model that recognizes over 400 labels.
To use this API, you need to include the GoogleMLKit/ImageLabeling pod, create a VisionImage object from your image, and then process it with an ImageLabeler instance.
Results are provided as an array of ImageLabel objects, each containing the label's text, confidence score, and index.
For real-time applications, leverage the synchronous results(in:) API and manage video frame processing efficiently to maintain optimal frame rates.

You can use ML Kit to label objects recognized in an image. The default model provided with ML Kit supports 400+ different labels.

Try it out

Play around with the sample app to see an example usage of this API.

Before you begin

Include the following ML Kit pods in your Podfile:
```
pod 'GoogleMLKit/ImageLabeling', '8.0.0'
```
After you install or update your project's Pods, open your Xcode project using its .xcworkspace . ML Kit is supported in Xcode version 12.4 or greater.

Now you are ready to label images.

1. Prepare the input image

Create a VisionImage object using a UIImage or a CMSampleBuffer .

If you use a UIImage , follow these steps:

Create a VisionImage object with the UIImage . Make sure to specify the correct .orientation .

Swift

let image = VisionImage(image: UIImage)
visionImage.orientation = image.imageOrientation

Objective-C

 MLKVisionImage 
  
 * 
 visionImage 
  
 = 
  
 [[ 
 MLKVisionImage 
  
 alloc 
 ] 
  
 initWithImage 
 : 
 image 
 ]; 
 visionImage 
 . 
 orientation 
  
 = 
  
 image 
 . 
 imageOrientation 
 ;

If you use a CMSampleBuffer , follow these steps:

Specify the orientation of the image data contained in the CMSampleBuffer .

To get the image orientation:

Swift

 func 
  
 imageOrientation 
 ( 
  
 deviceOrientation 
 : 
  
 UIDeviceOrientation 
 , 
  
 cameraPosition 
 : 
  
 AVCaptureDevice 
 . 
 Position 
 ) 
  
 -> 
  
 UIImage 
 . 
 Orientation 
  
 { 
  
 switch 
  
 deviceOrientation 
  
 { 
  
 case 
  
 . 
 portrait 
 : 
  
 return 
  
 cameraPosition 
  
 == 
  
 . 
 front 
  
 ? 
  
 . 
 leftMirrored 
  
 : 
  
 . 
 right 
  
 case 
  
 . 
 landscapeLeft 
 : 
  
 return 
  
 cameraPosition 
  
 == 
  
 . 
 front 
  
 ? 
  
 . 
 downMirrored 
  
 : 
  
 . 
 up 
  
 case 
  
 . 
 portraitUpsideDown 
 : 
  
 return 
  
 cameraPosition 
  
 == 
  
 . 
 front 
  
 ? 
  
 . 
 rightMirrored 
  
 : 
  
 . 
 left 
  
 case 
  
 . 
 landscapeRight 
 : 
  
 return 
  
 cameraPosition 
  
 == 
  
 . 
 front 
  
 ? 
  
 . 
 upMirrored 
  
 : 
  
 . 
 down 
  
 case 
  
 . 
 faceDown 
 , 
  
 . 
 faceUp 
 , 
  
 . 
 unknown 
 : 
  
 return 
  
 . 
 up 
  
 } 
 }

Objective-C

 - 
  
 ( 
 UIImageOrientation 
 ) 
  
 imageOrientationFromDeviceOrientation 
 :( 
 UIDeviceOrientation 
 ) 
 deviceOrientation 
  
 cameraPosition 
 :( 
 AVCaptureDevicePosition 
 ) 
 cameraPosition 
  
 { 
  
 switch 
  
 (deviceOrientation) 
  
 { 
  
 case 
  
 UIDeviceOrientationPortrait 
 : 
  
 return 
  
 cameraPosition 
  
 == 
  
 AVCaptureDevicePositionFront 
  
 ? 
  
 UIImageOrientationLeftMirrored 
  
 : 
  
 UIImageOrientationRight 
 ; 
  
 case 
  
 UIDeviceOrientationLandscapeLeft 
 : 
  
 return 
  
 cameraPosition 
  
 == 
  
 AVCaptureDevicePositionFront 
  
 ? 
  
 UIImageOrientationDownMirrored 
  
 : 
  
 UIImageOrientationUp 
 ; 
  
 case 
  
 UIDeviceOrientationPortraitUpsideDown 
 : 
  
 return 
  
 cameraPosition 
  
 == 
  
 AVCaptureDevicePositionFront 
  
 ? 
  
 UIImageOrientationRightMirrored 
  
 : 
  
 UIImageOrientationLeft 
 ; 
  
 case 
  
 UIDeviceOrientationLandscapeRight 
 : 
  
 return 
  
 cameraPosition 
  
 == 
  
 AVCaptureDevicePositionFront 
  
 ? 
  
 UIImageOrientationUpMirrored 
  
 : 
  
 UIImageOrientationDown 
 ; 
  
 case 
  
 UIDeviceOrientationUnknown 
 : 
  
 case 
  
 UIDeviceOrientationFaceUp 
 : 
  
 case 
  
 UIDeviceOrientationFaceDown 
 : 
  
 return 
  
 UIImageOrientationUp 
 ; 
  
 } 
 }

Create a VisionImage object using the CMSampleBuffer object and orientation:

Swift

 let 
  
 image 
  
 = 
  
 VisionImage 
 ( 
 buffer 
 : 
  
 sampleBuffer 
 ) 
 image 
 . 
 orientation 
  
 = 
  
 imageOrientation 
 ( 
  
 deviceOrientation 
 : 
  
 UIDevice 
 . 
 current 
 . 
 orientation 
 , 
  
 cameraPosition 
 : 
  
 cameraPosition 
 )

Objective-C

  
 MLKVisionImage 
  
 * 
 image 
  
 = 
  
 [[ 
 MLKVisionImage 
  
 alloc 
 ] 
  
 initWithBuffer 
 : 
 sampleBuffer 
 ]; 
  
 image 
 . 
 orientation 
  
 = 
  
 [ 
 self 
  
 imageOrientationFromDeviceOrientation 
 : 
 UIDevice 
 . 
 currentDevice 
 . 
 orientation 
  
 cameraPosition 
 : 
 cameraPosition 
 ];

2. Configure and run the image labeler

To label objects in an image, pass the VisionImage object to the ImageLabeler 's processImage() method.

First, get an instance of ImageLabeler .

Swift

 let 
  
 labeler 
  
 = 
  
 ImageLabeler 
 . 
 imageLabeler 
 () 
 // Or, to set the minimum confidence required: 
 // let options = ImageLabelerOptions() 
 // options.confidenceThreshold = 0.7 
 // let labeler = ImageLabeler.imageLabeler(options: options)

Objective-C

 MLKImageLabeler 
  
 * 
 labeler 
  
 = 
  
 [ 
 MLKImageLabeler 
  
 imageLabeler 
 ]; 
 // Or, to set the minimum confidence required: 
 // MLKImageLabelerOptions *options = 
 //         [[MLKImageLabelerOptions alloc] init]; 
 // options.confidenceThreshold = 0.7; 
 // MLKImageLabeler *labeler = 
 //         [MLKImageLabeler imageLabelerWithOptions:options];

Then, pass the image to the processImage() method:

Swift

 labeler 
 . 
 process 
 ( 
 image 
 ) 
  
 { 
  
 labels 
 , 
  
 error 
  
 in 
  
 guard 
  
 error 
  
 == 
  
 nil 
 , 
  
 let 
  
 labels 
  
 = 
  
 labels 
  
 else 
  
 { 
  
 return 
  
 } 
  
 // Task succeeded. 
  
 // ... 
 }

Objective-C

 [ 
 labeler 
  
 processImage 
 : 
 image 
 completion 
 : 
 ^ 
 ( 
 NSArray 
   
 * 
 _Nullable 
  
 labels 
 , 
  
 NSError 
  
 * 
 _Nullable 
  
 error 
 ) 
  
 { 
  
 if 
  
 ( 
 error 
  
 != 
  
 nil 
 ) 
  
 { 
  
 return 
 ; 
  
 } 
  
 // Task succeeded. 
  
 // ... 
 }];

3. Get information about labeled objects

If image labeling succeeds, the completion handler receives an array of ImageLabel objects. Each ImageLabel object represents something that was labeled in the image. The base model supports 400+ different labels . You can get each label's text description, index among all labels supported by the model, and the confidence score of the match. For example:

Swift

 for 
  
 label 
  
 in 
  
 labels 
  
 { 
  
 let 
  
 labelText 
  
 = 
  
 label 
 . 
 text 
  
 let 
  
 confidence 
  
 = 
  
 label 
 . 
 confidence 
  
 let 
  
 index 
  
 = 
  
 label 
 . 
 index 
 }

Objective-C

 for 
  
 ( 
 MLKImageLabel 
  
 * 
 label 
  
 in 
  
 labels 
 ) 
  
 { 
  
 NSString 
  
 * 
 labelText 
  
 = 
  
 label 
 . 
 text 
 ; 
  
 float 
  
 confidence 
  
 = 
  
 label 
 . 
 confidence 
 ; 
  
 NSInteger 
  
 index 
  
 = 
  
 label 
 . 
 index 
 ; 
 }

Tips to improve real-time performance

If you want to label images in a real-time application, follow these guidelines to achieve the best framerates:

For processing video frames, use the results(in:) synchronous API of the image labeler. Call this method from the AVCaptureVideoDataOutputSampleBufferDelegate 's captureOutput(_, didOutput:from:) function to synchronously get results from the given video frame. Keep AVCaptureVideoDataOutput 's alwaysDiscardsLateVideoFrames as true to throttle calls to the image labeler. If a new video frame becomes available while the image labeler is running, it will be dropped.
If you use the output of the image labeler to overlay graphics on the input image, first get the result from ML Kit, then render the image and overlay in a single step. By doing so, you render to the display surface only once for each processed input frame. See the updatePreviewOverlayViewWithLastFrame in the ML Kit quickstart sample for an example.

Label images with ML Kit on iOS Stay organized with collections Save and categorize content based on your preferences.

Page Summary

Try it out

Before you begin

1. Prepare the input image

Swift

Objective-C

Swift

Objective-C

Swift

Objective-C

2. Configure and run the image labeler

Swift

Objective-C

Swift

Objective-C

3. Get information about labeled objects

Swift

Objective-C

Tips to improve real-time performance

Label images with ML Kit on iOS