Label images with ML Kit on iOS

You can use ML Kit to label objects recognized in an image. The default model provided with ML Kit supports 400+ different labels.

Try it out

  • Play around with the sample app to see an example usage of this API.

Before you begin

  1. Include the following ML Kit pods in your Podfile:
    pod 'GoogleMLKit/ImageLabeling', '8.0.0'
  2. After you install or update your project's Pods, open your Xcode project using its .xcworkspace . ML Kit is supported in Xcode version 12.4 or greater.

Now you are ready to label images.

1. Prepare the input image

Create a VisionImage object using a UIImage or a CMSampleBuffer .

If you use a UIImage , follow these steps:

  • Create a VisionImage object with the UIImage . Make sure to specify the correct .orientation .

    Swift

    let image = VisionImage(image: UIImage)
    visionImage.orientation = image.imageOrientation

    Objective-C

     MLKVisionImage 
      
     * 
     visionImage 
      
     = 
      
     [[ 
     MLKVisionImage 
      
     alloc 
     ] 
      
     initWithImage 
     : 
     image 
     ]; 
     visionImage 
     . 
     orientation 
      
     = 
      
     image 
     . 
     imageOrientation 
     ; 
    

If you use a CMSampleBuffer , follow these steps:

  • Specify the orientation of the image data contained in the CMSampleBuffer .

    To get the image orientation:

    Swift

     func 
      
     imageOrientation 
     ( 
      
     deviceOrientation 
     : 
      
     UIDeviceOrientation 
     , 
      
     cameraPosition 
     : 
      
     AVCaptureDevice 
     . 
     Position 
     ) 
      
     -> 
      
     UIImage 
     . 
     Orientation 
      
     { 
      
     switch 
      
     deviceOrientation 
      
     { 
      
     case 
      
     . 
     portrait 
     : 
      
     return 
      
     cameraPosition 
      
     == 
      
     . 
     front 
      
     ? 
      
     . 
     leftMirrored 
      
     : 
      
     . 
     right 
      
     case 
      
     . 
     landscapeLeft 
     : 
      
     return 
      
     cameraPosition 
      
     == 
      
     . 
     front 
      
     ? 
      
     . 
     downMirrored 
      
     : 
      
     . 
     up 
      
     case 
      
     . 
     portraitUpsideDown 
     : 
      
     return 
      
     cameraPosition 
      
     == 
      
     . 
     front 
      
     ? 
      
     . 
     rightMirrored 
      
     : 
      
     . 
     left 
      
     case 
      
     . 
     landscapeRight 
     : 
      
     return 
      
     cameraPosition 
      
     == 
      
     . 
     front 
      
     ? 
      
     . 
     upMirrored 
      
     : 
      
     . 
     down 
      
     case 
      
     . 
     faceDown 
     , 
      
     . 
     faceUp 
     , 
      
     . 
     unknown 
     : 
      
     return 
      
     . 
     up 
      
     } 
     } 
      
    

    Objective-C

     - 
      
     ( 
     UIImageOrientation 
     ) 
      
     imageOrientationFromDeviceOrientation 
     :( 
     UIDeviceOrientation 
     ) 
     deviceOrientation 
      
     cameraPosition 
     :( 
     AVCaptureDevicePosition 
     ) 
     cameraPosition 
      
     { 
      
     switch 
      
     (deviceOrientation) 
      
     { 
      
     case 
      
     UIDeviceOrientationPortrait 
     : 
      
     return 
      
     cameraPosition 
      
     == 
      
     AVCaptureDevicePositionFront 
      
     ? 
      
     UIImageOrientationLeftMirrored 
      
     : 
      
     UIImageOrientationRight 
     ; 
      
     case 
      
     UIDeviceOrientationLandscapeLeft 
     : 
      
     return 
      
     cameraPosition 
      
     == 
      
     AVCaptureDevicePositionFront 
      
     ? 
      
     UIImageOrientationDownMirrored 
      
     : 
      
     UIImageOrientationUp 
     ; 
      
     case 
      
     UIDeviceOrientationPortraitUpsideDown 
     : 
      
     return 
      
     cameraPosition 
      
     == 
      
     AVCaptureDevicePositionFront 
      
     ? 
      
     UIImageOrientationRightMirrored 
      
     : 
      
     UIImageOrientationLeft 
     ; 
      
     case 
      
     UIDeviceOrientationLandscapeRight 
     : 
      
     return 
      
     cameraPosition 
      
     == 
      
     AVCaptureDevicePositionFront 
      
     ? 
      
     UIImageOrientationUpMirrored 
      
     : 
      
     UIImageOrientationDown 
     ; 
      
     case 
      
     UIDeviceOrientationUnknown 
     : 
      
     case 
      
     UIDeviceOrientationFaceUp 
     : 
      
     case 
      
     UIDeviceOrientationFaceDown 
     : 
      
     return 
      
     UIImageOrientationUp 
     ; 
      
     } 
     } 
      
    
  • Create a VisionImage object using the CMSampleBuffer object and orientation:

    Swift

     let 
      
     image 
      
     = 
      
     VisionImage 
     ( 
     buffer 
     : 
      
     sampleBuffer 
     ) 
     image 
     . 
     orientation 
      
     = 
      
     imageOrientation 
     ( 
      
     deviceOrientation 
     : 
      
     UIDevice 
     . 
     current 
     . 
     orientation 
     , 
      
     cameraPosition 
     : 
      
     cameraPosition 
     ) 
    

    Objective-C

      
     MLKVisionImage 
      
     * 
     image 
      
     = 
      
     [[ 
     MLKVisionImage 
      
     alloc 
     ] 
      
     initWithBuffer 
     : 
     sampleBuffer 
     ]; 
      
     image 
     . 
     orientation 
      
     = 
      
     [ 
     self 
      
     imageOrientationFromDeviceOrientation 
     : 
     UIDevice 
     . 
     currentDevice 
     . 
     orientation 
      
     cameraPosition 
     : 
     cameraPosition 
     ]; 
    

2. Configure and run the image labeler

To label objects in an image, pass the VisionImage object to the ImageLabeler 's processImage() method.
  1. First, get an instance of ImageLabeler .

Swift

 let 
  
 labeler 
  
 = 
  
 ImageLabeler 
 . 
 imageLabeler 
 () 
 // Or, to set the minimum confidence required: 
 // let options = ImageLabelerOptions() 
 // options.confidenceThreshold = 0.7 
 // let labeler = ImageLabeler.imageLabeler(options: options) 

Objective-C

 MLKImageLabeler 
  
 * 
 labeler 
  
 = 
  
 [ 
 MLKImageLabeler 
  
 imageLabeler 
 ]; 
 // Or, to set the minimum confidence required: 
 // MLKImageLabelerOptions *options = 
 //         [[MLKImageLabelerOptions alloc] init]; 
 // options.confidenceThreshold = 0.7; 
 // MLKImageLabeler *labeler = 
 //         [MLKImageLabeler imageLabelerWithOptions:options]; 
  1. Then, pass the image to the processImage() method:

Swift

 labeler 
 . 
 process 
 ( 
 image 
 ) 
  
 { 
  
 labels 
 , 
  
 error 
  
 in 
  
 guard 
  
 error 
  
 == 
  
 nil 
 , 
  
 let 
  
 labels 
  
 = 
  
 labels 
  
 else 
  
 { 
  
 return 
  
 } 
  
 // Task succeeded. 
  
 // ... 
 } 

Objective-C

 [ 
 labeler 
  
 processImage 
 : 
 image 
 completion 
 : 
 ^ 
 ( 
 NSArray 
   
 * 
 _Nullable 
  
 labels 
 , 
  
 NSError 
  
 * 
 _Nullable 
  
 error 
 ) 
  
 { 
  
 if 
  
 ( 
 error 
  
 != 
  
 nil 
 ) 
  
 { 
  
 return 
 ; 
  
 } 
  
 // Task succeeded. 
  
 // ... 
 }]; 
 

3. Get information about labeled objects

If image labeling succeeds, the completion handler receives an array of ImageLabel objects. Each ImageLabel object represents something that was labeled in the image. The base model supports 400+ different labels . You can get each label's text description, index among all labels supported by the model, and the confidence score of the match. For example:

Swift

 for 
  
 label 
  
 in 
  
 labels 
  
 { 
  
 let 
  
 labelText 
  
 = 
  
 label 
 . 
 text 
  
 let 
  
 confidence 
  
 = 
  
 label 
 . 
 confidence 
  
 let 
  
 index 
  
 = 
  
 label 
 . 
 index 
 } 

Objective-C

 for 
  
 ( 
 MLKImageLabel 
  
 * 
 label 
  
 in 
  
 labels 
 ) 
  
 { 
  
 NSString 
  
 * 
 labelText 
  
 = 
  
 label 
 . 
 text 
 ; 
  
 float 
  
 confidence 
  
 = 
  
 label 
 . 
 confidence 
 ; 
  
 NSInteger 
  
 index 
  
 = 
  
 label 
 . 
 index 
 ; 
 } 

Tips to improve real-time performance

If you want to label images in a real-time application, follow these guidelines to achieve the best framerates:

Design a Mobile Site
View Site in Mobile | Classic
Share by: