AI-generated Key Takeaways
-  ML Kit's image labeling API lets you identify objects in images using a pre-trained model that recognizes over 400 labels. 
-  To use this API, you need to include the GoogleMLKit/ImageLabelingpod, create aVisionImageobject from your image, and then process it with anImageLabelerinstance.
-  Results are provided as an array of ImageLabelobjects, each containing the label's text, confidence score, and index.
-  For real-time applications, leverage the synchronous results(in:)API and manage video frame processing efficiently to maintain optimal frame rates.
You can use ML Kit to label objects recognized in an image. The default model provided with ML Kit supports 400+ different labels.
Try it out
- Play around with the sample app to see an example usage of this API.
Before you begin
- Include the following ML Kit pods in your Podfile: pod 'GoogleMLKit/ImageLabeling', '8.0.0' 
- After you install or update your project's Pods, open your Xcode project using its .xcworkspace. ML Kit is supported in Xcode version 12.4 or greater.
Now you are ready to label images.
1. Prepare the input image
Create a  VisionImage 
 
object using a UIImage 
or a CMSampleBuffer 
.
If you use a UIImage 
, follow these steps:
- Create a  VisionImageobject with theUIImage. Make sure to specify the correct.orientation.Swiftlet image = VisionImage(image: UIImage) visionImage.orientation = image.imageOrientation Objective-CMLKVisionImage * visionImage = [[ MLKVisionImage alloc ] initWithImage : image ]; visionImage . orientation = image . imageOrientation ; 
If you use a CMSampleBuffer 
, follow these steps:
-  Specify the orientation of the image data contained in the CMSampleBuffer.To get the image orientation: Swiftfunc imageOrientation ( deviceOrientation : UIDeviceOrientation , cameraPosition : AVCaptureDevice . Position ) -> UIImage . Orientation { switch deviceOrientation { case . portrait : return cameraPosition == . front ? . leftMirrored : . right case . landscapeLeft : return cameraPosition == . front ? . downMirrored : . up case . portraitUpsideDown : return cameraPosition == . front ? . rightMirrored : . left case . landscapeRight : return cameraPosition == . front ? . upMirrored : . down case . faceDown , . faceUp , . unknown : return . up } } Objective-C- ( UIImageOrientation ) imageOrientationFromDeviceOrientation :( UIDeviceOrientation ) deviceOrientation cameraPosition :( AVCaptureDevicePosition ) cameraPosition { switch (deviceOrientation) { case UIDeviceOrientationPortrait : return cameraPosition == AVCaptureDevicePositionFront ? UIImageOrientationLeftMirrored : UIImageOrientationRight ; case UIDeviceOrientationLandscapeLeft : return cameraPosition == AVCaptureDevicePositionFront ? UIImageOrientationDownMirrored : UIImageOrientationUp ; case UIDeviceOrientationPortraitUpsideDown : return cameraPosition == AVCaptureDevicePositionFront ? UIImageOrientationRightMirrored : UIImageOrientationLeft ; case UIDeviceOrientationLandscapeRight : return cameraPosition == AVCaptureDevicePositionFront ? UIImageOrientationUpMirrored : UIImageOrientationDown ; case UIDeviceOrientationUnknown : case UIDeviceOrientationFaceUp : case UIDeviceOrientationFaceDown : return UIImageOrientationUp ; } } 
- Create a  VisionImageobject using theCMSampleBufferobject and orientation:Swiftlet image = VisionImage ( buffer : sampleBuffer ) image . orientation = imageOrientation ( deviceOrientation : UIDevice . current . orientation , cameraPosition : cameraPosition ) Objective-CMLKVisionImage * image = [[ MLKVisionImage alloc ] initWithBuffer : sampleBuffer ]; image . orientation = [ self imageOrientationFromDeviceOrientation : UIDevice . currentDevice . orientation cameraPosition : cameraPosition ]; 
2. Configure and run the image labeler
To label objects in an image, pass theVisionImage 
object to the ImageLabeler 
's processImage() 
method. - First, get an instance of ImageLabeler.
Swift
let labeler = ImageLabeler . imageLabeler () // Or, to set the minimum confidence required: // let options = ImageLabelerOptions() // options.confidenceThreshold = 0.7 // let labeler = ImageLabeler.imageLabeler(options: options)
Objective-C
MLKImageLabeler * labeler = [ MLKImageLabeler imageLabeler ]; // Or, to set the minimum confidence required: // MLKImageLabelerOptions *options = // [[MLKImageLabelerOptions alloc] init]; // options.confidenceThreshold = 0.7; // MLKImageLabeler *labeler = // [MLKImageLabeler imageLabelerWithOptions:options];
- Then, pass the image to the processImage()method:
Swift
labeler . process ( image ) { labels , error in guard error == nil , let labels = labels else { return } // Task succeeded. // ... }
Objective-C
[ labeler processImage : image completion : ^ ( NSArray* _Nullable labels , NSError * _Nullable error ) { if ( error != nil ) { return ; } // Task succeeded. // ... }]; 
3. Get information about labeled objects
If image labeling succeeds, the completion handler receives an array of ImageLabel 
objects. Each ImageLabel 
object represents something that was
labeled in the image. The base model supports 400+ different labels 
.
You can get each label's text description, index among all labels supported by
the model, and the confidence score of the match. For example:
Swift
for label in labels { let labelText = label . text let confidence = label . confidence let index = label . index }
Objective-C
for ( MLKImageLabel * label in labels ) { NSString * labelText = label . text ; float confidence = label . confidence ; NSInteger index = label . index ; }
Tips to improve real-time performance
If you want to label images in a real-time application, follow these guidelines to achieve the best framerates:
- For processing video frames, use the results(in:)synchronous API of the image labeler. Call this method from theAVCaptureVideoDataOutputSampleBufferDelegate'scaptureOutput(_, didOutput:from:)function to synchronously get results from the given video frame. KeepAVCaptureVideoDataOutput'salwaysDiscardsLateVideoFramesastrueto throttle calls to the image labeler. If a new video frame becomes available while the image labeler is running, it will be dropped.
- If you use the output of the image labeler to overlay graphics on the input image, first get the result from ML Kit, then render the image and overlay in a single step. By doing so, you render to the display surface only once for each processed input frame. See the updatePreviewOverlayViewWithLastFrame in the ML Kit quickstart sample for an example.


