Action recognition

Action recognition identifies different actions from video clips, such as walking or dancing. Each of the actions may or may not be performed throughout the entire duration of the video.

Using an AutoML model

Before you begin

For background on creating an AutoML model, check out the Vertex AI Beginner's guide . For instructions on how to create your AutoML model, see Video data in "Develop and use ML models" in the Vertex AI documentation.

Use your AutoML model

The following code sample demonstrates how to use your AutoML model for action recognition using the streaming client library.

Python

To authenticate to Video Intelligence, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  import 
  
 io 
 from 
  
 google.cloud 
  
 import 
 videointelligence_v1p3beta1 
 as 
 videointelligence 
 # path = 'path_to_file' 
 # project_id = 'project_id' 
 # model_id = 'automl_action_recognition_model_id' 
 client 
 = 
 videointelligence 
 . 
 StreamingVideoIntelligenceServiceClient 
 () 
 model_path 
 = 
 "projects/ 
 {} 
 /locations/us-central1/models/ 
 {} 
 " 
 . 
 format 
 ( 
 project_id 
 , 
 model_id 
 ) 
 automl_config 
 = 
 videointelligence 
 . 
 StreamingAutomlActionRecognitionConfig 
 ( 
 model_name 
 = 
 model_path 
 ) 
 video_config 
 = 
 videointelligence 
 . 
 StreamingVideoConfig 
 ( 
 feature 
 = 
 videointelligence 
 . 
 StreamingFeature 
 . 
 STREAMING_AUTOML_ACTION_RECOGNITION 
 , 
 automl_action_recognition_config 
 = 
 automl_config 
 , 
 ) 
 # config_request should be the first in the stream of requests. 
 config_request 
 = 
 videointelligence 
 . 
 StreamingAnnotateVideoRequest 
 ( 
 video_config 
 = 
 video_config 
 ) 
 # Set the chunk size to 5MB (recommended less than 10MB). 
 chunk_size 
 = 
 5 
 * 
 1024 
 * 
 1024 
 def 
  
 stream_generator 
 (): 
 yield 
 config_request 
 # Load file content. 
 # Note: Input videos must have supported video codecs. See 
 # https://cloud.google.com/video-intelligence/docs/streaming/streaming#supported_video_codecs 
 # for more details. 
 with 
 io 
 . 
 open 
 ( 
 path 
 , 
 "rb" 
 ) 
 as 
 video_file 
 : 
 while 
 True 
 : 
 data 
 = 
 video_file 
 . 
 read 
 ( 
 chunk_size 
 ) 
 if 
 not 
 data 
 : 
 break 
 yield 
 videointelligence 
 . 
 StreamingAnnotateVideoRequest 
 ( 
 input_content 
 = 
 data 
 ) 
 requests 
 = 
 stream_generator 
 () 
 # streaming_annotate_video returns a generator. 
 # The default timeout is about 300 seconds. 
 # To process longer videos it should be set to 
 # larger than the length (in seconds) of the video. 
 responses 
 = 
 client 
 . 
 streaming_annotate_video 
 ( 
 requests 
 , 
 timeout 
 = 
 900 
 ) 
 # Each response corresponds to about 1 second of video. 
 for 
 response 
 in 
 responses 
 : 
 # Check for errors. 
 if 
 response 
 . 
 error 
 . 
 message 
 : 
 print 
 ( 
 response 
 . 
 error 
 . 
 message 
 ) 
 break 
 for 
 label 
 in 
 response 
 . 
 annotation_results 
 . 
 label_annotations 
 : 
 for 
 frame 
 in 
 label 
 . 
 frames 
 : 
 print 
 ( 
 "At 
 {:3d} 
 s segment, 
 {:5.1%} 
  
 {} 
 " 
 . 
 format 
 ( 
 frame 
 . 
 time_offset 
 . 
 seconds 
 , 
 frame 
 . 
 confidence 
 , 
 label 
 . 
 entity 
 . 
 entity_id 
 , 
 ) 
 )