The Google Meet Media API lets your app join a Google Meet conference and consume real-time media streams.
Clients use WebRTC to communicate with Meet servers. The provided reference clients ( C++ , TypeScript ) demonstrate recommended practices and you are encouraged to build directly upon them.
However, you may also build fully custom WebRTC clients that adhere to the Meet Media API's technical requirements .
This page outlines key WebRTC concepts required for a successful Meet Media API session.
Offer-answer signaling
WebRTC is a peer-to-peer (P2P) framework, where peers communicate by signaling each other. To begin a session, the initiating peer sends an SDP offer to a remote peer. This offer includes the following important details:
Media descriptions for audio and video
Media descriptions indicate what's communicated during P2P sessions. Three types of descriptions exist: audio, video, and data.
To indicate n
audio streams, the offerer includes n
audio media descriptions
in the offer. The same is true for video. However, there will only be one data
media description at most.
Directionality
Each audio or video description describes individual Secure Real-time Transport
Protocol
(SRTP) streams, governed by RFC
3711
. These are bi-directional,
allowing two peers to send and receive media across the same connection.
Because of this, each media description (in both the offer and answer) contains one of three attributes describing how the stream should be used:
-
sendonly
: Only sends media from the offering peer. The remote peer won't send media on this stream. -
recvonly
: Only receives media from the remote peer. The offering peer won't send media on this stream. -
sendrecv
: Both peers may send and receive on this stream.
Codecs
Each media description also specifies the codecs a peer supports. In the case of the Meet Media API, client offers are rejected unless they support (at least) the codecs specified in the technical requirements .
DTLS handshake
SRTP streams are secured by an initial Datagram Transport Layer Security
("DTLS", RFC
9147
) handshake between the peers.
DTLS is traditionally a client-to-server protocol; during the signaling process,
one peer agrees to act as the server while the other acts as a peer.
Because each SRTP stream might have its own dedicated DTLS connection, each media description specifies one of three attributes to indicate the peer's role in the DTLS handshake:
-
a=setup:actpass
: The offering peer defers to the choice of the remote peer. -
a=setup:active
: This peer acts as the client. -
a=setup:passive
: This peer acts as the server.
Application media descriptions
Data channels ( RFC 8831
) are
an abstraction of the Stream Control Transmission Protocol
("SCTP", RFC
9260
).
To open data channels during the initial signaling phase, the offer must contain an application media description . Unlike audio and video descriptions, application descriptions don't specify direction or codecs.
ICE candidates
A peer's Interactive Connectivity Establishment
("ICE", RFC
8445
) candidates are a list of
routes that a remote peer may use to establish a connection.
The cartesian product of the two peers' lists, known as the candidate pairs , represents the potential routes between two peers. These pairs are tested to determine the optimal route.
Signal through the Meet REST API
Use the Meet REST API
to perform this offer-answer
signaling. Your app provides an SDP offer
to the connectActiveConference()
method and receives an SDP answer
in return.
The following code samples show how to call the method:
Java
C#
Node.js
Python
Example connection flow
Here's an offer with an audio media description:
The remote peer responds with an SDP
answer
containing the same number
of media description lines. Each line indicates what media, if any, the remote
peer sends back to the offering client across the SRTP streams. The remote
peer might also reject specific streams from the offerer by setting that media
description entry to recvonly
.
For the Meet Media API, clients always send the SDP offer to initiate a connection. Meet is never the initiator.
This behavior is managed internally by the reference clients
( C++
, TypeScript
),
but developers of custom clients can use WebRTC's PeerConnectionInterface
to
generate an offer.
To connect to Meet Meet, the offer must adhere to specific requirements :
-
The client must always act as the client in the DTLS handshake, so every media description in the offer must specify either
a=setup:actpass
ora=setup:active
. -
Each media description line must support all required codecs for that media type:
- Audio:
Opus
- Video:
VP8
,VP9
,AV1
- Audio:
-
To receive audio, the offer must include exactly 3 receive-only audio media descriptions. You can do this by setting transceivers on the peer connection object.
C++
// ... rtc :: scoped_refptr<webrtc :: PeerConnectionInterface > peer_connection ; for ( int i = 0 ; i < 3 ; ++ i ) { webrtc :: RtpTransceiverInit audio_init ; audio_init . direction = webrtc :: RtpTransceiverDirection :: kRecvOnly ; audio_init . stream_ids = { absl :: StrCat ( "audio_stream_" , i )}; webrtc :: RTCErrorOr<rtc :: scoped_refptr<webrtc :: RtpTransceiverInterface >> audio_result = peer_connection - > AddTransceiver ( cricket :: MediaType :: MEDIA_TYPE_AUDIO , audio_init ); if ( ! audio_result . ok ()) { return absl :: InternalError ( absl :: StrCat ( "Failed to add audio transceiver: " , audio_result . error (). message ())); } }
JavaScript
pc = new RTCPeerConnection (); // Configure client to receive audio from Meet servers. pc . addTransceiver ( 'audio' , { 'direction' : 'recvonly' }); pc . addTransceiver ( 'audio' , { 'direction' : 'recvonly' }); pc . addTransceiver ( 'audio' , { 'direction' : 'recvonly' });
-
To receive video, the offer must include 1–3 receive-only video media descriptions. You can do this by setting transceivers on the peer connection object.
C++
// ... rtc :: scoped_refptr<webrtc :: PeerConnectionInterface > peer_connection ; for ( uint32_t i = 0 ; i < configurations . receiving_video_stream_count ; ++ i ) { webrtc :: RtpTransceiverInit video_init ; video_init . direction = webrtc :: RtpTransceiverDirection :: kRecvOnly ; video_init . stream_ids = { absl :: StrCat ( "video_stream_" , i )}; webrtc :: RTCErrorOr<rtc :: scoped_refptr<webrtc :: RtpTransceiverInterface >> video_result = peer_connection - > AddTransceiver ( cricket :: MediaType :: MEDIA_TYPE_VIDEO , video_init ); if ( ! video_result . ok ()) { return absl :: InternalError ( absl :: StrCat ( "Failed to add video transceiver: " , video_result . error (). message ())); } }
JavaScript
pc = new RTCPeerConnection (); // Configure client to receive video from Meet servers. pc . addTransceiver ( 'video' , { 'direction' : 'recvonly' }); pc . addTransceiver ( 'video' , { 'direction' : 'recvonly' }); pc . addTransceiver ( 'video' , { 'direction' : 'recvonly' });
-
The offer must always include data channels. At minimum, the
session-control
andmedia-stats
channels should always be open. All data channels must beordered
.C++
// ... // All data channels must be ordered. constexpr webrtc :: DataChannelInit kDataChannelConfig = {. ordered = true }; rtc :: scoped_refptr<webrtc :: PeerConnectionInterface > peer_connection ; // Signal session-control data channel. webrtc :: RTCErrorOr<rtc :: scoped_refptr<webrtc :: DataChannelInterface >> session_create_result = peer_connection - > CreateDataChannelOrError ( "session-control" , & kDataChannelConfig ); if ( ! session_create_result . ok ()) { return absl :: InternalError ( absl :: StrCat ( "Failed to create data channel " , data_channel_label , ": " , session_create_result . error (). message ())); } // Signal media-stats data channel. webrtc :: RTCErrorOr<rtc :: scoped_refptr<webrtc :: DataChannelInterface >> stats_create_result = peer_connection - > CreateDataChannelOrError ( "media-stats" , & kDataChannelConfig ); if ( ! stats_create_result . ok ()) { return absl :: InternalError ( absl :: StrCat ( "Failed to create data channel " , data_channel_label , ": " , stats_create_result . error (). message ())); }
JavaScript
// ... pc = new RTCPeerConnection (); // All data channels must be ordered. const dataChannelConfig = { ordered : true , }; // Signal session-control data channel. sessionControlChannel = pc . createDataChannel ( 'session-control' , dataChannelConfig ); sessionControlChannel . onopen = () = > console . log ( "data channel is now open" ); sessionControlChannel . onclose = () = > console . log ( "data channel is now closed" ); sessionControlChannel . onmessage = async ( e ) = > { console . log ( "data channel message" , e . data ); }; // Signal media-stats data channel. mediaStatsChannel = pc . createDataChannel ( 'media-stats' , dataChannelConfig ); mediaStatsChannel . onopen = () = > console . log ( "data channel is now open" ); mediaStatsChannel . onclose = () = > console . log ( "data channel is now closed" ); mediaStatsChannel . onmessage = async ( e ) = > { console . log ( "data channel message" , e . data ); };
Example SDP offer and answer
Here's a full example of a valid SDP offer and matching SDP answer. This offer negotiates a Meet Media API session with audio and a single video stream.
Observe there are three audio media descriptions, one video media description, and the required application media description.
Client SDP offer | Meet Media API SDP answer |
---|---|
v=0
|
v=0
|
m=audio 59905 UDP/TLS/RTP/SAVPF 111 63 9 0 8 13 110 126
|
m=audio 19306 UDP/TLS/RTP/SAVPF 111
|
m=audio 9 UDP/TLS/RTP/SAVPF 111 63 9 0 8 13 110 126
|
m=audio 9 UDP/TLS/RTP/SAVPF 111
|
m=audio 9 UDP/TLS/RTP/SAVPF 111 63 9 0 8 13 110 126
|
m=audio 9 UDP/TLS/RTP/SAVPF 111
|
m=application 9 UDP/DTLS/SCTP webrtc-datachannel
|
m=application 9 DTLS/SCTP 5000
|
m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 35 36 37 38 102 103 104 105 106 107 108 109 127 125 39 40 41 42 43 44 45 46 47 48 112 113 114 115 116 117 118 49
|
m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99
|