Introducing Google AI Edge Portal : Benchmark Edge AI at scale. Sign-up to request access during private preview.

TensorAudio

public class TensorAudio

Defines a ring buffer and some utility functions to prepare the input audio samples.

It maintains a Ring Buffer to hold input audio data. Clients could feed input audio data via `load` methods and access the aggregated audio samples via `getTensorBuffer` method.

Note that this class can only handle input audio in Float (in AudioFormat.ENCODING_PCM_16BIT ) or Short (in AudioFormat.ENCODING_PCM_FLOAT ). Internally it converts and stores all the audio samples in PCM Float encoding.

Typical usage in Kotlin

val tensor = TensorAudio.create(format, modelInputLength)
   tensor.load(newData)
   interpreter.run(tensor.getTensorBuffer(), outputBuffer);

Another sample usage with AudioRecord

val tensor = TensorAudio.create(format, modelInputLength)
   Timer().scheduleAtFixedRate(delay, period) {
     tensor.load(audioRecord)
     interpreter.run(tensor.getTensorBuffer(), outputBuffer)
   }

Nested Classes

class

TensorAudio.TensorAudioFormat

Wraps a few constants describing the format of the incoming audio samples, namely number of channels and the sample rate.

Public Methods

static TensorAudio	create (AudioFormat format, int sampleCounts) Creates a `TensorAudio` instance with a ring buffer whose size is `sampleCounts` * `format.getChannelCount()` .
static TensorAudio	create ( TensorAudio.TensorAudioFormat format, int sampleCounts) Creates a `AudioRecord` instance with a ring buffer whose size is `sampleCounts` * `format.getChannels()` .
TensorAudio.TensorAudioFormat	getFormat ()
TensorBuffer	getTensorBuffer () Returns a float `TensorBuffer` holding all the available audio samples in `AudioFormat.ENCODING_PCM_FLOAT` i.e.
void	load (short[] src) Converts the input audio samples `src` to ENCODING_PCM_FLOAT, then stores it in the ring buffer.
void	load (float[] src, int offsetInFloat, int sizeInFloat) Stores the input audio samples `src` in the ring buffer.
void	load (short[] src, int offsetInShort, int sizeInShort) Converts the input audio samples `src` to ENCODING_PCM_FLOAT, then stores it in the ring buffer.
int	load (AudioRecord record) Loads latest data from the `AudioRecord` in a non-blocking way.
void	load (float[] src) Stores the input audio samples `src` in the ring buffer.

Inherited Methods

From class java.lang.Object

boolean	equals ( Object arg0)
final Class <?>	getClass ()
int	hashCode ()
final void	notify ()
final void	notifyAll ()
String	toString ()
final void	wait (long arg0, int arg1)
final void	wait (long arg0)
final void	wait ()

Public Methods

public static TensorAudio create (AudioFormat format, int sampleCounts)

Creates a TensorAudio instance with a ring buffer whose size is sampleCounts * format.getChannelCount() .

Parameters

format	the `AudioFormat` required by the TFLite model. It defines the number of channels and sample rate.
sampleCounts	the number of samples to be fed into the model

public static TensorAudio create ( TensorAudio.TensorAudioFormat format, int sampleCounts)

Creates a AudioRecord instance with a ring buffer whose size is sampleCounts * format.getChannels() .

Parameters

format	the expected `TensorAudio.TensorAudioFormat` of audio data loaded into this class.
sampleCounts	the number of samples to be fed into the model

public TensorAudio.TensorAudioFormat getFormat ()

public TensorBuffer getTensorBuffer ()

Returns a float TensorBuffer holding all the available audio samples in AudioFormat.ENCODING_PCM_FLOAT i.e. values are in the range of [-1, 1].

public void load (short[] src)

Converts the input audio samples src to ENCODING_PCM_FLOAT, then stores it in the ring buffer.

Parameters

src

input audio samples in

 AudioFormat.ENCODING_PCM_16BIT

. For multi-channel input, the array is interleaved.

public void load (float[] src, int offsetInFloat, int sizeInFloat)

Stores the input audio samples src in the ring buffer.

Parameters

src	input audio samples in `AudioFormat.ENCODING_PCM_FLOAT` . For multi-channel input, the array is interleaved.
offsetInFloat	starting position in the `src` array
sizeInFloat	the number of float values to be copied

Throws

IllegalArgumentException

for incompatible audio format or incorrect input size

public void load (short[] src, int offsetInShort, int sizeInShort)

Converts the input audio samples src to ENCODING_PCM_FLOAT, then stores it in the ring buffer.

Parameters

src	input audio samples in `AudioFormat.ENCODING_PCM_16BIT` . For multi-channel input, the array is interleaved.
offsetInShort	starting position in the src array
sizeInShort	the number of short values to be copied

Throws

IllegalArgumentException

if the source array can't be copied

public int load (AudioRecord record)

Loads latest data from the AudioRecord in a non-blocking way. Only supporting ENCODING_PCM_16BIT and ENCODING_PCM_FLOAT.

Parameters

record

an instance of

 AudioRecord

Returns

number of captured audio values whose size is channelCount * sampleCount . If there was no new data in the AudioRecord or an error occurred, this method will return 0.

Throws

IllegalArgumentException	for unsupported audio encoding format
IllegalStateException	if reading from AudioRecord failed

public void load (float[] src)

Stores the input audio samples src in the ring buffer.

Parameters

src

input audio samples in

 AudioFormat.ENCODING_PCM_FLOAT

. For multi-channel input, the array is interleaved.

TensorAudio Stay organized with collections Save and categorize content based on your preferences.

Nested Classes

Public Methods

Inherited Methods

Public Methods

public static TensorAudio create (AudioFormat format, int sampleCounts)

Parameters

public static TensorAudio create ( TensorAudio.TensorAudioFormat format, int sampleCounts)

Parameters

public TensorAudio.TensorAudioFormat getFormat ()

public TensorBuffer getTensorBuffer ()

public void load (short[] src)

Parameters

public void load (float[] src, int offsetInFloat, int sizeInFloat)

Parameters

Throws

public void load (short[] src, int offsetInShort, int sizeInShort)

Parameters

Throws

public int load (AudioRecord record)

Parameters

Returns

Throws

public void load (float[] src)

Parameters

TensorAudio