Defines a ring buffer and some utility functions to prepare the input audio samples.
It maintains a Ring Buffer to hold input audio data. Clients could feed input audio data via `load` methods and access the aggregated audio samples via `getTensorBuffer` method.
Note that this class can only handle input audio in Float (in AudioFormat.ENCODING_PCM_16BIT
) or Short (in AudioFormat.ENCODING_PCM_FLOAT
). Internally it converts and stores all the audio
samples in PCM Float encoding.
Typical usage in Kotlin
val tensor = TensorAudio.create(format, modelInputLength) tensor.load(newData) interpreter.run(tensor.getTensorBuffer(), outputBuffer);
Another sample usage with AudioRecord
val tensor = TensorAudio.create(format, modelInputLength)
Timer().scheduleAtFixedRate(delay, period) {
tensor.load(audioRecord)
interpreter.run(tensor.getTensorBuffer(), outputBuffer)
}
Nested Classes
Public Methods
| static TensorAudio | create
(AudioFormat format, int sampleCounts)
Creates a
TensorAudio
instance with a ring buffer whose size is sampleCounts
* format.getChannelCount()
. |
| static TensorAudio | create
( TensorAudio.TensorAudioFormat
format, int sampleCounts)
Creates a
AudioRecord
instance with a ring buffer whose size is sampleCounts
* format.getChannels()
. |
| TensorAudio.TensorAudioFormat | getFormat
()
|
| TensorBuffer | getTensorBuffer
()
Returns a float
TensorBuffer
holding all the available audio samples in AudioFormat.ENCODING_PCM_FLOAT
i.e. |
| void | load
(short[] src)
Converts the input audio samples
src
to ENCODING_PCM_FLOAT, then stores it in the ring
buffer. |
| void | load
(float[] src, int offsetInFloat, int sizeInFloat)
Stores the input audio samples
src
in the ring buffer. |
| void | load
(short[] src, int offsetInShort, int sizeInShort)
Converts the input audio samples
src
to ENCODING_PCM_FLOAT, then stores it in the ring
buffer. |
| int | |
| void |
Inherited Methods
Public Methods
public static TensorAudio create (AudioFormat format, int sampleCounts)
Creates a TensorAudio
instance with a ring buffer whose size is sampleCounts
* format.getChannelCount()
.
Parameters
| format | the AudioFormat
required by the TFLite model. It defines
the number of channels and sample rate. |
|---|---|
| sampleCounts | the number of samples to be fed into the model |
public static TensorAudio create ( TensorAudio.TensorAudioFormat format, int sampleCounts)
Creates a AudioRecord
instance with a ring buffer whose size is sampleCounts
* format.getChannels()
.
Parameters
| format | the expected TensorAudio.TensorAudioFormat
of audio data loaded into this class. |
|---|---|
| sampleCounts | the number of samples to be fed into the model |
public TensorAudio.TensorAudioFormat getFormat ()
public TensorBuffer getTensorBuffer ()
Returns a float TensorBuffer
holding all the available audio samples in AudioFormat.ENCODING_PCM_FLOAT
i.e. values are in the range of [-1, 1].
public void load (short[] src)
Converts the input audio samples src
to ENCODING_PCM_FLOAT, then stores it in the ring
buffer.
Parameters
AudioFormat.ENCODING_PCM_16BIT
. For
multi-channel input, the array is interleaved.public void load (float[] src, int offsetInFloat, int sizeInFloat)
Stores the input audio samples src
in the ring buffer.
Parameters
| src | input audio samples in AudioFormat.ENCODING_PCM_FLOAT
. For
multi-channel input, the array is interleaved. |
|---|---|
| offsetInFloat | starting position in the src
array |
| sizeInFloat | the number of float values to be copied |
Throws
public void load (short[] src, int offsetInShort, int sizeInShort)
Converts the input audio samples src
to ENCODING_PCM_FLOAT, then stores it in the ring
buffer.
Parameters
| src | input audio samples in AudioFormat.ENCODING_PCM_16BIT
. For
multi-channel input, the array is interleaved. |
|---|---|
| offsetInShort | starting position in the src array |
| sizeInShort | the number of short values to be copied |
Throws
public int load (AudioRecord record)
Loads latest data from the AudioRecord
in a non-blocking way. Only
supporting ENCODING_PCM_16BIT and ENCODING_PCM_FLOAT.
Parameters
AudioRecord
Returns
- number of captured audio values whose size is
channelCount * sampleCount. If there was no new data in the AudioRecord or an error occurred, this method will return 0.
Throws
| IllegalArgumentException | for unsupported audio encoding format |
|---|---|
| IllegalStateException | if reading from AudioRecord failed |
public void load (float[] src)
Stores the input audio samples src
in the ring buffer.
Parameters
AudioFormat.ENCODING_PCM_FLOAT
. For
multi-channel input, the array is interleaved.
