Audio classification is a common use case of Machine Learning to classify the sound types. For example, it can identify the bird species by their songs.
The Task Library AudioClassifier
API can be used to deploy your custom audio
classifiers or pretrained ones into your mobile app.
Key features of the AudioClassifier API
-
Input audio processing, e.g. converting PCM 16 bit encoding to PCM Float encoding and the manipulation of the audio ring buffer.
-
Label map locale.
-
Supporting Multi-head classification model.
-
Supporting both single-label and multi-label classification.
-
Score threshold to filter results.
-
Top-k classification results.
-
Label allowlist and denylist.
Supported audio classifier models
The following models are guaranteed to be compatible with the AudioClassifier
API.
-
Models created by TensorFlow Lite Model Maker for Audio Classification .
-
The pretrained audio event classification models on TensorFlow Hub .
-
Custom models that meet the model compatibility requirements .
Run inference in Java
See the Audio Classification reference
app
for an example using AudioClassifier
in an Android app.
Step 1: Import Gradle dependency and other settings
Copy the .tflite
model file to the assets directory of the Android module
where the model will be run. Specify that the file should not be compressed, and
add the TensorFlow Lite library to the module’s build.gradle
file:
android
{
// Other settings
// Specify that the tflite file should not be compressed when building the APK package.
aaptOptions
{
noCompress
"tflite"
}
}
dependencies
{
// Other dependencies
// Import the Audio Task Library dependency
implementation
'
org
.
tensorflow
:
tensorflow
-
lite
-
task
-
audio
:
0.4.4
'
// Import the GPU delegate plugin Library for GPU inference
implementation
'
org
.
tensorflow
:
tensorflow
-
lite
-
gpu
-
delegate
-
plugin
:
0.4.4
'
}
Step 2: Using the model
// Initialization
AudioClassifierOptions
options
=
AudioClassifierOptions
.
builder
()
.
setBaseOptions
(
BaseOptions
.
builder
().
useGpu
().
build
())
.
setMaxResults
(
1
)
.
build
();
AudioClassifier
classifier
=
AudioClassifier
.
createFromFileAndOptions
(
context
,
modelFile
,
options
);
// Start recording
AudioRecord
record
=
classifier
.
createAudioRecord
();
record
.
startRecording
();
// Load latest audio samples
TensorAudio
audioTensor
=
classifier
.
createInputTensorAudio
();
audioTensor
.
load
(
record
);
// Run inference
List<Classifications>
results
=
audioClassifier
.
classify
(
audioTensor
);
See the source code and
javadoc
for more options to configure AudioClassifier
.
Run inference in iOS
Step 1: Install the dependencies
The Task Library supports installation using CocoaPods. Make sure that CocoaPods is installed on your system. Please see the CocoaPods installation guide for instructions.
Please see the CocoaPods guide for details on adding pods to an Xcode project.
Add the TensorFlowLiteTaskAudio
pod in the Podfile.
target 'MyAppWithTaskAPI' do
use_frameworks!
pod 'TensorFlowLiteTaskAudio'
end
Make sure that the .tflite
model you will be using for inference is present in
your app bundle.
Step 2: Using the model
Swift
// Imports
import
TensorFlowLiteTaskAudio
import
AVFoundation
// Initialization
guard
let
modelPath
=
Bundle
.
main
.
path
(
forResource
:
"sound_classification"
,
ofType
:
"tflite"
)
else
{
return
}
let
options
=
AudioClassifierOptions
(
modelPath
:
modelPath
)
// Configure any additional options:
// options.classificationOptions.maxResults = 3
let
classifier
=
try
AudioClassifier
.
classifier
(
options
:
options
)
// Create Audio Tensor to hold the input audio samples which are to be classified.
// Created Audio Tensor has audio format matching the requirements of the audio classifier.
// For more details, please see:
// https://github.com/tensorflow/tflite-support/blob/master/tensorflow_lite_support/ios/task/audio/core/audio_tensor/sources/TFLAudioTensor.h
let
audioTensor
=
classifier
.
createInputAudioTensor
()
// Create Audio Record to record the incoming audio samples from the on-device microphone.
// Created Audio Record has audio format matching the requirements of the audio classifier.
// For more details, please see:
https
:
//github.com/tensorflow/tflite-support/blob/master/tensorflow_lite_support/ios/task/audio/core/audio_record/sources/TFLAudioRecord.h
let
audioRecord
=
try
classifier
.
createAudioRecord
()
// Request record permissions from AVAudioSession before invoking audioRecord.startRecording().
AVAudioSession
.
sharedInstance
().
requestRecordPermission
{
granted
in
if
granted
{
DispatchQueue
.
main
.
async
{
// Start recording the incoming audio samples from the on-device microphone.
try
audioRecord
.
startRecording
()
// Load the samples currently held by the audio record buffer into the audio tensor.
try
audioTensor
.
load
(
audioRecord
:
audioRecord
)
// Run inference
let
classificationResult
=
try
classifier
.
classify
(
audioTensor
:
audioTensor
)
}
}
}
Objective-C
// Imports
#import <TensorFlowLiteTaskAudio/TensorFlowLiteTaskAudio.h>
#import <AVFoundation/AVFoundation.h>
// Initialization
NSString
*
modelPath
=
[[
NSBundle
mainBundle
]
pathForResource
:
@"sound_classification"
ofType
:
@"tflite"
];
TFLAudioClassifierOptions
*
options
=
[[
TFLAudioClassifierOptions
alloc
]
initWithModelPath
:
modelPath
];
// Configure any additional options:
// options.classificationOptions.maxResults = 3;
TFLAudioClassifier
*
classifier
=
[
TFLAudioClassifier
audioClassifierWithOptions
:
options
error
:
nil
];
// Create Audio Tensor to hold the input audio samples which are to be classified.
// Created Audio Tensor has audio format matching the requirements of the audio classifier.
// For more details, please see:
// https://github.com/tensorflow/tflite-support/blob/master/tensorflow_lite_support/ios/task/audio/core/audio_tensor/sources/TFLAudioTensor.h
TFLAudioTensor
*
audioTensor
=
[
classifier
createInputAudioTensor
];
// Create Audio Record to record the incoming audio samples from the on-device microphone.
// Created Audio Record has audio format matching the requirements of the audio classifier.
// For more details, please see:
https
:
//github.com/tensorflow/tflite-support/blob/master/tensorflow_lite_support/ios/task/audio/core/audio_record/sources/TFLAudioRecord.h
TFLAudioRecord
*
audioRecord
=
[
classifier
createAudioRecordWithError
:
nil
];
// Request record permissions from AVAudioSession before invoking -[TFLAudioRecord startRecordingWithError:].
[[
AVAudioSession
sharedInstance
]
requestRecordPermission
:^
(
BOOL
granted
)
{
if
(
granted
)
{
dispatch_async
(
dispatch_get_main_queue
(),
^
{
// Start recording the incoming audio samples from the on-device microphone.
[
audioRecord
startRecordingWithError
:
nil
];
// Load the samples currently held by the audio record buffer into the audio tensor.
[
audioTensor
loadAudioRecord
:
audioRecord
withError
:
nil
];
// Run inference
TFLClassificationResult
*
classificationResult
=
[
classifier
classifyWithAudioTensor
:
audioTensor
error
:
nil
];
});
}
}];
See the source
code
for more options to configure TFLAudioClassifier
.
Run inference in Python
Step 1: Install the pip package
pip install tflite-support
- Linux: Run
sudo apt-get update && apt-get install libportaudio2 - Mac and Windows: PortAudio is installed automatically when installing the
tflite-supportpip package.
Step 2: Using the model
# Imports
from
tflite_support.task
import
audio
from
tflite_support.task
import
core
from
tflite_support.task
import
processor
# Initialization
base_options
=
core
.
BaseOptions
(
file_name
=
model_path
)
classification_options
=
processor
.
ClassificationOptions
(
max_results
=
2
)
options
=
audio
.
AudioClassifierOptions
(
base_options
=
base_options
,
classification_options
=
classification_options
)
classifier
=
audio
.
AudioClassifier
.
create_from_options
(
options
)
# Alternatively, you can create an audio classifier in the following manner:
# classifier = audio.AudioClassifier.create_from_file(model_path)
# Run inference
audio_file
=
audio
.
TensorAudio
.
create_from_wav_file
(
audio_path
,
classifier
.
required_input_buffer_size
)
audio_result
=
classifier
.
classify
(
audio_file
)
See the source
code
for more options to configure AudioClassifier
.
Run inference in C++
// Initialization
AudioClassifierOptions
options
;
options
.
mutable_base_options
()
-
> mutable_model_file
()
-
> set_file_name
(
model_path
);
std
::
unique_ptr<AudioClassifier>
audio_classifier
=
AudioClassifier
::
CreateFromOptions
(
options
).
value
();
// Create input audio buffer from your `audio_data` and `audio_format`.
// See more information here: tensorflow_lite_support/cc/task/audio/core/audio_buffer.h
int
input_size
=
audio_classifier
-
> GetRequiredInputBufferSize
();
const
std
::
unique_ptr<AudioBuffer>
audio_buffer
=
AudioBuffer
::
Create
(
audio_data
,
input_size
,
audio_format
).
value
();
// Run inference
const
ClassificationResult
result
=
audio_classifier
-
> Classify
(
*
audio_buffer
).
value
();
See the source
code
for more options to configure AudioClassifier
.
Model compatibility requirements
The AudioClassifier
API expects a TFLite model with mandatory TFLite Model
Metadata
. See examples of creating
metadata for audio classifiers using the TensorFlow Lite Metadata Writer
API
.
The compatible audio classifier models should meet the following requirements:
-
Input audio tensor (kTfLiteFloat32)
- audio clip of size
[batch x samples]. - batch inference is not supported (
batchis required to be 1). - for multi-channel models, the channels need to be interleaved.
- audio clip of size
-
Output score tensor (kTfLiteFloat32)
-
[1 x N]array withNrepresents the class number. - optional (but recommended) label map(s) as AssociatedFile-s with type
TENSOR_AXIS_LABELS, containing one label per line. The first such
AssociatedFile (if any) is used to fill the
labelfield (named asclass_namein C++) of the results. Thedisplay_namefield is filled from the AssociatedFile (if any) whose locale matches thedisplay_names_localefield of theAudioClassifierOptionsused at creation time ("en" by default, i.e. English). If none of these are available, only theindexfield of the results will be filled.
-

