You can use ML Kit to identify the language of a string of text. You can get the string's most likely language or get confidence scores for all of the string's possible languages.
ML Kit recognizes text in 103 different languages in their native scripts. In addition, romanized text can be recognized for Arabic, Bulgarian, Chinese, Greek, Hindi, Japanese, and Russian.
Before you begin
- If you have not already added Firebase to your app, do so by following the steps in the getting started guide .
- Include the ML Kit libraries in your Podfile:
pod 'Firebase/MLNaturalLanguage', '6.25.0' pod 'Firebase/MLNLLanguageID', '6.25.0'
.xcworkspace
. - In your app, import Firebase:
Swift
import Firebase
Objective-C
@import Firebase ;
Identify the language of a string
To identify the language of a string, get an instance of LanguageIdentification
, and then pass the string to the identifyLanguage(for:)
method.
For example:
Swift
let
languageId
=
NaturalLanguage
.
naturalLanguage
().
languageIdentification
()
languageId
.
identifyLanguage
(
for
:
text
)
{
(
languageCode
,
error
)
in
if
let
error
=
error
{
print
(
"Failed with error:
\(
error
)
"
)
return
}
if
let
languageCode
=
languageCode
,
languageCode
!=
"und"
{
print
(
"Identified Language:
\(
languageCode
)
"
)
}
else
{
print
(
"No language was identified"
)
}
}
Objective-C
FIRNaturalLanguage
*
naturalLanguage
=
[
FIRNaturalLanguage
naturalLanguage
];
FIRLanguageIdentification
*
languageId
=
[
naturalLanguage
languageIdentification
];
[
languageId
identifyLanguageForText
:
text
completion
:
^
(
NSString
*
_Nullable
languageCode
,
NSError
*
_Nullable
error
)
{
if
(
error
!=
nil
)
{
NSLog
(
@
"Failed with error: %@"
,
error
.
localizedDescription
);
return
;
}
if
(
languageCode
!=
nil
&&
!
[
languageCode
isEqualToString
:
@
"und"
]
)
{
NSLog
(
@
"Identified Language: %@"
,
languageCode
);
}
else
{
NSLog
(
@
"No language was identified"
);
}
}];
If the call succeeds, a BCP-47 language code
is
passed to the completion handler, indicating the language of the text. See the complete list of supported languages
. If no
language could be confidently detected, the code und
(undetermined) is passed.
By default, ML Kit returns a non- und
value only when it identifies the
language with a confidence value of at least 0.5. You can change this threshold
by passing a LanguageIdentificationOptions
object to languageIdentification(options:)
:
Swift
let
options
=
LanguageIdentificationOptions
(
confidenceThreshold
:
0.4
)
let
languageId
=
NaturalLanguage
.
naturalLanguage
().
languageIdentification
(
options
:
options
)
Objective-C
FIRNaturalLanguage
*
naturalLanguage
=
[
FIRNaturalLanguage
naturalLanguage
];
FIRLanguageIdentificationOptions
*
options
=
[[
FIRLanguageIdentificationOptions
alloc
]
initWithConfidenceThreshold
:
0.4
];
FIRLanguageIdentification
*
languageId
=
[
naturalLanguage
languageIdentificationWithOptions
:
options
];
Get the possible languages of a string
To get the confidence values of a string's most likely languages, get an
instance of LanguageIdentification
, and then pass the string to the identifyPossibleLanguages(for:)
method.
For example:
Swift
let
languageId
=
NaturalLanguage
.
naturalLanguage
().
languageIdentification
()
languageId
.
identifyPossibleLanguages
(
for
:
text
)
{
(
identifiedLanguages
,
error
)
in
if
let
error
=
error
{
print
(
"Failed with error:
\(
error
)
"
)
return
}
guard
let
identifiedLanguages
=
identifiedLanguages
,
!
identifiedLanguages
.
isEmpty
,
identifiedLanguages
[
0
].
languageCode
!=
"und"
else
{
print
(
"No language was identified"
)
return
}
print
(
"Identified Languages:
\n
"
+
identifiedLanguages
.
map
{
String
(
format
:
"(%@, %.2f)"
,
$0
.
languageCode
,
$0
.
confidence
)
}.
joined
(
separator
:
"
\n
"
))
}
Objective-C
FIRNaturalLanguage
*
naturalLanguage
=
[
FIRNaturalLanguage
naturalLanguage
];
FIRLanguageIdentification
*
languageId
=
[
naturalLanguage
languageIdentification
];
[
languageId
identifyPossibleLanguagesForText
:
text
completion
:
^
(
NSArray<FIRIdentifiedLanguage
*
>
*
_Nonnull
identifiedLanguages
,
NSError
*
_Nullable
error
)
{
if
(
error
!=
nil
)
{
NSLog
(
@
"Failed with error: %@"
,
error
.
localizedDescription
);
return
;
}
if
(
identifiedLanguages
.
count
==
1
&&
[
identifiedLanguages
[
0
].
languageCode
isEqualToString
:
@
"und"
]
)
{
NSLog
(
@
"No language was identified"
);
return
;
}
NSMutableString
*
outputText
=
[
NSMutableString
stringWithFormat
:
@
"Identified Languages:"
];
for
(
FIRIdentifiedLanguage
*
language
in
identifiedLanguages
)
{
[
outputText
appendFormat
:
@
"
\n
(%@, %.2f)"
,
language
.
languageCode
,
language
.
confidence
];
}
NSLog
(
outputText
);
}];
If the call succeeds, a list of IdentifiedLanguage
objects is passed to the
continuation handler. From each object, you can get the language's BCP-47 code
and the confidence that the string is in that language. See the complete list of supported languages
. Note that
these values indicate the confidence that the entire string is in the given
language; ML Kit doesn't identify multiple languages in a single string.
By default, ML Kit returns only languages with confidence values of at least
0.01. You can change this threshold by passing a LanguageIdentificationOptions
object to languageIdentification(options:)
:
Swift
let
options
=
LanguageIdentificationOptions
(
confidenceThreshold
:
0.4
)
let
languageId
=
NaturalLanguage
.
naturalLanguage
().
languageIdentification
(
options
:
options
)
Objective-C
FIRNaturalLanguage
*
naturalLanguage
=
[
FIRNaturalLanguage
naturalLanguage
];
FIRLanguageIdentificationOptions
*
options
=
[[
FIRLanguageIdentificationOptions
alloc
]
initWithConfidenceThreshold
:
0.4
];
FIRLanguageIdentification
*
languageId
=
[
naturalLanguage
languageIdentificationWithOptions
:
options
];
If no language meets this threshold, the list will have one item, with the value und
.