Gemini 3 Pro & Flash, Gemini 3 Pro Image (nano banana pro), and the latest Gemini Live API native audio models are now available to use with Firebase AI Logic on all platforms!

Analyze documents (like PDFs) using the Gemini API

You can ask a Gemini model to analyze document files (like PDFs and plain-text files) that you provide either inline (base64-encoded) or via URL. When you use Firebase AI Logic , you can make this request directly from your app.

With this capability, you can do things like:

Analyze diagrams, charts, and tables inside documents
Extract information into structured output formats
Answer questions about visual and text contents in documents
Summarize documents
Transcribe document content (for example, into HTML), preserving layouts and formatting, for use in downstream applications (such as in RAG pipelines)

Jump to code samples Jump to code for streamed responses

See other guides for additional options for working with documents (like PDFs)
Generate structured output Multi-turn chat

Before you begin

Click your Gemini API provider to view provider-specific content and code on this page.

If you haven't already, complete the getting started guide , which describes how to set up your Firebase project, connect your app to Firebase, add the SDK, initialize the backend service for your chosen Gemini API provider, and create a GenerativeModel instance.

For testing and iterating on your prompts, we recommend using Google AI Studio .

Need a sample PDF file?

You can use this publicly available file with a MIME type of application/pdf ( view or download file ). https://storage.googleapis.com/cloud-samples-data/generative-ai/pdf/2403.05530.pdf

Analyze documents (like PDFs) using the Gemini API

Before you begin

Generate text from PDF files (base64-encoded)

Swift

Kotlin

Java

Web

Dart

Unity

Stream the response

View example: Stream generated text from PDF files

Swift

Kotlin

Java

Web

Dart

Unity

Requirements and recommendations for input documents

Supported document MIME types

Limits per request

What else can you do?

Try out other capabilities

Learn how to control content generation

Learn more about the supported models

Analyze documents (like PDFs) using the Gemini API Stay organized with collections Save and categorize content based on your preferences.

Before you begin

Generate text from PDF files (base64-encoded)

Swift

Kotlin

Java

Web

Dart

Unity

Stream the response

View example: Stream generated text from PDF files

Swift

Kotlin

Java

Web

Dart

Unity

Requirements and recommendations for input documents

Supported document MIME types

Limits per request

What else can you do?

Try out other capabilities

Learn how to control content generation

Learn more about the supported models

Analyze documents (like PDFs) using the Gemini API