Gemini Models Overview

Gemini is a family of multimodal generative AI models developed by Google DeepMind, designed for state-of-the-art performance across text, code, images, audio, and video understanding.

Model Family

The Gemini family includes several model variants optimized for different use cases:

Gemini 3.1 Pro

Gemini 3.1 Pro is Google’s latest flagship model with enhanced capabilities:

Enhanced stability & grounding: Improved factuality and reduced repetitive response patterns
Advanced coding & agentic capabilities: Significant improvements in software engineering and agentic performance
Efficiency & reasoning modes: Improved token efficiency with Medium thinking level to balance depth with latency
Core quality improvements: Advanced reasoning, instruction following, and creative writing

Gemini 3 Flash

Optimized for speed and efficiency while maintaining high-quality output:

Fast response times for real-time applications
Cost-effective for high-volume workloads
Supports multimodal inputs (text, images, video, audio)
Excellent for chat, code generation, and content analysis

Gemini 2.5 Models

Previous generation models with strong performance:

Gemini 2.5 Pro: Balanced performance for complex reasoning
Gemini 2.5 Flash: Speed-optimized variant
Gemini 2.5 Flash Lite: Ultra-lightweight for edge deployments

Key Capabilities

Multimodal Understanding

Process and understand text, images, video, audio, and PDFs in a single unified model

Advanced Reasoning

Dynamic thinking with configurable reasoning depth for complex problem-solving

Code Generation

Generate, execute, and debug Python code with built-in code execution

Function Calling

Integrate with external APIs and tools through structured function declarations

Grounding

Connect to real-time data via Google Search or custom data sources

Long Context

Process up to millions of tokens with context caching for cost optimization

Available Models

Model ID	Context Window	Strengths	Best For
`gemini-3.1-pro-preview`	2M tokens	Advanced reasoning, coding	Complex tasks, agents
`gemini-3-flash-preview`	1M tokens	Speed, efficiency	Real-time apps, high volume
`gemini-2.5-pro`	2M tokens	Balanced performance	General purpose
`gemini-2.5-flash`	1M tokens	Fast responses	Production workloads
`gemini-2.5-flash-lite`	1M tokens	Lightweight	Edge deployment

Getting Started

Install the SDK

Install the Google Gen AI SDK for Python:

pip install --upgrade google-genai

Set up authentication

Authenticate your environment and enable the Vertex AI API:

from google.colab import auth
auth.authenticate_user()

Initialize the client

Create a client with your project details:

from google import genai

client = genai.Client(
    vertexai=True,
    project="your-project-id",
    location="global"
)

Generate content

Send your first request:

response = client.models.generate_content(
    model="gemini-3.1-pro-preview",
    contents="Explain how AI works in simple terms."
)
print(response.text)

Model Configuration

Control model behavior with generation parameters:

from google.genai.types import GenerateContentConfig, ThinkingConfig, ThinkingLevel

response = client.models.generate_content(
    model="gemini-3.1-pro-preview",
    contents="How does AI work?",
    config=GenerateContentConfig(
        temperature=1.0,          # Creativity (0.0-2.0)
        top_p=0.95,              # Nucleus sampling
        max_output_tokens=8000,  # Response length limit
        thinking_config=ThinkingConfig(
            thinking_level=ThinkingLevel.LOW  # Reasoning depth
        )
    )
)

For Gemini 3 models, we recommend keeping temperature at the default value of 1.0 as the reasoning capabilities are optimized for this setting.

Thinking Levels

Gemini 3.1 Pro introduces granular control over reasoning depth:

Low (1-1K tokens): Minimizes latency for simple tasks
Medium (1K-16K tokens): Balances reasoning and latency
High (16K-32K tokens): Maximizes reasoning depth (default)

from google.genai.types import ThinkingConfig, ThinkingLevel

# For faster responses on simple queries
config = GenerateContentConfig(
    thinking_config=ThinkingConfig(
        thinking_level=ThinkingLevel.LOW
    )
)

# For complex reasoning tasks
config = GenerateContentConfig(
    thinking_config=ThinkingConfig(
        thinking_level=ThinkingLevel.HIGH
    )
)

Media Resolution Control

Gemini 3 allows fine-grained control over image and video processing:

from google.genai.types import Part, PartMediaResolution, PartMediaResolutionLevel

response = client.models.generate_content(
    model="gemini-3.1-pro-preview",
    contents=[
        Part(
            file_data=types.FileData(
                file_uri="gs://path/to/image.jpg",
                mime_type="image/jpeg"
            ),
            media_resolution=PartMediaResolution(
                level=PartMediaResolutionLevel.MEDIA_RESOLUTION_HIGH
            )
        ),
        "Analyze this image in detail."
    ]
)

Safety Settings

Configure content filtering for responsible AI:

from google.genai.types import SafetySetting, HarmCategory, HarmBlockThreshold

safety_settings = [
    SafetySetting(
        category=HarmCategory.HARM_CATEGORY_HARASSMENT,
        threshold=HarmBlockThreshold.BLOCK_LOW_AND_ABOVE
    ),
    SafetySetting(
        category=HarmCategory.HARM_CATEGORY_HATE_SPEECH,
        threshold=HarmBlockThreshold.BLOCK_LOW_AND_ABOVE
    )
]

response = client.models.generate_content(
    model="gemini-3.1-pro-preview",
    contents="Your prompt here",
    config=GenerateContentConfig(safety_settings=safety_settings)
)

Getting Started

Gemini Models

Agents

RAG & Search

Embeddings & Vector Search

Vision

Audio

Gemini Models Overview

Gemini Models Overview

Model Family

Gemini 3.1 Pro

Gemini 3 Flash

Gemini 2.5 Models

Key Capabilities

Multimodal Understanding

Advanced Reasoning

Code Generation

Function Calling

Grounding

Long Context

Available Models

Getting Started

Model Configuration

Thinking Levels

Media Resolution Control

Safety Settings

Next Steps

Getting Started

Multimodal Inputs

Function Calling

Grounding

Resources

Build docs developers (and LLMs) love

Getting Started

Gemini Models

Agents

RAG & Search

Embeddings & Vector Search

Vision

Audio

​Gemini Models Overview

​Model Family

​Gemini 3.1 Pro

​Gemini 3 Flash

​Gemini 2.5 Models

​Key Capabilities

Multimodal Understanding

Advanced Reasoning

Code Generation

Function Calling

Grounding

Long Context

​Available Models

​Getting Started

​Model Configuration

​Thinking Levels

​Media Resolution Control

​Safety Settings

​Next Steps

Getting Started

Multimodal Inputs

Function Calling

Grounding

​Resources

Build docs developers (and LLMs) love

Gemini Models Overview

Model Family

Gemini 3.1 Pro

Gemini 3 Flash

Gemini 2.5 Models

Key Capabilities

Available Models

Getting Started

Model Configuration

Thinking Levels

Media Resolution Control

Safety Settings

Next Steps

Resources