Gemini 3 Pro Image Preview (Nano Banana Pro)

Overview

Gemini 3 Pro Image Preview (also known as Nano Banana Pro) is designed for professional asset production and complex instructions. This model features advanced "Thinking" reasoning that refines composition prior to generation, real-world grounding using Google Search, and can generate images at up to 4K resolution.

Best for: complex graphic design, high-fidelity product mockups, accurate text rendering, and data visualizations requiring real-world grounding.

Model Variants

Model	Resolution	Credits / Image	Description
`gemini-3-pro-image-preview`	1K (1024px)	8	Professional quality with Thinking
`gemini-3-pro-image-preview-2k`	2K (2048px)	8	High-res professional output
`gemini-3-pro-image-preview-4k`	4K (4096px)	16	Ultra-high-res studio quality

The Nano Banana Pro 1K and 2K variants are priced identically, making 2K the better default choice when higher resolution is needed without additional cost. The 4K variant doubles the credit cost but delivers studio-grade output suitable for large-format prints.

Capabilities

Feature	Support
Text to Image	✅ Supported
Image Editing	✅ Supported
Batch Generation	✅ Up to 9 images per request
Max Input Images	5 (high fidelity), up to 14 total
Thinking	✅ Supported (default on)
Search Grounding	✅ Supported

Supported Aspect Ratios

1:1 · 16:9 · 9:16 · 4:3 · 3:4 · 2:3 · 3:2 · 4:5 · 5:4

Pricing

All pricing is based on per-image cost via the NanoBanana API, significantly cheaper than using the official channels.

Variant	Our Price	Official Price	Savings
1K (1024px)	~$0.08	~$0.134	~40%
2K (2048px)	~$0.08	~$0.134	~40%
4K (4096px)	~$0.16	~$0.268	~40%

Since the Nano Banana Pro 1K and 2K variants share the same price, there is no reason to use 1K unless you specifically need smaller file sizes or faster response times.

Advanced Features

Nano Banana Pro Thinking Mode

Nano Banana Pro includes a built-in reasoning step called "Thinking" that plans the image composition before rendering. The Thinking step analyzes the prompt for:

Spatial relationships: Where objects should be placed relative to each other
Lighting consistency: Ensuring light sources and shadows align correctly
Text layout: Planning where text appears in the composition to avoid overlap
Style coherence: Matching artistic styles across all elements of the image

Thinking mode is enabled by default and generally produces more accurate and detailed results compared to models without this capability. It adds a small amount of latency but significantly improves output quality for complex prompts.

Search Grounding

When grounding is enabled, the model can incorporate real-world knowledge from Google Search into the generation process. This is particularly valuable for:

Real locations: Generate accurate depictions of landmarks, cities, and natural sites
Current events: Create images that reference recent happenings with visual accuracy
Product accuracy: Generate images of real products with correct branding, colors, and proportions
Historical accuracy: Produce period-appropriate imagery with correct clothing, architecture, and artifacts

Multi-Image Input

You can upload up to 14 reference images per request, with 5 designated as high-fidelity references. This enables workflows such as:

Combining elements from multiple source images into a single composition
Maintaining character consistency across a series of generated images
Transferring artistic styles from reference artwork to new compositions
Recreating product layouts with different backgrounds or settings

Best Practices

When to Use This Model

Choose Nano Banana Pro over the Flash tier when your project requires:

Precise text rendering: Posters, infographics, or any image containing readable text
Complex compositions: Scenes with multiple subjects, specific spatial arrangements, or intricate detail
Factual accuracy: Images that must reflect real-world information (locations, products, data)
Studio-quality output: Marketing assets, professional presentations, or client-facing deliverables

Prompt Tips for Professional Results

Describe lighting explicitly: Specify "soft diffused northern light" or "dramatic side lighting with deep shadows" rather than relying on the model to guess.
Include material descriptors: Phrases like "brushed aluminum surface," "matte ceramic finish," or "glossy magazine print" help the model produce realistic textures.
Reference composition styles: Mention specific photography or art styles such as "product photography on seamless background," "editorial fashion spread layout," or "flat lay arrangement."
Use negative descriptions sparingly: While you can instruct the model to avoid certain elements, positive descriptions of what you want tend to produce more reliable results.

Use Cases

Professional asset production — Studio-quality images for commercial advertising campaigns
Complex graphic design — Follow intricate multi-step instructions accurately for posters and packaging
Accurate text in images — Precise text rendering for advertisements, infographics, and social media graphics
Product mockups — High-fidelity commercial photography with accurate logo and branding integration
Data-driven visualizations — Generate charts, diagrams, and infographics grounded with real data from Google Search
Style transfer — Mix and blend artistic styles within a single composition using reference images
Editorial content — Magazine covers, book illustrations, and feature article headers with professional polish
Architecture visualization — Render interior and exterior design concepts with realistic lighting and materials

Quick Start

Replace YOUR_API_KEY with your actual API key. Don't have one yet? Create an API key here.

curl -X POST "https://api.nanobananaapi.dev/v1/images/generate" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "A professional product shot of a sleek perfume bottle on a marble surface with dramatic studio lighting",
    "num": 1,
    "model": "gemini-3-pro-image-preview",
    "image_size": "4:3"
  }'

const res = await fetch('https://api.nanobananaapi.dev/v1/images/generate', {
  method: 'POST',
  headers: {
    Authorization: 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    prompt: 'A professional product shot of a sleek perfume bottle on a marble surface with dramatic studio lighting',
    num: 1,
    model: 'gemini-3-pro-image-preview',
    image_size: '4:3',
  }),
});

const result = await res.json();
console.log(result.data.url);

import requests

res = requests.post(
  'https://api.nanobananaapi.dev/v1/images/generate',
  headers={
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json',
  },
  json={
    'prompt': 'A professional product shot of a sleek perfume bottle on a marble surface with dramatic studio lighting',
    'num': 1,
    'model': 'gemini-3-pro-image-preview',
    'image_size': '4:3',
  },
  timeout=60,
)

result = res.json()
print(result['data']['url'])

API Parameters Reference

Parameter	Type	Required	Description
`prompt`	string	Yes	The text description of the image to generate
`model`	string	Yes	Model identifier (see variants above)
`num`	integer	No	Number of images to generate (1–9, default 1)
`image_size`	string	No	Aspect ratio (default `1:1`)
`ref_images`	array	No	Reference images for style or character guidance

Frequently Asked Questions

Why does Nano Banana Pro cost more than the Flash tier? The Pro tier uses a more sophisticated pipeline with an additional reasoning step (Thinking) and access to Google Search for grounding. This produces higher-quality results but requires more computational resources per image.

When should I use 4K resolution? Use 4K for print-ready materials, large-format displays (billboards, trade show banners), and any scenario where the image will be viewed at close range on a high-DPI screen. For web use, 1K or 2K is typically sufficient and more cost-effective.

Can I disable Thinking mode? Thinking mode is enabled by default and is recommended for best results. Disabling it may reduce latency but can result in less coherent compositions, especially for complex scenes with multiple subjects or text elements.

How accurate is the text rendering? Text accuracy depends on factors like font style, text length, and overall image complexity. For short phrases and headlines, accuracy is generally very high. For longer paragraphs or small text sizes, consider using a dedicated text overlay tool after generation.

Text to Image API — Full API reference for image generation
Image to Image API — Image editing and transformation
Gemini 2.5 Flash Image — Fastest speed, lowest cost
Gemini 3.1 Flash Image Preview — High efficiency with extended resolutions

On this page