Gemini 3 Pro Image Preview (Nano Banana Pro)
Overview
Gemini 3 Pro Image Preview (also known as Nano Banana Pro) is designed for professional asset production and complex instructions. This model features advanced "Thinking" reasoning that refines composition prior to generation, real-world grounding using Google Search, and can generate images at up to 4K resolution.
Best for: complex graphic design, high-fidelity product mockups, accurate text rendering, and data visualizations requiring real-world grounding.
Model Variants
| Model | Resolution | Credits / Image | Description |
|---|---|---|---|
gemini-3-pro-image-preview | 1K (1024px) | 8 | Professional quality with Thinking |
gemini-3-pro-image-preview-2k | 2K (2048px) | 8 | High-res professional output |
gemini-3-pro-image-preview-4k | 4K (4096px) | 16 | Ultra-high-res studio quality |
The Nano Banana Pro 1K and 2K variants are priced identically, making 2K the better default choice when higher resolution is needed without additional cost. The 4K variant doubles the credit cost but delivers studio-grade output suitable for large-format prints.
Capabilities
| Feature | Support |
|---|---|
| Text to Image | ✅ Supported |
| Image Editing | ✅ Supported |
| Batch Generation | ✅ Up to 9 images per request |
| Max Input Images | 5 (high fidelity), up to 14 total |
| Thinking | ✅ Supported (default on) |
| Search Grounding | ✅ Supported |
Supported Aspect Ratios
1:1 · 16:9 · 9:16 · 4:3 · 3:4 · 2:3 · 3:2 · 4:5 · 5:4
Pricing
All pricing is based on per-image cost via the NanoBanana API, significantly cheaper than using the official channels.
| Variant | Our Price | Official Price | Savings |
|---|---|---|---|
| 1K (1024px) | ~$0.08 | ~$0.134 | ~40% |
| 2K (2048px) | ~$0.08 | ~$0.134 | ~40% |
| 4K (4096px) | ~$0.16 | ~$0.268 | ~40% |
Since the Nano Banana Pro 1K and 2K variants share the same price, there is no reason to use 1K unless you specifically need smaller file sizes or faster response times.
Advanced Features
Nano Banana Pro Thinking Mode
Nano Banana Pro includes a built-in reasoning step called "Thinking" that plans the image composition before rendering. The Thinking step analyzes the prompt for:
- Spatial relationships: Where objects should be placed relative to each other
- Lighting consistency: Ensuring light sources and shadows align correctly
- Text layout: Planning where text appears in the composition to avoid overlap
- Style coherence: Matching artistic styles across all elements of the image
Thinking mode is enabled by default and generally produces more accurate and detailed results compared to models without this capability. It adds a small amount of latency but significantly improves output quality for complex prompts.
Search Grounding
When grounding is enabled, the model can incorporate real-world knowledge from Google Search into the generation process. This is particularly valuable for:
- Real locations: Generate accurate depictions of landmarks, cities, and natural sites
- Current events: Create images that reference recent happenings with visual accuracy
- Product accuracy: Generate images of real products with correct branding, colors, and proportions
- Historical accuracy: Produce period-appropriate imagery with correct clothing, architecture, and artifacts
Multi-Image Input
You can upload up to 14 reference images per request, with 5 designated as high-fidelity references. This enables workflows such as:
- Combining elements from multiple source images into a single composition
- Maintaining character consistency across a series of generated images
- Transferring artistic styles from reference artwork to new compositions
- Recreating product layouts with different backgrounds or settings
Best Practices
When to Use This Model
Choose Nano Banana Pro over the Flash tier when your project requires:
- Precise text rendering: Posters, infographics, or any image containing readable text
- Complex compositions: Scenes with multiple subjects, specific spatial arrangements, or intricate detail
- Factual accuracy: Images that must reflect real-world information (locations, products, data)
- Studio-quality output: Marketing assets, professional presentations, or client-facing deliverables
Prompt Tips for Professional Results
- Describe lighting explicitly: Specify "soft diffused northern light" or "dramatic side lighting with deep shadows" rather than relying on the model to guess.
- Include material descriptors: Phrases like "brushed aluminum surface," "matte ceramic finish," or "glossy magazine print" help the model produce realistic textures.
- Reference composition styles: Mention specific photography or art styles such as "product photography on seamless background," "editorial fashion spread layout," or "flat lay arrangement."
- Use negative descriptions sparingly: While you can instruct the model to avoid certain elements, positive descriptions of what you want tend to produce more reliable results.
Use Cases
- Professional asset production — Studio-quality images for commercial advertising campaigns
- Complex graphic design — Follow intricate multi-step instructions accurately for posters and packaging
- Accurate text in images — Precise text rendering for advertisements, infographics, and social media graphics
- Product mockups — High-fidelity commercial photography with accurate logo and branding integration
- Data-driven visualizations — Generate charts, diagrams, and infographics grounded with real data from Google Search
- Style transfer — Mix and blend artistic styles within a single composition using reference images
- Editorial content — Magazine covers, book illustrations, and feature article headers with professional polish
- Architecture visualization — Render interior and exterior design concepts with realistic lighting and materials
Quick Start
Replace YOUR_API_KEY with your actual API key. Don't have one yet? Create an API key here.
curl -X POST "https://api.nanobananaapi.dev/v1/images/generate" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"prompt": "A professional product shot of a sleek perfume bottle on a marble surface with dramatic studio lighting",
"num": 1,
"model": "gemini-3-pro-image-preview",
"image_size": "4:3"
}'const res = await fetch('https://api.nanobananaapi.dev/v1/images/generate', {
method: 'POST',
headers: {
Authorization: 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json',
},
body: JSON.stringify({
prompt: 'A professional product shot of a sleek perfume bottle on a marble surface with dramatic studio lighting',
num: 1,
model: 'gemini-3-pro-image-preview',
image_size: '4:3',
}),
});
const result = await res.json();
console.log(result.data.url);import requests
res = requests.post(
'https://api.nanobananaapi.dev/v1/images/generate',
headers={
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json',
},
json={
'prompt': 'A professional product shot of a sleek perfume bottle on a marble surface with dramatic studio lighting',
'num': 1,
'model': 'gemini-3-pro-image-preview',
'image_size': '4:3',
},
timeout=60,
)
result = res.json()
print(result['data']['url'])API Parameters Reference
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | The text description of the image to generate |
model | string | Yes | Model identifier (see variants above) |
num | integer | No | Number of images to generate (1–9, default 1) |
image_size | string | No | Aspect ratio (default 1:1) |
ref_images | array | No | Reference images for style or character guidance |
Frequently Asked Questions
Why does Nano Banana Pro cost more than the Flash tier? The Pro tier uses a more sophisticated pipeline with an additional reasoning step (Thinking) and access to Google Search for grounding. This produces higher-quality results but requires more computational resources per image.
When should I use 4K resolution? Use 4K for print-ready materials, large-format displays (billboards, trade show banners), and any scenario where the image will be viewed at close range on a high-DPI screen. For web use, 1K or 2K is typically sufficient and more cost-effective.
Can I disable Thinking mode? Thinking mode is enabled by default and is recommended for best results. Disabling it may reduce latency but can result in less coherent compositions, especially for complex scenes with multiple subjects or text elements.
How accurate is the text rendering? Text accuracy depends on factors like font style, text length, and overall image complexity. For short phrases and headlines, accuracy is generally very high. For longer paragraphs or small text sizes, consider using a dedicated text overlay tool after generation.
Related
- Text to Image API — Full API reference for image generation
- Image to Image API — Image editing and transformation
- Gemini 2.5 Flash Image — Fastest speed, lowest cost
- Gemini 3.1 Flash Image Preview — High efficiency with extended resolutions