Qwen Image Text to Image | High-Quality Text-to-Image API

Home/Explore/WaveSpeed/Qwen Image/Text To Image

wavespeed-ai /

Qwen-Image is a 20B MMDiT next-gen text-to-image model that generates images from text prompts. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-image

Input

prompt*

size

width

height

1024 × 1024 px

Range: 256 - 1536

seed

output_format

enable_sync_mode

If set to true, the function will wait for the result to be generated and uploaded before returning the response. It allows you to get the result directly in the response. This property is only available through the API.

enable_base64_output

If enabled, the output will be encoded into a BASE64 string instead of a URL. This property is only available through the API.

Enable Safety Checker

Idle

Bookstore window display. A sign displays "New Arrivals This Week". Below, a shelf tag with the text "Best-Selling Novels Here". To the side, a colorful poster advertises "Author Meet And Greet on Saturday" with a central portrait of the author. There are four books on the bookshelf, namely "The light between worlds" "When stars are scattered" "The slient patient" "The night circus"

$0.02per run·~50 / $1

A beautiful Chinese woman wearing a "WaveSpeedAI" T-shirt is smiling at the camera with a black marker. Behind her, a glass panel reads in handwriting, "Meet Qwen Image - a powerful image foundation model capable of complex text rendering and precise image editing."

Bookstore window display. A sign displays "New Arrivals This Week". Below, a shelf tag with the text "Best-Selling Novels Here". To the side, a colorful poster advertises "Author Meet And Greet on Saturday" with a central portrait of the author. There are four books on the bookshelf, namely "The light between worlds" "When stars are scattered" "The slient patient" "The night circus"

A man in a suit is standing in front of the window, looking at the bright moon outside the window. The man is holding a yellowed paper with handwritten words on it: "A lantern moon climbs through the silver night, Unfurling quiet dreams across the sky, Each star a whispered promise wrapped in light, That dawn will bloom, though darkness wanders by." There is a cute cat on the windowsill.

A Victorian noble lady with an elegant updo and a gentle gaze, wearing a deep red velvet dress, sitting in an ornate library. Warm candlelight illuminates her face and the surrounding bookshelves. In the style of John Singer Sargent, classic oil painting, expressive brushstrokes, masterpiece, rich textures.

A movie poster. The first row is the movie title, which reads "Imagination Unleashed". The second row is the movie subtitle, which reads "Enter a world beyond your imagination". The third row reads "Cast: Qwen-Image". The fourth row reads "Director: The Collective Imagination of Humanity". The central visual features a sleek, futuristic computer from which radiant colors, whimsical creatures, and dynamic, swirling patterns explosively emerge, filling the composition with energy, motion, and surreal creativity. The background transitions from dark, cosmic tones into a luminous, dreamlike expanse, evoking a digital fantasy realm. At the bottom edge, the text "Launching in the Cloud, August 2025" appears in bold, modern sans-serif font with a glowing, slightly transparent effect, evoking a high-tech, cinematic aesthetic. The overall style blends sci-fi surrealism with graphic design flair—sharp contrasts, vivid color grading, and layered visual depth—reminiscent of visionary concept art and digital matte painting, 32K resolution, ultra-detailed.

Real style, three different looking puppies have a camera in front of them and the puppies look at it curiously. Elevated view

A female athlete with defined muscles and a tight ponytail, preparing for a run. She is wearing a black sports top and leggings, her gaze focused and determined. The background is a city running track at dawn with a light mist on the ground. Dynamic action shot, strong rim lighting outlining her silhouette, powerful and energetic, high contrast.

A girl with little freckles and messy red hair sitting on a rooftop during sunset, denim jacket slightly worn, holding a Polaroid camera, city skyline glowing in soft hues behind her

An elven queen with long silver hair and glowing blue eyes, wearing a magnificent white gown adorned with jewels. She stands in an ancient, mystical forest surrounded by luminous plants and mist. Moonlight filtering through the canopy, creating magical light and shadows. Fantasy art, epic, intricate details, masterpiece, digital painting.

Related Models

nano-banana/text-to-image

text-to-image

imagen4

text-to-image

imagen4-ultra

text-to-image

imagen4-fast

text-to-image

flux-1.1-pro

text-to-image

wan-2.1/text-to-image

text-to-image

README

Qwen-Image (Text-to-Image)

Qwen-Image is a 20B MMDiT-based text-to-image generation model, especially strong at native text rendering in both English and Chinese. It is a powerful creative tool for posters, comics, and visual storytelling, while also excelling at general image generation from photorealism to anime.

Why it looks great

SOTA text rendering: Rivals GPT-4o in English and best-in-class for Chinese.
In-pixel text generation: Text is fully integrated into the image (no overlays).
Bilingual typography: Handles diverse fonts, styles, and complex layouts.
General image capability: Excels across styles—photorealistic, anime, impressionist, minimalist.

Limits and Performance

Max resolution per job: up to 1536 × 1536 pixels
Custom size: manually set width & height
Output formats: JPEG / PNG / WEBP
Processing speed: ~5–8 seconds per image (depends on size & queue)
Input prompt: supports detailed, multi-line descriptions

Price

Only $0.02 per image!!!

How to Use

Write a prompt describing the image (can include embedded text).
Adjust size (width & height, up to 1536×1536).
Set a seed for reproducibility.
Choose output_format.
Run the job and download the generated image.

Pro tips for best quality

For poster design, explicitly describe font style, placement, and mood.
For bilingual text, specify both Chinese and English in the prompt.
Use consistent seeds to regenerate similar layouts with slight variations.
Keep height:width ratio balanced for best typography results.

Accessibility:This website uses AI models provided by third parties.

ExamplesView all

Related Models

README

Qwen-Image (Text-to-Image)

Why it looks great

Limits and Performance

Price

How to Use

Pro tips for best quality

Qwen Image Text To Image API — Quick start

Qwen Image Text To Image API — Frequently asked questions