Seedance 2.0 15% OFF | Create in Video Generator →
Shengshu·video·From $0.25/run

Vidu Q3 API

Shengshu Vidu Q3 — text-to-video, image-to-video, reference-to-video (1-4 reference images for multi-entity consistency), and start-end-to-video. Three tiers: Standard, Pro, Turbo. Up to 16-second outputs on some variants.

Standard, Pro (1-16s outputs), and Turbo (faster) tiers. Reference-to-video accepts 1-4 reference images for multi-entity consistent videos at 360p-1080p, up to 16 seconds. Start-end-to-video bridges two keyframes (1-16s on Pro). image-to-video-pro variant supports 720p/1080p/2K/4K.

About the Vidu Q3 API

What Vidu Q3 does, how it fits in the Shengshu model lineup, and why teams reach for it.

Vidu Q3 is a video generation model from Shengshu, available through the WaveSpeedAI REST API. Shengshu Vidu Q3 — text-to-video, image-to-video, reference-to-video (1-4 reference images for multi-entity consistency), and start-end-to-video. Three tiers: Standard, Pro, Turbo. Up to 16-second outputs on some variants.

Standard, Pro (1-16s outputs), and Turbo (faster) tiers. Reference-to-video accepts 1-4 reference images for multi-entity consistent videos at 360p-1080p, up to 16 seconds. Start-end-to-video bridges two keyframes (1-16s on Pro). image-to-video-pro variant supports 720p/1080p/2K/4K.

The Vidu Q3 family on WaveSpeedAI ships 12 REST endpoints covering Image-To-Video, Text-To-Video workflows. Each variant carries its own pricing, parameter knobs, and example outputs — pick the one that matches your input modality and production constraints, or call several from the same API key to compose multi-step pipelines.

Run Vidu Q3 through the same API key, billing account, and rate-limit envelope you use for the other 1,000+ AI models on WaveSpeedAI. No separate vendor setup, no per-provider SDKs, no per-vendor rate-limit envelopes — one integration covers everything from text-to-image and text-to-video through audio synthesis, 3D generation, upscaling, and editing.

All Vidu Q3 API endpoints

12 endpoints available now on WaveSpeedAI — pick the variant that matches your workflow.

Image To Video Spicy — Vidu Q3 image-to-video preview from Shengshu

Image To Video Spicy

Vidu Q3 Image-to-Video Spicy generates unlimited high-quality videos from images with smooth animations and diverse motion, optimized for scalable content generation. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-videofrom $0.35
Text To Video — Vidu Q3 text-to-video preview from Shengshu

Text To Video

Vidu Q3 Pro Text to Video is a fast AI video generation model that creates high-quality, audio-capable videos from text prompts with support for 1–16 second outputs. Ready-to-use REST inference API for cinematic clips, advertising creatives, social media videos, product visuals, storytelling, and professional text-to-video workflows with simple integration, no coldstarts, and affordable pricing.

text-to-videofrom $0.25
Start End To Video — Vidu Q3 image-to-video preview from Shengshu

Start End To Video

Vidu Q3 Pro Start-End-to-Video creates smooth transitions between two keyframes with viduq3-pro (1–16s). Billing follows Vidu's published Q3-pro per-second rates by resolution. Ready-to-use REST inference API on WaveSpeed.

image-to-videofrom $0.25
Start End To Video — Vidu Q3 image-to-video preview from Shengshu

Start End To Video

Vidu Q3 Turbo Start-End-to-Video creates smooth transitions between two images with faster processing. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-videofrom $0.30
Start End To Video — Vidu Q3 image-to-video preview from Shengshu

Start End To Video

Vidu Q3 Start End Image-to-Video turns text prompts into high-quality videos with exceptional visual fidelity and diverse motion. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-videofrom $0.35
Reference To Video — Vidu Q3 image-to-video preview from Shengshu

Reference To Video

Vidu Q3 Reference-to-Video Mix generates multi-entity consistent videos from 1-4 reference images with text prompt guidance. Supports 360p to 1080p resolutions, up to 16 seconds duration, multiple aspect ratios, and optional audio generation. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-videofrom $0.35
Image To Video Pro — Vidu Q3 image-to-video preview from Shengshu

Image To Video Pro

Vidu Q3 Image-to-Video Pro generates high-resolution videos (720p/1080p/2K/4K) from images with exceptional visual fidelity and diverse motion. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-videofrom $0.45
Image To Video — Vidu Q3 image-to-video preview from Shengshu

Image To Video

Vidu Q3 Pro Image-to-Video animates still images with high-quality motion via viduq3-pro (1–16s). Billing follows Vidu's published Q3-pro per-second rates by resolution. Ready-to-use REST inference API on WaveSpeed.

image-to-videofrom $0.25
Image To Video — Vidu Q3 image-to-video preview from Shengshu

Image To Video

Vidu Q3 Turbo Image-to-Video animates static images with high-quality motion and faster processing. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-videofrom $0.30
Text To Video — Vidu Q3 text-to-video preview from Shengshu

Text To Video

Vidu Q3 Text-to-Video turns text prompts into high-quality videos with exceptional visual fidelity and diverse motion. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-videofrom $0.35
Drama Clip — Vidu Q3 image-to-video preview from Shengshu

Drama Clip

Vidu Q3 Drama Clip generates short 8-12 second script-driven drama clips from structured assets, including characters, scenes, and tools. It is designed for compact scenes, storyboard shots, and focused narrative moments.

image-to-videofrom $0.70
Image To Video — Vidu Q3 image-to-video preview from Shengshu

Image To Video

Vidu Q3 Image-to-Video turns text prompts into high-quality videos with exceptional visual fidelity and diverse motion. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-videofrom $0.35

See Vidu Q3 in action

Real outputs generated by the Vidu Q3 API. Hover any video to preview, click to open the full-size viewer.

How to use the Vidu Q3 API

Four steps from signup to a finished generation. Full Python, Node.js, and cURL examples are in the API section below.

  1. 1

    Get an API key

    Sign up for a WaveSpeedAI account and copy your API key from the dashboard. New accounts come with free starter credits — enough to run the playground a few dozen times before billing kicks in.

  2. 2

    Submit a prediction

    POST your input as JSON to https://api.wavespeed.ai/api/v3/vidu/q3/text-to-video. The endpoint returns a prediction id immediately — generations are async so you don't hold an open connection during inference.

  3. 3

    Poll for completion

    GET https://api.wavespeed.ai/api/v3/predictions/{request_id}/result every 1-2 seconds. The response includes a status field; keep polling until it flips from"queued" or"processing" to"completed".

  4. 4

    Read the output URL

    Once status is"completed", read the URL from data.outputs[0]. The URL points to your generated media on the WaveSpeedAI CDN — image, video, audio, or 3D file depending on the Vidu Q3 variant you called.

What you can build with Vidu Q3

Common workflows developers and creators use the Vidu Q3 API for.

Reference-to-video with 1-4 references

vidu/q3/reference-to-video generates multi-entity consistent videos from 1-4 reference images with text-prompt guidance. Supports 360p-1080p, up to 16 seconds, multiple aspect ratios. Useful for ensemble scenes where multiple referenced subjects must stay coherent.

referencemulti-entityconsistency

Start-end keyframe interpolation

vidu/q3/start-end-to-video creates videos by interpolating between two keyframes. Pro and Turbo tiers also support start-end with 1-16s durations. Useful for animatic-style storyboarding where you have the start and end stills.

interpolationkeyframesanimatic

Up to 16-second outputs

Catalog claim on Pro and reference-to-video variants: 1-16 second duration. Longer than the 5-8s window of many competing video models — useful for full narrative beats in a single generation.

long-form16-secondsduration

High-resolution image-to-video

vidu/q3/image-to-video-pro supports 720p / 1080p / 2K / 4K from images. Useful for delivery-grade output without an upscaling pass when starting from a still.

i2v-pro4khigh-res

Pro tier (cheapest)

vidu/q3-pro/* is the cheapest Vidu Q3 tier — useful for high-volume work. Covers image-to-video, text-to-video, and start-end-to-video with 1-16 second outputs.

pro-tiercostvolume

Spicy variant for unlimited content

vidu/q3/image-to-video-spicy generates "unlimited high-quality videos from images with smooth animations and diverse motion, optimized for scalable content generation." Useful for high-volume i2v pipelines.

spicyscalableunlimited

Tips for prompting Vidu Q3

Practical advice for getting better outputs from Vidu Q3 — drawn from the patterns that work across video models in production pipelines.

Use the 1-4 reference-image support for multi-entity scenes

vidu/q3/reference-to-video accepts 1-4 reference images for multi-entity consistent videos with text-prompt guidance. Useful for ensemble cast scenes, group product showcases, and multi-subject storyboards.

Start-end frame interpolation for keyframe workflows

vidu/q3/start-end-to-video bridges two stills with generated motion. Particularly useful for animatic-style work, key-pose-driven storyboarding, and stitching concept art into motion without re-prompting each segment.

Pick the tier deliberately

Standard is the default delivery tier; Pro tier (1-16 second outputs) is positioned for high-volume work; Turbo tier prioritizes speed. Check the live pricing table on this page for the current per-tier cost — Vidu Q3's Pro tier is unusually competitive.

Up to 16-second outputs on some variants

Catalog claim on Pro and reference-to-video variants: 1-16 second duration. Longer than the 5-8s window of many competing video models — useful for full narrative beats in a single generation.

image-to-video-pro for high-res output

vidu/q3/image-to-video-pro supports 720p / 1080p / 2K / 4K resolution from images. Useful for delivery-grade output without an upscaling pass when starting from a still.

Spicy variant for unlimited content generation

vidu/q3/image-to-video-spicy is positioned for "unlimited high-quality videos from images with smooth animations and diverse motion, optimized for scalable content generation." Useful for high-volume i2v pipelines.

Vidu Q3 API pricing

Pricing is per-output. The final charge scales with the parameters you set in each variant's playground (resolution, duration, output count, references).

EndpointTypeStarting price
vidu/q3/image-to-video-spicyimage-to-video$0.35
vidu/q3-pro/text-to-videotext-to-video$0.25
vidu/q3-pro/start-end-to-videoimage-to-video$0.25
vidu/q3-turbo/start-end-to-videoimage-to-video$0.30
vidu/q3/start-end-to-videoimage-to-video$0.35
vidu/q3/reference-to-videoimage-to-video$0.35
vidu/q3/image-to-video-proimage-to-video$0.45
vidu/q3-pro/image-to-videoimage-to-video$0.25
vidu/q3-turbo/image-to-videoimage-to-video$0.30
vidu/q3/text-to-videotext-to-video$0.35
vidu/q3/drama-clipimage-to-video$0.70
vidu/q3/image-to-videoimage-to-video$0.35

Call the Vidu Q3 API

Sign up for an API key at wavespeed.ai/accesskey, then submit a prediction via REST. The playground generates ready-to-paste samples for any combination of inputs.

HTTP example
# 1. Submit a prediction
curl -X POST "https://api.wavespeed.ai/api/v3/vidu/q3/text-to-video" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY" \
  -d '{}'

# 2. Poll the result until status = "completed"
curl -X GET "https://api.wavespeed.ai/api/v3/predictions/{request_id}/result" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY"

# Read the output URL from data.outputs[0].
Node.js example
// npm install wavespeed
const WaveSpeed = require('wavespeed');
const client = new WaveSpeed(); // reads WAVESPEED_API_KEY

const result = await client.run("vidu/q3/text-to-video", {});
console.log(result.outputs[0]); // → URL of the generated output
Python example
# pip install wavespeed
import wavespeed

output = wavespeed.run(
    "vidu/q3/text-to-video",
    {}
)
print(output["outputs"][0])  # → URL of the generated output

Vidu Q3 vs alternatives

When to pick Vidu Q3 over similar models on WaveSpeedAI.

Vidu Q3 vs Seedance 2.0

Seedance 2.0 ships native audio synthesis across every variant and the Turbo tier. Vidu Q3 is meaningfully cheaper and ships start-end interpolation as a first-class endpoint that Seedance doesn't have.

Vidu Q3 vs Kling 3.0

Kling 3.0 covers Standard, Pro, and 4K with motion-control as a sub-endpoint. Vidu Q3 is cheaper for most tiers and ships start-end-to-video and reference-to-video (1-4 refs) as core variants.

Vidu Q3 vs Wan 2.7

Wan 2.7 ships reference-to-video, video-edit, video-extend, image-edit, and text-to-image in the same family. Vidu Q3 stays focused on the video-generation surface with cheaper tiers and the start-end interpolation workflow.

Vidu Q3 API — Frequently asked questions

Pricing, license, integration — common questions about running Vidu Q3 on WaveSpeedAI.

What is the Vidu Q3 API?

Vidu Q3 is a Shengshu video generation model exposed as a REST API on WaveSpeedAI. Shengshu Vidu Q3 — text-to-video, image-to-video, reference-to-video (1-4 reference images for multi-entity consistency), and start-end-to-video. Three tiers: Standard, Pro, Turbo. Up to 16-second outputs on some variants. You can call it programmatically or try it from the playground linked above.

How do I call the Vidu Q3 API?

Sign up for a WaveSpeedAI account, copy your API key from /accesskey, then POST to https://api.wavespeed.ai/api/v3/vidu/q3/text-to-video with your input as JSON. The endpoint returns a prediction id; poll the prediction endpoint until status flips to "completed", then read the output URL from data.outputs[0]. Full Python / Node.js / cURL examples are above.

How much does the Vidu Q3 API cost?

Vidu Q3 starts at $0.25 per run. The exact cost scales with the parameters you set (resolution, duration, output count, references). The live cost preview next to the Generate button in the playground shows the exact price for your current input.

Which Vidu Q3 variants are available?

WaveSpeedAI hosts 12 Vidu Q3 endpoints: vidu/q3/image-to-video-spicy, vidu/q3-pro/text-to-video, vidu/q3-pro/start-end-to-video, vidu/q3-turbo/start-end-to-video, vidu/q3/start-end-to-video, vidu/q3/reference-to-video, vidu/q3/image-to-video-pro, vidu/q3-pro/image-to-video, and more. Each variant has its own playground page and pricing.

Can I use Vidu Q3 outputs commercially?

Commercial usage rights follow the Shengshu model license. Most Shengshu models permit commercial output use; see each model's playground page for the specific license summary, and WaveSpeedAI's Terms of Service for platform-level conditions.

Why use Vidu Q3 on WaveSpeedAI instead of going direct?

One API key + one billing account across Vidu Q3 AND 1,000+ other AI models from other providers. No per-vendor SDK setup, no separate rate-limit envelopes, no rewrite-per-vendor integration code. Pricing is typically at parity with or below Shengshu's direct API.

About Shengshu

The team behind Vidu Q3 and the broader Shengshu model lineup on WaveSpeedAI.

Shengshu Technology is a Chinese AI lab spun out of Tsinghua University, behind the Vidu family of video generation models. Vidu Q3 ships text-to-video, image-to-video, reference-to-video (1-4 reference images for multi-entity consistency), and start-end-to-video (keyframe interpolation between two stills) across Standard, Pro, and Turbo tiers. Some variants support up to 16-second outputs.

Start building with Vidu Q3 on WaveSpeedAI

Free starter credits on signup. One API key across 1,000+ AI models from Shengshu and every other provider.