WAN 2.7 Text-to-Video turns plain prompts into coherent, cinematic clips with crisp detail, stable motion, and strong instruction-following—great for ads, explainers, and social posts. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.
Idle
$0.5per run·~20 / $10
Close-up portrait of a woman pressing her palm against a rain-soaked window at night, shot from outside. Heavy rain streaks distort her face through the glass, neon signs from the street below cast fragmented magenta and cyan reflections across her skin. Camera slowly pushes in at 0.3x speed. Shallow depth of field, bokeh rain drops, dual-exposure ghosting of her reflection overlapping her face. Cinematic 4K, anamorphic lens flare, 24fps.
Extreme close-up portrait, woman's face in profile against golden hour sun, shot with 85mm f/1.2. Sun positioned directly behind her head creating a blazing rim light halo. Individual strands of hair catch the light and glow translucent amber. A gentle breeze causes hair to drift across her cheek in slow motion. Skin subsurface scattering visible on ear and nose tip. Dust particles float through the backlit air. Hyper-realistic, 120fps slow motion playback at 24fps, anamorphic 2.39:1.
Portrait of a woman aging from 20 to 80 and back to 20, seamless morphing over 10 seconds. Shot in style of 1960s 16mm film with authentic gate weave, film grain, occasional splice marks and light leaks. The background remains constant — a sun-drenched window in a Paris apartment — while only her face transforms. Color grade shifts from saturated Kodachrome warmth in youth to desaturated, slightly faded tones in age, then back. No CGI aesthetic — must feel like archival film footage.
Wan 2.7 is advanced text-to-video model, generating high-quality cinematic video from natural language prompts. With audio input support, negative prompt control, flexible resolution and aspect ratio options, and an optional prompt expansion mode, it delivers strong results for a wide range of creative and production workflows.
High-quality text-to-video generation Produces detailed, visually coherent video with accurate motion, lighting, and scene composition from text descriptions.
Audio input support Upload an audio track to guide the rhythm, mood, and pacing of the generated video for synchronized results.
Negative prompt support Specify what you don't want in the video for more precise control over the output.
Prompt expansion Enable enable_prompt_expansion to let the model automatically enrich and optimize your prompt before generation.
Resolution options Generate at 720p or 1080p to match your delivery requirements.
Flexible aspect ratios Supports multiple orientations for social, cinematic, and broadcast formats.
Reproducible results Use the seed parameter to lock in a specific output for exact reproduction.
| Parameter | Required | Description |
|---|---|---|
| prompt | Yes | Text description of the scene, motion, camera style, and atmosphere. |
| negative_prompt | No | Elements to exclude from the generated video. |
| audio | No | Optional audio track to synchronize with the generated video. |
| resolution | No | Output resolution: 720p (default) or 1080p. |
| aspect_ratio | No | Output aspect ratio. Default: 16:9. |
| duration | No | Clip length in seconds. Default: 5. |
| enable_prompt_expansion | No | Enable automatic prompt optimization before generation. Default: off. |
| seed | No | Random seed for reproducible results. Use -1 for a random seed. |
| Duration | 720p | 1080p |
|---|---|---|
| 5s | $0.50 | $0.75 |
| 10s | $1.00 | $1.50 |
| 15s | $1.50 | $2.25 |
Grab a WaveSpeedAI API key, then call POST https://api.wavespeed.ai/api/v3/alibaba/wan-2.7/text-to-video with your input as JSON. The endpoint returns a prediction id; poll the prediction endpoint until status flips to completed, then read the output URL from data.outputs[0]. Examples for Wan 2.7 Text To Video below.
# Submit the prediction
curl -X POST "https://api.wavespeed.ai/api/v3/alibaba/wan-2.7/text-to-video" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $WAVESPEED_API_KEY" \
-d '{
"prompt": "A cinematic shot of a city at sunset, soft golden light",
"negative_prompt": "blurry, low quality, distorted",
"audio": "https://example.com/your-audio.mp3",
"resolution": "720p",
"aspect_ratio": "16:9",
"duration": 5,
"enable_prompt_expansion": false,
"seed": -1
}'
# Response includes a prediction id. Poll for the result:
curl -X GET "https://api.wavespeed.ai/api/v3/predictions/{request_id}/result" \
-H "Authorization: Bearer $WAVESPEED_API_KEY"
# When status is "completed", read the output from data.outputs[0].// npm install wavespeed
const WaveSpeed = require('wavespeed');
const client = new WaveSpeed(); // reads WAVESPEED_API_KEY from env
const result = await client.run("alibaba/wan-2.7/text-to-video", {
"prompt": "A cinematic shot of a city at sunset, soft golden light",
"negative_prompt": "blurry, low quality, distorted",
"audio": "https://example.com/your-audio.mp3",
"resolution": "720p",
"aspect_ratio": "16:9",
"duration": 5,
"enable_prompt_expansion": false,
"seed": -1
});
console.log(result.outputs[0]); // → URL of the generated output# pip install wavespeed
import wavespeed
output = wavespeed.run(
"alibaba/wan-2.7/text-to-video",
{
"prompt": "A cinematic shot of a city at sunset, soft golden light",
"negative_prompt": "blurry, low quality, distorted",
"audio": "https://example.com/your-audio.mp3",
"resolution": "720p",
"aspect_ratio": "16:9",
"duration": 5,
"enable_prompt_expansion": false,
"seed": -1
}
)
print(output["outputs"][0]) # → URL of the generated outputWan 2.7 Text To Video is a Alibaba model for video generation, exposed as a REST API on WaveSpeedAI. WAN 2.7 Text-to-Video turns plain prompts into coherent, cinematic clips with crisp detail, stable motion, and strong instruction-following—great for ads, explainers, and social posts. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing. You can call it programmatically or try it from the playground above.
POST your input parameters to the model's REST endpoint (shown in the API tab of this playground) with your WaveSpeedAI API key in the Authorization header. Submission returns a prediction ID; poll the prediction endpoint until status flips to "completed", then read the output URL from the result. The playground generates a ready-to-paste code sample in Python, JavaScript, or cURL for whatever inputs you've set. Full request/response shape is documented at https://wavespeed.ai/docs/docs-api/alibaba/alibaba-wan-2.7-text-to-video.
Wan 2.7 Text To Video starts at $0.50 per run. That figure is the base price — the final charge scales with the parameters you set in the form (output size, length, count, references, or whatever knobs this model exposes), so a higher-quality or larger output costs more than a minimal one. The exact cost for your current input is shown live next to the Generate button before you submit, and the actual per-call charge is recorded on the prediction afterwards.
Key inputs: `prompt`, `audio`, `aspect_ratio`, `resolution`, `duration`, `seed`. The full JSON schema (types, defaults, allowed values) is rendered above the Generate button and mirrored in the API reference at https://wavespeed.ai/docs/docs-api/alibaba/alibaba-wan-2.7-text-to-video.
Sign up for a free WaveSpeedAI account to claim starter credits, copy your API key from /accesskey, then call the endpoint shown in the API tab of the playground. The playground also auto-generates a code sample in Python, JavaScript, or cURL for the parameters you've set.
Commercial usage rights depend on the model's license, set by its provider (Alibaba). The license summary appears on the model card above; see WaveSpeedAI's Terms of Service for platform-level conditions.