Seedance 2.0 15% OFF | Create in Video Generator →
WaveSpeedAI·video·From $0.075/run

InfiniteTalk API

WaveSpeedAI InfiniteTalk — converts one photo + audio into talking OR singing avatar videos, up to 10 minutes. Standard and Fast tiers, plus Multi variants (single image + two audio inputs for multi-character output) and Video-to-Video variants (drive an existing video with new audio).

Standard and Fast tiers, plus 720p output. Multi variants accept a single image + two audio inputs to generate multi-character talking/singing video. Video-to-Video variants drive an existing video with new audio for lip-sync replacement. Up to 10-minute outputs.

About the InfiniteTalk API

What InfiniteTalk does, how it fits in the WaveSpeedAI model lineup, and why teams reach for it.

InfiniteTalk is a video generation model from WaveSpeedAI, available through the WaveSpeedAI REST API. WaveSpeedAI InfiniteTalk — converts one photo + audio into talking OR singing avatar videos, up to 10 minutes. Standard and Fast tiers, plus Multi variants (single image + two audio inputs for multi-character output) and Video-to-Video variants (drive an existing video with new audio).

Standard and Fast tiers, plus 720p output. Multi variants accept a single image + two audio inputs to generate multi-character talking/singing video. Video-to-Video variants drive an existing video with new audio for lip-sync replacement. Up to 10-minute outputs.

The InfiniteTalk family on WaveSpeedAI ships 8 REST endpoints covering Digital-Human workflow. Each variant carries its own pricing, parameter knobs, and example outputs — pick the one that matches your input modality and production constraints, or call several from the same API key to compose multi-step pipelines.

Run InfiniteTalk through the same API key, billing account, and rate-limit envelope you use for the other 1,000+ AI models on WaveSpeedAI. No separate vendor setup, no per-provider SDKs, no per-vendor rate-limit envelopes — one integration covers everything from text-to-image and text-to-video through audio synthesis, 3D generation, upscaling, and editing.

All InfiniteTalk API endpoints

8 endpoints available now on WaveSpeedAI — pick the variant that matches your workflow.

Video To Video Multi (Fast) — InfiniteTalk digital-human preview from WaveSpeedAI

Video To Video Multi (Fast)

InfiniteTalk fast video-to-video multi converts a video and two audio inputs into multi-character talking or singing videos. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

digital-humanfrom $0.075
Video To Video (Fast) — InfiniteTalk digital-human preview from WaveSpeedAI

Video To Video (Fast)

Audio-driven infinitetalk-fast turns one video plus audio into realistic talking or singing videos with lip-sync. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

digital-humanfrom $0.075
Video To Video Multi — InfiniteTalk digital-human preview from WaveSpeedAI

Video To Video Multi

InfiniteTalk Video-to-Video Multi converts a video and two audio inputs into multi-character talking or singing videos at up to 720p. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

digital-humanfrom $0.15
Multi (Fast) — InfiniteTalk digital-human preview from WaveSpeedAI

Multi (Fast)

InfiniteTalk fast multi converts a single image and two audio inputs into multi-character talking or singing videos. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

digital-humanfrom $0.075
Multi — InfiniteTalk digital-human preview from WaveSpeedAI

Multi

InfiniteTalk Multi converts a single image and two audio inputs into multi-character talking or singing videos at up to 720p. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

digital-humanfrom $0.15
Infinitetalk Fast (Fast) — InfiniteTalk digital-human preview from WaveSpeedAI

Infinitetalk Fast (Fast)

InfiniteTalk fast converts one photo + audio into audio-driven talking or singing avatar videos (Image-to-Video), up to 10 minutes. Ready-to-use REST API, no coldstarts, affordable pricing.

digital-humanfrom $0.075
Video To Video — InfiniteTalk digital-human preview from WaveSpeedAI

Video To Video

Audio-driven InfiniteTalk turns one video plus audio into realistic talking or singing videos with lip-sync in 480p or 720p. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

digital-humanfrom $0.15
Infinitetalk — InfiniteTalk digital-human preview from WaveSpeedAI

Infinitetalk

InfiniteTalk converts one photo + audio into audio-driven talking or singing avatar videos (Image-to-Video), up to 10 minutes, 720p tier $0.30/5s. Ready-to-use REST API, no coldstarts, affordable pricing.

digital-humanfrom $0.15

See InfiniteTalk in action

Real outputs generated by the InfiniteTalk API. Hover any video to preview, click to open the full-size viewer.

How to use the InfiniteTalk API

Four steps from signup to a finished generation. Full Python, Node.js, and cURL examples are in the API section below.

  1. 1

    Get an API key

    Sign up for a WaveSpeedAI account and copy your API key from the dashboard. New accounts come with free starter credits — enough to run the playground a few dozen times before billing kicks in.

  2. 2

    Submit a prediction

    POST your input as JSON to https://api.wavespeed.ai/api/v3/wavespeed-ai/infinitetalk. The endpoint returns a prediction id immediately — generations are async so you don't hold an open connection during inference.

  3. 3

    Poll for completion

    GET https://api.wavespeed.ai/api/v3/predictions/{request_id}/result every 1-2 seconds. The response includes a status field; keep polling until it flips from"queued" or"processing" to"completed".

  4. 4

    Read the output URL

    Once status is"completed", read the URL from data.outputs[0]. The URL points to your generated media on the WaveSpeedAI CDN — image, video, audio, or 3D file depending on the InfiniteTalk variant you called.

What you can build with InfiniteTalk

Common workflows developers and creators use the InfiniteTalk API for.

Talking OR singing avatar from a photo

wavespeed-ai/infinitetalk converts one photo + audio into talking or singing avatar videos. Catalog framing: "Image-to-Video, up to 10 minutes, 720p tier." Singing support is a distinguishing feature.

talkingsingingavatar

Multi-character with two audio inputs

wavespeed-ai/infinitetalk/multi accepts a single image + two audio inputs to generate multi-character talking/singing videos at up to 720p. Useful for dialog scenes, duets, and two-person presentations from a single setup.

multi-charactertwo-audiodialog

Video-to-video lip-sync replacement

wavespeed-ai/infinitetalk/video-to-video drives an existing video with new audio, producing realistic talking or singing with lip-sync (480p or 720p). Useful for re-dubbing existing footage with a new voiceover.

video-to-videodublip-sync

Up to 10-minute outputs

Catalog claim on the base variant: up to 10 minutes. Significantly longer than most video models — usable for podcasts, lectures, long-form narration, and full-episode talking-avatar content in one generation.

long-form10-minutepodcast

Fast tier

wavespeed-ai/infinitetalk-fast. The same image-to-video flow with optimized inference — useful for high-volume work and pre-production iteration. Multi and video-to-video also ship Fast variants.

fastcostiteration

Multi-character video-to-video

wavespeed-ai/infinitetalk/video-to-video-multi accepts a video and two audio inputs for multi-character talking/singing at up to 720p. The full combination of multi-character + video-source workflows in one call.

multivideo-sourcetwo-character

Tips for prompting InfiniteTalk

Practical advice for getting better outputs from InfiniteTalk — drawn from the patterns that work across video models in production pipelines.

Clean audio first, sync second

Lip-sync quality is bottlenecked by audio clarity. Remove background noise, normalize levels, and check for clipping before feeding the audio in. Clean audio improves lip-sync more than any other variable.

Match reference image to audio language/culture

English-language audio paired with a clearly Western character looks more natural than a mismatch. Same for Japanese / Chinese / Korean audio + corresponding character references.

Front-facing portraits sync cleanest

Straight-on portrait references produce the most natural lip-sync. Three-quarter and profile angles work, but with subtle artifacts. If you have control over the character image, supply a near-frontal pose.

Singing is supported, not just talking

Catalog feature: "talking or singing avatar videos." Use for music videos, choir / soloist content, lyric videos with a featured character — not just spoken-word content.

Multi variant accepts two audio inputs

wavespeed-ai/infinitetalk/multi takes a single image + two audio inputs to generate multi-character talking/singing videos at up to 720p. Useful for dialog scenes, duets, and two-person presentations from one setup.

Video-to-video for re-dubbing existing footage

wavespeed-ai/infinitetalk/video-to-video drives an existing video with new audio, producing realistic talking or singing with lip-sync. Useful for re-dubbing clips with a new voiceover while keeping the original visual.

InfiniteTalk API pricing

Pricing is per-output. The final charge scales with the parameters you set in each variant's playground (resolution, duration, output count, references).

Call the InfiniteTalk API

Sign up for an API key at wavespeed.ai/accesskey, then submit a prediction via REST. The playground generates ready-to-paste samples for any combination of inputs.

HTTP example
# 1. Submit a prediction
curl -X POST "https://api.wavespeed.ai/api/v3/wavespeed-ai/infinitetalk" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY" \
  -d '{}'

# 2. Poll the result until status = "completed"
curl -X GET "https://api.wavespeed.ai/api/v3/predictions/{request_id}/result" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY"

# Read the output URL from data.outputs[0].
Node.js example
// npm install wavespeed
const WaveSpeed = require('wavespeed');
const client = new WaveSpeed(); // reads WAVESPEED_API_KEY

const result = await client.run("wavespeed-ai/infinitetalk", {});
console.log(result.outputs[0]); // → URL of the generated output
Python example
# pip install wavespeed
import wavespeed

output = wavespeed.run(
    "wavespeed-ai/infinitetalk",
    {}
)
print(output["outputs"][0])  # → URL of the generated output

InfiniteTalk vs alternatives

When to pick InfiniteTalk over similar models on WaveSpeedAI.

InfiniteTalk vs Wan 2.2 Speech-to-Video

Wan 2.2 Speech-to-Video (wavespeed-ai/wan-2.2/speech-to-video) supports up to 10-minute clips at 480p — same maximum duration as InfiniteTalk. InfiniteTalk adds Multi variants (two audio inputs), Video-to-Video, and a Fast tier that Wan 2.2 doesn't ship as a separate endpoint.

InfiniteTalk vs Stock avatar tools

Stock-avatar tools (HeyGen-style) limit you to a curated library of pre-trained avatars. InfiniteTalk accepts any character image — brand mascot, AI-generated character, illustrated host — without per-character setup, and supports singing as well as talking.

InfiniteTalk vs ElevenLabs voice tools

ElevenLabs handles voice (generation, cloning, multilingual TTS). InfiniteTalk is the video layer: pair an ElevenLabs voiceover with a character image (or existing video) to produce a full lip-synced video.

InfiniteTalk API — Frequently asked questions

Pricing, license, integration — common questions about running InfiniteTalk on WaveSpeedAI.

What is the InfiniteTalk API?

InfiniteTalk is a WaveSpeedAI video generation model exposed as a REST API on WaveSpeedAI. WaveSpeedAI InfiniteTalk — converts one photo + audio into talking OR singing avatar videos, up to 10 minutes. Standard and Fast tiers, plus Multi variants (single image + two audio inputs for multi-character output) and Video-to-Video variants (drive an existing video with new audio). You can call it programmatically or try it from the playground linked above.

How do I call the InfiniteTalk API?

Sign up for a WaveSpeedAI account, copy your API key from /accesskey, then POST to https://api.wavespeed.ai/api/v3/wavespeed-ai/infinitetalk with your input as JSON. The endpoint returns a prediction id; poll the prediction endpoint until status flips to "completed", then read the output URL from data.outputs[0]. Full Python / Node.js / cURL examples are above.

How much does the InfiniteTalk API cost?

InfiniteTalk starts at $0.075 per run. The exact cost scales with the parameters you set (resolution, duration, output count, references). The live cost preview next to the Generate button in the playground shows the exact price for your current input.

Which InfiniteTalk variants are available?

WaveSpeedAI hosts 8 InfiniteTalk endpoints: wavespeed-ai/infinitetalk-fast/video-to-video-multi, wavespeed-ai/infinitetalk-fast/video-to-video, wavespeed-ai/infinitetalk/video-to-video-multi, wavespeed-ai/infinitetalk-fast/multi, wavespeed-ai/infinitetalk/multi, wavespeed-ai/infinitetalk-fast, wavespeed-ai/infinitetalk/video-to-video, wavespeed-ai/infinitetalk. Each variant has its own playground page and pricing.

Can I use InfiniteTalk outputs commercially?

Commercial usage rights follow the WaveSpeedAI model license. Most WaveSpeedAI models permit commercial output use; see each model's playground page for the specific license summary, and WaveSpeedAI's Terms of Service for platform-level conditions.

Why use InfiniteTalk on WaveSpeedAI instead of going direct?

One API key + one billing account across InfiniteTalk AND 1,000+ other AI models from other providers. No per-vendor SDK setup, no separate rate-limit envelopes, no rewrite-per-vendor integration code. Pricing is typically at parity with or below WaveSpeedAI's direct API.

About WaveSpeedAI

The team behind InfiniteTalk and the broader WaveSpeedAI model lineup on WaveSpeedAI.

WaveSpeedAI runs an inference platform that hosts 1,000+ AI models from every major provider — ByteDance, Google, OpenAI, Alibaba, Kuaishou, ElevenLabs, and dozens of independent labs — behind one API key, one billing account, and one rate-limit envelope. WaveSpeedAI also ships first-party models (Image / Video Upscalers, Watermark Removers, Animate, InfiniteTalk) tuned for production pipelines.

Start building with InfiniteTalk on WaveSpeedAI

Free starter credits on signup. One API key across 1,000+ AI models from WaveSpeedAI and every other provider.