Vidu Contest
Explore/Speech Generation

Speech Generation

Convert text into expressive spoken audio

Our selection

google/veo3-fast

google/veo3-fast

google/veo3-fast

Google Veo 3 Fast creates text-to-video with synchronized audio, delivering faster, more cost-effective results than standard Veo 3; commercial use allowed and pricing starts at $0.25/second. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

All models

google/veo3-fast

google

veo3-fast

minimax/speech-02-hd
minimax/speech-02-hd

minimax

speech-02-hd

minimax/speech-02-turbo
minimax/speech-02-turbo

minimax

speech-02-turbo

google/veo3-fast/image-to-video

google

veo3-fast/image-to-video

google/veo3/image-to-video

google

veo3/image-to-video

sync/lipsync-2

sync

lipsync-2

sync/lipsync-2-pro

sync

lipsync-2-pro

wavespeed-ai/mmaudio-v2

wavespeed-ai

mmaudio-v2

google/veo3

google

veo3

minimax/voice-clone
minimax/voice-clone

minimax

voice-clone

minimax/voice-design
minimax/voice-design

minimax

voice-design

elevenlabs/flash-v2
elevenlabs/flash-v2

elevenlabs

flash-v2

elevenlabs/flash-v2.5
elevenlabs/flash-v2.5

elevenlabs

flash-v2.5

elevenlabs/multilingual-v1
elevenlabs/multilingual-v1

elevenlabs

multilingual-v1

elevenlabs/multilingual-v2
elevenlabs/multilingual-v2

elevenlabs

multilingual-v2

elevenlabs/turbo-v2
elevenlabs/turbo-v2

elevenlabs

turbo-v2

elevenlabs/turbo-v2.5
elevenlabs/turbo-v2.5

elevenlabs

turbo-v2.5

minimax/speech-2.5-hd-preview
minimax/speech-2.5-hd-preview

minimax

speech-2.5-hd-preview

minimax/speech-2.5-turbo-preview
minimax/speech-2.5-turbo-preview

minimax

speech-2.5-turbo-preview