Frame-accurate syncAsync REST + webhooksUp to 4K outputIdempotent retries

Powering lip-sync workflows for 900+ engineering teams

Lip Sync API.

Lip-sync as a service. Drop a face URL, drop an audio URL, get a frame-accurate lip-synced MP4 back — sub-minute latency, async webhooks.

Sample request

Output

Sample API call

POST /v1/lipsync
Authorization: Bearer sk_live_***

{
  "face_url": "https://cdn.example.com/face.jpg",
  "audio_url": "https://cdn.example.com/voice.mp3",
  "format": "16-9",
  "resolution": "1080p"
}

→ 202 Accepted
{ "job_id": "ls_8a2...", "webhook": "..." }

Async REST · sub-minute latency · idempotent

118k lip-sync renders served last 30 days

— Output Example

▸ preview9:16 · 1080p

00:00 / 00:45▸

— Endpoints

Lip-sync any face to any audio.

— Sample request

endpoint · 1/3

Endpoint

112 params

POST /v1/lipsync with { face_url, audio_url }. Single photo + audio → MP4 with photoreal lip movement. Up to 4K.

AuthBearer API key

LimitsTiered per plan

Endpoint

POST /v1/lipsync

REST · Async

— Webhook payload

200 OK

● Live9:16 · 1080p

Webhook response

slot · webhook

00:00 / 00:45▸

— How it works

From request to lip-synced render in 3 simple steps.

Step 1

Get an API key

Step 2

POST face + audio URLs

Send signed URLs to face image/video and audio. The API extracts the face model and the phoneme timing.

Step 3

Receive webhook

On completion, we POST a signed MP4 URL to your webhook. Or poll the job endpoint until done.

— Docs

How to dub 100k videos via API without breaking the bank?

From auth to webhook handler with code samples in TypeScript and Python.

▸ Docs · 16:9

I dubbed 10,000 product demos via API in 4 hours (walkthrough).

— Who it's for

Built for engineering teams.

Localization

Localization platforms

Offer video dubbing as a service via API. Frame-accurate, multilingual, at scale.

L&D

L&D platforms

Re-sync narrator audio across course updates. Same instructor, new lines, no re-shoot.

Media

Media & news

Auto-dub news clips for international audiences. Same anchor, every language, every clip.

SaaS

Personalized media SaaS

Generate personalized lip-synced videos at scale — sales outreach, onboarding, transactional.

— Comparison

DIY lip-sync vs ClipNova Lip Sync API.

Building it yourself means months of ML infra. ClipNova ships it behind a single REST endpoint.

Feature

Lip Sync API

DIY infra

Setup

One API key, one endpoint

Stand up GPU farm + models

Time to first sync

Minutes

Months of ML work

Quality

Frame-accurate, photoreal

Hire ML team to match

Idempotency

Built in

Build yourself

Compliance

SOC 2 + EU residency

Audit yourself

— Use Cases

See what teams build with it.

Production deployments across categories.

Video dubbing at scale.

A media company dubs 10,000 news clips per week into 8 languages. Same anchor, same energy, every language.

Batch endpoints
8 languages per pass
Anchor consistency preserved
Webhook on each clip

16:9

Drop example here

slot · dub-api

Personalized sales outreach.

A SaaS sends every prospect a lip-synced video pitch from the founder, personalized to their company.

Per-prospect rendering
Founder face + cloned voice
Sub-minute latency
Audit logs

16:9

Drop example here

slot · outreach-api

L&D narrator updates.

An LMS pushes script updates to existing lessons. Same narrator, new lines, no re-shoot — just a re-sync.

Audio-only updates
Visual continuity preserved
Version control
Bulk endpoints

16:9

Drop example here

slot · lms-api

— FAQs

Frequently asked.

What is the Lip Sync API?

A REST endpoint that takes a face URL and an audio URL, and returns a frame-accurate lip-synced MP4. Designed for high-volume, programmatic use.

What inputs are accepted?

Face: JPG, PNG, MP4, MOV. Audio: MP3, WAV, M4A. Both passed as signed URLs (or uploaded via /uploads endpoint).

Quality compared to ClipNova UI?

Same model, same quality. The API is the same engine that powers the UI tool.

Latency?

Under 2 minutes for a 60-second source video. 4K renders take 4–6 minutes.

Idempotency?

Yes. Every request accepts an Idempotency-Key header. Safe to retry.

Webhook reliability?

Signed payloads, retried with exponential backoff for 24h, full logs in dashboard.

Compliance?

SOC 2 Type II. EU data residency available on enterprise. No training on user inputs.

Pricing?

Per-second of output. Volume discounts at 50k+ renders / month. Free sandbox tier for development.

View complete API docs

Find detailed reference for every endpoint, parameter and webhook

or check our OpenAPI spec optimized for LLMs →

— Tools

Free AI ads tools.

Pick your tool.

Prompt → Video

Type a sentence, ship a video.

Read the docs →

AI TikTok Generator

Vertical videos tuned for the For You Page.

Read the docs →

Anime Video Generator

Five anime styles, one prompt away.

Read the docs →

Talking Avatar

AI hosts with realistic voices.

Read the docs →

Movie Maker

Cinematic multi-scene shorts.

Read the docs →

Music Video Maker

Beat-matched visuals from a track.

Read the docs →

AI Ads Generator

Scroll-stopping ads for Meta, TikTok, YouTube.

Read the docs →

Audio to Video

Podcasts and voice memos, made visual.

Read the docs →

YouTube to Video

Long-form videos cut into 9:16 shorts.

Read the docs →

AI Cartoon Video Generator

Hand-drawn shorts from one paragraph.

Read the docs →

AI Content Generator

Scripts, hooks and captions — one brief, a week of content.

Read the docs →

AI UGC Generator

Native UGC at scale, from a brief.

Read the docs →

See all tools

ClipNova

The fastest way to lip-sync via API.

Get an API key

Free sandbox tier