Home AI工具 AI指南 AI模型 AI创作者 🛒 购买 立即开始
🎬 KLING AI ⏱ 4 min read 🗣️ Avatar v2

Avatar v2 — Technical Guide

将任何肖像照片转换为会说话的视频——上传照片并提供音频或输入他们应该说的话,AI会用自然的动作和口型同步来动画化面部

🗣️

Avatar v2

klingai video /app/avatar →
将任何肖像照片转换为会说话的视频——上传照片并提供音频或输入他们应该说的话,AI会用自然的动作和口型同步来动画化面部
Avatar v2 让静态照片栩栩如生。上传您角色的肖像并提供音频——可以上传录音或输入文字,让AI生成声音——结果是一个视频,照片中的人看起来自然地说话,伴随着逼真的头部运动、眨眼和完美同步的嘴部动作。

这与需要现有视频的口型同步不同。Avatar 从一张静态照片开始。AI 添加所有的运动——细微的头部倾斜、自然的眨眼、面部表情变化和精确的嘴部动画——从一张静态图像创建一个令人信服的会说话视频。

两种音频模式涵盖所有用例。上传音频让您可以使用任何预录的演讲、旁白、播客片段或翻译的叙述。输入文本模式让您编写对话,从目录中选择声音,选择语言和情感(快乐、悲伤、生气、惊讶等),调整说话速度,AI 一步完成所有操作。

可选的提示让您引导情绪和手势风格——描述表情、能量水平或情感,AI 会调整头部运动和面部动画以匹配。结果是一个完整的会说话视频,准备好用于社交媒体、客户支持回复、培训材料、产品公告或个性化视频消息。
✦ Best Results Tips
👤 正面肖像,光线良好
使用一张光线充足的照片,面部从正面清晰可见。头部居中,眼睛看向相机,表情中性或微笑。避免使用太阳镜、面具或面部重影。
🎭 提示控制情绪,而非语言
提示字段控制表情和手势风格——而不是说话内容。写下自信和充满活力或冷静和深思的内容。实际的语言来自音频文件或输入的文本。
⌨️ 输入文本以获得最快的结果
输入文本模式在一步中生成声音并同步嘴唇——无需录音或寻找音频文件。选择一个声音,设置情感,写下文字,AI 会处理其余的。
😊 选择正确的情感
使用输入文本模式时,情感设置会改变声音的音调和面部的运动。快乐增加温暖和微笑,愤怒增加强度,悲伤增加柔和。将情感与内容相匹配。
⏱️ 保持音频在60秒以内
较短的音频片段产生最高质量的动画。理想情况下在60秒以内——AI 在整个过程中保持自然运动的一致性。较长的片段可能会导致表情质量的漂移。
📐 头部和肩部构图
最佳效果来自于从上胸部以上的照片构图。全身照片会减少面部细节。过于紧凑的裁剪在动画过程中没有自然头部运动的空间。

Avatar v2 — Available Models

Avatar Standard
Default
kling-v2-avatar
Natural lip-sync and expressive motion from portrait + audio.
Mode: std
Avatar Pro
kling-v2-avatar
Higher fidelity, smoother motion, improved expressivity.
Mode: pro
📥 You Give
🖼️Character Photo 🎤Audio (TTS or Upload) 🎭Expression Prompt
AI Magic
klingai
🎬 You Get
🎬 Video
Quality modes
Standard
Professional
TTS emotions
😐 Neutral 😊 Happy 😠 Angry 😢 Sad 😨 Fearful 🤢 Disgusted 😲 Surprised
⏱️
5 min
Max duration
🎤
Upload (MP3/WAV/M4A)
Audio source
🎤
TTS
Audio source
🌐
English, Chinese
TTS languages

💰 Avatar v2 — Pricing

Estimated cost
Failed jobs are automatically refunded
The Avatar 2.0 feature allows you to upload character images, add voiceovers, and describe the character’s expressions to generate lifelike dynamic avatar videos. The newly upgraded Avatar 2.0 dramatically enhances performance, offering full coverage for 5-minute-long content scenes!

Showcase Kling Avatar

Prompt Excited and joyful, the child raises her hands covered in paint, laughing and interacting with the colorful art supplies on the table, camera zooms in.
Input
Input
Output
Prompt Selfie of a young lady with a bright smile, her eyes sparkling with excitement as she sits in the driver's seat. Very Subtile handheld camera mouvement. No cars passing by. No distortions. Very natural mouvements
Input
Input
Output
Prompt With a joyful expression Santa laughs and interacts with the camera, gesturing with open hands wearing white gloves, exuding holiday cheer, surrounded by festive lights and decorations.
Input
Input
Output
Prompt While talking, they excitedly shook their heads and swayed their bodies. Finally, they clenched their fists and decided to set off, jumping and skipping happily.
Input
Input
Output
Prompt Put hands together in front of your chest, and finally hold them together and tell a story naturally.
Input
Input
Output
Prompt He raised his hand to touch his glasses and then angrily pointed at the camera with his finger.
Input
Input
Output
Prompt Patient and gentle explanations, occasionally glancing at the item in the hand, maintaining a smile, with natural movement.
Input
Input
Output
Prompt Professional explanations, natural movements, and sometimes use gestures to assist in the explanation.
Input
Input
Output
Prompt The singer sings earnestly, enjoying the stage with a smile, her body movements swaying naturally in coordination with the performance.
Input
Input
Output
Prompt The female singer sings to the audience while looking confident, occasionally smiling at the camera, hand on the microphone, natural arm movements.
Input
Input
Output
Prompt In a commercial advertisement, a person holds a product in one hand and speaks directly to the camera. The gesture is deliberate and confident.
Input
Input
Output
Prompt The expression is intoxicated, emotions high, gently shaking the head. The snake around the neck moves as light reflects off its body, gradually zooming in on the face.
Input
Input
Output
Prompt Smiling, swaying confidently while rapping, holding a microphone. Eyes focused on the audience, natural and fluid movements. Occasional head movements.
Input
Input
Output
Prompt Confidently posing with a sultry gaze, the figure exudes an aura of mystery and allure, captivating the audience with every movement.
Input
Input
Output
Prompt A teacher is speaking politely and earnestly.
Input
Input
Output
Prompt Confidently holding a smartphone, standing in an empty street, exuding a mysterious aura with a slight smile.
Input
Input
Output
Prompt The man is angry, shown in both facial expression and action.
Input
Input
Output
Prompt Smiling warmly at the camera, she gently touches her necklace, exuding confidence and grace.
Input
Input
Output

🗣️ Avatar v2

试用 Avatar v2