🎬 KLING AI ⏱ 4 min read 🎬 运动控制

Motion Control — Technical Guide

上传一张照片和一个运动视频——你的角色在新的 AI 视频中执行完全相同的动作、舞蹈或手势

🎬

Motion Control

klingai video /app/kling-motion-control →

上传一张照片和一个运动视频——你的角色在新的 AI 视频中执行完全相同的动作、舞蹈或手势

你是否曾看过舞蹈视频并希望你的角色能做出相同的动作？这正是运动控制的功能。

上传你角色的任何照片和一个人执行动作的视频——跳舞、走路、挥手、做健身动作——AI 将生成一个全新的视频，在这个视频中你的角色执行完全相同的动作。你的角色保持他们的面孔、服装和身份，但现在他们的动作与参考视频中的人完全一致。

两种模式让你控制结果：相机跟随保持与参考视频相同的相机角度，而图像匹配保留你角色的原始姿势方向，并允许你通过提示添加相机移动。

非常适合让你的 AI 角色跳舞于流行编舞中、走过场景、表演手语、展示锻炼或表现任何你可以拍摄或在网上找到的身体动作。

✦ Best Results Tips

🧍 展示全身

你的角色照片和运动视频都需要清晰地展示全身——从头到脚，不要裁剪。AI 需要看到每个肢体才能准确转移动作。

🎬 连续拍摄

运动视频必须是一个连续的拍摄。没有剪辑，没有场景变化，没有相机切换。只需一段从开始到结束的平滑动作片段。

🤚 保持动作稳定

缓慢和适度的动作产生最佳效果。快速或颠簸的动作在转移过程中往往会丢失或失真。

👤 仅限一个人

使用只有一个人清晰可见的运动视频。如果有多个人，AI 会选择最大的一个——这可能不是你所期望的。

📐 匹配构图

如果你的角色照片是全身照，请使用全身运动视频。如果是半身照，请使用半身运动视频。不匹配的构图会产生尴尬的结果。

🏃 真实人类动作效果最佳

拍摄一个真实的人执行你想要的动作。动画或 CGI 参考视频的转移效果不如真实拍摄的人类动作。

Motion Control — Available Models

Kling v2.6

Default

kling-v2-6

Stable motion transfer from reference video.

Kling v3.0

kling-v3

Higher quality motion control.

📥 You Give

🖼️Character Image 🎬Motion Reference Video 📝Text Prompt (optional)

✨

AI Magic

klingai

🎬 You Get

🎬 Video

Quality modes

Standard

Professional

Orientation modes

Camera Follow (up to 30s)

Image Match (up to 10s)

🔇

Keep audio

Preserve original sound

⏱️

30s

Max duration

🎥

MP4/MOV, 3-30s, max 100MB

Video input

🖼️

JPG/PNG, max 10MB

Image input

💰 Motion Control — Pricing

Estimated cost

—

Failed jobs are automatically refunded

Kling VIDEO 3.0 Motion Control

The newly released Kling VIDEO 3.0 Motion Control builds upon the Motion Control introduced in VIDEO 2.6, delivering key capability upgrades. VIDEO 3.0 Motion Control enhances facial consistency across scenarios, ensuring stable facial features and smooth expressions even in complex, multi-angle, long-duration motions. This upgrade expands Motion Control into cinematic performance, high-precision motion capture, and diverse entertainment scenarios, delivering more powerful and reliable video generation.

Consistent Facial Identity from Any Angle

Reference Image

Element

Output with Element Binding

Output without Element Binding

Reference Image

Element

Output with Element Binding

Output without Element Binding

Complex Emotions, Faithfully Reproduced

Reference Image

Element

Output with Element Binding

Output without Element Binding

Reference Image

Element

Output with Element Binding

Output without Element Binding

Face Occlusion, High-Fidelity Restoration

Reference Image

Element

Output with Element Binding

Output without Element Binding

Consistent Facial Clarity Across Dynamic Framing

Reference Image

Element

Output with Element Binding

Output without Element Binding

Reference Image

Element

Output with Element Binding

Output without Element Binding

How to Achieve the Desired Outputs

1. The Motion Control Element Library only uses facial information for reference. It does not include clothing, hairstyle, makeup, or props. Therefore, we recommend uploading clear facial close-ups to ensure sufficient facial data.

2. Whether you upload images or videos, follow this core principle: Upload facial references that match the result you want to generate.

a) Head Turn Accuracy — To achieve more accurate head turns, upload: a front-facing view, side views (left and/or right).

b) Facial Expression Accuracy — To better match facial expressions (such as smiling), upload: a neutral front-facing image, a smiling front-facing image.

c) 360° Smiling Rotation — For a seamless 360° smiling rotation, upload: front-facing smile, left-profile smile, right-profile smile, upward-facing smile, downward-facing smile.

d) Complex Emotional Transitions with Head Movement — For complex emotional changes (e.g. happy to sad) combined with head turns, upload: a front-facing image, a smiling expression, a sad expression, side views (left or right).

e) If you need complex facial expressions while maintaining high identity accuracy, we strongly recommend uploading a video, which provides richer and more continuous facial information.

3. Edge Cases

— The first frame in Motion Control may contain multiple people, but only one element is supported; the system will select the person with the largest on-screen presence as the element. If the elements occupy similar portions of the frame, no element will be selected.

— If the element's face differs significantly from the face in the first frame, there is a small chance that facial quality may degrade — for example, when using a cat's face to reference a human.

Technical Requirements:

1. Ensure the character's entire body and head are clearly visible and not obstructed.

2. Upload a single character motion reference. For motion references with two or more characters, the motion of the character occupying the largest portion of the frame will be used for generation.

3. Real human actions are recommended, while certain stylized humanoid or humanoid body proportions can be recognized.

4. The action video must be a single continuous shot, with the character consistently visible in the frame. Please avoid cuts, shot changes, or camera movements; otherwise, the video may be truncated.

5. Avoid overly fast motions; steady, moderate movements yield the best results.

6. The short edge must be at least 340px, and the long edge must not exceed 3850px.

7. The supported duration for uploaded action videos is 3–30 seconds, and the generated video duration will match the length of the uploaded video. If the action is highly complex or performed at a very fast pace, there is a possibility that the generated result may be shorter than the original upload. This is because the model extracts only the valid and continuous action segments for generation. As long as a minimum of 3 seconds of usable continuous motion is extracted, the video can be generated. Please note that in such cases, the consumed Credits are non-refundable. We recommend adjusting the action difficulty and speed accordingly for optimal results.

Perfectly Synchronized Full-Body Motions

Image Reference

Motion Reference

Output

How to Achieve the Desired Outputs

1. Match the character's full-body/half-body in the image reference with the full-body/half-body in the motion reference.

2. Use a motion reference that features a wide range of motion, moderate speed, and minimal displacement.

3. For large motion references, ensure there is enough space in the image reference for the character to move freely.

Image Reference

Half Body

Full body

Motion Reference

Half-Body

Full-Body

1. Ensure the character's entire body and head are clearly visible and not obstructed.

2. Upload a single character motion reference. For motion references with two or more characters, the motion of the character occupying the largest portion of the frame will be used for generation.

3. Real human actions are recommended, while certain stylized humanoid or humanoid body proportions can be recognized.

4. Avoid cuts and camera movements in the motion reference.

5. Avoid overly fast motions; steady, moderate movements yield the best results.

6. The short edge must be at least 340px, and the long edge must not exceed 3850px.

7. The duration range of the uploaded motion reference is from 3 to 30 seconds, in which the generated video length will align with the duration of the uploaded video. If motions are complex or fast-paced, there is a chance that the output may be shorter than the uploaded video duration, as the model can only extract the valid action duration for generation. The minimum extractable continuous action duration is 3 seconds. Please note that in such cases, the consumed credits cannot be refunded. It is recommended to adjust the complexity and speed of the actions accordingly.

Character Orientation

1. By default, the video will be generated through "Character Orientation Matches Video", and the character's movements, expressions, camera movements, and orientation will follow the motion reference. Other details can be controlled via prompts.

2. When you choose "Character Orientation Matches Image" to match the character orientation with the image reference, the character's movements and expressions will follow the motion reference, and the orientation will align with the character orientation in the reference image. Camera movements and other elements can be customised through prompts.

Character Orientation Matches Image

Camera Movement Showcase

Zoom In

Zoom Out

Camera Up

Camera Down

Fixed Position