🎬 Kitchen Cinema

AI video generation is moving fast. Here's the current landscape, what each tool is best at, and how to prompt for video that actually looks good.

The tools (as of 2025)

Sora (OpenAI) β†’ Best quality

Best overall quality and coherence. Up to 20 seconds. Available to ChatGPT Plus/Pro users.

Most creative control. Good for stylised and artistic video. Strong image-to-video.

Kling β†’ Best realism

Strong photorealistic video, especially for faces and natural motion. Good free tier.

Fast generation, good for product shots and camera movements. Free tier available.

Good for short creative clips. Easy to use, solid free tier, good lip sync.

HeyGen β†’ Talking heads

AI avatars and talking head videos. Best tool for generating presenter videos at scale.

Prompt structure for video

1. The scene "A chef in a sunlit kitchen at 7am"
2. The action "carefully cracks an egg into a cast iron pan"
3. The camera "close-up shot, slow push-in"
4. The atmosphere "steam rising, warm morning light, peaceful"
5. The style "cinematic, 35mm film look, shallow depth of field"
Combined:

"A chef in a sunlit kitchen at 7am carefully cracks an egg into a cast iron pan. Close-up shot, slow push-in. Steam rising, warm morning light, peaceful. Cinematic, 35mm film look, shallow depth of field."

Camera moves that work

Slow push-in

Camera slowly moves toward subject. Creates intimacy and tension.

Pull back reveal

Starts close, pulls back to show the wider scene. Great for reveals.

Orbit / 360

Camera circles the subject. Works well for products and objects.

Tracking shot

Camera moves alongside a moving subject. Good for walking scenes.

Top-down / bird's eye

Looking straight down. Works beautifully for food, maps, crowds.

Static lock-off

Camera doesn't move at all. Often the cleanest choice β€” let the scene do the work.

πŸ’‘

Pro tip: Generate a high-quality image first in Midjourney or DALLΒ·E, then use image-to-video in Runway or Kling. You get more control over the starting frame and better consistency than text-to-video alone.