π¨ Image & Video
Prompting for visual AI is its own craft. These techniques work across Midjourney, DALLΒ·E, Stable Diffusion, Runway, and Sora.
Describe, don't instruct
Image models aren't assistants β they're visual interpreters. Don't say "make me a photo of a dog." Say "golden retriever mid-leap over a stream, natural forest light, motion blur, Canon 5D."
Camera & lens matter
Adding camera specs dramatically improves photorealism: "shot on Sony a7R IV, 85mm f/1.4, shallow depth of field, bokeh background." The model knows what this looks like.
Aspect ratio is part of the prompt
Wide landscape (16:9), portrait (2:3), square (1:1), cinema (2.39:1). Most tools accept --ar flags or style presets. Pick before you prompt.
Negative prompts remove problems
Most tools support negative prompts β things to exclude. Common ones: "blurry, low quality, watermark, extra fingers, deformed, text overlay."
Reference images anchor style
Tools like Midjourney support --sref (style reference) and --cref (character reference). Upload an image of the style or person you want to match.
Seed for consistency
Once you get a result you like, save the seed number. Re-using it with variations of your prompt gives consistent character/style across images.
Start with a still
The best video prompts describe a frozen moment first, then the motion. "A candle flame on a dark table β the flame flickers gently in a light breeze."
Describe motion precisely
Vague motion = bad output. "The camera slowly pulls back to reveal" beats "camera movement." Specify direction, speed, and what's being revealed.
Duration changes everything
Current AI video tools (Sora, Runway, Kling) have sweet spots β usually 4β10 seconds. Design your scene to work in that window. One moment, not a story.
Lighting and atmosphere
Same as image prompting: golden hour, neon-lit rain, foggy morning. Atmosphere carries more visual weight in video than in stills.
What tools exist now
Runway Gen-3, Kling, Luma Dream Machine, Sora (limited access), Pika. Each has different strengths β Kling for realism, Runway for creative control.
Image-to-video is more reliable
Generate a strong image first (Midjourney, DALLΒ·E), then animate it with a video tool. More control, better consistency than text-to-video alone.