Most people write AI image prompts the same way they write Google searches — a few words that describe what they want. The result is generic, flat images that look like every other AI output. The difference between a forgettable image and a stunning one usually comes down to five things: subject clarity, style, lighting, composition, and quality modifiers. This guide covers all five, with copy-paste examples you can use today.
Why most AI image prompts fail
AI image models don't think like photographers. They fill every unspecified variable with the statistical average — the most common background, the most common lighting, the most common angle they've seen in training data. When you write 'a dog in a park', you get the most average dog in the most average park under average noon lighting. When you specify everything, you get exactly what you envisioned.
The anatomy of a great AI image prompt
Every effective AI image prompt has the same six components, in roughly this order:
- Subject — what you want in the image (be specific about appearance, action, position)
- Style — the visual language (photography, illustration style, art movement, named artist)
- Lighting — the single biggest impact on mood and realism
- Composition & camera — angle, distance, lens characteristics
- Mood & atmosphere — weather, time of day, emotional tone
- Quality modifiers — resolution, rendering quality, platform-specific tags
Subject: be specific or get generic
The subject is where most people stop, but it's only the starting point. Vague subjects produce vague images. Compare these two prompts:
❌ Weak: a woman sitting at a desk ✅ Strong: a focused woman in her late 30s, dark hair pulled back, wearing a grey linen blazer, sitting at a mid-century walnut desk covered in open notebooks, looking slightly to the left off-camera
Every added detail is a constraint that removes randomness. Specify age, clothing, expression, and position for people. For objects, specify material, colour, and context. For environments, specify architectural style, time period, and key elements.
Style: the visual language of your image
Style determines how the image is rendered, not what's in it. The same subject photographed versus illustrated versus painted produces completely different outputs. Common style frameworks:
- Photography styles: editorial fashion photography, documentary, commercial product photography, photojournalism
- Art movements: Bauhaus, Art Nouveau, surrealism, impressionism, brutalism
- Illustration styles: flat vector, hand-drawn, watercolour, ink sketch, pixel art, 3D render
- Named director/photographer: Wes Anderson palette, Annie Leibovitz portraiture, Stanley Kubrick symmetry
- Platform shorthand: 'Midjourney --style raw' for photorealism, 'anime style' for Japanese animation aesthetics
Lighting: the most underrated variable
Lighting is what makes an image feel real, dramatic, warm, clinical, or cinematic. Most beginners skip it entirely and get flat images. These are the lighting types worth knowing:
- Golden hour: warm, soft, long shadows — romantic, nostalgic, outdoor lifestyle
- Blue hour: cool, twilight quality — atmospheric, moody, urban
- Soft-box studio: even, professional, no harsh shadows — commercial, beauty, product photography
- Rembrandt lighting: one strong side light with characteristic triangle shadow — dramatic portraits
- Rim/backlight: subject outlined in light against dark background — cinematic, separates subject from background
- Natural window light: soft, directional, realistic — interior photography, lifestyle
- Harsh midday sun: strong shadows, high contrast — documentary, street photography
Composition and camera
Camera angle and distance change what a subject communicates emotionally. Low angles make subjects look powerful. Eye-level angles feel relatable. Bird's eye creates detachment or pattern. Focal length matters too — 35mm feels natural and environmental, 85mm flatters portraits, 200mm compresses backgrounds dramatically.
Camera examples: • 'low angle, looking up' — makes subject feel dominant • 'straight-on eye level, subject centred' — direct, confrontational • 'wide angle, environmental shot' — subject small in context • '85mm portrait lens, f/1.4, shallow depth of field' — professional headshot look • 'macro extreme close-up' — texture and detail emphasis
Quality modifiers that actually work
These are the end-of-prompt tags that consistently improve output quality across platforms:
- Midjourney: --q 2 (higher quality), --stylize 750 (more artistic), --v 6 (latest model), --ar 16:9 (widescreen)
- DALL-E 3: 'ultra-detailed, photorealistic, 8K resolution' in the prompt text
- Stable Diffusion: 'masterpiece, best quality, highly detailed' as positive prompt; 'blurry, watermark, text' as negative
- Universal: 'award-winning photography', 'professional quality', 'sharp focus, high contrast'
Putting it all together: a before and after
❌ Before: a coffee shop in the morning ✅ After: A warm independent coffee shop interior, early morning, soft golden light streaming through large front windows onto worn oak tables, a single espresso cup with rising steam in the foreground, slightly blurred barista visible in background, shot at 35mm eye level, shallow depth of field, warm amber colour grade, editorial lifestyle photography, ultra-detailed, 8K
Common mistakes to avoid
- Over-stacking adjectives: 'incredibly extremely stunning beautiful amazing portrait' — contradictory superlatives confuse the model
- Conflicting styles: 'realistic watercolour pencil sketch 3D render' — pick one primary style
- Forgetting the background: an unspecified background defaults to the most common context for your subject
- Skipping lighting entirely: the single most common reason images look flat and AI-generated
- Too many subjects: AI struggles with 'three people doing five different things' — simplify