App Store screenshot captions that convert

Screenshot captions are marketing copy, not feature documentation — and they're read in under three seconds, at thumbnail size, by someone scrolling search results. These are the patterns that survive that test.

Where captions are actually read

In App Store search results, the first one to three portrait screenshots appear directly on the results card — most installs are decided there, before anyone opens your product page. Two consequences:

Screenshot #1 carries most of the weight. It gets your single strongest claim — the reason the app exists — not a login screen or a feature tour stop.
Captions must work at thumbnail size. If a headline isn't legible at a third of its design size, it's decoration, not copy.

Benefit first, feature second

The most common failure mode is captioning what the screen is instead of what the user gets:

Feature-led (weak)	Benefit-led (strong)
PDF scanner with OCR	Scan anything. Search everything.
Customizable workout plans	A plan that adapts to your week
Multi-language support	Speaks your customer's language

A useful structure for a 5–6 slide set is hook → benefits → payoff: slide 1 states the core promise, the middle slides each claim one concrete benefit (one per slide — two claims on one slide means neither gets read), and the last slide closes with the outcome or social proof.

Length rules that survive thumbnails

Headline: roughly 3–6 words. It should fit on one or two lines at large type. If you're tempted to shrink the font to fit, cut words instead.
Subheadline: optional, one short line. It qualifies the headline; it never introduces a second idea.
Don't repeat your app name or subtitle. They're already on the page — captions that restate them waste the only space you control inside the image.

Caption and screen must tell the same story

The caption sets the claim; the UI underneath is the evidence. “Scan anything” over a settings screen reads as a broken promise. Pick the screen first, then write the claim it can support — or design the screen state specifically for the slide (clean demo data, the feature mid-action, results visible).

Apple's guideline 2.3 cuts the other way too: screenshots must show the actual app. Pure marketing slides with no real UI risk rejection — the caption-over-real-screen format is the safe and the effective one.

Localize the copy, not just the words

A translated caption is not localized copy. German runs ~30% longer than English and will wrap differently; Korean and Japanese compress and lose rhetorical punch if translated literally. Treat each language as a rewrite of the claim, keep the layout shared, and preview every locale before exporting — the mechanics of that workflow are in the localization guide.

A checklist before you export

Does slide 1 state the app's core promise, not a feature?
One claim per slide, everywhere?
Legible at thumbnail size (zoom your preview out to ~33%)?
No app-name or subtitle repetition?
Does every caption match the screen under it?
Has every locale been previewed for overflow?

App Store Screenshot Studio — write headlines and subheadlines per slide, keep them as a translatable slide × language table, preview any locale on the canvas, and export every language at exact App Store sizes. Free, open source, fully client-side. →

FAQ

How many screenshots should I upload?
Use at least 5 of the 10 slots — more scroll depth means more chances to convince. But order them as if only the first three will ever be seen, because for most visitors that's true.

Should captions be sentence case or title case?
Pick one and keep it consistent across the set. Sentence case generally reads more natural in non-English languages, which matters once you localize.

Can I A/B test captions?
Yes — App Store Connect's Product Page Optimization tests screenshot variants natively. Test one variable (usually the slide-1 headline) per experiment.