Why I built a fully client-side App Store screenshot studio
I ship indie iOS apps localized into nine languages. Every release ends the same way: a wall of screenshots that all need frames, headlines, translations, and pixel-exact export sizes. This is the story of the tool I built instead of doing that by hand again — and the two stubborn decisions that shaped it.
The arithmetic that breaks you
One app. Six screenshots. iPhone and iPad. Nine languages. That's 108 PNGs, each at an exact App Store Connect resolution, each with a translated headline that mustn't overflow its box. Change one layout detail and all 108 need regenerating. A design tool treats every one of those as a separate artboard; my releases were stalling on marketing images, not code.
The existing answers didn't fit. Screenshot SaaS tools solve the arithmetic, but they want my raw screenshots uploaded to their servers and a subscription for something I need four times a year. Figma templates solve the design, but not the multiplication.
Decision one: nothing leaves the browser
So the first constraint was almost emotional: my screenshots stay on my machine. No accounts, no upload, no backend — which turned out to be a forcing function for the whole architecture:
- Projects persist in
localStorage, images in IndexedDB. Closing the tab loses nothing; clearing site data deletes everything. Honest storage. - The canvas is Fabric.js; export renders each slide on an offscreen canvas at full Apple resolution and zips the result — all in-process.
- No backend also means free static hosting forever, and nothing to monetize, secure, or apologize for. The project is MIT-licensed open source.
Client-side-only sounds limiting until you notice that nothing in this problem needs a server. Rendering? The browser has a GPU. Translation? Caption tables export as CSV/JSON — translate them in a spreadsheet or paste them into any AI chat, then re-import. The one feature a backend would have enabled (server-side API translation) I eventually removed entirely; a copy-paste prompt does the same job with zero keys.
Decision two: files are the interface
The second decision came from watching how I actually work now: half my development happens through an AI coding agent. If the agent can't drive a tool, the tool is a bottleneck. But I didn't want to build an MCP server or a plugin API — those rot. What doesn't rot is files:
- A documented import manifest (JSON) describes a whole project — slides, layouts, text-block counts.
- Captions travel as a CSV/JSON table of slide × language.
- Screenshots route by filename convention:
01-home.ko.png→ slide 1, Korean. - A headless pipeline renders the folder to final PNGs with no browser open at all.
So an AI agent can draft the entire screenshot project — copy, layout choices, file organization — as plain files, render it, look at the PNGs, and iterate. A human picks the folder once in the import dialog, or never. The agent loop closed the day I watched a release's screenshot set go from brief to rendered without me opening the editor.
What building it taught me
- The 80% feature set is small. Frames, headlines, captions-as-data, exact-size export. Everything else — magnifier loupes, 2-page panoramas, span groups, per-locale screenshot overrides — earned its way in by an actual release needing it.
- Localization is a data problem wearing a design costume. Once captions became a table instead of text baked into artboards, nine languages stopped being nine times the work.
- Constraints compound pleasantly. No backend forced file-based workflows; file-based workflows made the tool agent-drivable for free; agent-drivability is now the feature I'd least want to give up.
The source, the import manifest spec, and the headless pipeline are all in the GitHub repository. If you ship localized iOS apps and this scratches your itch too, an issue or a star is the best way to say so.