A keynote on building

Designing
with Intention

When generation is free, intention is the only thing left to optimise for.

The failure

“Trash. Slop.”

  • It’s never been easier to build and ship — Replit, Claude, Cursor and v0 hand you the keys.
  • So I did. I put AutoScout on Reddit. Car people were blunt — eight features, none past 80% reliable.
  • Shipping was right. Moving to the next feature before I understood the problem was not.

The most valuable work is often the work nobody sees.

autoscout.fyi/cars/audi-rs3

The scale of the problem

It isn’t just me · the whole industry
12% 8% 56% 24% FREQUENT MODERATE RARE NEVER 20% USED 80% RARELY OR NEVER USED

80% rarely or never used. My eight half-built features weren’t the exception — they were the rule. Source: Pendo, 2019 Feature Adoption Report — usage measured across 615 products.

The turn

When generation becomes cheap, craft and judgement become the only real moat.

So what’s the point of building fast if it solves a problem for no one?

The slot machine in your IDE

Dopamine peaks in anticipation, not consumption.

Vibe-coding feels productive, but the rush comes from the next prompt, not the shipped product. Here’s why the loop is so hard to leave.

01 · The pull

Every prompt is a lever pull

The hit is in walking to the freezer, not the ice cream — in the next prompt, not the shipped product. Each prompt is mechanically a slot lever, and intermittent good output keeps you pulling.

02 · The illusion

Output feels like progress

Every prompt fires back something that looks like work, so you feel productive even when nothing real has shipped. The stream of output is the reward, and it keeps you in the loop.

Cal Newport · Slow Productivity

Traditional productivity is output over time.

AI inflates the output, not the judgement. A manager can’t tell judgement applied from judgement skipped — both produce the same-shaped artifact, so pseudo-productivity wins even harder.

Newport’s answer: do fewer things · work at a natural pace · obsess over quality.

THE MANAGER OPTION 1 · YOU JUDGE Human OPTION 2 · USE AI AI decides Same-shaped artifact
Lines of code0/week ↑
Human judgement0/week
Faster every second. The judgement is the part that never speeds up.
The pre-flight checklist

When generation is free, intention is the only thing left to optimise for.

Pilots run the pre-flight checklist. Not to fly slowly. To reach the destination they intended, in a plane that works.

15 minutes of clarity saves 15 hours of rework.

The design process is dead.

Jenny Wen, Anthropic · Julie Zhuo

We lead with prototypes, not mocks and docs. Hold many ideas loosely, stay anchored on the problem, not your first solution, and prune toward the one that works.

The old way
Idea → Mocks → PRD → Code → Launch → Surprise
The new way · a loop
Prototype Test Learn Prune↻ repeat
The prune test

Capability

Can today’s tech even do this?

Accuracy

Will it consistently meet expectations?

Speed

Fast enough for real use? Kill the idea if any answer is no.

Karri Saarinen · Co-founder, Linear

Craft isn’t a choice. It’s about being intentional about it.

Like the sushi chef refining one knife for a decade, or the maker who knows wood. The skill is not producing. It is judging, reinterpreting, challenging what the tool gives back.

Intuition is compressed experience. AI simulates output. It cannot simulate the years that tell you this output is wrong even when it looks right.

Every ring · a year of judgement
Dogfooding · traintimesuk

Live the problem, every single day.

I built traintimesuk for my own commute. Wrong platforms, departures that wouldn’t load, API errors. I hit every one myself, day after day, until I knew the product in my bones.

Getting comfortable living the problem is how you build with intention.

traintimesuk.co.uk/station/PAD
traintimesuk — Paddington live departures

One person. A small, sharp stack.

Under the hood
Frontend

React + Vite

TypeScript, Tailwind and shadcn/ui. TanStack Query for live data, React Router for the 2,600 pages.

Backend

Supabase

Postgres, edge functions and pg-cron. Caches the boards, runs the warmers, tracks API quota.

Hosting

Vercel

Static build plus one serverless proxy that fronts the edge function and keeps keys server-side.

Insight

PostHog + Playwright

Every fetch logs latency, cache tier and errors. Playwright guards the regressions.

No team, no microservices. A Vite app on Vercel, a Supabase backend, and the whole product hanging on two train-data APIs.

The whole product hangs on two APIs.

The API story
The free one · always on

Darwin (National Rail)

Times, status, destinations and most platforms. Sanctioned, effectively unmetered, ~1.7s. This is the board.

The rich one · rationed

RealTimeTrains

Exact platforms, calling points, live movement. But rate-limited to 9,000 calls a day, and slow: one call per train, 6–19s for a busy board.

Cache · 0.6s Darwin · 1.7s RTT enrich · 6–19s behind a quota circuit-breaker

The intention wasn’t piling on features. It was launching something stable first — a board that always loads beats a richer one that stalls.

I shipped fewer features.
The product got real.

The train app · proof
MAR APR MAY
Unique visitors Errors & rage clicks

Instead of four half-built features at 80%, I went deep on a few. Removed the bugs. Focused on craft.

Final week of May

3 errors · 0 rage clicks

Two lines moving in opposite directions. That is what shipping a real product looks like.

The bugs were real. I had the data.

traintimesuk · PostHog
1.6%

of live departure fetches errored over 30 days — mostly timeouts and “all data sources unavailable” on the slow live path.

Cached board0.6s
Peak error week4.1%
p95 fetch latency19s
Worst single fetch92s

The live data path (RealTimeTrains enrichment) was the tail: cache 0.6s, Darwin 1.7s, RTT 6–19s. PostHog, departures_fetch_metrics, last 30 days.

Then I shipped fixes, one PR at a time.

11 pull requests
#1Stop the burnStopped cron jobs spending the RealTimeTrains API quota; reserved it for live enrichment.
#10·11PlatformsBackfilled missing ‘TBA’ platforms after load, refresh and ‘Show more’.
#2Show more‘Show more’ now loads later trains via the RTT fallback at busy stations.
#3–8IndexableSEO: pre-rendered 2,599 station pages, upgraded schema, added pillar + FAQ pages.
#9Health + mobileRebuilt the station-health charts and fixed the mobile search header.

The errors fell.

Weekly fetch-error rate
4.1% 0.6% platform + warming fixes merged · May 26 Apr 12 May 10 May 24 Jun 7

Peak 4.1% the week of May 10. Two weeks after the platform and warming fixes merged, it hit 0.6%. The craft is ongoing — it now hovers near 2%, and the next target is the latency tail. PostHog.

traintimesuk · reliability

The guardrails that held the line.

Two changes did most of the work: a circuit-breaker on the rationed RealTimeTrains quota, and pre-warmed caches so the board loads before anyone asks.

Errors 4.1% → 0.6%. Uptime climbed and held.

guardrails.ts
// keep the live board honest
const RTT_BUDGET = 9000        // calls/day · hard cap

if (rtt.used >= RTT_BUDGET)         // circuit-breaker
  return darwin() ?? cache        // fall back, never blank

cron("*/5 * * * *", warmHotBoards)   // pre-warm top stations

board = staticCache
      ?? supabaseCache                // 4-layer cascade
      ?? darwin
      ?? rtt                          // ~95% never hit live

Build intuition, then build with intention.

The four moves AI can’t make for you
01

Dogfood

Use your own product daily and live every flaw. Watch PostHog session replays. The reps compound into instinct.

02

Research

Scrape Reddit, X and LinkedIn for real complaints. Go deeper with NotebookLM. Talk to real customers, not just friends.

03

Craft

Use the psychology — Hick’s Law, social proof. Julie Zhuo: the eye over the hand. Saarinen: craft is intentional.

04

Preview

Ship in preview like Anthropic. Set expectations and learn in public.

AI cannot decide what to build. That is your job.

Answer five questions on paper first.

The Intention Brief · 15 minutes
01ProblemWhat am I solving? One sentence a stranger could understand.e.g. Show the next trains and the right platform for the stations I actually use.
02PersonWho exactly is it for? Name five real people, not adjectives.e.g. My mum in Essex, two weekend commuters, a colleague who travels for work.
03ScopeWhat does success look like at the smallest scope?e.g. One station, the next three departures, the correct platform, refreshed live.
04QualityWhat does ‘good’ look like for this?e.g. Loads in under a second and never shows the wrong platform.
05HumanWhat part can only a human decide?e.g. Which stations matter, and what ‘reliable enough’ feels like to an anxious commuter.
Alī ibn al-Ḥusayn

At night he carried sacks of flour through Madinah to the widows and the poor, his face covered. For years. Nobody knew it was him.

When he died, they found dark marks across his back from the sacks. That night, over a hundred households found no food had come. That is when the city learned what he had been doing all along.

Thank you

Build with intention.
Find me after.

The deck lives online

Scan to keep the slides.

traintimesuk.co.uk/designing-with-intention-build/deck

Further reading

Appendix · a curated few
Karri SaarinenWhy is Quality So Rare? (Linear) · craft is the moat
Julie ZhuoThe Death of Product Development · taste and the process
Jenny WenDon’t Trust the Process (Anthropic) · prototype-first
Cal NewportSlow Productivity · pseudo-productivity
Anna LembkeDopamine Nation · the dopamine loop
Cory DoctorowLLMs are slot machines · the trap

The hadith of intention (Bukhari 1) anchors the close. The rest? Find me after.

A special time to build

It has never been this easy to build.

Replit Claude ChatGPT Cursor v0

Three biases keep you in the casino

Appendix · why you can’t zoom out
01 · Anchoring

The reference point

Your first idea becomes the bar. You ask ‘better than mine?’ and stop asking ‘is mine even right?’

02 · Confirmation

Weighted evidence

The signal that supports your idea is remembered. The one that contradicts it gets explained away.

03 · Sunk cost

Harder to leave

Every prompt and prototype you commit makes the idea harder to abandon.

The feedback was brutal — and mostly fair.

AutoScout · r/CarTalkUK
TorqueTinkerer_91 · ’16 Peugeot 208

“The MOT doesn’t test for this. Half the cars you list don’t have the engines you claim. Stop lying to promote your bullshit AI app.”

ChamferedHead

“MOT pass rate is an incredibly poor indicator of reliability — it says more about the people who own the cars than the cars themselves.”

ProperBrim-08 · Seat Ibiza FR

“ChatGPT analysed 33 million MOT tests, you mean.”

BoltGremlin_42

“Claude’s done a good job on this one.” — it was not a compliment.

Posted to r/CarTalkUK the week we shipped. Comments unedited.

1,000+ users in a day. Then I listened.

Building in public

I posted on Reddit. Over a thousand people pulled the site apart in 24 hours. Every comment and DM got triaged into the fixes that shaped the product.

Shipping to Reddit

Banned, called “vibecoded”

Banned from several car subs for looking AI-built. The criticism was often right — nothing surfaced bugs faster.

Data is the product

CSVs → DVSA API

Built v1 on static CSV dumps. When DVSA API access came through, I rewrote the whole reliability engine.

Feature discipline

Rebuilt twice

Shipped a servicing tab in a day. Wrong from the start. Shipping fast only works if you’ll throw v1 away.

Toward self-healing

Claude in the loop

Today a user finds a bug, I route it to Claude, the fix lands. Next: Claude finds the issues itself.

Here’s the stack.

traintimesuk.co.uk
GH
GitHub
Version control · Actions CI/CD
RV
React + Vite
TypeScript SPA · static route generation
TW
Tailwind + shadcn/ui
Responsive components · dark interface
NR
Darwin OpenLDBWS
Fast live-board source · current services
RTT
Realtime Trains
Platform enrichment · future dates & fallbacks
SW
Service worker
Offline shell · asset cache · network bypasses
VC
Vercel
Deployment · clean URLs · serverless proxy
SB
Supabase
Postgres · edge functions · pg-cron
SEO
Static SEO pipeline
2,600+ pages · sitemap · JSON-LD
AI
Claude + Codex
Pair-programming · reviews · production fixes
PH
PostHog EU
Fetch telemetry · session replay · product signals
PW
Playwright
E2E tests · desktop and mobile profiles

The model narrates. The data asserts.

How I got quality

I never let the model recall facts from memory. Before it writes a word it gets eight feeds of verified, source-tagged data — real prices, official spec, known faults, recalls, MOT history, market context.

Verified priceOfficial specInsurance & taxKnown faultsDVSA recallsMOT pass ratesManufacturer DNAMarket context

The craft isn’t the prompt. It’s the data you refuse to let it guess.

Same prompt. Three frontier models.

Choosing for quality

I run the same anchored prompt across three models on every car — measuring latency, UK-specificity, and whether they stay inside the data I gave them.

Gemini 3 Flash

Sharpest

Best UK-specific trim framing. Cites the Takata recall by name. Stays inside the price anchor.

Grok-4-fast · shipped

The pick

A near-tie with Gemini on quality, at a fraction of the cost per call. Clean JSON first try, no repair step.

Gemini-2.5-flash-lite

Fallback

Fastest end-to-end, but stays generic and skips per-trim detail. Good for long-tail cars.

Anchored prompts matter more than model choice. Fast-and-grounded beats slow-and-clever.