TrustYourWebsite: multi-locale GDPR scanner for EU/UK SMBs

Hey all :waving_hand:

I’ve been building TrustYourWebsite (https://trustyourwebsite.com), a GDPR and cookie compliance scanner for EU and UK small businesses. After ~8 weeks of solo work it’s in production, and a few of the engineering decisions might be useful for other Vercel-hosted SaaS folks, so sharing here.

What it does

You paste a URL, it returns a risk score (0 to 100) plus a list of GDPR, cookie, and accessibility issues in about 60 seconds. Free tier shows the score and issue counts (details blurred). €2.50 unlocks the full report (150+ checks, downloadable PDF). €9.99 runs a multi-page premium scan with image copyright detection and AI-powered policy review.

The stack

  • Next.js 16 (App Router, SSG-first) on Vercel

  • Supabase for auth and database

  • Stripe Checkout for payments (no subscriptions, one-time scans only)

  • Puppeteer scanner running off a Mac Mini

  • Local Qwen 3.5 9B for customer-facing report copy. Claude for our own marketing content.

Engineering bits that might be interesting

  1. 12-locale routing with full hreflang and sitemap agreement. The hard part wasn’t the routing, it was making sure the page-head canonical and the sitemap entry never disagree. There’s a regression test that cross-checks them on every build, so a guide that quietly stops emitting in one language fails CI.

  2. CSP under SSG is harder than under SSR. We kept 'unsafe-inline' on script-src because Next.js cannot attach nonces to its SSG-emitted RSC streaming scripts without forcing the route tree dynamic and losing the edge cache. CSP3 makes nonce and 'unsafe-inline' mutually exclusive, so it’s one mode or the other. We almost shipped 'strict-dynamic', which under SSG silently ignores 'self' and blocks every framework chunk. Wrote that one up as an incident postmortem.

  3. The scanner runs off Vercel. Functions are great for the marketing site and the API, but a fleet of 4 to 6 concurrent Puppeteer instances plus a local LLM does not fit serverless. A Mac Mini with 24 GB RAM handles it, results come back to Vercel through a signed webhook.

  4. Code-first, LLM-only-where-necessary. Roughly 90% of the scanning is deterministic TypeScript (DOM queries, regex, axe-core, HTTP checks). The LLM only generates the human-readable explanation per finding. Way cheaper, faster, and more predictable than LLM-everything.

Happy to answer questions about any of the above. You can play with it at https://trustyourwebsite.com, just paste a URL.