feat: production hardening + smart subpage scanning with layout dedup

Security:
- Add CRON_SECRET auth to /api/cron/* endpoints
- Add admin role verification to /api/admin/* routes
- Add org membership check to /api/billing/usage
- Add security headers (HSTS, X-Frame-Options, CSP, etc.)
- Add env variable validation at startup
- Add rate limiting to backend API (30 req/min per IP)

Infrastructure:
- Multi-stage Dockerfiles with non-root user + healthchecks
- Updated cron workflow to pass CRON_SECRET header
- Updated .env.example with all optional vars

Smart subpage scanning:
- Crawler now computes template_hash (DOM structure without content)
- Scanner scans ALL unique-layout pages, not just main page
- Pages with same layout (e.g. product pages) scanned only once
- Deduplication by template_hash, fallback to content_hash
- Main page always scanned with high priority
- Re-checks subscription limits before each page scan

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
Dennis
2026-03-06 07:44:32 +01:00
parent d8de0a973a
commit 1c545c93b4
18 changed files with 498 additions and 59 deletions
@@ -2,8 +2,12 @@ import { NextResponse } from "next/server";
import { scanScheduler } from "@/services/scanScheduler";
import { lighthouseScanner } from "@/services/lighthouseScanner";
import { logError } from "@/utils/errorUtils";
import { verifyCronSecret } from "@/lib/apiAuth";
export async function GET(request: Request) {
const authError = verifyCronSecret(request);
if (authError) return authError;
try {
const url = new URL(request.url);
const mode = url.searchParams.get("mode") || "all"; // "scheduled", "change_detection", "all"