feat: production hardening + smart subpage scanning with layout dedup
Security: - Add CRON_SECRET auth to /api/cron/* endpoints - Add admin role verification to /api/admin/* routes - Add org membership check to /api/billing/usage - Add security headers (HSTS, X-Frame-Options, CSP, etc.) - Add env variable validation at startup - Add rate limiting to backend API (30 req/min per IP) Infrastructure: - Multi-stage Dockerfiles with non-root user + healthchecks - Updated cron workflow to pass CRON_SECRET header - Updated .env.example with all optional vars Smart subpage scanning: - Crawler now computes template_hash (DOM structure without content) - Scanner scans ALL unique-layout pages, not just main page - Pages with same layout (e.g. product pages) scanned only once - Deduplication by template_hash, fallback to content_hash - Main page always scanned with high priority - Re-checks subscription limits before each page scan Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
@@ -203,4 +203,8 @@ CREATE TABLE IF NOT EXISTS alert_configurations (
|
||||
created_at timestamp with time zone DEFAULT now(),
|
||||
updated_at timestamp with time zone DEFAULT now()
|
||||
);
|
||||
|
||||
-- Add template_hash to pages table for layout deduplication
|
||||
ALTER TABLE pages ADD COLUMN IF NOT EXISTS template_hash VARCHAR;
|
||||
CREATE INDEX IF NOT EXISTS idx_pages_template_hash ON pages(template_hash) WHERE template_hash IS NOT NULL;
|
||||
);
|
||||
Reference in New Issue
Block a user