Architecture Overview
System architecture, tech stack, data flow, and deployment overview
VantageDash is a competitive intelligence SaaS that tracks competitor pricing, matches products using AI, and provides market insights. It follows a multi-tenant architecture with strict data isolation.
High-Level Architecture
┌─────────────────────────────┐ ┌──────────────────────────────┐
│ Frontend (Vercel) │ │ Backend (Coolify) │
│ Next.js 16 + App Router │────▶│ FastAPI + Python Scripts │
│ shadcn/ui v4 + Tailwind v4│ │ JWT Auth + Background Tasks│
│ Supabase Auth (client) │ │ Service Role Key │
└──────────┬──────────────────┘ └──────────┬───────────────────┘
│ │
│ Supabase JS SDK │ Supabase Python SDK
│ (RLS-scoped reads) │ (service role for background ops)
▼ ▼
┌────────────────────────────────────────────────────────────────────┐
│ Supabase (PostgreSQL) │
│ RLS Policies ──▶ get_user_tenant_id() ──▶ tenant_id scoping │
│ pgvector extension ──▶ HNSW indexes ──▶ semantic product search │
│ 15 tables ──▶ all FK'd to tenants(id) │
│ Auth trigger ──▶ handle_new_user() auto-provisions tenants │
└────────────────────────────────────────────────────────────────────┘Frontend Stack
| Technology | Purpose |
|---|---|
| Next.js 16 | App Router, server/client components, TypeScript strict mode |
| shadcn/ui v4 | Component library (uses @base-ui/react, NOT Radix — render prop, not asChild) |
| Tailwind v4 | CSS-first config, no tailwind.config.js |
| Supabase JS | Auth + database queries from server/client components |
| next-themes | Light/dark mode toggle |
| lucide-react | Icons |
| recharts | Charts (analytics, price history) |
| sonner | Toast notifications |
| jspdf + html2canvas | PDF/CSV export |
Frontend Directory Layout
frontend/src/
├── app/(auth)/ # Login, signup, callback pages
├── app/(dashboard)/ # All 9 dashboard pages (sidebar + header layout)
│ ├── overview/ # Main dashboard with metrics
│ ├── competitors/ # Add/delete competitors
│ ├── products/ # Brand products from Shopify
│ ├── comparison/ # Side-by-side product matches
│ ├── price-history/ # Price trend charts per competitor
│ ├── analytics/ # Recharts analytics dashboard
│ ├── alerts/ # Alert config + history
│ ├── logs/ # Scrape/sync/matching session logs
│ ├── settings/ # General + Matching Profile tabs
│ ├── products/[id]/ # Product detail page
│ └── competitors/[id]/ # Competitor report page
├── components/ui/ # shadcn components (auto-generated)
├── components/dashboard/ # Custom: MetricCard, StatusBadge, etc.
├── components/layout/ # Sidebar, header, mobile nav
├── lib/supabase/ # Client + server helpers
├── lib/api/ # backend.ts — typed FastAPI client
├── lib/utils/ # config.ts, price.ts, export.ts
└── lib/types/ # database.ts (TypeScript interfaces)Key Frontend Patterns
- Server components by default —
"use client"only for forms, interactivity, hooks - Data fetching: Server components use
createClient()from@/lib/supabase/server; client components use@/lib/supabase/client - Every page has: loading state, empty state, error handling (try-catch with error UI cards + retry buttons)
- Multi-tenant: All Supabase queries automatically scoped by RLS
- Backend calls:
lib/api/backend.tsprovides typed helpers (startScrape(),startMatch(),getProfile(), etc.)
Backend Stack
| Technology | Purpose |
|---|---|
| FastAPI | Async API framework with JWT auth |
| Pydantic | Settings validation, request/response models |
| supabase-py | Database access (per-request client with user JWT, service client for background) |
| RapidFuzz | Fuzzy string matching |
| OpenAI | GPT-4o-mini for AI matching, text-embedding-3-small for vector embeddings |
| BeautifulSoup | HTML parsing for product extraction |
| cryptography | Fernet encryption for tenant credentials |
| Stripe | Subscription billing (Checkout, Portal, webhooks) |
Backend Directory Layout
backend/
├── app/
│ ├── config.py # Pydantic BaseSettings (env vars, get_secret())
│ ├── dependencies.py # Auth: get_current_user, get_supabase_client, get_tenant_id
│ ├── models.py # Request/response Pydantic schemas
│ ├── crypto.py # Fernet encryption for tenant secrets
│ ├── middleware/
│ │ ├── audit.py # Request ID + structured logging (AU-2, AU-3)
│ │ ├── rate_limit.py # Token bucket per IP (SC-5)
│ │ └── security_headers.py # CSP, HSTS, etc. (SC-8)
│ ├── routers/
│ │ ├── health.py # GET /api/health
│ │ ├── scrape.py # POST /api/scrape + status polling
│ │ ├── sync.py # POST /api/sync + status polling
│ │ ├── match.py # POST /api/match (ai/fuzzy/hybrid/embed/validate)
│ │ ├── profile.py # GET/PUT/DELETE /api/profile
│ │ ├── credentials.py # GET/PUT/DELETE /api/tenant/credentials
│ │ └── data_lifecycle.py # GET/DELETE /api/tenant/data
│ └── services/
│ ├── scrape_service.py # Async wrapper for scraper.py
│ ├── sync_service.py # Async wrapper for shopify_sync.py
│ ├── match_service.py # Async wrapper for matching scripts
│ ├── alert_service.py # Price change detection + alerts
│ ├── notification_service.py # Slack + webhook delivery
│ ├── billing_service.py # Stripe subscription + plan limits
│ └── scheduler_service.py # Background scrape scheduling
├── tests/ # ~1,357 pytest tests across 50 files
├── Dockerfile # Python 3.12 slim, non-root user
└── requirements.txt # Production dependenciesRoot Python Scripts
These scripts predate the FastAPI backend and run standalone or are imported by backend services:
| Script | Purpose |
|---|---|
scraper.py | Multi-strategy competitor scraping (Shopify, Uline, WooCommerce, Magento, Wix, Ecwid, Playwright+AI) |
ai_matcher.py | GPT-4o-mini semantic product matching |
product_matcher.py | RapidFuzz fuzzy matching |
shopify_sync.py | Shopify Admin API product sync |
embedding_service.py | OpenAI text-embedding-3-small for pgvector |
industry_profile.py | IndustryProfile dataclass + DB loader with caching |
industry_templates.py | 8 pre-built industry profile templates |
price_utils.py | PPU (price-per-unit) calculations |
All entry points accept db=None, tenant_id=None for FastAPI injection while remaining backward-compatible for standalone use.
New Shared Components (Session 44)
| Component | Purpose |
|---|---|
confidence-bar.tsx | Reusable confidence percentage bar (extracted from comparison page) |
competitor-avatar.tsx | Shows competitor favicon via Google's favicon service |
competitor-link.tsx | Competitor name with avatar linking to /competitors/[id] |
searchable-product-select.tsx | Product combobox using @base-ui/react/combobox for Edit/Create match dialogs |
Scrape Progress (Session 44)
The RunTaskButton component now shows live scrape progress during scraping operations:
- Polls
progress_percent+progress_messagefrom the scrape session - Displays a progress bar with percentage and descriptive message
- Updates in real-time until the scrape completes or fails
Authentication & Multi-Tenancy
Auth Flow
- User signs up via Supabase Auth (email/password)
- DB trigger
handle_new_user()auto-creates atenantsrow +user_tenantsmapping - Frontend stores Supabase JWT in cookies
- Backend validates JWT via
get_current_user()dependency - RLS policies use
get_user_tenant_id()SQL function to scope all queries
Request Flow (Backend)
Request → CORS → SecurityHeaders → RateLimit → Audit → Router
│
get_current_user()
(validates Supabase JWT)
│
get_supabase_client()
(per-request, RLS-scoped)
│
get_tenant_id()
(reads user_tenants)
│
get_user_role() [optional]
require_role() guard
│
Background Task
(service role client +
explicit tenant_id)Two Client Patterns
- Per-request client (
get_supabase_client): Created with user's JWT token, RLS automatically scopes all queries to the tenant. Used for synchronous reads. - Service client (
get_service_client): Created with service role key, bypasses RLS. Used for background tasks where the user's JWT may expire. Callers must filter bytenant_idexplicitly.
Role-Based Access Control
The require_role() dependency factory enforces endpoint-level authorization:
- owner: Full control — manage team, settings, credentials, data lifecycle
- admin: Invite/remove members (except owner), configure settings, run pipelines
- member: View all data, run scrapes and matches, add competitors
- viewer: Read-only — view dashboards, export data
The get_user_role() dependency reads the current user's role from user_tenants (scoped by RLS). Frontend components use GET /api/team/me to conditionally render admin-only UI.
Invite Flow
Admin sends invite → team_invitations row created → Supabase invite email sent
│
Recipient clicks link → auth callback → exchangeCodeForSession
│
handle_new_user() trigger fires
(checks team_invitations for pending invite)
│
┌─── Found: joins existing tenant with invited role
└─── Not found: creates new tenant as ownerAuth callback detects invited users (non-owner role) and skips the onboarding wizard.
Resilient Startup
The backend is designed to start even with missing config:
config.pystoressettings_errorinstead of crashing if env vars are missingmain.pystarts with only the health router if settings fail to load- Health endpoint reports
config_errorandenv_hintfields for debugging - All operational routers only mount if settings loaded successfully
Billing & Monetization
Status: Stripe billing integration is live as of 2026-03-17.
VantageDash uses Stripe for subscription management with a 3-tier pricing model:
| Plan | Price | Competitors | Monthly Budgets | Key Features |
|---|---|---|---|---|
| Free | $0/mo | 2 | 5 scrapes, 0 AI, 0 embeddings | Manual scraping, fuzzy matching |
| Pro | $49/mo | 10 | 100 scrapes, 50 AI, 0 embeddings | AI matching, 24h auto-scrape, webhooks |
| Enterprise | $199/mo | Unlimited | Unlimited scrapes/AI, 100 embeddings | Vector embeddings, 1h auto-scrape, priority support |
Billing Flow
User clicks "Upgrade" → POST /api/billing/checkout
│
Stripe Checkout session created
(with tenant_id in metadata)
│
User completes payment on Stripe
│
Stripe sends checkout.session.completed webhook
│
POST /api/billing/webhook
(signature-verified, no JWT)
│
billing_service upserts subscription row
+ logs billing_event
│
Plan limits enforced on next API callKey implementation details:
- Webhook-driven: All subscription state changes flow through Stripe webhooks, not client-side confirmation
- Feature gating:
billing_service.pychecks plan limits before allowing competitor additions, AI matching, embedding generation, and auto-scrape scheduling - Stripe Customer Portal: Users manage payment methods, view invoices, and cancel/switch plans via Stripe's hosted portal
- Database tables:
subscriptions(plan state per tenant) andbilling_events(webhook audit log), both RLS-enforced - Free by default: New tenants start on the Free plan with no Stripe interaction required
Design Theme
Warm Neon Aero — a warm glass + golden neon aesthetic with light/dark mode support:
Color Palette
| Token | Light Mode | Dark Mode | Purpose |
|---|---|---|---|
--background | #faf8f5 (warm cream) | #0d0b08 (warm black) | Page background |
--primary | #b07a00 (deep gold) | #f0a500 (bright gold) | Buttons, active states, links |
--aero-gold | #f0a500 | #f0a500 | Brand accent color |
--aero-honey | #ffd166 | #ffd166 | Highlight / glow |
--aero-amber | #e07c24 | #e07c24 | Secondary accent |
--aero-bronze | #b8723b | #b8723b | Borders, shadows |
--aero-ember | #c44d1a | #c44d1a | Warm emphasis |
--aero-green | #69f0ae | #69f0ae | Success indicators |
--aero-pink | #ff4081 | #ff4081 | Destructive / alert |
Visual Effects
- Glassmorphism cards:
backdrop-filter: blur(20px) saturate(180%)with golden border glow on hover - Ambient gradient orbs: Warm golden radial gradients fixed to the body background
- Light mode: Cream/ivory base with bronze-tinted glass cards and muted gold accents
- Dark mode: Deep warm black base (
#0d0b08) with golden neon edge glow on cards
Typography
| Font | CSS Variable | Usage |
|---|---|---|
| Instrument Sans | --font-heading | Headings, nav links, labels, card titles, buttons |
| Space Grotesk | --font-space | Body text (set on <body>) |
| Geist Mono | --font-geist-mono | Code, SKUs, monospace data |
Instrument Sans replaced Sora in session 44 for a cleaner, more professional feel. Logo gradient darkened to #c07800→#e09520. Landing page buttons use rounded-md (not rounded-lg).
Light/Dark Mode Toggle
Implemented via next-themes (ThemeProvider wraps the app in layout.tsx):
attribute="class"— toggles.darkclass on<html>defaultTheme="dark"— dark mode by defaultenableSystem— respects OS preference- Toggle component in the dashboard header
Accent Color Customization
Tenants can set a brand color in Settings that overrides the default gold accent:
AccentColorProviderreadsbrandColorfrom tenant settingstheme.tsutilities (applyAccentColor,clearAccentColor) set CSS custom properties on<html>- Overrides
--primary,--ring,--sidebar-primary,--chart-1, and the--color-aero-*tokens - Automatic WCAG contrast calculation for foreground text via
getContrastForeground() - Reverts to default palette when the provider unmounts
New files: components/providers/theme-provider.tsx, components/providers/accent-color-provider.tsx, lib/utils/theme.ts
Observability (Session 37)
Sentry — Error Tracking + Performance
- Frontend:
@sentry/nextjswith client, server, edge configs.instrumentation.tshooks for Next.js 16. - Backend:
sentry-sdk[fastapi]initialized inmain.pybefore FastAPI creation. - Traces sample rate: 10% in production, 100% in development.
- Session replay: 10% sessions, 100% on error.
PostHog — Product Analytics
posthog-jswithPostHogProviderin root layout (app/layout.tsx).- Auto page view tracking via
usePathname(). - Custom events:
scrape_started,match_started,competitor_added. - Helper:
lib/analytics.ts—trackEvent(),identifyUser().
Env Vars Required
| Variable | Where | Purpose |
|---|---|---|
NEXT_PUBLIC_SENTRY_DSN | Vercel | Frontend Sentry |
SENTRY_AUTH_TOKEN | Vercel (build) | Source map uploads |
NEXT_PUBLIC_POSTHOG_KEY | Vercel | PostHog analytics |
NEXT_PUBLIC_POSTHOG_HOST | Vercel | PostHog host (default: us.i.posthog.com) |
SENTRY_DSN | Railway | Backend Sentry |
Super-Admin Dashboard (Session 37)
Architecture
**super_admins**table:user_id UUID PK— RLS enabled with no policies (service_role only).**is_super_admin: true**in userapp_metadata— checked by frontend layout.- Backend:
dependencies_admin.py→require_super_admin()validates againstsuper_adminstable via service_role. - Router:
routers/admin.py— 6 endpoints under/api/admin, all require super_admin.
Admin Endpoints
| Endpoint | Returns |
|---|---|
GET /api/admin/stats | Total tenants, active tenants, users, scrape sessions, success rate, MRR |
GET /api/admin/tenants?page=N | Paginated tenant list with competitor/product counts |
GET /api/admin/tenants/{id} | Tenant drill-down: members, competitors, recent scrapes |
GET /api/admin/activity | Cross-tenant activity feed (last 50 sessions) |
GET /api/admin/errors | Failed sessions with error messages |
GET /api/admin/billing | Subscribers by plan, MRR calculation |
Frontend Routes
| Route | Page |
|---|---|
/admin | Platform KPIs, billing overview, errors, activity feed |
/admin/tenants | Paginated tenant table with search and drill-down |
/admin/tenants/[id] | Tenant detail: members, competitors, scrapes, subscription |
/admin/activity | Cross-tenant scrape/match/sync activity feed |
Access Control
- Header user dropdown shows "Admin Dashboard" link only when
isSuperAdminprop is true. - Dashboard layout checks
user.app_metadata.is_super_admin. - Backend enforces via
super_adminstable lookup on every admin endpoint.
Code Quality & Shared Utilities (Session 39)
Python Linting
All Python code is linted and formatted with ruff (configured in pyproject.toml at repo root).
- Rules: E/F/W (pycodestyle/pyflakes), I (isort), UP (pyupgrade), B (bugbear), SIM (simplify), RUF
- Pre-commit:
.husky/pre-commitrunsruff checkandruff format --checkon staged.pyfiles - Dev dependency:
ruff>=0.8.0inbackend/requirements-dev.txt
Shared Backend Utilities
| Module | Purpose |
|---|---|
backend/app/utils/sessions.py | update_session_status() — single helper replacing 23+ duplicated session-update try/except blocks across services and scripts |
backend/app/constants.py | Centralized magic numbers: timeouts, pagination limits, matcher thresholds, AI params, webhook config |
Shared Frontend Utilities
| Module | Purpose |
|---|---|
frontend/src/lib/utils/tenant.ts | getCurrentTenantId(supabase) — replaces 8 duplicated user_tenants fetch patterns |
frontend/src/lib/utils/date.ts | timeAgo(), getWeekKey() — shared date formatting |
frontend/src/lib/constants.ts | POLLING, CONFIDENCE_TIERS, CHART_COLORS — app-wide constants |
Analytics Chart Components
The analytics page is split into focused components in frontend/src/components/dashboard/analytics/:
| Component | Renders |
|---|---|
CompetitorPriceDiffChart | Horizontal bar chart of avg price difference by competitor |
PriceCompetitivenessChart | Donut chart of win/loss price distribution |
CompetitorTrendChart | Line chart of price diff trends over time |
ConfidenceChart | Bar chart of match confidence distribution |
CategoryBreakdown | Pie chart of matches by product category |
BiggestPriceGaps | List of products with largest price differences |
Contributor Onboarding
- CONTRIBUTING.md — Dev setup, testing, code style, branch naming, PR checklist
- backend/README.md — FastAPI structure, auth flow, how to add endpoints, test patterns
- README.md — Updated to reflect current Next.js + FastAPI stack (was outdated Streamlit-era)
Error Handling Policy
All production code follows: never silently swallow exceptions. Every except block either:
- Re-raises the exception, or
- Logs a
logger.warning()with context before continuing
Recent Changes (Sessions 38–44)
Last updated: 2026-03-20
Session 38 — Quality Fixes
- Fixed 7 vitest test errors across frontend test suite
- Billing URL config:
FRONTEND_URLenv var added to backend Settings for Stripe checkout redirect URLs - Admin error boundary: added error handling to super-admin pages to prevent white-screen crashes
- Logs page type safety: fixed TypeScript type errors in scrape/sync/matching session log displays
Session 39 — Code Maintainability Overhaul
- Ruff linting fully integrated:
pyproject.tomlat repo root configures rules (E/F/W/I/UP/B/SIM/RUF),.husky/pre-commitrunsruff check+ruff format --checkon staged.pyfiles - Shared backend utilities:
backend/app/utils/sessions.py(update_session_status()replacing 23+ duplicated blocks),backend/app/constants.py(centralized magic numbers) - Shared frontend utilities:
lib/utils/tenant.ts(getCurrentTenantId()),lib/utils/date.ts(timeAgo(),getWeekKey()),lib/constants.ts(POLLING,CONFIDENCE_TIERS,CHART_COLORS) - Error handling policy: all
exceptblocks now either re-raise or loglogger.warning()with context — no more silent swallows - Contributor docs:
CONTRIBUTING.md,backend/README.md, updated rootREADME.md
Session 40 — Monitoring Activation
- Sentry activated: Env vars set in both Vercel (
NEXT_PUBLIC_SENTRY_DSN,SENTRY_AUTH_TOKEN,SENTRY_ORG,SENTRY_PROJECT) and Railway (SENTRY_DSN). Error tracking + performance monitoring live in production. - PostHog activated:
NEXT_PUBLIC_POSTHOG_KEYset in Vercel. Product analytics, session replay, and custom event tracking active. - 8 new Shopify competitors scraped: Dragon Chewer, Green Tech Packaging, Marijuana Packaging, CannaSup Co, Grove Bags, Sana Packaging, Flush Packaging, 420 Science. Total: 17 competitors, 7,648 products.
Session 41 — CI & Stripe Finalization
- GitHub Actions set to manual-only: All 4 workflows (
ci.yml, backend CI, Playwright, security) changed toworkflow_dispatchonly. No more auto-triggers on push/PR (free minutes exhausted). Re-enable by restoringon: push/pull_requesttriggers. - Stripe setup wizard completed: Tax collection enabled (automatic mode, SaaS tax code
txcd_10000000, Maryland registered, NAICS 541512). Checkout portal configured.
Session 42 — Scraper Resilience & Marketing Page
Scraper Retry Logic
- Per-competitor retry with exponential backoff: Each competitor scrape now retries up to 3 times on failure with backoff delays, preventing a single timeout from killing an entire batch scrape
- Session-level finally block:
scrape_sessionsstatus always gets updated (completed/failed) even on unexpected exceptions - Single-competitor session tracking:
POST /api/scrape/{competitor_id}now creates its own scrape session (previously only batch scrapes had sessions)
Frontend Route Restructuring
- Public landing/marketing page: New
(marketing)route group with a public page at/— the root URL now shows a marketing/landing page instead of redirecting to login - Dashboard moved to
**/overview**: The main dashboard page moved from/to/overviewwithin the(dashboard)route group - Updated directory layout:
frontend/src/app/
├── (marketing)/ # Public pages (landing page at /)
├── (auth)/ # Login, signup, callback
├── (dashboard)/ # All dashboard pages (requires auth)
│ ├── overview/ # Main dashboard (was previously at /)
│ ├── competitors/
│ ├── products/
│ └── ...
└── (admin)/ # Super-admin pagesSession 43 — Warm Neon Aero Redesign
Visual Redesign
- Color palette overhaul: Replaced cyan/purple Alienware palette with warm gold/amber/bronze ("Warm Neon Aero"). New CSS custom properties:
--aero-gold(#f0a500),--aero-honey(#ffd166),--aero-amber(#e07c24),--aero-bronze(#b8723b),--aero-ember(#c44d1a). - Heading font: Sora replaced Exo 2 (
--font-headingvariable). Warmer geometric sans-serif that pairs better with the gold accent palette. - Light/dark mode: Added
next-themesThemeProviderin root layout. Light mode uses warm cream base (#faf8f5), dark mode uses warm black (#0d0b08). Both modes have matching glassmorphism card effects and ambient gradient orbs in gold/bronze tones. - Landing page bento grid: Marketing page at
/redesigned with a 2+4 bento grid layout (2 large feature cards + 4 compact cards), "How it Works" numbered steps, and pricing cards. All using the warm neon aero palette.
Accent Color Customization
**AccentColorProvider**(components/providers/accent-color-provider.tsx): Reads tenant's brand color from settings and injects it as CSS custom properties viaapplyAccentColor().**theme.ts**(lib/utils/theme.ts): Utilities for hex-to-RGB conversion, WCAG contrast calculation, color lightening/darkening, and applying/clearing accent color overrides on<html>.- Overridden tokens:
--primary,--primary-foreground,--ring,--sidebar-primary,--chart-1,--color-aero-gold/honey/bronze.
Product Variant Expansion
- Products page now shows expandable variant sub-rows. Products with multiple variants display a chevron toggle and "N variants" badge.
- Clicking a product row expands inline sub-rows showing each variant's title, SKU, price, pack quantity, and calculated PPU.
- Variant sub-rows use
extractPackQuantity()fromprice.tsto parse pack sizes from variant titles and compute per-unit pricing.
Updated Test Counts
| Suite | Count |
|---|---|
| vitest (frontend) | 651 tests (53 files) |
| pytest (backend) | 1,427+ tests (50+ files) |
| Playwright (e2e) | ~170 tests (8 files) |
| Total | ~2,240+ |
Recent Updates (Sessions 38–42)
Session 38 — Quality Fixes
- Fixed 7 vitest test errors (type mismatches, missing mocks)
- Billing URL configuration fix (backend URL from env vars)
- Admin page error boundary added
- Logs page type safety improvements
Session 39 — Code Maintainability Overhaul
- Ruff linting: Configured in
pyproject.toml, enforced via pre-commit hook - Shared utilities:
backend/app/utils/sessions.py,backend/app/constants.py,frontend/src/lib/utils/tenant.ts,frontend/src/lib/utils/date.ts,frontend/src/lib/constants.ts - Error handling: All production
except Exception: passreplaced withlogger.warning() - Type hints: Return types on all 9 critical Python entry points
- Component architecture: Analytics page split into 6 extracted chart components
- Contributor docs:
CONTRIBUTING.md,backend/README.md, updated rootREADME.md
Session 40 — Monitoring & Scraping
- Sentry activated: Env vars set in Vercel (
NEXT_PUBLIC_SENTRY_DSN,SENTRY_AUTH_TOKEN) and Railway (SENTRY_DSN) - PostHog activated:
NEXT_PUBLIC_POSTHOG_KEYset in Vercel — product analytics + session replay live - 8 new Shopify competitors scraped (17 total, 7,648 products)
Session 41 — CI & Stripe
- GitHub Actions: All 4 workflows set to
workflow_dispatchonly (manual trigger, no failure emails) - Stripe setup wizard: Completed — Stripe Tax enabled, checkout portal configured, SaaS tax code applied
Session 42 — Scraper Resilience, Landing Page, Wiki Update
- Scraper resilience: Per-competitor retry with backoff (1 retry, 3s delay), session-level
finallyblock guarantees sessions are never stuck on "running", single-competitor scrapes now create session rows for progress tracking - Landing/marketing page: New
(marketing)route group with public landing page at/. Dashboard overview moved from/to/overview. Hero, features grid, pricing cards, CTA sections. Middleware updated to allow/without auth. - Route changes: All sidebar/nav links updated from
/to/overview. Auth callbacks redirect to/overview. - Test count: 651 vitest + 1,389+ pytest = ~2,210+ total (zero failures)
Session 43 — Warm Neon Aero Redesign
- Color palette: Warm gold/amber/bronze replacing cyan/purple Alienware palette. New
--aero-gold/honey/amber/bronze/emberCSS custom properties. - Typography: Sora font replacing Exo 2 for headings (
--font-heading). Later replaced by Instrument Sans in session 44. - Light/dark mode:
next-themesThemeProvider added to root layout. Light mode cream base, dark mode warm black. - Accent color customization:
AccentColorProvider+theme.tsallow per-tenant brand color overrides with WCAG contrast calculation. - Product variants: Expandable variant sub-rows on Products page with pack quantity and PPU per variant.
- Landing page redesign: Bento grid layout (2 large + 4 compact feature cards), numbered "How it Works" steps, warm palette pricing cards.
- New files:
theme-provider.tsx,accent-color-provider.tsx,theme.ts
Session 44 — UI Polish, New Pages, Playwright Scraping
- Typography: Heading font changed from Sora to Instrument Sans for a cleaner, more professional look.
- Logo: Gradient darkened to
#c07800→#e09520. - Landing page: Buttons changed from
rounded-lgtorounded-md. - New pages:
/products/[id](product detail page) and/competitors/[id](competitor report page). - New shared components:
confidence-bar.tsx(extracted),competitor-avatar.tsx(Google favicon service),competitor-link.tsx,searchable-product-select.tsx(@base-ui/react/combobox). - Edit/Create match dialogs: Now use searchable product combobox instead of plain select.
- CompetitorAvatar: Displays favicons from competitor URLs via Google's favicon service.
- Scrape progress bar:
RunTaskButtonshows liveprogress_percent+progress_messageduring scraping. - Scraper fallback chain expanded to 7 steps: Shopify → Uline → WooCommerce API → WooCommerce Sitemap → HTML Listing → Playwright (JS) → Firecrawl+AI.
- Playwright headless browser: New step 6 in the fallback chain for JS-rendered sites (Squarespace+Ecwid, React SPAs). Uses Playwright with Chromium to render pages and extract product data.
- Design & Customize: Identified as Squarespace+Ecwid (not WooCommerce) — requires Playwright to scrape prices.
Session 46 — SEO, Blog, Ecwid Scraper
SEO
- metadataBase: Set to
https://vantagedash.ioin root layout — all relative OG URLs resolve correctly. - Open Graph + Twitter cards: Title, description, and image on all pages.
- JSON-LD structured data: Organization + SoftwareApplication schemas in root layout.
- Dynamic OG image:
opengraph-image.tsxgenerates a branded 1200x630 image at build time. - sitemap.ts: Auto-generates sitemap with all static pages + blog posts.
- robots.ts: Allows all crawlers, points to sitemap URL.
Blog (/blog)
- Route under
(marketing)layout (standalone nav + footer, no sidebar). - Markdown posts stored in
frontend/content/blog/with YAML frontmatter (title, date, tags, excerpt). - Blog index page (
/blog) shows post cards with excerpts, reading time, and tags. - Individual post pages (
/blog/[slug]) render markdown with Tailwind typography (@tailwindcss/typography). - CTA footer on each post drives signup conversions.
- Blog link added to marketing navigation bar.
- 3 initial posts: competitor price tracking, AI vs fuzzy matching, PPU comparison.
- New files:
lib/blog/index.ts(markdown loading, frontmatter parsing),content/blog/*.md.
Ecwid Scraper
- New
scrape_ecwid_store()function inscraper.py. - Auto-detects Ecwid store ID from page HTML (regex patterns for
ecwid.comorEcwid.references). - Uses Ecwid public REST API (
app.ecwid.com/api/v3/{store_id}/products) — no auth token needed. - Pagination support (100 products per page).
- Falls back to Playwright DOM scraping if API returns errors.
- Fallback chain updated to 8 steps: Shopify → Uline → WooCommerce API → WooCommerce Sitemap → HTML Listing → Playwright (JS) → Ecwid API → Firecrawl+AI.
Session 47 — Firecrawl Eliminated, Ecwid Fixed, Blog Expansion, Email Drip
Firecrawl Eliminated
The paid Firecrawl dependency has been completely removed:
- Removed
firecrawl-pyfrombackend/requirements.txt, replaced withbeautifulsoup4. - Deprecated
scrape_non_shopify_store()— emits DeprecationWarning, no longer in fallback chain. - New
scrape_with_playwright_and_ai()— uses free Playwright rendering + OpenAI text extraction. Cost: ~$0.003/site (vs $0.30 with Firecrawl).
Ecwid Scraper Fixed
_detect_ecwid_store_id()now returns(store_id, token)tuple.fetch_ecwid_products()tries 3 token strategies: extracted token,public_{store_id}, no-token fallback.- XHR interception: Playwright captures Ecwid API responses during browser rendering.
Enhanced Playwright Scraper
- XHR interception captures product data from JSON API responses.
- Scroll-to-load: Auto-scrolls 3x viewport height to trigger lazy loading.
- More CSS selectors: WooCommerce, Ecwid v2+, BigCommerce, generic.
_extract_product_from_card()DRY helper and_parse_price_text()handles price ranges.- Wait time increased from 3s to 5s.
BeautifulSoup HTML Scraper
New Strategy 4 in scrape_product_listing_html() using schema.org microdata and CSS class patterns.
Updated Fallback Chain
- Shopify /products.json (FREE)
- Uline dedicated scraper (FREE)
- WooCommerce Store API (FREE)
- Generic Sitemap + JSON-LD (FREE)
- HTML Listing + BeautifulSoup (FREE, enhanced)
- Ecwid API with token auth (FREE, fixed)
- Playwright JS renderer (FREE, enhanced)
- Playwright + OpenAI extraction (~$0.003/site, replaces Firecrawl)
Blog Expansion
7 new SEO-targeted posts (10 total): competitive pricing, packaging industry, Shopify analysis, price alerts, supplements, web scraping legal, product matching algorithms.
Email Drip System
email_drip_logtable +pg_cron+pg_netextensions enabled.- Edge Function
send-drip-emaildeployed (Resend free tier). - 4-email sequence: Day 0 welcome, Day 1 add competitor, Day 3 landscape, Day 7 upgrade.
- pg_cron runs daily at 2:07pm UTC. Set
RESEND_API_KEYin Supabase secrets to activate.
Tests: 1,417 pytest + 651 vitest = 2,068+ total, zero failures.
Session 48 — RLS Bug Fix, Blog Public, Scraper Platform Expansion
Critical Bug Fix: Single-Competitor Scrape RLS Bypass
POST /api/scrape/{competitor_id} was saving 0 products due to an RLS bypass bug. The single-competitor code path did not inject the auth-scoped DB client into scrape_and_save_store() — it fell through to the anon key via _ensure_supabase(), causing RLS to silently block all product_tracking inserts. The batch scraper (POST /api/scrape) worked because run_scraper receives the DB client directly.
Fix: Inject the service-role DB client in run_scrape_single + add _ensure_supabase safety net in scrape_and_save_store.
Blog Made Public
/blog and /blog/* routes were behind auth middleware, breaking SEO crawling. Added to the public paths list in frontend/src/middleware.ts.
Scraper Platform Improvements
- Magento: Added
_extract_product_from_microdata()fallback using BeautifulSoup for Magento pages that use schema.org microdata instead of JSON-LD. Sitemap URL quality check now validates microdata alongside JSON-LD. - Wix: Added Wix-specific DOM selectors (
data-hook,ProductItem) for Playwright. Added Wix Stores API interception in XHR handler. - Ecwid: Expanded
shop_pathswith/shop-all,/all-products,/allfor Ecwid stores. - RSS/WP Feed: Accept all URLs from WordPress product feeds (not just
/product/paths). - DRY refactoring: Extracted
_xhr_items_to_products()and_extract_price_from_xhr_item()helpers.
Scrape Logging for Single-Competitor Scrapes
Single-competitor scrapes (POST /api/scrape/{id}) now write scrape_logs entries for debugging, matching the behavior of batch scrapes.
Email Template Assets
Added stacked windows SVG at /email/stacked-windows.svg for use in drip email templates (CSS positioning stripped by email clients).
RLS Policy Hardening
Replaced tautology RLS policies (USING true) in supabase_enable_rls.sql migration script with real get_user_tenant_id() enforcement matching the live Supabase state. All 20 public tables confirmed to have proper tenant isolation.
Marketing Claims Updated
- "NIST 800-53 compliance" changed to "NIST 800-53 aligned" (no third-party audit conducted)
- "Firecrawl+AI" feature badge replaced with "Ecwid" (Firecrawl eliminated in session 47)
Session 50 — Scraper Accuracy Testing & Fixes (2026-03-21)
Intensive scraper testing against all 17 live competitor sites with Chrome visual verification. Fixed 6 critical data quality bugs.
Bugs Fixed
| Bug | Affected Store | Fix |
|---|---|---|
| Sitemap sampling grabbed category pages first | ClearBags (Magento) | Smart sampling prefers .html URLs and SKU-like slugs over category pages |
<dialog> "Product modal" extracted as product name | ClearBags | Remove <dialog> elements before BS4 parsing; blocklist UI text |
| Playwright extracted "Quick View" overlay text | Mylar Legends (Wix) | Added [data-hook='product-item-name'] and [data-hook='product-item-price-to-pay'] as priority selectors |
| All $0 prices on Wix stores | Mylar Legends | Added handlers for convertedPriceData, formattedPrice, variant nested price formats |
| Uline keyword substring false positives | Uline | Word-boundary regex matching ("tin" no longer matches "counting", "cart" no longer matches "carton") |
| Duplicate products across scrape sessions | All stores | Delete existing competitor products before inserting fresh snapshot (each scrape is a full replacement) |
Additional Improvements
- WooCommerce
price_rangefallback for variable products (range-priced items) - Googlebot UA fallback when Chrome UA gets 403
- Sitemap product cap raised 300 to 500, with
.htmlURLs prioritized - Magento-specific CSS selectors added to both BeautifulSoup and Playwright
[data-hook='product-item-root']Wix DOM selector added- 10 new scraper tests (175 total scraper tests, 183 scrape-related total)
Verified Results
- ClearBags: 0 products to 500 products with real names and prices (verified in Chrome: exact match)
- Mylar Legends: 21 products at $0.00 to 10 products with real prices ($0.11-$3.00)
- Biohazard Inc: 6,898 rows (3,440 unique) to ~3,440 on next scrape via dedup fix
Known Limitations
- Cannaline: sgcaptcha bot protection blocks ALL HTTP requests (even Googlebot). Requires captcha-solving or headful browser approach.
- Design & Customize: Ecwid API returns 403; prices only visible on individual product detail pages (not listings). 16/77 product names extracted, 0 prices.
Session 49 — Operational Completion (2026-03-20)
- Resend API key activated: Email drip system now live —
RESEND_API_KEYset in Supabase Edge Function secrets. 4-email onboarding sequence (welcome → add competitor → features → upgrade) runs daily at 2:07pm UTC via pg_cron, sent fromonboarding@vantagedash.io. - 3 WooCommerce competitors re-scraped: Mylar Legends (21 products), ClearBags (13 products), Design & Customize (4 products) — all scraped successfully with enhanced Playwright + BeautifulSoup scrapers (previously failed due to exhausted Firecrawl credits).
- All operational items complete: Product is feature-complete with all infrastructure active. Remaining items: go-to-market (customer acquisition) and Shopify App Store (deferred).
- SVG favicon: Orange gradient circle with white lightning bolt (
icon.svg), replaces oldfavicon.ico. Matches the.vd-logoused across nav/sidebar/auth pages. - Monthly usage budget enforcement:
PLAN_LIMITSnow includesmonthly_scrapes,monthly_ai_matches,monthly_embeddings.check_monthly_limit()inbilling_service.pychecks against session tables within billing period. Scrape router returns 429 at limit, match router returns 403 for disallowed methods + 429 at limit. Frontend "Monthly Budget" card shows usage bars.
Session 51 — Stealth Playwright, Pagination, Enrichment, New Competitors (2026-03-21)
- Session 52 (2026-03-21): Railway→Coolify doc cleanup complete (10 files), SKS Bottle scraping fixed (BS4 Strategy C + sitemap listing fallback), 13 new tests (200 scraper total, 2,278+ overall)
Features Added
| Feature | Impact |
|---|---|
| playwright-stealth | Bypasses bot detection (Cannaline sgcaptcha solved: 0 → 365 products) |
| Pagination | Follows next-page links up to 15 pages (Cannaline: 29 → 365) |
| Detail page enrichment | Visits product URLs for prices when listings hide them (D&C: 0 → 14 prices) |
| Shop page preference | Tries /shop when root has <10 products (D&C: 4 → 14 products) |
| WooCommerce detail selectors | wait_for_selector for async-loaded variation prices |
5 New Competitors (22 Total, 7 Platforms)
| Competitor | Platform | Products | Coverage |
|---|---|---|---|
| PackFreshUSA | BigCommerce | 104 | 100% |
| 420 Stock | WooCommerce | 482 | 100% |
| SKS Bottle & Packaging | Custom PHP | 4 | 100% |
| CannaZip | WooCommerce | 84 | 88% |
| Specialty Bottle | BigCommerce | 500 | ~100% |
Previously "Impossible" Stores Now Working
- Cannaline: 365 products, 98.4% with prices (sgcaptcha bypassed by stealth)
- Design & Customize: 14 products, 100% with prices (detail enrichment + shop preference)
Deployment Change
- Backend migrated from Railway to Coolify (Docker-based, same Dockerfile)
api.vantagedash.ionow points to Coolify instance- Dockerfile is platform-agnostic (curl added for Coolify healthcheck)
Test Counts
- 200 scraper tests (175 + 12 new), 2,278+ total tests — zero failures