Database Schema & RLS

VantageDash uses Supabase (PostgreSQL) with Row Level Security (RLS) for multi-tenant data isolation. All tables reference tenants(id) via tenant_id foreign keys.

Extensions

uuid-ossp: UUID generation (uuid_generate_v4())
pgvector: Vector similarity search for product embeddings

Tables Overview

#	Table	Purpose	Key Columns
0	`tenants`	Tenant (brand) metadata	`id` (slug), `brand_name`, `brand_url`, `industry`, `encrypted_shopify_url/token`
1	`user_tenants`	Maps auth users → tenants	`user_id` (auth.uid), `tenant_id`, `role`
2	`competitors`	Competitor monitoring targets	`name`, `url`, `brand_color`
3	`product_tracking`	Scraped competitor products + prices	`competitor_name`, `product_name`, `current_price`, `embedding vector(1536)`
4	`brand_products`	Brand's own products (from Shopify)	`shopify_id`, `title`, `price`, `category`, `sku`, `embedding vector(1536)`
5	`product_groups`	Matched product pairs	`brand_product_id` → `brand_products`, `matched_competitor`, `match_confidence`
6	`scrape_sessions`	Scraping operation tracking	`status`, `total_targets`, `progress_percent`, `error_message`
7	`scrape_logs`	Per-competitor scrape details	`session_id` → `scrape_sessions`, `competitor_name`, `status`
8	`sync_sessions`	Shopify sync tracking	`status`, `total_products`, `progress_percent`, `error_message`
9	`sync_logs`	Per-product sync details	`session_id` → `sync_sessions`, `product_title`, `status`
10	`matching_sessions`	Matching operation tracking	`status`, `total_products`, `matched_products`, `error_message`
11	`matching_logs`	Per-product match details	`tenant_id` → `tenants`, `session_id` → `matching_sessions`, `brand_product_name`, `match_confidence`
12	`alerts_config`	Alert rule configuration	`email_enabled`, `price_drop_percent`, `price_increase_percent`
13	`alerts_history`	Historical alert records	`competitor_name`, `product_name`, `price_change_percent`, `read_at`
14	`industry_profiles`	Per-tenant matching config (JSONB)	`industry_slug`, `categories`, `scoring_config`, `use_new_matcher`

Detailed Table Schemas

tenants

Primary identity table. id is a short slug (e.g., "mybrand").

CREATE TABLE tenants (
    id TEXT PRIMARY KEY,
    brand_name TEXT NOT NULL,
    brand_url TEXT,
    shopify_store TEXT,
    brand_color TEXT DEFAULT '#0070f3',
    logo_url TEXT,
    timezone TEXT DEFAULT 'America/New_York',
    industry TEXT DEFAULT 'ecommerce',
    product_description TEXT DEFAULT 'products',
    encrypted_shopify_url TEXT,      -- Fernet-encrypted
    encrypted_shopify_token TEXT,    -- Fernet-encrypted
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

product_tracking

Central table for competitor pricing data. Has pgvector embedding column.

CREATE TABLE product_tracking (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    tenant_id TEXT REFERENCES tenants(id),
    competitor_name TEXT NOT NULL,
    product_name TEXT NOT NULL,
    current_price NUMERIC(10,2),
    url TEXT,
    finish_type TEXT DEFAULT 'standard',
    base_name TEXT,
    ppu NUMERIC,
    embedding vector(1536),          -- pgvector for similarity search
    created_at TIMESTAMPTZ DEFAULT NOW()
);

brand_products

Brand's own products synced from Shopify. Also has pgvector embedding.

CREATE TABLE brand_products (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    tenant_id TEXT REFERENCES tenants(id),
    shopify_id TEXT UNIQUE,
    title TEXT,
    price NUMERIC(10,2),
    category TEXT,
    sku TEXT,
    size TEXT,
    variants JSONB,
    embedding vector(1536),
    -- ... plus handle, vendor, description, weight, tags, etc.
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

product_groups

Matched product pairs — the core output of the matching engine.

CREATE TABLE product_groups (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    tenant_id TEXT REFERENCES tenants(id),
    brand_product_id UUID REFERENCES brand_products(id) ON DELETE CASCADE,
    brand_product_name TEXT,
    brand_price NUMERIC(10,2),
    matched_competitor TEXT,
    matched_product_name TEXT,
    matched_price NUMERIC(10,2),
    matched_url TEXT,
    match_confidence INTEGER,        -- 0-100
    differences JSONB,
    match_reason TEXT,
    comparison_method TEXT DEFAULT 'price',
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

industry_profiles

Per-tenant matching configuration. One row per tenant.

CREATE TABLE industry_profiles (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id TEXT NOT NULL REFERENCES tenants(id) ON DELETE CASCADE,
    industry_slug TEXT NOT NULL,
    industry_name TEXT NOT NULL,
    categories JSONB NOT NULL DEFAULT '[]',
    search_terms JSONB NOT NULL DEFAULT '{}',
    size_equivalences JSONB NOT NULL DEFAULT '{}',
    hard_block_rules JSONB NOT NULL DEFAULT '[]',
    scoring_config JSONB NOT NULL DEFAULT '{...}',
    specialty_brands JSONB NOT NULL DEFAULT '{}',
    unit_types JSONB NOT NULL DEFAULT '[]',
    ai_extraction_categories TEXT[] NOT NULL DEFAULT '{}',
    ai_matching_rules TEXT[] NOT NULL DEFAULT '{}',
    use_new_matcher BOOLEAN NOT NULL DEFAULT false,
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW(),
    UNIQUE(tenant_id)
);

Session/Log Pattern

Three pipelines follow the same two-table pattern:

Pipeline	Session Table	Log Table	FK
Scraping	`scrape_sessions`	`scrape_logs`	`session_id` → `scrape_sessions(id)` ON DELETE CASCADE
Syncing	`sync_sessions`	`sync_logs`	`session_id` → `sync_sessions(id)` ON DELETE CASCADE
Matching	`matching_sessions`	`matching_logs`	`session_id` → `matching_sessions(id)` ON DELETE CASCADE

Each session tracks: status (running/completed/failed), progress counters, error_message (stores JSON reports for validation sessions).

Row Level Security (RLS)

Core RLS Function

Resolves the authenticated user's active tenant, with JWT-based tenant switching and backward-compatible fallback:

CREATE OR REPLACE FUNCTION get_user_tenant_id() RETURNS TEXT AS $$
  SELECT COALESCE(
    (SELECT tenant_id FROM public.user_tenants
     WHERE user_id = auth.uid()
       AND tenant_id = (auth.jwt() -> 'app_metadata' ->> 'active_tenant_id')),
    (SELECT tenant_id FROM public.user_tenants
     WHERE user_id = auth.uid()
     ORDER BY created_at ASC LIMIT 1)
  );
$$ LANGUAGE sql SECURITY DEFINER STABLE SET search_path TO 'public';

Policy Pattern

Every data table has this policy:

ALTER TABLE <table> ENABLE ROW LEVEL SECURITY;

CREATE POLICY "tenant_isolation" ON <table>
FOR ALL USING (tenant_id = get_user_tenant_id());

This ensures:

SELECT only returns rows matching the user's tenant
INSERT/UPDATE/DELETE only affects rows matching the user's tenant
No cross-tenant data leakage is possible

RLS Verification (Session 48)

All 20 public tables confirmed to have real get_user_tenant_id() enforcement (verified via live DB query — no tautology policies).

Special Cases

user_tenants: Policy is user_id = auth.uid() (identity-scoped, not tenant-scoped)
tenants: SELECT if id IN (SELECT tenant_id FROM user_tenants WHERE user_id = auth.uid()); UPDATE restricted to owners
scrape_logs, sync_logs, matching_logs: No tenant_id column — use join-based policies: session_id IN (SELECT id FROM <parent>_sessions WHERE tenant_id = get_user_tenant_id())
super_admins, email_drip_log: RLS enabled, no policies (deny all to anon/authenticated — service_role only)
team_invitations: Separate SELECT/UPDATE/DELETE policies (no INSERT — created via service_role)

Auth Trigger

Auto-provisions new tenants on Supabase Auth signup:

CREATE OR REPLACE FUNCTION handle_new_user()
RETURNS TRIGGER AS $$
DECLARE
    _tenant_id TEXT;
BEGIN
    SET search_path TO 'public';
    _tenant_id := REPLACE(NEW.id::text, '-', '');
    _tenant_id := SUBSTRING(_tenant_id FROM 1 FOR 20);
    
    INSERT INTO tenants (id, brand_name) VALUES (_tenant_id, 'My Brand');
    INSERT INTO user_tenants (user_id, tenant_id, role) VALUES (NEW.id, _tenant_id, 'owner');
    
    RETURN NEW;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;

CREATE TRIGGER on_auth_user_created
AFTER INSERT ON auth.users
FOR EACH ROW EXECUTE FUNCTION handle_new_user();

Note: SET search_path TO 'public' is required — without it, the trigger fails to find the tenants table.

team_invitations

Multi-user invite management. RLS-enabled.

Column	Type	Notes
id	UUID PK	`gen_random_uuid()`
tenant_id	TEXT FK	References `tenants(id)` ON DELETE CASCADE
email	TEXT	Invitee email (lowercased)
role	TEXT	`admin`, `member`, or `viewer` (CHECK constraint)
invited_by	UUID	`auth.uid()` of the inviter
token	TEXT UNIQUE	UUID4 acceptance token
status	TEXT	`pending`, `accepted`, `revoked`, `expired` (CHECK)
expires_at	TIMESTAMPTZ	7 days from creation
accepted_at	TIMESTAMPTZ	Set when accepted
created_at	TIMESTAMPTZ	`now()`

Indexes:

Partial unique: (tenant_id, lower(email)) WHERE status = 'pending' — one pending invite per email per tenant
Partial: (lower(email)) WHERE status = 'pending' — signup invite lookup

RLS policies: SELECT/INSERT/UPDATE/DELETE scoped by get_user_tenant_id()

Indexes

Standard Indexes

Every table with tenant_id has: idx_<table>_tenant ON <table>(tenant_id)

Special Indexes

product_tracking(competitor_name) — fast competitor filtering
product_tracking(created_at DESC) — time-series queries
product_tracking(base_name) — variant grouping
brand_products(shopify_id) — Shopify ID lookup (UNIQUE)
product_groups(brand_product_id) — join optimization
alerts_history(created_at DESC) — recent alerts
alerts_history partial index on read_at IS NULL — unread notification queries

pgvector Indexes (HNSW)

CREATE INDEX idx_product_tracking_embedding ON product_tracking
USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);

CREATE INDEX idx_brand_products_embedding ON brand_products
USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);

HNSW parameters: m=16 (connections per node), ef_construction=64 (build-time search breadth).

RPC Functions

match_products_by_embedding

CREATE OR REPLACE FUNCTION match_products_by_embedding(
    query_embedding vector(1536),
    match_threshold float DEFAULT 0.3,
    match_count int DEFAULT 20,
    p_tenant_id text DEFAULT NULL
)
RETURNS TABLE (
    id uuid,
    product_name text,
    competitor_name text,
    current_price numeric,
    similarity float
)

Returns the top-N most similar competitor products for a given embedding vector, filtered by tenant and similarity threshold.

Data Lifecycle

When deleting tenant data (DELETE /api/tenant/data), tables are deleted in FK order:

Log tables (via parent session IDs)
product_groups (FK to brand_products)
Session tables
brand_products, product_tracking, competitors
alerts_config, alerts_history, industry_profiles
Credential columns on tenants nulled (row preserved)

Schema Changes (Session 33)

matching_logs: Added tenant_id TEXT REFERENCES tenants(id) column (was missing, causing silent insert failures in AI matcher)
product_tracking: Confirmed no image_url column exists — AI matcher was incorrectly selecting it
product_groups: Confirmed no status or price_comparison columns — only differences and comparison_method

PPU Columns (Session 34)

New columns added for price-per-unit normalization:

brand_products:

pack_quantity INTEGER — Base variant pack size (e.g., 100 for "Pack of 100")
ppu NUMERIC(10,4) — Price per unit for base variant

product_tracking:

pack_quantity INTEGER — Extracted from product name (e.g., 1000 from "1000 Count")
(already had ppu NUMERIC(10,4))

product_groups:

brand_pack_quantity INTEGER — Brand's pack size used in comparison
matched_pack_quantity INTEGER — Competitor's pack size
brand_ppu NUMERIC(10,4) — Brand PPU
matched_ppu NUMERIC(10,4) — Competitor PPU

super_admins (Session 37)

Platform-wide admin access table. No RLS policies — only accessible via service_role key.

Column	Type	Notes
`user_id`	UUID PK	References auth.users(id), ON DELETE CASCADE
`created_at`	TIMESTAMPTZ	DEFAULT NOW()

RLS: Enabled, no policies. Only service_role (backend admin endpoints) can read/write.

Access pattern: Backend require_super_admin() dependency queries this table via service_role client.

email_drip_log (Session 47)

Added: 2026-03-20

Tracks onboarding drip emails sent to users.

Column	Type	Notes
id	UUID PK	gen_random_uuid()
user_id	UUID	References auth.users(id)
tenant_id	TEXT	References tenants(id)
email_type	TEXT	'welcome', 'add_competitor', 'landscape', 'upgrade_pro'
sent_at	TIMESTAMPTZ	DEFAULT now()

Constraints: UNIQUE(user_id, email_type) — prevents duplicate sends. RLS: Enabled, no policies (service_role access only via Edge Function).

Supabase Extensions (Session 47)

Two new extensions enabled:

pg_cron (schema: pg_catalog) — PostgreSQL job scheduler. Used for daily drip email check.
pg_net (schema: extensions) — Async HTTP from SQL. Used by pg_cron to call Edge Functions.

On this page