VantageDash

Database Schema & RLS

Tables, RLS policies, pgvector indexes, auth triggers, data lifecycle

VantageDash uses Supabase (PostgreSQL) with Row Level Security (RLS) for multi-tenant data isolation. All tables reference tenants(id) via tenant_id foreign keys.

Extensions

  • uuid-ossp: UUID generation (uuid_generate_v4())
  • pgvector: Vector similarity search for product embeddings

Tables Overview

#TablePurposeKey Columns
0tenantsTenant (brand) metadataid (slug), brand_name, brand_url, industry, encrypted_shopify_url/token
1user_tenantsMaps auth users → tenantsuser_id (auth.uid), tenant_id, role
2competitorsCompetitor monitoring targetsname, url, brand_color
3product_trackingScraped competitor products + pricescompetitor_name, product_name, current_price, embedding vector(1536)
4brand_productsBrand's own products (from Shopify)shopify_id, title, price, category, sku, embedding vector(1536)
5product_groupsMatched product pairsbrand_product_idbrand_products, matched_competitor, match_confidence
6scrape_sessionsScraping operation trackingstatus, total_targets, progress_percent, error_message
7scrape_logsPer-competitor scrape detailssession_idscrape_sessions, competitor_name, status
8sync_sessionsShopify sync trackingstatus, total_products, progress_percent, error_message
9sync_logsPer-product sync detailssession_idsync_sessions, product_title, status
10matching_sessionsMatching operation trackingstatus, total_products, matched_products, error_message
11matching_logsPer-product match detailstenant_idtenants, session_idmatching_sessions, brand_product_name, match_confidence
12alerts_configAlert rule configurationemail_enabled, price_drop_percent, price_increase_percent
13alerts_historyHistorical alert recordscompetitor_name, product_name, price_change_percent, read_at
14industry_profilesPer-tenant matching config (JSONB)industry_slug, categories, scoring_config, use_new_matcher

Detailed Table Schemas

tenants

Primary identity table. id is a short slug (e.g., "mybrand").

CREATE TABLE tenants (
    id TEXT PRIMARY KEY,
    brand_name TEXT NOT NULL,
    brand_url TEXT,
    shopify_store TEXT,
    brand_color TEXT DEFAULT '#0070f3',
    logo_url TEXT,
    timezone TEXT DEFAULT 'America/New_York',
    industry TEXT DEFAULT 'ecommerce',
    product_description TEXT DEFAULT 'products',
    encrypted_shopify_url TEXT,      -- Fernet-encrypted
    encrypted_shopify_token TEXT,    -- Fernet-encrypted
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

product_tracking

Central table for competitor pricing data. Has pgvector embedding column.

CREATE TABLE product_tracking (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    tenant_id TEXT REFERENCES tenants(id),
    competitor_name TEXT NOT NULL,
    product_name TEXT NOT NULL,
    current_price NUMERIC(10,2),
    url TEXT,
    finish_type TEXT DEFAULT 'standard',
    base_name TEXT,
    ppu NUMERIC,
    embedding vector(1536),          -- pgvector for similarity search
    created_at TIMESTAMPTZ DEFAULT NOW()
);

brand_products

Brand's own products synced from Shopify. Also has pgvector embedding.

CREATE TABLE brand_products (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    tenant_id TEXT REFERENCES tenants(id),
    shopify_id TEXT UNIQUE,
    title TEXT,
    price NUMERIC(10,2),
    category TEXT,
    sku TEXT,
    size TEXT,
    variants JSONB,
    embedding vector(1536),
    -- ... plus handle, vendor, description, weight, tags, etc.
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

product_groups

Matched product pairs — the core output of the matching engine.

CREATE TABLE product_groups (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    tenant_id TEXT REFERENCES tenants(id),
    brand_product_id UUID REFERENCES brand_products(id) ON DELETE CASCADE,
    brand_product_name TEXT,
    brand_price NUMERIC(10,2),
    matched_competitor TEXT,
    matched_product_name TEXT,
    matched_price NUMERIC(10,2),
    matched_url TEXT,
    match_confidence INTEGER,        -- 0-100
    differences JSONB,
    match_reason TEXT,
    comparison_method TEXT DEFAULT 'price',
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

industry_profiles

Per-tenant matching configuration. One row per tenant.

CREATE TABLE industry_profiles (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id TEXT NOT NULL REFERENCES tenants(id) ON DELETE CASCADE,
    industry_slug TEXT NOT NULL,
    industry_name TEXT NOT NULL,
    categories JSONB NOT NULL DEFAULT '[]',
    search_terms JSONB NOT NULL DEFAULT '{}',
    size_equivalences JSONB NOT NULL DEFAULT '{}',
    hard_block_rules JSONB NOT NULL DEFAULT '[]',
    scoring_config JSONB NOT NULL DEFAULT '{...}',
    specialty_brands JSONB NOT NULL DEFAULT '{}',
    unit_types JSONB NOT NULL DEFAULT '[]',
    ai_extraction_categories TEXT[] NOT NULL DEFAULT '{}',
    ai_matching_rules TEXT[] NOT NULL DEFAULT '{}',
    use_new_matcher BOOLEAN NOT NULL DEFAULT false,
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW(),
    UNIQUE(tenant_id)
);

Session/Log Pattern

Three pipelines follow the same two-table pattern:

PipelineSession TableLog TableFK
Scrapingscrape_sessionsscrape_logssession_idscrape_sessions(id) ON DELETE CASCADE
Syncingsync_sessionssync_logssession_idsync_sessions(id) ON DELETE CASCADE
Matchingmatching_sessionsmatching_logssession_idmatching_sessions(id) ON DELETE CASCADE

Each session tracks: status (running/completed/failed), progress counters, error_message (stores JSON reports for validation sessions).

Row Level Security (RLS)

Core RLS Function

Resolves the authenticated user's active tenant, with JWT-based tenant switching and backward-compatible fallback:

CREATE OR REPLACE FUNCTION get_user_tenant_id() RETURNS TEXT AS $$
  SELECT COALESCE(
    (SELECT tenant_id FROM public.user_tenants
     WHERE user_id = auth.uid()
       AND tenant_id = (auth.jwt() -> 'app_metadata' ->> 'active_tenant_id')),
    (SELECT tenant_id FROM public.user_tenants
     WHERE user_id = auth.uid()
     ORDER BY created_at ASC LIMIT 1)
  );
$$ LANGUAGE sql SECURITY DEFINER STABLE SET search_path TO 'public';

Policy Pattern

Every data table has this policy:

ALTER TABLE <table> ENABLE ROW LEVEL SECURITY;

CREATE POLICY "tenant_isolation" ON <table>
FOR ALL USING (tenant_id = get_user_tenant_id());

This ensures:

  • SELECT only returns rows matching the user's tenant
  • INSERT/UPDATE/DELETE only affects rows matching the user's tenant
  • No cross-tenant data leakage is possible

RLS Verification (Session 48)

All 20 public tables confirmed to have real get_user_tenant_id() enforcement (verified via live DB query — no tautology policies).

Special Cases

  • user_tenants: Policy is user_id = auth.uid() (identity-scoped, not tenant-scoped)
  • tenants: SELECT if id IN (SELECT tenant_id FROM user_tenants WHERE user_id = auth.uid()); UPDATE restricted to owners
  • scrape_logs, sync_logs, matching_logs: No tenant_id column — use join-based policies: session_id IN (SELECT id FROM <parent>_sessions WHERE tenant_id = get_user_tenant_id())
  • super_admins, email_drip_log: RLS enabled, no policies (deny all to anon/authenticated — service_role only)
  • team_invitations: Separate SELECT/UPDATE/DELETE policies (no INSERT — created via service_role)

Auth Trigger

Auto-provisions new tenants on Supabase Auth signup:

CREATE OR REPLACE FUNCTION handle_new_user()
RETURNS TRIGGER AS $$
DECLARE
    _tenant_id TEXT;
BEGIN
    SET search_path TO 'public';
    _tenant_id := REPLACE(NEW.id::text, '-', '');
    _tenant_id := SUBSTRING(_tenant_id FROM 1 FOR 20);
    
    INSERT INTO tenants (id, brand_name) VALUES (_tenant_id, 'My Brand');
    INSERT INTO user_tenants (user_id, tenant_id, role) VALUES (NEW.id, _tenant_id, 'owner');
    
    RETURN NEW;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;

CREATE TRIGGER on_auth_user_created
AFTER INSERT ON auth.users
FOR EACH ROW EXECUTE FUNCTION handle_new_user();

Note: SET search_path TO 'public' is required — without it, the trigger fails to find the tenants table.

team_invitations

Multi-user invite management. RLS-enabled.

ColumnTypeNotes
idUUID PKgen_random_uuid()
tenant_idTEXT FKReferences tenants(id) ON DELETE CASCADE
emailTEXTInvitee email (lowercased)
roleTEXTadmin, member, or viewer (CHECK constraint)
invited_byUUIDauth.uid() of the inviter
tokenTEXT UNIQUEUUID4 acceptance token
statusTEXTpending, accepted, revoked, expired (CHECK)
expires_atTIMESTAMPTZ7 days from creation
accepted_atTIMESTAMPTZSet when accepted
created_atTIMESTAMPTZnow()

Indexes:

  • Partial unique: (tenant_id, lower(email)) WHERE status = 'pending' — one pending invite per email per tenant
  • Partial: (lower(email)) WHERE status = 'pending' — signup invite lookup

RLS policies: SELECT/INSERT/UPDATE/DELETE scoped by get_user_tenant_id()

Indexes

Standard Indexes

Every table with tenant_id has: idx_<table>_tenant ON <table>(tenant_id)

Special Indexes

  • product_tracking(competitor_name) — fast competitor filtering
  • product_tracking(created_at DESC) — time-series queries
  • product_tracking(base_name) — variant grouping
  • brand_products(shopify_id) — Shopify ID lookup (UNIQUE)
  • product_groups(brand_product_id) — join optimization
  • alerts_history(created_at DESC) — recent alerts
  • alerts_history partial index on read_at IS NULL — unread notification queries

pgvector Indexes (HNSW)

CREATE INDEX idx_product_tracking_embedding ON product_tracking
USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);

CREATE INDEX idx_brand_products_embedding ON brand_products
USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);

HNSW parameters: m=16 (connections per node), ef_construction=64 (build-time search breadth).

RPC Functions

match_products_by_embedding

CREATE OR REPLACE FUNCTION match_products_by_embedding(
    query_embedding vector(1536),
    match_threshold float DEFAULT 0.3,
    match_count int DEFAULT 20,
    p_tenant_id text DEFAULT NULL
)
RETURNS TABLE (
    id uuid,
    product_name text,
    competitor_name text,
    current_price numeric,
    similarity float
)

Returns the top-N most similar competitor products for a given embedding vector, filtered by tenant and similarity threshold.

Data Lifecycle

When deleting tenant data (DELETE /api/tenant/data), tables are deleted in FK order:

  1. Log tables (via parent session IDs)
  2. product_groups (FK to brand_products)
  3. Session tables
  4. brand_products, product_tracking, competitors
  5. alerts_config, alerts_history, industry_profiles
  6. Credential columns on tenants nulled (row preserved)

Schema Changes (Session 33)

  • matching_logs: Added tenant_id TEXT REFERENCES tenants(id) column (was missing, causing silent insert failures in AI matcher)
  • product_tracking: Confirmed no image_url column exists — AI matcher was incorrectly selecting it
  • product_groups: Confirmed no status or price_comparison columns — only differences and comparison_method

PPU Columns (Session 34)

New columns added for price-per-unit normalization:

brand_products:

  • pack_quantity INTEGER — Base variant pack size (e.g., 100 for "Pack of 100")
  • ppu NUMERIC(10,4) — Price per unit for base variant

product_tracking:

  • pack_quantity INTEGER — Extracted from product name (e.g., 1000 from "1000 Count")
  • (already had ppu NUMERIC(10,4))

product_groups:

  • brand_pack_quantity INTEGER — Brand's pack size used in comparison
  • matched_pack_quantity INTEGER — Competitor's pack size
  • brand_ppu NUMERIC(10,4) — Brand PPU
  • matched_ppu NUMERIC(10,4) — Competitor PPU

super_admins (Session 37)

Platform-wide admin access table. No RLS policies — only accessible via service_role key.

ColumnTypeNotes
user_idUUID PKReferences auth.users(id), ON DELETE CASCADE
created_atTIMESTAMPTZDEFAULT NOW()

RLS: Enabled, no policies. Only service_role (backend admin endpoints) can read/write.

Access pattern: Backend require_super_admin() dependency queries this table via service_role client.

email_drip_log (Session 47)

Added: 2026-03-20

Tracks onboarding drip emails sent to users.

ColumnTypeNotes
idUUID PKgen_random_uuid()
user_idUUIDReferences auth.users(id)
tenant_idTEXTReferences tenants(id)
email_typeTEXT'welcome', 'add_competitor', 'landscape', 'upgrade_pro'
sent_atTIMESTAMPTZDEFAULT now()

Constraints: UNIQUE(user_id, email_type) — prevents duplicate sends. RLS: Enabled, no policies (service_role access only via Edge Function).

Supabase Extensions (Session 47)

Two new extensions enabled:

  • pg_cron (schema: pg_catalog) — PostgreSQL job scheduler. Used for daily drip email check.
  • pg_net (schema: extensions) — Async HTTP from SQL. Used by pg_cron to call Edge Functions.