Database Schema & RLS
Tables, RLS policies, pgvector indexes, auth triggers, data lifecycle
VantageDash uses Supabase (PostgreSQL) with Row Level Security (RLS) for multi-tenant data isolation. All tables reference tenants(id) via tenant_id foreign keys.
Extensions
- uuid-ossp: UUID generation (
uuid_generate_v4()) - pgvector: Vector similarity search for product embeddings
Tables Overview
| # | Table | Purpose | Key Columns |
|---|---|---|---|
| 0 | tenants | Tenant (brand) metadata | id (slug), brand_name, brand_url, industry, encrypted_shopify_url/token |
| 1 | user_tenants | Maps auth users → tenants | user_id (auth.uid), tenant_id, role |
| 2 | competitors | Competitor monitoring targets | name, url, brand_color |
| 3 | product_tracking | Scraped competitor products + prices | competitor_name, product_name, current_price, embedding vector(1536) |
| 4 | brand_products | Brand's own products (from Shopify) | shopify_id, title, price, category, sku, embedding vector(1536) |
| 5 | product_groups | Matched product pairs | brand_product_id → brand_products, matched_competitor, match_confidence |
| 6 | scrape_sessions | Scraping operation tracking | status, total_targets, progress_percent, error_message |
| 7 | scrape_logs | Per-competitor scrape details | session_id → scrape_sessions, competitor_name, status |
| 8 | sync_sessions | Shopify sync tracking | status, total_products, progress_percent, error_message |
| 9 | sync_logs | Per-product sync details | session_id → sync_sessions, product_title, status |
| 10 | matching_sessions | Matching operation tracking | status, total_products, matched_products, error_message |
| 11 | matching_logs | Per-product match details | tenant_id → tenants, session_id → matching_sessions, brand_product_name, match_confidence |
| 12 | alerts_config | Alert rule configuration | email_enabled, price_drop_percent, price_increase_percent |
| 13 | alerts_history | Historical alert records | competitor_name, product_name, price_change_percent, read_at |
| 14 | industry_profiles | Per-tenant matching config (JSONB) | industry_slug, categories, scoring_config, use_new_matcher |
Detailed Table Schemas
tenants
Primary identity table. id is a short slug (e.g., "mybrand").
CREATE TABLE tenants (
id TEXT PRIMARY KEY,
brand_name TEXT NOT NULL,
brand_url TEXT,
shopify_store TEXT,
brand_color TEXT DEFAULT '#0070f3',
logo_url TEXT,
timezone TEXT DEFAULT 'America/New_York',
industry TEXT DEFAULT 'ecommerce',
product_description TEXT DEFAULT 'products',
encrypted_shopify_url TEXT, -- Fernet-encrypted
encrypted_shopify_token TEXT, -- Fernet-encrypted
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);product_tracking
Central table for competitor pricing data. Has pgvector embedding column.
CREATE TABLE product_tracking (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
tenant_id TEXT REFERENCES tenants(id),
competitor_name TEXT NOT NULL,
product_name TEXT NOT NULL,
current_price NUMERIC(10,2),
url TEXT,
finish_type TEXT DEFAULT 'standard',
base_name TEXT,
ppu NUMERIC,
embedding vector(1536), -- pgvector for similarity search
created_at TIMESTAMPTZ DEFAULT NOW()
);brand_products
Brand's own products synced from Shopify. Also has pgvector embedding.
CREATE TABLE brand_products (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
tenant_id TEXT REFERENCES tenants(id),
shopify_id TEXT UNIQUE,
title TEXT,
price NUMERIC(10,2),
category TEXT,
sku TEXT,
size TEXT,
variants JSONB,
embedding vector(1536),
-- ... plus handle, vendor, description, weight, tags, etc.
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);product_groups
Matched product pairs — the core output of the matching engine.
CREATE TABLE product_groups (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
tenant_id TEXT REFERENCES tenants(id),
brand_product_id UUID REFERENCES brand_products(id) ON DELETE CASCADE,
brand_product_name TEXT,
brand_price NUMERIC(10,2),
matched_competitor TEXT,
matched_product_name TEXT,
matched_price NUMERIC(10,2),
matched_url TEXT,
match_confidence INTEGER, -- 0-100
differences JSONB,
match_reason TEXT,
comparison_method TEXT DEFAULT 'price',
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);industry_profiles
Per-tenant matching configuration. One row per tenant.
CREATE TABLE industry_profiles (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id TEXT NOT NULL REFERENCES tenants(id) ON DELETE CASCADE,
industry_slug TEXT NOT NULL,
industry_name TEXT NOT NULL,
categories JSONB NOT NULL DEFAULT '[]',
search_terms JSONB NOT NULL DEFAULT '{}',
size_equivalences JSONB NOT NULL DEFAULT '{}',
hard_block_rules JSONB NOT NULL DEFAULT '[]',
scoring_config JSONB NOT NULL DEFAULT '{...}',
specialty_brands JSONB NOT NULL DEFAULT '{}',
unit_types JSONB NOT NULL DEFAULT '[]',
ai_extraction_categories TEXT[] NOT NULL DEFAULT '{}',
ai_matching_rules TEXT[] NOT NULL DEFAULT '{}',
use_new_matcher BOOLEAN NOT NULL DEFAULT false,
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW(),
UNIQUE(tenant_id)
);Session/Log Pattern
Three pipelines follow the same two-table pattern:
| Pipeline | Session Table | Log Table | FK |
|---|---|---|---|
| Scraping | scrape_sessions | scrape_logs | session_id → scrape_sessions(id) ON DELETE CASCADE |
| Syncing | sync_sessions | sync_logs | session_id → sync_sessions(id) ON DELETE CASCADE |
| Matching | matching_sessions | matching_logs | session_id → matching_sessions(id) ON DELETE CASCADE |
Each session tracks: status (running/completed/failed), progress counters, error_message (stores JSON reports for validation sessions).
Row Level Security (RLS)
Core RLS Function
Resolves the authenticated user's active tenant, with JWT-based tenant switching and backward-compatible fallback:
CREATE OR REPLACE FUNCTION get_user_tenant_id() RETURNS TEXT AS $$
SELECT COALESCE(
(SELECT tenant_id FROM public.user_tenants
WHERE user_id = auth.uid()
AND tenant_id = (auth.jwt() -> 'app_metadata' ->> 'active_tenant_id')),
(SELECT tenant_id FROM public.user_tenants
WHERE user_id = auth.uid()
ORDER BY created_at ASC LIMIT 1)
);
$$ LANGUAGE sql SECURITY DEFINER STABLE SET search_path TO 'public';Policy Pattern
Every data table has this policy:
ALTER TABLE <table> ENABLE ROW LEVEL SECURITY;
CREATE POLICY "tenant_isolation" ON <table>
FOR ALL USING (tenant_id = get_user_tenant_id());This ensures:
- SELECT only returns rows matching the user's tenant
- INSERT/UPDATE/DELETE only affects rows matching the user's tenant
- No cross-tenant data leakage is possible
RLS Verification (Session 48)
All 20 public tables confirmed to have real get_user_tenant_id() enforcement (verified via live DB query — no tautology policies).
Special Cases
user_tenants: Policy isuser_id = auth.uid()(identity-scoped, not tenant-scoped)tenants: SELECT ifid IN (SELECT tenant_id FROM user_tenants WHERE user_id = auth.uid()); UPDATE restricted to ownersscrape_logs,sync_logs,matching_logs: Notenant_idcolumn — use join-based policies:session_id IN (SELECT id FROM <parent>_sessions WHERE tenant_id = get_user_tenant_id())super_admins,email_drip_log: RLS enabled, no policies (deny all to anon/authenticated — service_role only)team_invitations: Separate SELECT/UPDATE/DELETE policies (no INSERT — created via service_role)
Auth Trigger
Auto-provisions new tenants on Supabase Auth signup:
CREATE OR REPLACE FUNCTION handle_new_user()
RETURNS TRIGGER AS $$
DECLARE
_tenant_id TEXT;
BEGIN
SET search_path TO 'public';
_tenant_id := REPLACE(NEW.id::text, '-', '');
_tenant_id := SUBSTRING(_tenant_id FROM 1 FOR 20);
INSERT INTO tenants (id, brand_name) VALUES (_tenant_id, 'My Brand');
INSERT INTO user_tenants (user_id, tenant_id, role) VALUES (NEW.id, _tenant_id, 'owner');
RETURN NEW;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;
CREATE TRIGGER on_auth_user_created
AFTER INSERT ON auth.users
FOR EACH ROW EXECUTE FUNCTION handle_new_user();Note: SET search_path TO 'public' is required — without it, the trigger fails to find the tenants table.
team_invitations
Multi-user invite management. RLS-enabled.
| Column | Type | Notes |
|---|---|---|
| id | UUID PK | gen_random_uuid() |
| tenant_id | TEXT FK | References tenants(id) ON DELETE CASCADE |
| TEXT | Invitee email (lowercased) | |
| role | TEXT | admin, member, or viewer (CHECK constraint) |
| invited_by | UUID | auth.uid() of the inviter |
| token | TEXT UNIQUE | UUID4 acceptance token |
| status | TEXT | pending, accepted, revoked, expired (CHECK) |
| expires_at | TIMESTAMPTZ | 7 days from creation |
| accepted_at | TIMESTAMPTZ | Set when accepted |
| created_at | TIMESTAMPTZ | now() |
Indexes:
- Partial unique:
(tenant_id, lower(email)) WHERE status = 'pending'— one pending invite per email per tenant - Partial:
(lower(email)) WHERE status = 'pending'— signup invite lookup
RLS policies: SELECT/INSERT/UPDATE/DELETE scoped by get_user_tenant_id()
Indexes
Standard Indexes
Every table with tenant_id has: idx_<table>_tenant ON <table>(tenant_id)
Special Indexes
product_tracking(competitor_name)— fast competitor filteringproduct_tracking(created_at DESC)— time-series queriesproduct_tracking(base_name)— variant groupingbrand_products(shopify_id)— Shopify ID lookup (UNIQUE)product_groups(brand_product_id)— join optimizationalerts_history(created_at DESC)— recent alertsalerts_historypartial index onread_at IS NULL— unread notification queries
pgvector Indexes (HNSW)
CREATE INDEX idx_product_tracking_embedding ON product_tracking
USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);
CREATE INDEX idx_brand_products_embedding ON brand_products
USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);HNSW parameters: m=16 (connections per node), ef_construction=64 (build-time search breadth).
RPC Functions
match_products_by_embedding
CREATE OR REPLACE FUNCTION match_products_by_embedding(
query_embedding vector(1536),
match_threshold float DEFAULT 0.3,
match_count int DEFAULT 20,
p_tenant_id text DEFAULT NULL
)
RETURNS TABLE (
id uuid,
product_name text,
competitor_name text,
current_price numeric,
similarity float
)Returns the top-N most similar competitor products for a given embedding vector, filtered by tenant and similarity threshold.
Data Lifecycle
When deleting tenant data (DELETE /api/tenant/data), tables are deleted in FK order:
- Log tables (via parent session IDs)
product_groups(FK tobrand_products)- Session tables
brand_products,product_tracking,competitorsalerts_config,alerts_history,industry_profiles- Credential columns on
tenantsnulled (row preserved)
Schema Changes (Session 33)
- matching_logs: Added
tenant_id TEXT REFERENCES tenants(id)column (was missing, causing silent insert failures in AI matcher) - product_tracking: Confirmed no
image_urlcolumn exists — AI matcher was incorrectly selecting it - product_groups: Confirmed no
statusorprice_comparisoncolumns — onlydifferencesandcomparison_method
PPU Columns (Session 34)
New columns added for price-per-unit normalization:
brand_products:
pack_quantity INTEGER— Base variant pack size (e.g., 100 for "Pack of 100")ppu NUMERIC(10,4)— Price per unit for base variant
product_tracking:
pack_quantity INTEGER— Extracted from product name (e.g., 1000 from "1000 Count")- (already had
ppu NUMERIC(10,4))
product_groups:
brand_pack_quantity INTEGER— Brand's pack size used in comparisonmatched_pack_quantity INTEGER— Competitor's pack sizebrand_ppu NUMERIC(10,4)— Brand PPUmatched_ppu NUMERIC(10,4)— Competitor PPU
super_admins (Session 37)
Platform-wide admin access table. No RLS policies — only accessible via service_role key.
| Column | Type | Notes |
|---|---|---|
user_id | UUID PK | References auth.users(id), ON DELETE CASCADE |
created_at | TIMESTAMPTZ | DEFAULT NOW() |
RLS: Enabled, no policies. Only service_role (backend admin endpoints) can read/write.
Access pattern: Backend require_super_admin() dependency queries this table via service_role client.
email_drip_log (Session 47)
Added: 2026-03-20
Tracks onboarding drip emails sent to users.
| Column | Type | Notes |
|---|---|---|
| id | UUID PK | gen_random_uuid() |
| user_id | UUID | References auth.users(id) |
| tenant_id | TEXT | References tenants(id) |
| email_type | TEXT | 'welcome', 'add_competitor', 'landscape', 'upgrade_pro' |
| sent_at | TIMESTAMPTZ | DEFAULT now() |
Constraints: UNIQUE(user_id, email_type) — prevents duplicate sends. RLS: Enabled, no policies (service_role access only via Edge Function).
Supabase Extensions (Session 47)
Two new extensions enabled:
pg_cron(schema: pg_catalog) — PostgreSQL job scheduler. Used for daily drip email check.pg_net(schema: extensions) — Async HTTP from SQL. Used by pg_cron to call Edge Functions.