Data model
The Currnt MySQL database holds 314 tables and approximately 140 GB on disk. About 95% of the volume lives in eight table clusters; the other ~300 tables average a few hundred MB each. This page tours the eight clusters that matter and points out the table-level oddities worth knowing.
The engine is MySQL 5.5 (years past EOL — see /security). The character set is utf8mb3 for legacy reasons; full-UTF-8 (utf8mb4) is on the someday list but applies only to a handful of newer tables.
Cluster 1 · prospects and friends
The CRM’s main fact table. About 15 million rows and ~20 GB.
prospects— one row per identified contact. Columns include identity (name, email, LinkedIn URL when known), enrichment (employer, title, seniority guess, location), engagement (last contacted, last opened, last replied), and a long tail of source/import metadata.prospect_companies— denormalized employer table; updated by enrichment runs.prospect_segments— many-to-many mapping of prospects to internal segments.prospect_dedup_candidates— rows generated by the dedup-run handler; staff resolves through Treehouse.campaign_membership— many-to-many prospects ↔ outreach campaigns.
Cluster 2 · users, roles, sessions
The unified user identity model.
users— every account: visitor, member, expert, client, staff. Role distinction is by flag andmember_type.user_alt_emails— secondary emails accumulated by the account-merge flow.roles,user_roles— for the very small set of staff permission tiers.- Sessions live in Redis, not in MySQL — but a
session_audittable records significant lifecycle events (login, suspend, force-logout).
Cluster 3 · boards, posts, transcripts
The product data model.
boards— one row per engagement (board or sprint).board_members— many-to-many users ↔ boards, with role on the board (expert, member, observer, staff).posts— every authored message in any board. Rich-text body stored as HTML.transcripts— full conversation exports, regenerated on demand.summaries— analyst-written deliverables, one or more per board.
Cluster 4 · LLM trace table
A single table accumulating about 12 GB of LLM call/response logs from the AI co-facilitator process. Each row records prompt, response, model, latency, and the board/post that triggered the call.
This table is unusually large for its row count because individual rows can be very long (a full prompt plus a multi-paragraph response). It is also write-only — nothing reads from it at runtime — and is the cleanest deletion target in the database if disk reclamation ever becomes urgent.
Cluster 5 · dispenserd_jobs and worker traces
The queue’s persistence layer.
dispenserd_jobs— one row per enqueued job: type, lane, payload, status, attempts, error, timestamps.dispenserd_worker_heartbeats— periodic check-ins from each worker process.dispenserd_dead_jobs— overflow fromdispenserd_jobsonce a job is marked dead (this is a view, not a separate table).
Cluster 6 · campaigns, email_sends, tracking
Outreach’s history.
campaigns— one per outreach blast; carries template id, segment id, scheduling.email_sends— one row per per-recipient send; the largest table after prospects, ~80 M rows.email_events— opens, clicks, bounces, complaints.smtp_accounts— the rotation pool, with per-account health stats.
Cluster 7 · invoices, payouts, ledger
Operations data.
invoices— one row per client invoice.invoice_line_items— line breakdowns.payouts— expert compensation rows.ledger_entries— append-only ledger of every money-affecting event.- (Dormant)
gnosis_safe_txns— historical multi-sig transfers; nothing recent.
Cluster 8 · blog_posts, assets, content
The content-production data store.
blog_posts— about a thousand rows; published, draft, scheduled.blog_post_revisions— versioned history.assets— uploaded files (images, PDFs).marketing_copy,marketing_copy_revisions— landing-page text and email-campaign templates.
The long tail
The remaining ~280 tables fall into three buckets:
- Lookup tables — countries, industries, seniority levels, board templates.
- Audit and event logs — most under 1 GB each; collectively ~10 GB.
- Dormant feature tables — schema still intact from features that no longer run (Calendly bookings, Lusha enrichments, NDA acceptances, MetaMask signatures). Several have zero rows since 2022 or 2023.
A condensed inventory of the dormant subset is on the sunset page.
Schema management
No migration tool was ever adopted (no Liquibase, no Flyway, no Alembic-equivalent). Schema changes were applied manually as ALTER TABLE statements run by staff during low-traffic windows. The codebase has a migrations/ directory but it stopped getting updated around 2019.
The implication: there is no canonical schema-as-code. The database is the schema. Reading the codebase’s model files (models/*.js in the Sails app) gets you close but not exact — fields have drifted, and several “legacy” columns that the models don’t reference are still present in the actual tables.