# Maverick Platform — Complete Documentation

> This file contains the full platform documentation for LLM consumption.
> Auto-generated technical docs + hand-maintained ops playbooks.

# PART 1: OPERATIONS PLAYBOOKS

---

## What This Does

The Maverick Dashboard is the central hub for managing insurance marketing campaigns. It handles the full contact pipeline — from scraping leads off Xpressdocs, filtering and verifying them, to uploading batches to Email Bison for campaign delivery.

You access it at [dashboard.maverick-ins.com](https://dashboard.maverick-ins.com) (production) or [beta.maverick-ins.com](https://beta.maverick-ins.com) (staging).

## How To Log In

1. Go to `https://dashboard.maverick-ins.com`
2. Enter your email and password (Supabase auth)
3. You'll land on the **Overview** page showing pipeline stats for the current pull month

If you get a blank screen or redirect loop, try clearing your browser cookies for `maverick-ins.com` and logging in again.

## Navigating the Dashboard

The left sidebar contains all sections:

| Section | Purpose |
|---------|---------|
| **Overview** | Pipeline stats, email volume, monthly targets |
| **Scraping** | Xpressdocs jobs, zip assignments, pipeline runs |
| **Contacts** | Filtered and verified contacts, batch management |
| **Campaigns** | Email Bison campaigns, lead tracking |
| **Leads** | Interested leads, conversation threads, AI audit |
| **Analytics** | Revenue, ROI, campaign performance dashboards |
| **Agent** | AI assistant for querying data and running ops tasks |
| **Settings** | Workspace config, Bison tokens, user management |

## Workspace Selector

The workspace dropdown at the top-left switches between clients. Each workspace maps to one insurance marketing client.

- **Display names** (e.g., "Gregg Blanchard") are shown in the UI
- **Internal names** (e.g., `gregg_blanchard`) are used in the backend
- Switching workspaces reloads all data for that client

When `All Workspaces` is selected, aggregate data is shown across all clients.

## Month Selector

Many pages have a month dropdown. Two different month semantics:

- **Pipeline month** (Overview, Scraping): The *renewal month* — contacts pulled 2 months before, emailed 1 month before. If it's March, the pipeline month is `2026-05` (May renewals).
- **Calendar month** (Analytics, Revenue): The actual current month for activity data.

Select `All Months` to see unfiltered data.

## Theme Settings

Click the palette icon in the sidebar footer to change themes:
- Families: Midnight, Ocean, Warm, Sage
- Variants: Dark, Light

Both the dashboard and client portal share the same theme system.

## Common Issues

| Symptom | Cause | Fix |
|---------|-------|-----|
| Blank page after login | Session expired mid-navigation | Clear cookies, log in again |
| "Unauthorized" errors | JWT token expired | Refresh the page (auto-refreshes token) |
| Data not updating | Stale cache | Wait 30s or hard-refresh (Ctrl+Shift+R) |
| Wrong workspace data | Workspace selector not synced | Click the workspace dropdown and reselect |

## Related Alerts

No direct alerts for login/navigation. If the dashboard itself is down, the **Deploy & CI/CD** module on the [Status Page](https://status.maverick-ins.com) will show it.

---

## What This Does

Scraping is the first stage of the contact pipeline. It uses Playwright (headless browser automation) to log into Xpressdocs and export insurance contact lists by ZIP code and renewal month.

The full pipeline flows in strict order:

```
Scrape → Filter → Verify → Batch → Upload
```

Each stage reads from the previous stage's output. Never skip stages.

## Pipeline Month Semantics

The stored month is the **renewal month**. The timeline offset:

- **Pull phase** (renewal month - 2): Scraping contacts now
- **Email phase** (renewal month - 1): Sending campaigns
- **Renewal**: The stored month (insurance renewals happen)

Example: Month `2026-06` means pull in April, email in May, June renewals.

## How To Use It

### Starting a Scraping Job

1. Go to **Scraping** in the sidebar
2. Select a workspace from the dropdown
3. Click **Create Job** — choose the target month and ZIP codes
4. The job appears in the jobs table with status `pending`
5. A Celery worker picks it up within seconds (status → `running`)

### Monitoring Progress

- The **jobs table** shows status, created time, and contact counts
- The **pipeline progress bar** shows counts at each stage
- On the [Status Page](https://status.maverick-ins.com), the **Celery Workers** module shows active scraping tasks

### Pipeline Stats

The Overview page shows pipeline stats per workspace per month:
- `raw`: Total contacts scraped
- `filtered`: After filter rules applied
- `verified`: After email verification
- `batched`: Grouped for upload
- `uploaded`: Pushed to Email Bison (this is what counts toward the monthly target)

## Xpressdocs Constraints

- **10K record limit** per export — the system automatically paginates
- **One credential at a time** — only one scraping task can use a login simultaneously
- **Browser automation** — runs Playwright in a Docker container, so it's slower than API calls

## Common Issues

| Symptom | Cause | Fix |
|---------|-------|-----|
| Job stuck at `running` for 30+ min | Playwright browser hung or Xpressdocs session expired | Check Celery logs on Status Page → Logs module. If hung, the job will eventually timeout (6h limit) |
| Job failed with `login_failed` | Xpressdocs password changed or account locked | Update credentials in Settings → Xpressdocs Config |
| 0 contacts scraped | No contacts available for that ZIP/month combo, or Xpressdocs returned empty export | Verify ZIP assignments exist for the workspace, check the month is correct |
| Pipeline stops after scraping | Downstream tasks not triggered | Check the jobs table — each stage creates its own job. If filtering didn't start, check Celery queue depth on Status Page |
| Duplicate contacts across months | Same person has multiple renewal dates | Expected behavior — dedup happens at the verification stage via email uniqueness |

## Related Alerts

- **ScrapingQueueDepthHigh**: More than 10 scraping tasks queued — workers may be overloaded
- **CeleryHighFailureRate**: >10% task failures across all queues
- **PipelineSuccessRateSLOBreach**: Pipeline success rate below 95% over 1 hour

---

## What This Does

Filtering is the second pipeline stage. It takes raw scraped contacts and applies workspace-specific filter rules to decide which contacts move forward to verification. This reduces volume and ensures only eligible contacts are emailed.

## How It Works

After scraping completes, filtering automatically starts. The system:

1. Loads filter rules configured for the workspace
2. Applies each rule (age range, coverage type, geography, etc.)
3. Uses **deterministic hash-based sampling** for reproducibility — the same contact always gets the same sampling decision
4. Outputs to the `filtered_contacts` table

### Deterministic Sampling

Filtering uses hash-based sampling so results are reproducible:
```
hash_input = f"routing_sample_{contact_id}"
hash_val = (int(md5(hash_input).hexdigest(), 16) % 10000) / 10000.0
```
Same contact always gets the same sampling decision.

## How To Use It

### Viewing Filtered Contacts

1. Go to **Contacts** in the sidebar
2. Select the workspace and month
3. The contacts table shows filtering status for each contact
4. Use the status filter dropdown to see only `filtered` contacts

### Filter Rules

Filter rules are configured per workspace in **Settings → Filter Configuration**. Common rules:
- Age range (e.g., 65-80)
- Coverage type (Medicare Supplement, etc.)
- State inclusion/exclusion
- Sampling rate (e.g., 50% of eligible contacts)

## Common Issues

| Symptom | Cause | Fix |
|---------|-------|-----|
| 0 contacts after filtering | Filter rules too restrictive or no rules configured | Check filter rules in Settings. If no rules exist, filtering passes everything through |
| Filtering takes too long | Large batch (>50K contacts) | Normal for large batches. Check queue depth on Status Page — filtering queue has concurrency 2 |
| Different results after re-run | Shouldn't happen — deterministic sampling | If contact IDs changed (re-scraped), the hash changes. This is expected for new scrapes |

## Related Alerts

- **PipelineSuccessRateSLOBreach**: Filtering failures contribute to overall pipeline SLO
- **CeleryHighFailureRate**: Check if filtering tasks are failing

---

## What This Does

Verification is the third pipeline stage. It sends filtered contact emails to the Debounce API for validation. This prevents sending campaigns to invalid emails (which would hurt sender reputation).

## How It Works

1. Filtered contacts are batched and sent to the Debounce API
2. Debounce returns a status per email: `deliverable`, `risky`, `undeliverable`, `unknown`
3. Only `deliverable` and (optionally) `risky` contacts move to batching
4. Results are stored in the `verified_contacts` table

### Verification Statuses

| Status | Meaning | Action |
|--------|---------|--------|
| `deliverable` | Valid email, safe to send | Moves to batching |
| `risky` | Email exists but may bounce | Configurable — usually included |
| `undeliverable` | Invalid or non-existent email | Excluded from pipeline |
| `unknown` | Debounce couldn't determine | Excluded by default |

## How To Use It

### Monitoring Verification

1. Go to **Contacts** → select workspace and month
2. The pipeline progress bar shows verified count vs. filtered count
3. On the Status Page, check the **Celery Workers** module for verification queue depth

### Manual Recovery (Without list_id)

If Debounce verification fails mid-batch and you have results in a CSV:

```bash
# Run from the server inside the backend container
python scripts/import_debounce_results.py <csv_file> <workspace> --month May --ready-only
```

Scope by month and status to avoid touching other pipeline rows.

## Common Issues

| Symptom | Cause | Fix |
|---------|-------|-----|
| Verification queue > 5K deep | Large batch submitted or Debounce rate-limited | Normal for large batches — Debounce rate limits at ~3K/hour. Check VerificationBacklogSLOBreach alert |
| All contacts marked `undeliverable` | Debounce API key expired or billing issue | Check `DEBOUNCE_API_KEY` in backend env, verify Debounce account status |
| Verification stuck (no progress) | Worker crashed or Debounce API down | Check Status Page → Logs → `celery-verification`. Restart: on server, `docker restart celery-verification` |
| Partial results | Task timed out mid-verification | Re-run the verification stage for the affected batch. Already-verified contacts are skipped (idempotent) |

## Related Alerts

- **VerificationBacklogSLOBreach**: Queue >15K contacts (>4 hours to clear at 3K/hour)
- **DebounceAPIHighErrorRate**: Debounce returning >10% errors — likely rate limiting

---

## What This Does

Batching is the fourth stage — it groups verified contacts into upload-ready batches. Upload is the fifth and final stage — it pushes batches to Email Bison for campaign delivery.

## How It Works

### Batching
1. Verified contacts are grouped into batches (typically by workspace + month)
2. Batches are stored in the `contact_batches` table with status `ready`
3. Each batch contains metadata: workspace, month, contact count

### Upload
1. Ready batches are picked up by the `email_bison` queue
2. Contacts are pushed to the Bison API as leads
3. On success, batch status changes to `uploaded`
4. The Bison API URL is always `send.maverickmarketingllc.com`

## Monthly Target

The `monthly_contact_target` measures contacts **uploaded to Bison** — not raw scraped contacts. Use `pipeline_stats_cache.uploaded` for progress tracking, not `raw_contacts` count.

## How To Use It

### Viewing Batches
1. Go to **Contacts** → **Batches** tab
2. See all batches with status, contact count, created date
3. Upload status shows: `ready`, `uploading`, `uploaded`, `failed`

### Manual Upload Retry
If a batch fails to upload, it stays in `failed` status. The system will not auto-retry. To retry:
1. Check the failure reason in Status Page → Logs → `celery-email-bison`
2. If the issue is resolved (Bison API was down, credentials fixed), the batch can be re-queued from the Contacts page

## Common Issues

| Symptom | Cause | Fix |
|---------|-------|-----|
| Batch stuck at `ready` | email_bison queue worker not running | Check Status Page → Celery Workers → email_bison queue depth. Restart worker if needed |
| Upload failed with auth error | Bison API token expired | Go to Settings → Bison Tokens, refresh the token for the workspace |
| Uploaded count doesn't match target | Target counts uploaded contacts, some may have been filtered/rejected by Bison | Check Bison campaign dashboard for rejection reasons |
| Duplicate uploads | Batch re-queued while original was in progress | Bison deduplicates by email — no harm done, but check batch statuses |

## Related Alerts

- **BisonAPIHighErrorRate**: Bison returning >10% errors
- **BisonAPIDown**: Bison returning >75% errors — API likely down
- **EmailEventsQueueDepthHigh**: email_events queue >200 (webhook processing backlog)

---

## What This Does

After contacts are uploaded to Email Bison, they become **leads** in active campaigns. The dashboard syncs campaign data back from Bison and tracks lead engagement — replies, interest signals, and conversation threads.

## How Campaign Sync Works

A Celery Beat task runs **every hour** to sync campaign data from Bison:
1. Fetches all campaigns for each workspace
2. Pulls lead statuses, reply counts, and conversation data
3. Updates `client_leads` table with latest statuses
4. Bison API always returns 15 items per page regardless of `per_page` param — the sync paginates using `meta.last_page`

## Lead Statuses and Interest Flow

### Source of Truth: Bison

- `is_interested` is set **ONLY** by the `lead_interested` Bison webhook
- AI audit flags `needs_review=True` but **NEVER** sets `is_interested=True`
- `sentiment_source='bison'` means Bison classified this lead
- `sentiment_source='pending_bison'` means awaiting Bison's verdict
- `sentiment_source='ai_audit'` means AI classified (not interested per Bison)

### Pipeline Stages

| Stage | Meaning |
|-------|---------|
| `new` | Fresh lead, no engagement yet |
| `contacted` | Email sent |
| `replied` | Lead replied to email |
| `follow-up` | In follow-up sequence |
| `interested` | Confirmed interested (Bison webhook) |
| `not_interested` | Confirmed not interested |
| `unsubscribed` | Lead opted out |
| `bounced` | Email bounced |

Note: Bison uses hyphens (`follow-up`) — the system normalizes to underscores internally.

## How To Use It

### Viewing Campaigns
1. Go to **Campaigns** in the sidebar
2. Select workspace — see all synced campaigns with send dates, open rates, reply counts
3. Click a campaign to see individual leads

### Viewing Leads
1. Go to **Leads** in the sidebar
2. Filter by workspace, month, status, interest
3. Click a lead to see conversation thread (via `bison_conversation_url`)

## Common Issues

| Symptom | Cause | Fix |
|---------|-------|-----|
| Campaigns not showing | Sync hasn't run yet or Bison tokens expired | Check last sync time. Go to Settings → Bison Tokens to verify |
| Lead shows `pending_bison` for days | Bison hasn't classified yet | Normal for low-engagement leads. AI audit will eventually classify |
| Conversation URL returns 404 | Bison conversation was deleted or lead was removed | Expected for cleaned-up leads |
| Reply count mismatch | Sync lag — Bison data is up to 1 hour behind | Wait for next sync cycle |

## Related Alerts

- **BisonAPIHighErrorRate**: Campaign sync will fail if Bison is down
- **EmailPeriodicQueueDepthHigh**: Campaign sync tasks queued >20

---

## What This Does

The Analytics section provides campaign performance and ROI metrics. It tracks email delivery, engagement, and revenue attribution across workspaces.

## Key Metrics

| Metric | Meaning | Source |
|--------|---------|--------|
| **Contacts Uploaded** | Leads pushed to Bison | `pipeline_stats_cache.uploaded` |
| **Emails Sent** | Campaigns delivered | Bison campaign metrics |
| **Open Rate** | Percentage of emails opened | Bison tracking pixels |
| **Reply Rate** | Percentage of leads that replied | Bison conversation data |
| **Interested Leads** | Confirmed interested | Bison webhook (`is_interested=True`) |
| **ROI** | Revenue / cost ratio | `campaign_metrics` + `client_costs` |

## Date Range Behavior

- **Analytics pages use calendar month** (`getCurrentMonth()`) — not pipeline month
- The month selector on Analytics means "show me activity from this month"
- This is different from Overview which uses pipeline month (renewal month)

**Important:** Do NOT use the pipeline month selector for daily stats or email volume.

## How To Use It

1. Go to **Analytics** in the sidebar
2. Select workspace and date range
3. Dashboard shows: email volume trends, campaign performance, lead conversion funnel
4. Revenue tab shows: per-client costs, ROI breakdown

### Client Costs

The `client_costs` table tracks costs per workspace:
- `email_account_costs` — email infrastructure
- `labor_costs` — manual review time
- `other_costs` — miscellaneous
- `total_costs` — **auto-generated column** (sum of above three, never INSERT directly)

## Common Issues

| Symptom | Cause | Fix |
|---------|-------|-----|
| Revenue shows $0 | Cost data not entered for the workspace | Enter costs in Settings → Client Costs |
| Charts show no data for current month | Campaign metrics sync runs hourly | Wait for sync, or check Celery Workers module |
| Numbers don't match Bison dashboard | Sync lag or different date boundaries | Bison uses UTC, dashboard uses CST. Data syncs hourly |

## Related Alerts

No specific alerts for analytics — it is read-only data derived from campaign and pipeline metrics.

---

## What This Does

ZIP assignments define which geographic territories each client workspace owns. When scraping runs, it pulls contacts from the ZIP codes assigned to that workspace for the target month.

## How To Use It

### Map View
1. Go to **Scraping** → **Zip Assignments** tab (or use the map toggle)
2. The map shows all assigned ZIPs color-coded by workspace (12 client colors)
3. Zoom in to see individual ZIP code markers
4. Click an assigned ZIP to see details (workspace, city, state)

### Assigning ZIPs

**Single assignment:**
1. Click an unassigned ZIP on the map
2. This enters multi-select mode (ZIP turns green)
3. Click more unassigned ZIPs to add them
4. A selection bar appears at the bottom — click a client name to assign all selected ZIPs

**Bulk upload:**
1. Click the Upload button in the toolbar
2. Upload a CSV with columns: `zip_code`, `state`, `month`
3. Select the target workspace
4. Preview results with dry-run, then confirm

### Managing Assignments
- Click an assigned ZIP → popup shows reassign option
- Use the table view (list icon) to see all assignments with delete option
- Bulk delete via the table view's multi-select

## Common Issues

| Symptom | Cause | Fix |
|---------|-------|-----|
| Map shows all dark / wrong colors in light mode | Known bug — map tiles don't respect theme | Being fixed — currently uses CartoDB Dark Matter tiles always |
| ZIPs not showing on map | Need to zoom in past threshold | Zoom in — ZIP markers appear at zoom level 6+ |
| ZIP already assigned error | Another workspace owns that ZIP for the same month | Check the map — click the ZIP to see which workspace has it. Reassign or delete first |
| Bulk upload shows 0 imported | CSV format incorrect | Ensure columns are: `zip_code`, `state`, `month` (YYYY-MM format) |

## Related Alerts

No specific alerts for ZIP assignments — it is configuration data, not pipeline execution.

---

## What This Does

The AI Agent is an assistant that can query platform data, look up contacts and leads, check pipeline status, and answer questions about workspaces. It uses Claude (Anthropic) with tool-calling to access Supabase data.

## How To Use It

1. Go to **Agent** in the sidebar
2. Type a question or command in the chat input
3. The agent processes your request using available tools
4. Results appear in the conversation thread

### What It Can Do

- Look up contacts, leads, and campaign data by workspace
- Check pipeline status and counts
- Query the `pipeline_stats_cache` table for month-over-month comparisons
- Search for specific leads by email or name
- Summarize workspace activity

### What It Cannot Do

- Modify data (read-only access)
- Run pipeline stages or restart workers
- Access external APIs (Bison, Debounce) directly
- Perform SQL aggregations (COUNT, SUM) — PostgREST doesn't support SQL functions
- Access tables not listed in its system prompt

## Important Limitations

- **Token limits**: Tool outputs are truncated to under 2K tokens to avoid context overflow
- **Table awareness**: The agent only knows about tables listed in its system prompt (`backend/app/tools/system_prompt.md`). New tables must be added there.
- **Model**: Uses Haiku 4.5 by default for cost efficiency (90% of Sonnet performance at 3x lower cost)

## Common Issues

| Symptom | Cause | Fix |
|---------|-------|-----|
| Agent says it doesn't have access to that table | Table not in system prompt | Add table to `backend/app/tools/system_prompt.md` |
| Prompt too long error | Previous tool results too large | Start a new conversation thread |
| Agent gives wrong numbers | Used name-based search instead of pipeline_stats_cache | Ask specifically: use pipeline_stats_cache to get X |
| Slow responses on beta | Nginx SSE timeout | Known issue — beta nginx needs `proxy_read_timeout 300s` |

## Related Alerts

- Agent System module on Status Page shows: active threads, P95 latency, token usage, tool success rate

---

## What This Does

Settings manages workspace configuration, external service credentials, and user access. Each workspace has its own Bison token, filter rules, and contact targets.

## Workspace Configuration

### Bison Tokens
Each workspace needs an active Bison API token to sync campaigns and upload leads.

1. Go to **Settings** → **Bison Tokens**
2. Each workspace shows its token status (`active` / `expired`)
3. To refresh: click the workspace, enter new token from Bison dashboard
4. Tokens are stored in the `bison_tokens` table with `is_active=True`

### Client Registry
The `client_registry` table maps workspace names to display names and configuration:
- `workspace_name`: Internal snake_case name (e.g., `gregg_blanchard`)
- `display_name`: Human-readable name (e.g., "Gregg Blanchard")
- `bison_workspace_id`: Links to Bison workspace (IDs 14 and 15 are anomalous — reassigned to different clients)
- `monthly_contact_target`: How many contacts to upload per month

### Monthly Targets
The monthly target counts contacts **uploaded to Bison**, not raw scraped. Pipeline progress is tracked via `pipeline_stats_cache.uploaded`.

## User Management

Users are managed via Supabase Auth:
- `auth.users` must exist before creating `user_profiles` or `user_workspace_access`
- Workspace access is granted per-user in `user_workspace_access` table
- Users can have access to multiple workspaces

## Common Issues

| Symptom | Cause | Fix |
|---------|-------|-----|
| Token expired on campaign sync | Bison token needs refresh | Get new token from Bison dashboard, update in Settings |
| New user can't see any data | Missing workspace access | Add entry to `user_workspace_access` for the user + workspace |
| Workspace not appearing in dropdown | Not in `client_registry` | Add workspace to `client_registry` table |

## Related Alerts

- **BisonAPIHighErrorRate**: May indicate token issues across workspaces

---

## What This Does

The platform sends alerts to Slack when things go wrong. Alerts are routed by severity:
- **Critical** → `#maverick-alerts-critical` (immediate action needed)
- **Warning** → `#maverick-alerts` (investigate when possible)

## Alert Reference

### API Health

| Alert | Severity | Trigger | What To Do |
|-------|----------|---------|------------|
| `APIInstanceDown` | Critical | Prometheus can't reach API for >2 min | Check Status Page → Core Services. If down, SSH to server and run `docker compose -f docker-compose.production.yml restart api` |
| `APIHighErrorRateWarning` | Warning | 5xx rate >1% over 5 min | Check Status Page → Logs → `api` for error patterns |
| `APIHighErrorRateCritical` | Critical | 5xx rate >5% over 3 min | Likely a code bug or DB issue. Check Logs → `api` for stack traces. May need rollback |
| `APIHighLatencyWarning` | Warning | P95 latency >1s | Check database latency on Status Page → Database module |
| `APIHighLatencyCritical` | Critical | P95 latency >3s | Database or Redis likely overloaded. Check System Resources for CPU/memory |

### Celery Workers and Queues

| Alert | Severity | Trigger | What To Do |
|-------|----------|---------|------------|
| `EmailEventsQueueDepthHigh` | Warning | email_events queue >200 | Webhook backlog building. Check if email_bison worker is running |
| `EmailEventsQueueDepthCritical` | Critical | email_events queue >1000 | Worker likely crashed. Restart: `docker restart celery-email-bison` |
| `VerificationQueueDepthHigh` | Warning | verification queue >5000 | Large batch submitted. Normal — takes ~2h to clear at 3K/hour |
| `ScrapingQueueDepthHigh` | Warning | scraping queue >10 | Multiple scraping jobs queued. Only 1 runs at a time (browser automation). Queue will drain slowly |
| `CeleryHighFailureRateWarning` | Warning | >10% task failures | Check Logs for failing task patterns |
| `CeleryHighFailureRateCritical` | Critical | >25% task failures | Systemic issue — check DB connectivity, Redis, external APIs |

### System Resources

| Alert | Severity | Trigger | What To Do |
|-------|----------|---------|------------|
| `DiskSpaceWarning` | Warning | Disk >70% full | Check Docker images and logs consuming space. Run `docker system prune` on server |
| `DiskSpaceCritical` | Critical | Disk >85% full | Urgent — clean up immediately or services will crash |
| `HighMemoryUsageWarning` | Warning | RAM >80% | Check which containers are using most memory: `docker stats` |
| `HighCPULoad` | Warning | 5-min load avg >6 | Usually scraping or verification spike. Should resolve on its own |

### Pipeline SLOs

| Alert | Severity | Trigger | What To Do |
|-------|----------|---------|------------|
| `PipelineSuccessRateSLOBreach` | Warning | below 95% success over 1h | Check which pipeline stage is failing — look at Celery Workers module |
| `VerificationBacklogSLOBreach` | Warning | Verification queue >15K | Massive batch — will take 4+ hours. No action needed unless it grows |
| `BisonAPIHighErrorRate` | Warning | Bison >10% errors | Check Bison status. May be rate limiting or token issue |
| `BisonAPIDown` | Critical | Bison >75% errors | Bison API is likely down. Check [send.maverickmarketingllc.com](https://send.maverickmarketingllc.com). No action until they resolve |
| `DebounceAPIHighErrorRate` | Warning | Debounce >10% errors | Likely rate limiting. Verification will slow but continue |

## General Troubleshooting Decision Tree

```
Something seems wrong
├── Is the Status Page showing any red modules?
│   ├── Yes → Click the red module for details
│   │   ├── Core Services down → restart API container
│   │   ├── Celery down → check which queue, restart that worker
│   │   ├── Database down → check Supabase status
│   │   └── System Resources critical → check disk/memory
│   └── No → Check Slack for recent alerts
│
├── Is a specific pipeline slow?
│   ├── Check Celery Workers → Active Tasks tab for what's running
│   ├── Check queue depths — high depth = backlog, not failure
│   └── Check Logs module for the relevant worker
│
└── Is data not updating?
    ├── Campaign data → wait for hourly sync (check Beat schedule)
    ├── Pipeline data → check if the stage completed (jobs table)
    └── Dashboard → hard refresh (Ctrl+Shift+R)
```

## Related Links

- [Status Page](https://status.maverick-ins.com)
- [Grafana](https://grafana.maverick-ins.com)

---

## What This Does

Deployments are automated via GitHub Actions. Merging to `main` triggers production deploy. Merging to `development` triggers beta deploy.

## Deploy Flow

### Production (main branch)
1. PR merged to `main` triggers `deploy-production.yml`
2. GitHub Actions SSHs to the VPS at `/opt/maverick-platform`
3. Backs up local `.env` files
4. Runs `git reset --hard origin/main`
5. Restores `.env` files
6. Runs `docker compose -f docker-compose.production.yml up -d --build`
7. Waits 90s for startup
8. Health checks: API (`/health`), Dashboard (HTTP 200), Status Page (HTTP 200)
9. Slack notification on success/failure

### Beta (development branch)
Same flow but targets `/opt/maverick-beta` on the beta VPS (187.124.152.114), uses `docker-compose.beta.yml`, and waits 45s.

## What To Check After Deploy

1. **Status Page** → verify all modules are green (https://status.maverick-ins.com)
2. **Dashboard** → log in and navigate to a few pages
3. **Slack** → check for deploy success notification
4. **API health** → `curl https://api.maverick-ins.com/health`

## Adding New Environment Variables

Env files are **server-local** and NOT in git. If code requires a new env var:

1. SSH to the production server
2. Edit the relevant env file:
   - Backend: `backend/.env.production`
   - Frontend: `frontend/apps/dashboard/.env.production`
3. Restart the affected container: `docker compose -f docker-compose.production.yml restart <service>`

## Rollback

If a deploy breaks something:

1. Check the Slack notification for the commit that was deployed
2. On GitHub, identify the last known-good commit on `main`
3. **Option A** (preferred): Revert the bad commit via a PR, merge to `main`, auto-deploys
4. **Option B** (emergency): SSH to server, manually `git reset --hard <good-commit>`, rebuild

## Common Issues

| Symptom | Cause | Fix |
|---------|-------|-----|
| Deploy workflow failed | SSH connection timeout or build error | Check GitHub Actions run log for details |
| Health check failed after deploy | Service slow to start or config issue | Wait 2 more minutes. If still failing, check `docker compose logs api` |
| Env vars missing after deploy | `git reset --hard` cleared them, restore failed | SSH to server, re-add the env vars manually |
| Beta deploys not triggering | Merge target was wrong branch | Beta deploys on push to `development` only |

## Related Alerts

- **Deploy & CI/CD** module on Status Page shows latest workflow runs and branch status

# PART 2: TECHNICAL REFERENCE

---

<!-- ARCHITECTURE.md not found -->

---

<!-- API_REFERENCE.md not found -->

---

<!-- DATABASE_SCHEMA.md not found -->

---

<!-- ENV_REFERENCE.md not found -->