# EventheOdds Sports Database Audit
**Generated:** 2026-01-30 06:30 UTC
**Database:** PostgreSQL (sports schema)
**Purpose:** Comprehensive review for AI verification

---

## 1. DATABASE OVERVIEW

### 1.1 Table Count
- **Total Tables:** 121
- **Total Columns:** ~2,100+

### 1.2 Top Tables by Row Count
| Table | Rows | Purpose |
|-------|------|---------|
| PlayerGameMetric | 2,912,987 | Player stats per game |
| PropBacktest | 376,956 | Backtesting results |
| BookmakerOdds | 327,971 | Odds from bookmakers |
| NFLPlayByPlay | 245,565 | NFL play-by-play data |
| PlayerPropLine | 199,705 | Player prop betting lines |
| SportsGame | 182,831 | Game schedules/results |
| LiveOddsSnapshot | 120,474 | Live odds snapshots |
| ExternalFeedRecord | 115,611 | External data ingestion logs |
| CanonicalGame | 86,376 | Deduplicated game records |
| Player | 45,955 | Player records |
| CanonicalPlayer | 21,531 | Canonical player references |

---

## 2. DATA FRESHNESS BY LEAGUE

### 2.1 SportsGame Table
| League | Games | Latest Game | Last Updated |
|--------|-------|-------------|--------------|
| NHL | 34,160 | 2026-04-17 | 2026-01-30 |
| MLB | 29,485 | 2026-09-02 | 2026-01-29 |
| NBA | 25,688 | 2026-04-13 | 2026-01-30 |
| NCAAB | 20,955 | 2026-03-07 | 2026-01-30 |
| NCAAF | 17,218 | 2026-01-20 | 2026-01-28 |
| NFL | 16,305 | 2026-02-08 | 2026-01-30 |
| WNBA | 11,103 | 2025-10-11 | 2026-01-28 |
| MMA | 10,184 | 2027-07-02 | 2026-01-28 |
| EPL | 7,360 | 2026-05-24 | 2026-01-30 |

### 2.2 Player Table
| League | Players | Last Updated |
|--------|---------|--------------|
| NFL | 5,801 | 2026-01-30 |
| NBA | 5,389 | 2026-01-30 |
| NHL | 5,214 | 2026-01-30 |
| MLB | 4,967 | 2026-01-29 |
| NCAAB | 4,694 | 2026-01-28 |
| UCL | 3,933 | 2026-01-12 |
| Serie A | 3,701 | 2026-01-21 |
| La Liga | 3,405 | 2026-01-21 |

### 2.3 Injury Data
| League | Injuries | Last Updated |
|--------|----------|--------------|
| NFL | 1,508 | 2026-01-28 |
| MLB | 447 | 2026-01-28 |
| NBA | 353 | 2026-01-28 |
| NHL | 180 | 2026-01-28 |
| NCAAF | 40 | 2026-01-28 |

---

## 3. CANONICAL ENTITY SYSTEM

### 3.1 CanonicalTeam Coverage
| League | Teams |
|--------|-------|
| NFL | 32 |
| NHL | 32 |
| MLB | 30 |
| NBA | 30 |

### 3.2 CanonicalPlayer Coverage
| League | Players | With Team Assigned | Coverage % |
|--------|---------|-------------------|------------|
| NFL | 5,717 | 5,617 | 98.3% |
| NBA | 5,210 | 4,947 | 95.0% |
| NHL | 5,081 | 1,071 | 21.1% |
| MLB | 4,963 | 3,983 | 80.3% |
| NCAAB | 48 | 0 | 0% |

**ISSUE:** NHL canonical players missing team assignments (only 21% coverage)

---

## 4. DATA INTEGRITY ISSUES

### 4.1 Duplicate Games (Same Date + Teams)
| League | Total Games (2025-26) | Potential Duplicates |
|--------|----------------------|---------------------|
| NHL | 2,557 | 96 |
| NBA | 2,374 | 64 |
| MLB | 96 | 2 |
| NFL | 253 | 1 |

**ISSUE:** ~160 duplicate game records need deduplication

### 4.2 Team Name Inconsistency
Games stored with both abbreviations AND full names:
- `ATL` vs `Atlanta Hawks`
- `BOS` vs `Boston Celtics`
- `CHI` vs `Chicago Bulls`

This causes duplicate records when querying.

---

## 5. KEY PLAYER VERIFICATION

### 5.1 Recent Trade Verification (2025-26 Season)
| Player | Player Table | Canonical Table | Status |
|--------|-------------|-----------------|--------|
| Julius Randle | MIN | MIN | CORRECT |
| De'Andre Hunter | CLE | CLE | CORRECT |
| LeBron James | LAL | LAL | CORRECT |
| Karl-Anthony Towns | NYK | NYK | CORRECT |
| Nikola Jokic | DEN | DEN | CORRECT |

---

## 6. API & ENDPOINT STATUS

### 6.1 Working Endpoints
- `/api/health` - Health check
- `/api/analytics/injury-severity` - Injury severity analysis
- `/api/analytics/injury-props-edge` - Injury props (fixed 2026-01-30)
- `/api/nba/games` - NBA games (requires auth)
- `/api/nba/players` - NBA players (requires auth)

### 6.2 External API Status
| API | Status | Notes |
|-----|--------|-------|
| BallDontLie | Working | Had 401 errors, now fixed |
| Grok (x.ai) | Working | Chat responses |
| SportsGameOdds | Working | Props & odds sync |
| ESPN | Working | Stats ingestion |

---

## 7. KNOWN ISSUES TO FIX

### 7.1 Critical
1. **Duplicate Games** - 160+ duplicate game records (NHL/NBA)
   - Same game stored with abbreviation AND full team name
   - Need deduplication script

2. **NHL Team Assignment** - Only 21% of NHL canonical players have team IDs
   - Need to run roster sync for NHL

### 7.2 Moderate
3. **NCAAB/NCAAF Player Canonicalization** - 0% coverage
   - Need to populate canonical player records

4. **Soccer League Fragmentation** - Multiple league identifiers
   - `epl` vs `soccer_epl`
   - `bundesliga` vs `soccer_bundesliga`
   - Need normalization

### 7.3 Minor
5. **Stale Local Fallback Data** - `/data/nba/` folder has old data
   - Used when BallDontLie API fails
   - Should auto-update or disable

---

## 8. SCHEMA HIGHLIGHTS

### 8.1 Core Entity Tables
```
Player (45,955 rows)
  - id, league, externalPlayerId, name, position, team, raw, createdAt, updatedAt

CanonicalPlayer (21,531 rows)
  - id, league, fullName, normalizedName, teamId
  - External IDs: sgoId, bdlId, espnId, nbaComId, mlbamId, nflGsisId, nhlApiId

CanonicalTeam (124 rows)
  - id, league, abbr, fullName, city, nickname
  - External IDs: sgoId, bdlId, espnId

CanonicalGame (86,376 rows)
  - id, league, season, gameDate, homeTeamId, awayTeamId
  - Odds: homeMoneyline, awayMoneyline, spread, total
  - External IDs: sgoEventId, bdlGameId, espnEventId
```

### 8.2 Odds & Props Tables
```
PlayerPropLine (199,705 rows)
  - playerExternalId, league, market, lineValue, overOdds, underOdds
  - Links to games and players

SportsGame (182,831 rows)
  - homeTeam, awayTeam, gameDate, league
  - homeScore, awayScore, status
  - Odds: homeMoneyline, awayMoneyline, spread, total
```

### 8.3 Analytics Tables
```
PlayerGameMetric (2.9M rows)
  - playerExternalId, gameKey, statKey, value, league

CLVAnalysis (18,323 rows)
  - Closing line value tracking

LineMovement (79,954 rows)
  - Line movement tracking
```

---

## 9. CRON JOBS CONFIGURED

| Job | Schedule | Purpose |
|-----|----------|---------|
| Master Sync | */30 15-23,0-7 UTC | SGO SDK, props, canonical |
| Full Canonical | 0 4 UTC daily | Deep linking, full sync |
| ESPN Metrics | */2 hours | Player game stats |
| Roster Sync | 0 8 UTC daily | Refresh rosters |
| Odds Snapshots | */15 17-23,0-6 UTC | Live odds capture |
| Analytics | 0 6 UTC daily | CLV, market efficiency |
| Injuries | */4 hours | Injury status sync |
| Schedules | 0 5 UTC daily | Upcoming games |

---

## 10. RECOMMENDED FIXES

### Priority 1: Data Quality
```bash
# 1. Deduplicate games
npx tsx scripts/deduplicate-games.ts

# 2. Sync NHL rosters
npx tsx scripts/sync-rosters.ts --league=nhl

# 3. Normalize team names in SportsGame
UPDATE "SportsGame" SET "homeTeam" = 'ATL' WHERE "homeTeam" = 'Atlanta Hawks';
# (repeat for all teams)
```

### Priority 2: Coverage Gaps
```bash
# 1. Populate NCAAB/NCAAF canonical players
npx tsx scripts/populate-canonical-ncaa.ts

# 2. Merge duplicate soccer leagues
# soccer_epl -> epl, etc.
```

### Priority 3: Maintenance
```bash
# 1. Update local fallback data
npx tsx scripts/update-local-fallback.ts

# 2. Add indexes for common queries
CREATE INDEX idx_sportsgame_league_date ON "SportsGame"(league, "gameDate");
CREATE INDEX idx_player_league_name ON "Player"(league, name);
```

---

## 11. VERIFICATION QUERIES

### Check for orphaned records
```sql
-- Players without canonical entry
SELECT COUNT(*) FROM "Player" p
LEFT JOIN "CanonicalPlayer" c ON p.name = c."fullName" AND p.league = c.league
WHERE c.id IS NULL;

-- Props without valid player
SELECT COUNT(*) FROM "PlayerPropLine" pp
LEFT JOIN "Player" p ON pp."playerExternalId" = p."externalPlayerId"
WHERE p.id IS NULL;
```

### Check data freshness
```sql
-- Tables not updated in 7+ days
SELECT relname, last_vacuum, last_analyze, n_live_tup
FROM pg_stat_user_tables
WHERE last_analyze < NOW() - INTERVAL '7 days'
ORDER BY n_live_tup DESC;
```

---

## 12. SUMMARY

| Metric | Status |
|--------|--------|
| Total Tables | 121 |
| Total Rows | ~8.5M |
| Data Freshness | Current (Jan 30, 2026) |
| NBA Coverage | 95% |
| NFL Coverage | 98% |
| NHL Coverage | 21% (needs work) |
| Duplicate Games | ~160 (needs cleanup) |
| API Health | All working |
| Key Players Verified | All correct |

**Overall Status:** Operational with minor data quality issues to address.

---

*Audit generated by Claude Code for AI verification review.*
