# Backups and Monitoring

This repo now includes the minimum launch-safe operational hooks that were missing:

- `scripts/backup-postgres.sh`
  Runs a PostgreSQL backup using `DATABASE_URL` or `PG*` env vars.
- `scripts/check-health.mjs`
  Checks the frontend and `/api/health`, exits non-zero on failure, and can post failures to `ALERT_WEBHOOK_URL`.

## Database Backups

Run a manual backup from the repo root:

```bash
npm run ops:backup-db
```

Defaults:

- output directory: `/sdb-disk/backups/rainmaker-postgres` when that mounted backup volume exists, otherwise `./backups/postgres`
- format: PostgreSQL custom dump
- retention: 14 days
- remote replication: disabled unless `REMOTE_BACKUP_TARGET` and `REMOTE_BACKUP_DIR` are set

Optional env vars:

- `BACKUP_DIR`
- `RETENTION_DAYS`
- `REMOTE_BACKUP_TARGET`
- `REMOTE_BACKUP_DIR`

Recommended cron:

```cron
15 3 * * * /usr/bin/flock -n /tmp/cron-rm-pg-backup.lock /bin/bash -lc 'cd /var/www/html/rainmaker && BACKUP_DIR=/sdb-disk/backups/rainmaker-postgres RETENTION_DAYS=14 REMOTE_BACKUP_TARGET=administrator@10.10.0.3 REMOTE_BACKUP_DIR=/home/administrator/backups/rainmaker-postgres /usr/bin/npm run ops:backup-db' >> /var/log/rainmaker/postgres-backup.log 2>&1
```

For production, do not point backups back into the app tree unless you have no other volume. A repo-local dump on `/var/www/html` does not protect you from that disk dying.

If you have a cluster node available, replicate the dump there over SSH in the same cron run. That protects you from losing the app host and still keeps restore files close by on the cluster network.

## Health Monitoring

Run a manual health check from the repo root:

```bash
npm run ops:check-health
```

Optional env vars:

- `FRONTEND_HEALTHCHECK_URL`
- `API_HEALTHCHECK_URL`
- `HEALTHCHECK_TIMEOUT_MS`
- `ALERT_WEBHOOK_URL`

Recommended cron:

```cron
*/5 * * * * cd /var/www/html/rainmaker && /usr/bin/npm run ops:check-health >> /var/log/rainmaker/healthcheck.log 2>&1
```

If `ALERT_WEBHOOK_URL` is set, failures are POSTed as JSON so Slack, Discord, or another webhook consumer can alert immediately.
