Operations Runbooks
Common operational procedures for the quinza infrastructure.
Runbooks
| Runbook | Description |
|---|---|
| Disaster Recovery | Recovery procedures for each failure scenario |
| Adding K8s Nodes | Step-by-step guide to provision new workers |
| Secrets Management | SOPS, Ansible Vault, ArgoCD credentials, deploy tokens |
| Troubleshooting | Known issues, observability pipeline, and recovery playbook |
Emergency Contacts
| Service | URL |
|---|---|
| PagerDuty | quinza.eu.pagerduty.com |
| SigNoz | dogs.quinza.dev |
| OneUptime | ops.quinza.dev |
| Semaphore | semaphore.quinza.dev |
Quick Reference
- Bastion down? See Disaster Recovery - Scenario 1
- Need to decrypt secrets? See Secrets Management
- Adding capacity? See Adding Nodes
- ArgoCD credentials? See Secrets Management
- Dashboards empty? See Troubleshooting - Recovery Playbook
- Check cluster health?
talosctl get members+kubectl get nodes - etcd snapshot?
make etcd-snapshotormake etcd-snapshot-push(offsite) - PostgreSQL backup? Daily CronJob at 02:00 UTC; offsite:
scripts/pg-backup-to-bastion.sh - Restore PostgreSQL?
scripts/pg-restore.sh <backup-file> - Generate CI kubeconfig?
scripts/generate-ci-kubeconfig.sh - MetalLB VIP unreachable? See Troubleshooting - Issue 6
- CNPG metrics missing? See Troubleshooting - Issue 7
- Semaphore template blank? See Troubleshooting - Issue 8
- Auto-remediation? See Disaster Recovery - Semaphore Runbooks
URLs
| Service | URL |
|---|---|
| Carzying frontend | www.carzying.es |
| Directus CMS | content.carzying.es |
| ArgoCD | argocd.quinza.dev |
| SigNoz | dogs.quinza.dev |
| OneUptime | ops.quinza.dev |
| Semaphore | semaphore.quinza.dev |