Skip to content

Kubernetes

Cluster: quinza

PropertyValue
DistributionTalos Linux v1.13.0
Kubernetesv1.36.0
Nodes3 (1 control plane + 2 workers)
Scheduling on CPEnabled (allowSchedulingOnControlPlanes: true)

Nodes

RoleSpecsWireGuard IP
Control Plane2 vCPU, 4 GB RAM10.10.1.1
Worker 14 vCPU, 8 GB RAM10.10.1.2
Worker 24 vCPU, 8 GB RAM10.10.1.3

Networking

CNI: Flannel with --iface=wg0 — all pod traffic routes through the WireGuard mesh.

Load Balancer: MetalLB in L2 mode, deployed via Helm in the metallb-system namespace. IP pool: 10.10.1.200-10.10.1.210 (WireGuard subnet).

Ingress: Traefik v3.6.13, deployed via Helm chart with LoadBalancer service type. MetalLB assigns VIP 10.10.1.200. Caddy on the bastion points to this single VIP instead of hardcoded node IPs. The variable k8s_traefik_upstream in bastion vars controls the upstream address.

MetalLB and WireGuard

MetalLB L2 mode uses ARP announcements, which do not cross WireGuard tunnels. The VIP must be added to WireGuard AllowedIPs for the node peer so the bastion can route to it. See WireGuard - MetalLB VIP Routing.

DNS Automation: ExternalDNS watches K8s Ingress resources and auto-creates DNS A records in Cloudflare for quinza.dev domains. Deployed via Helm (external-dns/external-dns) in the external-dns namespace with txt-owner-id: quinza-k8s.

PropertyValue
Namespaceexternal-dns
ProviderCloudflare (quinza.dev only)
Sync policysync with TXT owner ID quinza-k8s
GCore (carzying.es)Pending -- needs API token

Storage

local-path-provisioner is the default StorageClass. Namespaces using PVCs need the pod-security label:

yaml
labels:
  pod-security.kubernetes.io/enforce: privileged

PostgreSQL

CloudNativePG v1.29.0 operator manages PostgreSQL instances. StackGres was evaluated but is incompatible with Kubernetes v1.36.

PostgreSQL runs in the apps namespace as a shared instance for all applications.

Backups

A CronJob in talos/manifests/postgresql/backup-cronjob.yaml runs pg_dump daily at 02:00 UTC with 7-day retention, storing dumps on a 5Gi PVC.

PropertyValue
Schedule0 2 * * * (daily at 02:00 UTC)
Retention7 days
Storage5Gi PVC

Additional scripts:

ScriptPurpose
scripts/pg-backup-to-bastion.shCopy backups offsite to the bastion
scripts/pg-restore.shRestore a backup to the cluster

Namespace Layout

mermaid
graph TB
    subgraph kube-system
        COREDNS[CoreDNS x2]
        FLANNEL[Flannel x3]
        KUBEPROXY[kube-proxy x3]
        TRAEFIK[Traefik v3.6.13]
    end

    subgraph metallb-system
        METALLB[MetalLB L2]
    end

    subgraph external-dns
        EXTDNS[ExternalDNS - Cloudflare]
    end

    subgraph cnpg-system
        CNPG_OP[CloudNativePG v1.29.0]
    end

    subgraph local-path-storage
        LPP[local-path-provisioner]
    end

    subgraph apps
        PG[PostgreSQL 16 - 1 instance, 10Gi]
    end

    subgraph carzying
        FE[Frontend - SolidStart x1]
        DIRECTUS[Directus v11]
    end

    subgraph carzying-preview
        PREV[Preview Deployments]
    end

    subgraph argocd
        ARGO_SRV[ArgoCD Server]
        ARGO_REPO[Repo Server]
        ARGO_REDIS[Redis]
        ARGO_CTRL[Application Controller]
    end

    subgraph gitlab-runner
        GLRUNNER[GitLab Runner - Helm]
    end

    subgraph monitoring
        OTEL[OTel Collector DaemonSet x3]
        OTEL_CNPG[OTel CNPG Collector]
    end

    CNPG_OP -->|manages| PG
    DIRECTUS -->|uses| PG
    TRAEFIK -->|routes to| FE
    TRAEFIK -->|routes to| DIRECTUS
    TRAEFIK -->|routes to| ARGO_SRV
    OTEL -->|OTLP to bastion relay| RELAY[10.10.0.1:4317]

Workload Summary

NamespaceWorkloads
kube-systemCoreDNS (2), Flannel (3), kube-proxy (3), Traefik v3.6.13 (LoadBalancer), gitlab-ci-deploy SA
metallb-systemMetalLB controller + speaker (L2 mode, pool 10.10.1.200-210)
external-dnsExternalDNS (Cloudflare provider, quinza.dev)
cnpg-systemCloudNativePG operator v1.29.0
local-path-storagelocal-path-provisioner (default StorageClass)
appsPostgreSQL 16 (CNPG, 1 instance, 10Gi), pg_dump CronJob (daily)
carzyingFrontend (1 replica), Directus v11
carzying-previewEphemeral preview deployments from MRs
argocdServer, repo-server, redis, application-controller
gitlab-runnerGitLab Runner (Helm, Kubernetes executor)
monitoringOTel Collector DaemonSet (3 pods), OTel CNPG Collector (Deployment)

Configuration Management

Talos nodes are configured via talosctl machineconfig patch. Each node has its own patch file applied directly.

talhelper was evaluated but v3 is broken — do not use it.

Secrets

talsecret.yaml contains cluster secrets, encrypted with SOPS + age. The age key is stored outside the repository.

talsecret.yaml  →  SOPS (age encryption)  →  git

Quinza Infrastructure