Why Talos
Kubernetes the hard way (kubeadm) requires 15+ steps and manual certificate management. Talos gives you a declarative cluster that manages its own certificates, API server rotation, and upgrades — all through a single machine config.
The trade-off: Talos is opinionated. You don’t get a traditional kubelet. But for infrastructure that should just work, this is a feature.
Air-gapped requirement: My homelab can’t reach public registries. Every container pull route redirects through my Harbor mirror. Talos’s registry mirror config makes this seamless.
[!tip] Talos automatically rotates certificates before they expire. No manual intervention needed for cluster certificate management.
Module Capabilities
The tf-module-proxmox-talos module provisions a complete Talos-based Kubernetes cluster on Proxmox VE in a single Terraform apply:
- Talos Image Factory — generates custom ISOs with specific extensions
- Machine Configuration — generates Talos machine configs with networking
- ISO Upload — downloads and uploads to Proxmox datastore
- Node Provisioning — provisions control plane and worker VMs across host pool
- Cluster Bootstrap — applies machine configs and bootstraps Kubernetes
- Day-0 GitOps — optionally installs Flux or Argo CD during bootstrap
- Registry Mirrors — configures container registry redirects
Quick Start
module "talos_cluster" {
source = "registry.example.com/namespace/tf-module-proxmox-talos/talos"
version = "1.2.1"
configuration = {
cluster = {
name = "prod-k8s"
datastore = { id = "nas", node = "alpha" }
talos = { version = "v1.12.4" }
kubernetes_version = "v1.35.0"
}
host_pool = {
alpha = { datastore_id = "local-lvm" }
charlie = { datastore_id = "local-lvm" }
foxtrot = { datastore_id = "local-lvm" }
}
control_plane_nodes = {
nodes = [
{ size = "control_plane", networks = { dmz = { address = "192.168.62.21/24", gateway = "192.168.62.1" } } }
]
host_pool = ["alpha", "charlie", "foxtrot"]
vip = { enabled = true, address = "192.168.62.20" }
}
worker_nodes = {
nodes = [
{ size = "worker", networks = { dmz = { address = "192.168.62.24/24", gateway = "192.168.62.1" } } }
]
host_pool = ["alpha", "charlie", "foxtrot"]
}
node_size_configuration = {
control_plane = { cpu = 4, memory = 8192, os_disk = 128 }
worker = { cpu = 10, memory = 49152, os_disk = 128, data_disk = 512 }
}
}
}Talos Image Factory
The module uses Talos’s image factory to generate custom ISOs with specific extensions:
# image.tf
resource "talos_image_factory_schematic" "this" {
schematic = yamlencode({
customization = {
systemExtensions = {
officialExtensions = data.talos_image_factory_extensions_versions.this.extensions_info[*].name
}
}
})
}The extensions are defined in locals:
locals {
image = {
platform = "nocloud"
customizations = {
base = [
"lldp", # Network topology discovery
"qemu-guest-agent", # Proxmox agent integration
"util-linux-tools", # Core utilities
"iscsi-tools", # NFS storage
"nfs-utils" # NFS mounting
]
}
}
}The generated schematic ID is used to construct the ISO URL:
resource "proxmox_download_file" "talos_iso" {
file_name = "talos-${var.configuration.cluster.name}-${var.configuration.cluster.talos.version}-${data.talos_image_factory_urls.this.schematic_id}.iso"
url = var.configuration.cluster.talos.iso_mirror != null
? replace(data.talos_image_factory_urls.this.urls.iso, "https://", var.configuration.cluster.talos.iso_mirror)
: data.talos_image_factory_urls.this.urls.iso
}This allows using mirror registries for air-gapped environments.
Machine Configuration
Talos machine configuration is generated through the Talos provider:
data "talos_machine_configuration" "configurations" {
cluster_name = var.configuration.cluster.name
cluster_version = var.configuration.cluster.talos.version
# Control plane specific config
machine_type = "controlplane"
# Network configuration
network = {
interfaces = [
for idx, network in var.configuration.control_plane_nodes.nodes[0].networks : {
interface = keys(network.networks)[0]
DHCP = false
addresses = [values(network.networks)[0].address]
}
]
}
# Kubernetes configuration
kubernetes = {
version = var.configuration.cluster.kubernetes_version
}
}The configuration supports:
- Multiple network interfaces per node
- Registry mirrors for all major registries
- Custom CNI (Cilium) configuration
- kube-proxy disable
Registry Mirrors
A key feature is container registry mirror configuration:
configuration = {
cluster = {
registry_mirrors = {
"ghcr.io" = {
endpoints = ["https://harbor.example.com/v2/gh"]
override_path = true
}
"registry.k8s.io" = {
endpoints = ["https://harbor.example.com/v2/k8s"]
override_path = true
}
"docker.io" = {
endpoints = ["https://harbor.example.com/v2/dh"]
override_path = true
}
"quay.io" = {
endpoints = ["https://harbor.example.com/v2/qi"]
override_path = true
}
"factory.talos.dev" = {
endpoints = ["https://harbor.example.com/v2/talos"]
override_path = true
}
}
}
}All container pulls route through my Harbor registry — essential for air-gapped homelabs.
Multi-Network Support
The module provisions VMs with multiple network interfaces:
network_devices = [
for network_name, network in each.value.networks : {
name = network_name
enabled = true
bridge = network_name
ipv4 = {
address = network.address
gateway = network.gateway
}
}
]My production setup uses:
- dmz — frontend network with gateway (192.168.62.0/24)
- vmbr1 — backend network for inter-node communication (192.168.192.0/24)
Cluster Bootstrap
The bootstrap sequence is orchestrated by Terraform:
# 1. Generate machine secrets
resource "talos_machine_secrets" "this" {}
# 2. Apply control plane configuration
resource "talos_machine_configuration_apply" "controlplane" {
for_each = { for idx, node in var.configuration.control_plane_nodes.nodes : idx => node }
node = module.control_plane_virtual_machine[each.key].virtual_machine.id
config = data.talos_machine_configuration.configurations[each.key].machine_config
secrets = talos_machine_secrets.this.secrets
}
# 3. Bootstrap the cluster
resource "talos_machine_bootstrap" "this" {
node = var.configuration.control_plane_nodes.nodes[0].name
config = data.talos_machine_configuration.configurations[0].machine_config
secrets = talos_machine_secrets.this.secrets
}GitOps Bootstrap
One of the most powerful features — Flux or ArgoCD can be bootstrapped during cluster creation:
configuration = {
cluster = {
gitops = {
provider = "flux" # or "argocd"
namespace = "flux-system"
chart_version = "2.18.2"
bootstrap = {
repo_url = "https://github.com/your-org/applications.git"
revision = "main"
path = "src/k8s/prod"
destination_namespace = "homelab"
}
}
}
}This does:
- Installs Flux during Talos bootstrap (via inline manifest)
- Configures it to sync from the applications-homelab repository
- The cluster starts deploying apps immediately after boot
sequenceDiagram
participant TF as Terraform
participant Talos as Talos
participant Flux as Flux
participant GH as GitHub
participant K8s as Kubernetes
TF->>Talos: Apply machine config
Talos->>Talos: Bootstrap control plane
Talos->>Flux: Install Flux CRDs
Flux->>GH: Clone applications-homelab
GH-->>Flux: Return repo contents
Flux->>K8s: Deploy applications
Cilium Integration
For advanced networking, the default CNI can be replaced with bundled Cilium:
configuration = {
cluster = {
# Disable Talos-managed CNI
options = {
disable_default_cni = true
disable_kube_proxy = true
}
# Configure Cilium via helm values
helm_values_override = {
cilium = {
operator = { replicas = 1 }
}
}
}
}The module uses the Helm provider to template the Cilium manifest:
data "helm_template" "cilium" {
name = "cilium"
repo = "https://cilium.github.io/cilium"
chart = "cilium"
version = var.configuration.cluster.talos.version
namespace = "cilium"
values = [var.configuration.cluster.helm_values_override]
}Node Sizing
The node_size_configuration block keeps definitions DRY:
node_size_configuration = {
control_plane = {
cpu = 4
memory = 8192 # MB
os_disk = 128 # GB
}
worker = {
cpu = 10
memory = 49152 # MB
os_disk = 128
data_disk = 512 # Extra data disk for PVs
}
}My prod-k8s cluster:
- 3 control plane nodes: 4 vCPU, 8GB RAM, 128GB disk
- 3 worker nodes: 10 vCPU, 48GB RAM, 128GB OS + 512GB data
Host Pool Scheduling
VMs are distributed across Proxmox nodes via modulo arithmetic:
# In nodes.tf
node_name = var.configuration.control_plane_nodes.host_pool[
each.key % length(var.configuration.control_plane_nodes.host_pool)
]With 3 nodes and 6 worker indices:
- Worker 0 → alpha (0 % 3)
- Worker 1 → charlie (1 % 3)
- Worker 2 → foxtrot (2 % 3)
- Worker 3 → alpha (3 % 3)
- Worker 4 → charlie (4 % 3)
- Worker 5 → foxtrot (5 % 3)
This ensures even distribution across the cluster.
Outputs
The module returns cluster credentials for external use:
output "cluster_credentials" {
value = {
kubeconfig = talos_cluster_kubeconfig.this.kubeconfig
talosconfig = talos_client_configuration.this.talosconfig
# Kubeconfig file is also written locally when debug = true
talosconfig_path = local.talosconfig_path
kubeconfig_path = local.kubeconfig_path
}
}Credentials are automatically stored in Bitwarden:
resource "bitwarden-secrets_secret" "kubernetes_kubeconfig" {
key = "${local.cluster_name}-kubeconfig"
value = module.kubernetes[0].cluster_credentials.kubeconfig
}
resource "bitwarden-secrets_secret" "kubernetes_talosconfig" {
key = "${local.cluster_name}-talosconfig"
value = module.kubernetes[0].cluster_credentials.talosconfig
}My Production Configuration
Here’s the actual production YAML configuration:
# configurations/kubernetes/prod-k8s.yaml
cluster:
name: prod-k8s
datastore:
id: nas
node: alpha
talos:
version: v1.12.4
installer_mirror: harbor.example.com/talos
iso_mirror: https://proxy.example.com/
kubernetes_version: v1.35.0
registry_mirrors:
ghcr.io: { endpoints: [https://harbor.example.com/v2/gh], override_path: true }
registry.k8s.io: { endpoints: [https://harbor.example.com/v2/k8s], override_path: true }
docker.io: { endpoints: [https://harbor.example.com/v2/dh], override_path: true }
quay.io: { endpoints: [https://harbor.example.com/v2/qi], override_path: true }
factory.talos.dev: { endpoints: [https://harbor.example.com/v2/talos], override_path: true }
options:
disable_default_cni: true
disable_kube_proxy: true
disable_scheduling_on_control_plane: true
gitops:
provider: flux
bootstrap:
repo_url: https://github.com/your-org/applications.git
path: src/k8s/prod
destination_namespace: homelab
host_pool:
alpha: { datastore_id: local-lvm }
charlie: { datastore_id: local-lvm }
foxtrot: { datastore_id: local-lvm }
control_plane_nodes:
nodes: [...] # 3 control planes
host_pool: [alpha, charlie, foxtrot]
vip: { enabled: true, address: 192.168.62.20 }
worker_nodes:
nodes: [...] # 3 workers
host_pool: [alpha, charlie, foxtrot]
node_size_configuration:
control_plane: { cpu: 4, memory: 8192, os_disk: 128 }
worker: { cpu: 10, memory: 49152, os_disk: 128, data_disk: 512 }What’s Next
Current areas of exploration:
- Multi-cluster federation — connecting Talos clusters for workload distribution
- Nested Talos — running Talos inside Proxmox for testing
- Observability — centralized logging with Loki and Grafana
What Most People Get Wrong
-
“Talos upgrades break clusters” — With proper machine configs and registry mirrors, upgrades are rolling. The immutability is a feature, not a bug.
-
“Air-gapped is impossible” — Talos’ registry mirror config + image factory handles this. Your nodes don’t need public internet access.
-
“No kubelet means no logging” — Talos has built-in
talosctl logsandtalosctl metrics. It’s different from Kubernetes logging, not less capable.
When to Use / When NOT to Use
| Use Talos | Stick with kubeadm |
|---|---|
| Want declarative infrastructure | Need full kubelet control |
| Air-gapped environments | Custom init systems required |
| Single apply to cluster | Manual certificate management needed |
Use Talos
| Use Talos | Stick with kubeadm |
|---|---|
| Want declarative infrastructure | Need full kubelet control |
| Air-gapped environments | Custom init systems required |
| Single apply to cluster | Manual certificate management needed |
The foundation is solid — every cluster can be versioned, reviewed, and rolled back.