<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>Homelab on zharif.my</title>
        <link>https://zharif.my/tags/homelab/</link>
        <description>Recent content in Homelab on zharif.my</description>
        <generator>Hugo -- gohugo.io</generator>
        <language>en-us</language>
        <lastBuildDate>Wed, 22 Apr 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://zharif.my/tags/homelab/index.xml" rel="self" type="application/rss+xml" /><item>
        <title>GitHub Organization as Code with Terraform</title>
        <link>https://zharif.my/posts/github-management-plane/</link>
        <pubDate>Wed, 22 Apr 2026 00:00:00 +0000</pubDate>
        
        <guid>https://zharif.my/posts/github-management-plane/</guid>
        <description>&lt;img src="https://images.unsplash.com/photo-1556075798-4825dfaaf498?w=800&amp;h=400&amp;fit=crop" alt="Featured image of post GitHub Organization as Code with Terraform" /&gt;&lt;h2 id=&#34;why-this-matters&#34;&gt;Why This Matters
&lt;/h2&gt;&lt;p&gt;If you&amp;rsquo;ve ever tried to explain to your team that &amp;ldquo;we can&amp;rsquo;t create a new repo right now, I&amp;rsquo;m at dinner,&amp;rdquo; you understand why GitHub management should be code. Every infrastructure change in my homelab goes through code review — including how we manage GitHub itself.&lt;/p&gt;
&lt;p&gt;The problem: GitHub&amp;rsquo;s web UI is fine for 3 repos, painful for 30+. You can&amp;rsquo;t track who changed what, can&amp;rsquo;t enforce naming conventions, and can&amp;rsquo;t ensure consistency across repositories.&lt;/p&gt;
&lt;p&gt;The solution: treat your GitHub organization like database infrastructure. Define everything in YAML, let Terraform handle the drift, and sleep better at night.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Scale&lt;/strong&gt;: This setup manages 40+ repositories across my organization with full configuration, teams, secrets, and custom properties.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;All Terraform resources support &lt;code&gt;lifecycle { create_before_destroy = true }&lt;/code&gt; for zero-downtime deployments.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&#34;architecture&#34;&gt;Architecture
&lt;/h2&gt;&lt;p&gt;The &lt;code&gt;github-management-plane&lt;/code&gt; repository manages my GitHub organization through Terraform — the same configuration-driven pattern as my homelab infrastructure:&lt;/p&gt;
&lt;pre class=&#34;mermaid&#34;&gt;
  graph TB
    G[github-management-plane]
    R[Repository Module]
    S[Secrets/Variables Module]
    O[Organization Custom Properties]
    
    G --&amp;gt; R
    G --&amp;gt; S
    G --&amp;gt; O
    
    Repos[&amp;#34;All Repositories&amp;#34;]
    Vars[&amp;#34;Actions Variables&amp;#34;]
    Secs[&amp;#34;Actions Secrets&amp;#34;]
    
    Repos --&amp;gt; tf-infra-homelab[&amp;#34;tf-infra-homelab&amp;#34;]
    Repos --&amp;gt; tf-module-proxmox-talos[&amp;#34;tf-module-proxmox-talos&amp;#34;]
    Repos --&amp;gt; applications-homelab[&amp;#34;applications-homelab&amp;#34;]
    Repos --&amp;gt; ...[&amp;#34;24+ repositories&amp;#34;]
&lt;/pre&gt;

&lt;h2 id=&#34;what-this-module-does&#34;&gt;What This Module Does
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Repository management&lt;/strong&gt; — create and configure all repositories via YAML&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Team management&lt;/strong&gt; — define teams and members&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Organization custom properties&lt;/strong&gt; — classify repositories&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Actions secrets/variables&lt;/strong&gt; — manage organization-wide secrets&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Issue labels&lt;/strong&gt; — standardize labels across repositories&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;quick-start&#34;&gt;Quick Start
&lt;/h2&gt;&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;module &amp;#34;repositories&amp;#34; {
  source   = &amp;#34;./modules/repository&amp;#34;
  for_each = local.filtered_repo_configurations
  
  configuration = each.value
  organization  = var.github_organization
}&lt;/code&gt;&lt;/pre&gt;&lt;hr&gt;
&lt;h2 id=&#34;repository-management&#34;&gt;Repository Management
&lt;/h2&gt;&lt;p&gt;All repositories are defined as YAML configurations:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;# configurations/repository/tf-infra-homelab.yaml
name: tf-infra-homelab
description: A terraform infrastructure repository for managing my homelab environment
enabled: true
archived: false
visibility: private
type: terraform-infrastructure

topics:
  - homelab
  - proxmox

enabled_features:
  vulnerability_alerts: true
  issues: true
  wiki: false
  projects: false
  discussions: false&lt;/code&gt;&lt;/pre&gt;&lt;h3 id=&#34;repository-types&#34;&gt;Repository Types
&lt;/h3&gt;&lt;p&gt;The module supports different repository types with defaults:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;locals {
  repository_types = {
    terraform-infrastructure = {
      license_template = &amp;#34;mit&amp;#34;
      auto_init       = true
      topics        = [&amp;#34;terraform&amp;#34;, &amp;#34;homelab&amp;#34;]
    }
    terraform-module = {
      license_template = &amp;#34;mit&amp;#34;
      auto_init       = true
      topics         = [&amp;#34;terraform&amp;#34;, &amp;#34;proxmox&amp;#34;]
    }
    python-docker-application = {
      license_template = &amp;#34;mit&amp;#34;
      auto_init       = true
      topics         = [&amp;#34;python&amp;#34;, &amp;#34;docker&amp;#34;]
    }
    generic = {
      auto_init = true
    }
  }
}&lt;/code&gt;&lt;/pre&gt;&lt;h3 id=&#34;repository-resource&#34;&gt;Repository Resource
&lt;/h3&gt;&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;resource &amp;#34;github_repository&amp;#34; &amp;#34;this&amp;#34; {
  name        = var.configuration.name
  description = var.configuration.description
  visibility  = var.configuration.visibility
  
  allow_rebase_merge    = true
  allow_squash_merge  = true
  delete_branch_on_merge = true
  
  vulnerability_alerts = var.configuration.enabled_features.vulnerability_alerts
  has_discussions     = var.configuration.enabled_features.discussions
  has_issues         = var.configuration.enabled_features.issues
  has_projects       = var.configuration.enabled_features.projects
  has_wiki           = var.configuration.enabled_features.wiki
}&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;organization-custom-properties&#34;&gt;Organization Custom Properties
&lt;/h2&gt;&lt;p&gt;Custom properties allow classification and filtering:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;locals {
  organization_custom_properties = {
    &amp;#34;can-be-public&amp;#34; = {
      description   = &amp;#34;To indicate whether the repository can be made public&amp;#34;
      value_type  = &amp;#34;single_select&amp;#34;
      required   = true
      allowed_values = [&amp;#34;true&amp;#34;, &amp;#34;false&amp;#34;]
      default_value = &amp;#34;false&amp;#34;
    }
    
    &amp;#34;managed-by&amp;#34; = {
      description = &amp;#34;To identify who manages the repository&amp;#34;
      value_type = &amp;#34;single_select&amp;#34;
      required = true
      allowed_values = [
        &amp;#34;github-management-plane&amp;#34;,
        &amp;#34;manual-management&amp;#34;,
      ]
      default_value = &amp;#34;manual-management&amp;#34;
    }
    
    &amp;#34;repository-type&amp;#34; = {
      description = &amp;#34;To indicate the type of repository&amp;#34;
      value_type = &amp;#34;single_select&amp;#34;
      required = true
      allowed_values = [
        &amp;#34;generic&amp;#34;,
        &amp;#34;repository-template&amp;#34;,
        &amp;#34;golang-linux-package&amp;#34;,
        &amp;#34;golang-docker-application&amp;#34;,
        &amp;#34;python-docker-application&amp;#34;,
        &amp;#34;python-package&amp;#34;,
        &amp;#34;terraform-infrastructure&amp;#34;,
        &amp;#34;terraform-module&amp;#34;,
      ]
    }
  }
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Apply them:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;resource &amp;#34;github_organization_custom_properties&amp;#34; &amp;#34;managed_properties&amp;#34; {
  for_each   = local.organization_custom_properties
  
  property_name = each.key
  value_type  = each.value.value_type
  required   = each.value.required
  description = each.value.description
  default_value = each.value.default_value
  allowed_values = each.value.allowed_values
}&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;secrets-and-variables&#34;&gt;Secrets and Variables
&lt;/h2&gt;&lt;p&gt;Actions secrets and variables are managed through Bitwarden:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;# configurations/secrets_variables/global.yaml
variables:
  - name: RUNNER
    value: self-hosted
    visibility: all

secrets:
  - name: BWS_ACCESS_TOKEN
    is_manual: true
    visibility: private&lt;/code&gt;&lt;/pre&gt;&lt;h3 id=&#34;variable-resource&#34;&gt;Variable Resource
&lt;/h3&gt;&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;resource &amp;#34;github_actions_organization_variable&amp;#34; &amp;#34;managed_variables&amp;#34; {
  for_each = { for v in var.configuration.variables : v.name =&amp;gt; v }
  
  variable_name = each.value.name
  visibility   = each.value.visibility
  value        = each.value.value
}&lt;/code&gt;&lt;/pre&gt;&lt;h3 id=&#34;secret-resource&#34;&gt;Secret Resource
&lt;/h3&gt;&lt;p&gt;For secrets, two patterns:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;# Manual secrets (value set outside Terraform)
resource &amp;#34;github_actions_organization_secret&amp;#34; &amp;#34;managed_secrets_manual&amp;#34; {
  for_each = { for v in var.configuration.secrets : v.name =&amp;gt; v if v.is_manual }
  
  secret_name = each.value.name
  visibility = each.value.visibility
  plaintext_value = &amp;#34;NONE&amp;#34;
}

# Synced secrets (from Bitwarden)
resource &amp;#34;github_actions_organization_secret&amp;#34; &amp;#34;managed_secrets_sync&amp;#34; {
  for_each = { for v in var.configuration.secrets : v.name =&amp;gt; v if !v.is_manual }
  
  secret_name     = each.value.name
  visibility     = each.value.visibility
  plaintext_value = data.bitwarden-secrets_secret.secrets[each.value.name].value
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Get secrets from Bitwarden:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;data &amp;#34;bitwarden-secrets_secret&amp;#34; &amp;#34;secrets&amp;#34; {
  for_each = { for v in local.all_secrets : v.name =&amp;gt; v if !v.is_manual }
  id      = each.value.bw_secret_id
}&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;team-management&#34;&gt;Team Management
&lt;/h2&gt;&lt;p&gt;Teams are defined in the main module:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;resource &amp;#34;github_team&amp;#34; &amp;#34;organization_administrators&amp;#34; {
  name        = &amp;#34;organization-administrators&amp;#34;
  description = &amp;#34;Team with administrative access to the organization&amp;#34;
  privacy     = &amp;#34;closed&amp;#34;
}

resource &amp;#34;github_team_members&amp;#34; &amp;#34;organization_administrators_members&amp;#34; {
  team_id = github_team.organization_administrators.id
  
  members = {
    &amp;#34;your-username&amp;#34; = {
      role = &amp;#34;maintainer&amp;#34;
    }
  }
}&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;issue-labels&#34;&gt;Issue Labels
&lt;/h2&gt;&lt;p&gt;Labels are standardized across repositories:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;locals {
  shared_labels = {
    &amp;#34;bug&amp;#34; = {
      color   = &amp;#34;d73a4a&amp;#34;
      description = &amp;#34;Bug report&amp;#34;
    }
    &amp;#34;enhancement&amp;#34; = {
      color   = &amp;#34;a2eeef&amp;#34;
      description = &amp;#34;New feature&amp;#34;
    }
    &amp;#34;documentation&amp;#34; = {
      color   = &amp;#34;0075ca&amp;#34;
      description = &amp;#34;Documentation improvements&amp;#34;
    }
  }
}

resource &amp;#34;github_issue_labels&amp;#34; &amp;#34;this&amp;#34; {
  repository = var.configuration.name
  
  label = merge(
    local.shared_labels,
    try(var.configuration.labels, {})
  )
}&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;my-repository-configuration&#34;&gt;My Repository Configuration
&lt;/h2&gt;&lt;p&gt;Here&amp;rsquo;s the list of repositories managed:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-txt&#34;&gt;configurations/repository/
├── tf-infra-homelab.yaml          # Homelab infrastructure
├── tf-infra-github-management-plane.yaml  # GitHub management
├── tf-module-proxmox-lxc.yaml    # LXC module
├── tf-module-proxmox-vm.yaml    # VM module
├── tf-module-proxmox-talos.yaml  # Talos module
├── tf-module-proxmox-docker.yaml # Docker module
├── applications-homelab.yaml   # Kustomize apps
├── cf-worker-terraform-registry.yaml  # TF registry worker
├── cf-worker-apt-repository.yaml   # APT repository worker
├── template-terraform-basic.yaml   # Module template
├── template-cloudflare-worker-python.yaml  # Worker template
└── ... (24 total)&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Each is just a YAML file — adding a new repository is adding a file.&lt;/p&gt;
&lt;h2 id=&#34;configuration-structure&#34;&gt;Configuration Structure
&lt;/h2&gt;&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-txt&#34;&gt;github-management-plane/
├── configurations/
│   ├── repository/           # Repository definitions
│   ├── secrets_variables/   # Actions secrets/variables
│   └── rulesets/           # Branch protection (future)
├── modules/
│   ├── repository/         # Repository module
│   ├── secrets_variables/  # Secrets module
│   └── organization/        # Org properties module
├── main.tf                 # Orchestration
├── locals.tf              # Configuration loading
└── providers.tf          # Provider setup&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This mirrors the homelab infrastructure structure.&lt;/p&gt;
&lt;h2 id=&#34;outputs&#34;&gt;Outputs
&lt;/h2&gt;&lt;p&gt;The module doesn&amp;rsquo;t have specific outputs since it manages the organization passively.&lt;/p&gt;
&lt;h2 id=&#34;whats-next&#34;&gt;What&amp;rsquo;s Next
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Branch protection rulesets&lt;/strong&gt; — via rulesets API (when Terraform provider supports it)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Repository invitations&lt;/strong&gt; — managing outside collaborators&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Security advisories&lt;/strong&gt; — automated security scanning&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;what-most-people-get-wrong&#34;&gt;What Most People Get Wrong
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&amp;ldquo;GitHub Terraform is just for repos&amp;rdquo;&lt;/strong&gt; — It&amp;rsquo;s org-wide: teams, custom properties, secrets, variables. The whole thing.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&amp;ldquo;One Terraform run is enough&amp;rdquo;&lt;/strong&gt; — Git rate limits apply. Use &lt;code&gt;ratenode&lt;/code&gt; provider or caching.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&amp;ldquo;Manual changes are fine if you revert&amp;rdquo;&lt;/strong&gt; — Terraform drift will catch you. Enable &lt;code&gt;logento&lt;/code&gt; or run &lt;code&gt;terraform refresh&lt;/code&gt; regularly.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;when-to-use--when-not-to-use&#34;&gt;When to Use / When NOT to Use
&lt;/h2&gt;&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Use GitHub Management Plane&lt;/th&gt;
          &lt;th&gt;Do it manually&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;20+ repositories&lt;/td&gt;
          &lt;td&gt;1-5 repos&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Team collaboration&lt;/td&gt;
          &lt;td&gt;Personal projects&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Audit requirements&lt;/td&gt;
          &lt;td&gt;Quick experiments&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Secret/variable management&lt;/td&gt;
          &lt;td&gt;Ad-hoc scripts only&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;This makes GitHub management declarative — every change goes through code review.&lt;/p&gt;
</description>
        </item>
        <item>
        <title>Self-Service Infrastructure with Backstage</title>
        <link>https://zharif.my/posts/backstage-homelab/</link>
        <pubDate>Sat, 18 Apr 2026 00:00:00 +0000</pubDate>
        
        <guid>https://zharif.my/posts/backstage-homelab/</guid>
        <description>&lt;img src="https://images.unsplash.com/photo-1551288049-bebda4e38f71?w=800&amp;h=400&amp;fit=crop" alt="Featured image of post Self-Service Infrastructure with Backstage" /&gt;&lt;h2 id=&#34;why-self-service-matters&#34;&gt;Why Self-Service Matters
&lt;/h2&gt;&lt;p&gt;In my homelab, I was the bottleneck. Every new Kubernetes cluster meant:&lt;/p&gt;
&lt;pre class=&#34;mermaid&#34;&gt;
  flowchart LR
    A[Create YAML] --&amp;gt; B[Find free IPs]
    B --&amp;gt; C[Configure node sizes]
    C --&amp;gt; D[Manually create Backstage catalog entries]
    D --&amp;gt; E[Open PR]
    E --&amp;gt; F[Wait for review]
&lt;/pre&gt;

&lt;p&gt;That&amp;rsquo;s 6 steps where 4 could be automated.&lt;/p&gt;
&lt;p&gt;The insight: infrastructure already defined as YAML. Backstage should consume that same YAML and generate its own catalog entries.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Real-world constraint&lt;/strong&gt;: I deploy clusters infrequently (quarterly?), so I forget the steps. The templated approach ensures consistency whether I do this once a month or once a year.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;This post covers the Backstage integration for homelab infrastructure. See the architecture overview for how it fits in the broader platform.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&#34;two-integration-points&#34;&gt;Two Integration Points
&lt;/h2&gt;&lt;p&gt;Backstage integrates with the homelab infrastructure in two ways:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Resource Catalog&lt;/strong&gt; — auto-generated entities from infrastructure YAML configurations&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Software Templates&lt;/strong&gt; — scaffolder templates for self-service provisioning&lt;/li&gt;
&lt;/ol&gt;
&lt;pre class=&#34;mermaid&#34;&gt;
  graph LR
    subgraph &amp;#34;Configuration&amp;#34;
        C[configurations/*.yaml]
    end
    
    subgraph &amp;#34;Generation&amp;#34;
        G[generate_backstage_catalog.py]
    end
    
    subgraph &amp;#34;Backstage&amp;#34;
        R[Resource Catalog]
        T[Software Templates]
    end
    
    C --&amp;gt; G
    G --&amp;gt; R
    G --&amp;gt; T
&lt;/pre&gt;

&lt;h2 id=&#34;auto-generated-catalog&#34;&gt;Auto-Generated Catalog
&lt;/h2&gt;&lt;p&gt;Running &lt;code&gt;make backstage-catalog&lt;/code&gt; generates Backstage Resource entities from configurations:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34;&gt;make backstage-catalog
# ✓ kubernetes--prod-k8s.yaml (prod-k8s, disabled)
# ✓ kubernetes--dev-k8s.yaml (dev-k8s, disabled)
# ✓ docker--prod-docker-lxc.yaml (prod-docker-lxc, disabled)
# ✓ docker--dev-docker-lxc.yaml (dev-docker-lxc, enabled)&lt;/code&gt;&lt;/pre&gt;&lt;h3 id=&#34;generated-entity-example&#34;&gt;Generated Entity Example
&lt;/h3&gt;&lt;p&gt;Each generated YAML file is a Backstage Resource:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;# backstage/catalog/kubernetes--prod-k8s.yaml
apiVersion: backstage.io/v1alpha1
kind: Resource
metadata:
  name: prod-k8s
  description: Talos kubernetes cluster configuration for production environment
  annotations:
    github.com/project-slug: your-org/your-infra-repo
    homelab.dev/configuration-file: configurations/kubernetes/prod-k8s.yaml
    homelab.dev/resource-type: kubernetes
    homelab.dev/schema-file: configuration_schemas/kubernetes.schema.yaml
    backstage.io/techdocs-entity: component:terraform-module-kubernetes
    homelab.dev/cluster-name: prod-k8s
    homelab.dev/talos-version: v1.12.4
    homelab.dev/kubernetes-version: v1.35.0
    homelab.dev/vip-address: 192.168.62.20
  labels:
    homelab.dev/enabled: &amp;#39;false&amp;#39;
    homelab.dev/environment: prod
    homelab.dev/control-plane-count: &amp;#39;3&amp;#39;
    homelab.dev/worker-count: &amp;#39;3&amp;#39;
    homelab.dev/size-control_plane-cpu: &amp;#39;4&amp;#39;
    homelab.dev/size-control_plane-memory: &amp;#39;8192&amp;#39;
    homelab.dev/size-worker-cpu: &amp;#39;10&amp;#39;
    homelab.dev/size-worker-memory: &amp;#39;49152&amp;#39;
  tags:
    - disabled
    - kubernetes
    - prod
    - proxmox
    - talos
spec:
  type: kubernetes-cluster
  lifecycle: experimental
  owner: group:default/homelab-admins
  system: tf-infra-homelab
  dependsOn:
    - component:default/terraform-module-kubernetes&lt;/code&gt;&lt;/pre&gt;&lt;h3 id=&#34;docker-cluster-entity&#34;&gt;Docker Cluster Entity
&lt;/h3&gt;&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;# backstage/catalog/docker--prod-docker-lxc.yaml
apiVersion: backstage.io/v1alpha1
kind: Resource
metadata:
  name: prod-docker-lxc
  description: Docker configuration on lxc for production environment
  annotations:
    github.com/project-slug: your-org/your-infra-repo
    homelab.dev/configuration-file: configurations/docker/prod-docker-lxc.yaml
    homelab.dev/resource-type: docker
    homelab.dev/cluster-name: prod-docker-lxc
    homelab.dev/cluster-type: lxc
    homelab.dev/vip-address: 192.168.61.20
  labels:
    homelab.dev/enabled: &amp;#39;false&amp;#39;
    homelab.dev/environment: prod
    homelab.dev/worker-count: &amp;#39;3&amp;#39;
    homelab.dev/size-medium-cpu: &amp;#39;8&amp;#39;
    homelab.dev/size-medium-memory: &amp;#39;32768&amp;#39;
  tags:
    - disabled
    - docker
    - lxc
    - prod
    - proxmox
spec:
  type: docker-cluster
  lifecycle: experimental
  owner: group:default/homelab-admins
  system: tf-infra-homelab
  dependsOn:
    - component:default/terraform-module-docker&lt;/code&gt;&lt;/pre&gt;&lt;h3 id=&#34;metadata-extraction&#34;&gt;Metadata Extraction
&lt;/h3&gt;&lt;p&gt;The generation script extracts key metadata from configurations:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-python&#34;&gt;# scripts/generate_backstage_catalog.py

def extract_kubernetes_metadata(config: dict) -&amp;gt; dict:
    &amp;#34;&amp;#34;&amp;#34;Extract catalog-relevant metadata from a Kubernetes configuration.&amp;#34;&amp;#34;&amp;#34;
    annotations = {}
    labels = {}
    
    cluster = config.get(&amp;#34;cluster&amp;#34;, {})
    annotations[&amp;#34;homelab.dev/cluster-name&amp;#34;] = cluster.get(&amp;#34;name&amp;#34;, &amp;#34;&amp;#34;)
    
    talos = cluster.get(&amp;#34;talos&amp;#34;, {}) or {}
    annotations[&amp;#34;homelab.dev/talos-version&amp;#34;] = talos.get(&amp;#34;version&amp;#34;, &amp;#34;&amp;#34;)
    annotations[&amp;#34;homelab.dev/kubernetes-version&amp;#34;] = cluster.get(&amp;#34;kubernetes_version&amp;#34;, &amp;#34;&amp;#34;)
    
    cp_nodes = config.get(&amp;#34;control_plane_nodes&amp;#34;, {}).get(&amp;#34;nodes&amp;#34;, [])
    worker_nodes = config.get(&amp;#34;worker_nodes&amp;#34;, {}).get(&amp;#34;nodes&amp;#34;, [])
    labels[&amp;#34;homelab.dev/control-plane-count&amp;#34;] = str(len(cp_nodes))
    labels[&amp;#34;homelab.dev/worker-count&amp;#34;] = str(len(worker_nodes))
    
    cp_vip = config.get(&amp;#34;control_plane_nodes&amp;#34;, {}).get(&amp;#34;vip&amp;#34;, {})
    if cp_vip and cp_vip.get(&amp;#34;enabled&amp;#34;):
        annotations[&amp;#34;homelab.dev/vip-address&amp;#34;] = cp_vip.get(&amp;#34;address&amp;#34;, &amp;#34;&amp;#34;)
    
    sizes = config.get(&amp;#34;node_size_configuration&amp;#34;, {})
    for size_name, size_spec in sizes.items():
        labels[f&amp;#34;homelab.dev/size-{size_name}-cpu&amp;#34;] = str(size_spec.get(&amp;#34;cpu&amp;#34;, &amp;#34;&amp;#34;))
        labels[f&amp;#34;homelab.dev/size-{size_name}-memory&amp;#34;] = str(size_spec.get(&amp;#34;memory&amp;#34;, &amp;#34;&amp;#34;))
    
    return {&amp;#34;annotations&amp;#34;: annotations, &amp;#34;labels&amp;#34;: labels}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This enables filtering in Backstage:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;homelab.dev/environment=prod&lt;/code&gt; — production clusters&lt;/li&gt;
&lt;li&gt;&lt;code&gt;homelab.dev/enabled=true&lt;/code&gt; — currently deployed&lt;/li&gt;
&lt;li&gt;&lt;code&gt;homelab.dev/size-worker-memory=49152&lt;/code&gt; — large workers&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;catalog-info-definition&#34;&gt;Catalog-Info Definition
&lt;/h2&gt;&lt;p&gt;The root &lt;code&gt;catalog-info.yaml&lt;/code&gt; defines the domain, system, and components:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;# Domain: homelab
apiVersion: backstage.io/v1alpha1
kind: Domain
metadata:
  name: homelab
  description: Self-hosted homelab infrastructure managed with Terraform and Proxmox
  annotations:
    backstage.io/techdocs-ref: dir:.
    github.com/project-slug: your-org/your-infra-repo
spec:
  owner: group:default/homelab-admins

---
# System: tf-infra-homelab
apiVersion: backstage.io/v1alpha1
kind: System
metadata:
  name: tf-infra-homelab
  description: Terraform-managed homelab infrastructure provisioning system
spec:
  owner: group:default/homelab-admins
  domain: homelab

---
# Component: terraform-module-kubernetes
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: terraform-module-kubernetes
  description: Terraform module for provisioning Kubernetes (Talos) clusters on Proxmox
spec:
  type: terraform-module
  lifecycle: production
  owner: group:default/homelab-admins
  system: tf-infra-homelab

---
# Location: discovers auto-generated resources
apiVersion: backstage.io/v1alpha1
kind: Location
metadata:
  name: tf-infra-homelab-resources
spec:
  targets:
    - ./backstage/catalog/*.yaml&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;software-templates&#34;&gt;Software Templates
&lt;/h2&gt;&lt;p&gt;The Backstage scaffolder templates enable self-service provisioning:&lt;/p&gt;
&lt;h3 id=&#34;kubernetes-template&#34;&gt;Kubernetes Template
&lt;/h3&gt;&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;# backstage/templates/kubernetes/template.yaml
apiVersion: scaffolder.backstage.io/v1beta3
kind: Template
metadata:
  name: provision-kubernetes-cluster
  title: Provision Kubernetes Cluster
  description: Create a new Talos-based Kubernetes cluster configuration on Proxmox
  tags:
    - terraform
    - kubernetes
    - talos
    - proxmox
    - homelab
spec:
  owner: group:default/homelab-admins
  type: infrastructure
  system: tf-infra-homelab&lt;/code&gt;&lt;/pre&gt;&lt;h3 id=&#34;template-parameters&#34;&gt;Template Parameters
&lt;/h3&gt;&lt;p&gt;The template accepts parameters for cluster configuration:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;parameters:
  - title: Cluster Identity
    required:
      - name
      - environment
    properties:
      name:
        title: Cluster Name
        type: string
        pattern: &amp;#34;^[a-z][a-z0-9-]&amp;#43;$&amp;#34;
      environment:
        title: Environment
        type: string
        enum:
          - dev
          - staging
          - prod

  - title: Cluster Configuration
    properties:
      talos_version:
        title: Talos Version
        type: string
        default: v1.12.4
      kubernetes_version:
        title: Kubernetes Version
        type: string
        default: v1.35.0
      disable_default_cni:
        title: Disable Default CNI
        type: boolean
        default: true

  - title: Control Plane Nodes
    properties:
      cp_count:
        title: Number of Control Plane Nodes
        type: integer
        minimum: 1
        maximum: 5
        default: 3
      cp_cpu:
        title: CPU Cores per CP Node
        type: integer
        default: 4
      cp_memory:
        title: Memory per CP Node (MB)
        type: integer
        default: 8192

  - title: Worker Nodes
    properties:
      worker_count:
        title: Number of Workers
        type: integer
        default: 3
      worker_cpu:
        title: CPU Cores per Worker
        type: integer
        default: 8
      worker_memory:
        title: Memory per Worker (MB)
        type: integer
        default: 16384&lt;/code&gt;&lt;/pre&gt;&lt;h3 id=&#34;template-steps&#34;&gt;Template Steps
&lt;/h3&gt;&lt;p&gt;Each template has three steps:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;steps:
  - id: generate
    name: Generate Configuration
    action: fetch:template
    input:
      url: ./skeleton/kubernetes
      targetPath: configurations/kubernetes
      values:
        name: ${{ parameters.name }}
        talos_version: ${{ parameters.talos_version }}
        cp_count: ${{ parameters.cp_count }}

  - id: generate-catalog
    name: Generate Backstage Catalog Entry
    action: fetch:template
    input:
      url: ./skeleton/backstage/catalog
      targetPath: backstage/catalog

  - id: publish
    name: Open Pull Request
    action: publish:github:pull-request
    input:
      repoUrl: github.com?repo=tf-infra-homelab&amp;amp;owner=your-org
      title: &amp;#34;feat: provision Kubernetes cluster ${{ parameters.name }}&amp;#34;&lt;/code&gt;&lt;/pre&gt;&lt;h3 id=&#34;user-flow&#34;&gt;User Flow
&lt;/h3&gt;&lt;p&gt;In Backstage, users:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Choose template&lt;/strong&gt; — &amp;ldquo;Provision Kubernetes Cluster&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Fill parameters&lt;/strong&gt; — name, environment, node sizes&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Submit&lt;/strong&gt; — opens a PR automatically&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Review&lt;/strong&gt; — maintainers approve the PR&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Apply&lt;/strong&gt; — Terraform provisions the cluster&lt;/li&gt;
&lt;/ol&gt;
&lt;pre class=&#34;mermaid&#34;&gt;
  sequenceDiagram
    participant User
    participant BS as Backstage
    participant GH as GitHub
    participant TF as Terraform
    participant PVE as Proxmox
    participant Talos
    participant Flux
    
    User-&amp;gt;&amp;gt;BS: Create new cluster (fill form)
    BS-&amp;gt;&amp;gt;GH: Open PR with config files
    Maintainer-&amp;gt;&amp;gt;GH: Review and approve PR
    GH-&amp;gt;&amp;gt;TF: Merge triggers apply
    TF-&amp;gt;&amp;gt;PVE: Provision VMs
    PVE-&amp;gt;&amp;gt;Talos: Bootstrap cluster
    Talos-&amp;gt;&amp;gt;Flux: Install GitOps
&lt;/pre&gt;

&lt;h2 id=&#34;generation-script&#34;&gt;Generation Script
&lt;/h2&gt;&lt;p&gt;The full script in &lt;code&gt;scripts/generate_backstage_catalog.py&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-python&#34;&gt;#!/usr/bin/env python3
&amp;#34;&amp;#34;&amp;#34;Generate Backstage catalog Resource entities from configuration YAML files.&amp;#34;&amp;#34;&amp;#34;

import yaml
from pathlib import Path

REPO_ROOT = Path(__file__).resolve().parent.parent

def load_yaml(path: Path) -&amp;gt; dict:
    return yaml.safe_load(path.read_text()) or {}

def build_resource_entity(resource_type, environment_name, config):
    &amp;#34;&amp;#34;&amp;#34;Build a Backstage Resource entity from configuration.&amp;#34;&amp;#34;&amp;#34;
    entity = {
        &amp;#34;apiVersion&amp;#34;: &amp;#34;backstage.io/v1alpha1&amp;#34;,
        &amp;#34;kind&amp;#34;: &amp;#34;Resource&amp;#34;,
        &amp;#34;metadata&amp;#34;: {
            &amp;#34;name&amp;#34;: config.get(&amp;#34;name&amp;#34;, environment_name),
            &amp;#34;description&amp;#34;: config.get(&amp;#34;description&amp;#34;, &amp;#34;&amp;#34;),
            &amp;#34;annotations&amp;#34;: {...},
            &amp;#34;labels&amp;#34;: {...},
        },
        &amp;#34;spec&amp;#34;: {
            &amp;#34;type&amp;#34;: RESOURCE_TYPE_META[resource_type][&amp;#34;backstage_type&amp;#34;],
            &amp;#34;lifecycle&amp;#34;: &amp;#34;experimental&amp;#34;,
            &amp;#34;owner&amp;#34;: &amp;#34;group:default/homelab-admins&amp;#34;,
            &amp;#34;system&amp;#34;: &amp;#34;tf-infra-homelab&amp;#34;,
        },
    }
    return entity

def main():
    for config_file in (REPO_ROOT / &amp;#34;configurations&amp;#34;).rglob(&amp;#34;*.yaml&amp;#34;):
        config = load_yaml(config_file)
        entity = build_resource_entity(...)
        output_file.write_text(yaml.dump(entity))&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Run via Makefile:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-makefile&#34;&gt;backstage-catalog:
  python3 scripts/generate_backstage_catalog.py&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;filtering-in-backstage&#34;&gt;Filtering in Backstage
&lt;/h2&gt;&lt;p&gt;With the generated metadata, users can filter in Backstage:&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Filter&lt;/th&gt;
          &lt;th&gt;Use Case&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;code&gt;homelab.dev/environment=prod&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;Production clusters&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;code&gt;homelab.dev/enabled=true&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;Currently deployed&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;code&gt;homelab.dev/control-plane-count=3&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;Full quorum&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;code&gt;homelab.dev/size-worker-memory&amp;gt;=16384&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;Large workers&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;code&gt;homelab.dev/talos-version=v1.12.*&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;Specific Talos version&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&#34;what-most-people-get-wrong&#34;&gt;What Most People Get Wrong
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&amp;ldquo;Backstage is only for Kubernetes&amp;rdquo;&lt;/strong&gt; — It catalogs anything. My LXC, VMs, Docker clusters all have Backstage entries.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&amp;ldquo;Templates replace code review&amp;rdquo;&lt;/strong&gt; — My templates generate PRs. Human review still happens. Self-service ≠ unattended.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&amp;ldquo;Catalog must be perfect at launch&amp;rdquo;&lt;/strong&gt; — Start simple. The YAML-to-catalog pipeline can always regenerate.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;when-to-use--when-not-to-use&#34;&gt;When to Use / When NOT to Use
&lt;/h2&gt;&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Use Backstage&lt;/th&gt;
          &lt;th&gt;Use direct Terraform&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;Team self-service&lt;/td&gt;
          &lt;td&gt;Single admin&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;10+ resources&lt;/td&gt;
          &lt;td&gt;&amp;lt;5 resources&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Need catalog UI&lt;/td&gt;
          &lt;td&gt;CLI is enough&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&#34;whats-next&#34;&gt;What&amp;rsquo;s Next
&lt;/h2&gt;&lt;p&gt;Current areas of exploration:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;More templates&lt;/strong&gt; — VM and LXC provisioning templates&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Approval workflows&lt;/strong&gt; — notification to maintainers&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Status tracking&lt;/strong&gt; — integration with Terraform Cloud state&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The Backstage integration makes infrastructure self-serviceable while maintaining code review.&lt;/p&gt;
</description>
        </item>
        <item>
        <title>High-Availability Docker Swarm on Proxmox</title>
        <link>https://zharif.my/posts/docker-swarm-proxmox/</link>
        <pubDate>Fri, 10 Apr 2026 00:00:00 +0000</pubDate>
        
        <guid>https://zharif.my/posts/docker-swarm-proxmox/</guid>
        <description>&lt;img src="https://images.unsplash.com/photo-1605745341112-85968b19335b?w=800&amp;h=400&amp;fit=crop" alt="Featured image of post High-Availability Docker Swarm on Proxmox" /&gt;&lt;h2 id=&#34;why-docker-swarm-not-kubernetes&#34;&gt;Why Docker Swarm (Not Kubernetes)
&lt;/h2&gt;&lt;p&gt;Kubernetes is the standard, but for 5-10 containers, it&amp;rsquo;s overkill. Docker Swarm gives you:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Service discovery with zero configuration&lt;/li&gt;
&lt;li&gt;Built-in load balancing&lt;/li&gt;
&lt;li&gt;Rolling updates without custom tooling&lt;/li&gt;
&lt;li&gt;Single-node control plane if needed&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The trade-off: advanced scheduling or custom CNIs need Kubernetes. For Home Assistant, Jellyfin, and WireGuard? Swarm is simpler.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Hardware constraint&lt;/strong&gt;: My GPU passthrough works on LXC (device nodes), which is simpler than PCI passthrough on VMs. This determines the choice.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Proxmox LXC containers with device nodes bypass the need for PCI passthrough. This simplifies GPU access significantly.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&#34;module-capabilities&#34;&gt;Module Capabilities
&lt;/h2&gt;&lt;p&gt;The &lt;code&gt;tf-module-proxmox-docker&lt;/code&gt; module provisions Docker containers or VMs with Docker Engine installed, optionally forming a Docker Swarm cluster:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Multi-node provisioning&lt;/strong&gt; — creates LXC or VM nodes across host pool&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Docker installation&lt;/strong&gt; — installs Docker Engine via cloud-init&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Keepalived integration&lt;/strong&gt; — optional VIP for high availability&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;GPU device passthrough&lt;/strong&gt; — passes through /dev/apex_0, /dev/dri/* for hardware acceleration&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Host pool scheduling&lt;/strong&gt; — round-robin distribution across Proxmox nodes&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;quick-start&#34;&gt;Quick Start
&lt;/h2&gt;&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;module &amp;#34;docker_cluster&amp;#34; {
  source  = &amp;#34;registry.example.com/namespace/tf-module-proxmox-docker/docker&amp;#34;
  version = &amp;#34;1.2.3&amp;#34;

  configuration = {
    cluster = {
      name = &amp;#34;prod-docker&amp;#34;
      type = &amp;#34;lxc&amp;#34;  # or &amp;#34;vm&amp;#34;
      datastore = { id = &amp;#34;nas&amp;#34;, node = &amp;#34;alpha&amp;#34; }
    }

    host_pool = [
      { name = &amp;#34;alpha&amp;#34;, datastore_id = &amp;#34;local-lvm&amp;#34; },
      { name = &amp;#34;charlie&amp;#34;, datastore_id = &amp;#34;local-lvm&amp;#34; },
      { name = &amp;#34;foxtrot&amp;#34;, datastore_id = &amp;#34;local-lvm&amp;#34; }
    ]

    worker_nodes = [
      {
        size = &amp;#34;medium&amp;#34;
        networks = { dmz = { address = &amp;#34;192.168.61.21/24&amp;#34;, gateway = &amp;#34;192.168.61.1&amp;#34; } }
        vip = { state = &amp;#34;MASTER&amp;#34;, priority = 100, interface = &amp;#34;dmz&amp;#34; }
      },
      {
        size = &amp;#34;medium&amp;#34;
        networks = { dmz = { address = &amp;#34;192.168.61.22/24&amp;#34;, gateway = &amp;#34;192.168.61.1&amp;#34; } }
        vip = { state = &amp;#34;BACKUP&amp;#34;, priority = 90, interface = &amp;#34;dmz&amp;#34; }
      }
    ]

    node_size_configuration = {
      medium = { cpu = 8, memory = 32768, os_disk = 256 }
    }

    vip = { enabled = true, address = &amp;#34;192.168.61.20&amp;#34; }
  }
}&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;lxc-vs-vm&#34;&gt;LXC vs VM
&lt;/h2&gt;&lt;p&gt;The module supports both container and VM backends:&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Aspect&lt;/th&gt;
          &lt;th&gt;LXC&lt;/th&gt;
          &lt;th&gt;VM&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;Resource overhead&lt;/td&gt;
          &lt;td&gt;Minimal&lt;/td&gt;
          &lt;td&gt;Full hypervisor&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;GPU passthrough&lt;/td&gt;
          &lt;td&gt;Device nodes&lt;/td&gt;
          &lt;td&gt;Full PCI&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Nesting support&lt;/td&gt;
          &lt;td&gt;No&lt;/td&gt;
          &lt;td&gt;Yes&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Use case&lt;/td&gt;
          &lt;td&gt;Simple containers&lt;/td&gt;
          &lt;td&gt;Full VMs&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;# LXC-based (type = &amp;#34;lxc&amp;#34;)
configuration = {
  cluster = {
    type = &amp;#34;lxc&amp;#34;
  }
}

# VM-based (type = &amp;#34;vm&amp;#34;)
configuration = {
  cluster = {
    type = &amp;#34;vm&amp;#34;
  }
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The VM provisioner downloads a cloud image and imports it:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;resource &amp;#34;proxmox_download_file&amp;#34; &amp;#34;vm_image&amp;#34; {
  content_type    = &amp;#34;iso&amp;#34;
  datastore_id  = var.configuration.cluster.datastore.id
  file_name    = &amp;#34;docker-base.iso&amp;#34;
  url          = var.configuration.node_os_configuration[var.configuration.cluster.type].template_image_url
}&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;host-pool-scheduling&#34;&gt;Host Pool Scheduling
&lt;/h2&gt;&lt;p&gt;VMs are distributed across Proxmox nodes via modulo arithmetic:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;# In nodes.tf
node_name = var.configuration.host_pool[
  each.key % length(var.configuration.host_pool)
].name&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;With 3 nodes and 3 node indices:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Node 0 → alpha (0 % 3)&lt;/li&gt;
&lt;li&gt;Node 1 → charlie (1 % 3)&lt;/li&gt;
&lt;li&gt;Node 2 → foxtrot (2 % 3)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This ensures even distribution across the cluster for resilience.&lt;/p&gt;
&lt;h2 id=&#34;keepalived-ha&#34;&gt;Keepalived HA
&lt;/h2&gt;&lt;p&gt;For high availability, Keepalived provides a floating VIP:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;configuration = {
  vip = {
    enabled  = true
    address  = &amp;#34;192.168.61.20&amp;#34;
    router_id = 20
  }
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Each node is configured with its role:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;worker_nodes = [
  {
    size = &amp;#34;medium&amp;#34;
    networks = { dmz = { address = &amp;#34;192.168.61.21/24&amp;#34;, gateway = &amp;#34;192.168.61.1&amp;#34; } }
    vip = { state = &amp;#34;MASTER&amp;#34;, priority = 100, interface = &amp;#34;dmz&amp;#34; }
  },
  {
    size = &amp;#34;medium&amp;#34;
    networks = { dmz = { address = &amp;#34;192.168.61.22/24&amp;#34;, gateway = &amp;#34;192.168.61.1&amp;#34; } }
    vip = { state = &amp;#34;BACKUP&amp;#34;, priority = 90, interface = &amp;#34;dmz&amp;#34; }
  },
  {
    size = &amp;#34;medium&amp;#34;
    networks = { dmz = { address = &amp;#34;192.168.61.23/24&amp;#34;, gateway = &amp;#34;192.168.61.1&amp;#34; } }
    vip = { state = &amp;#34;BACKUP&amp;#34;, priority = 80, interface = &amp;#34;dmz&amp;#34; }
  }
]&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The module generates Keepalived configuration:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;resource &amp;#34;proxmox_virtual_environment_file&amp;#34; &amp;#34;keepalived_config&amp;#34; {
  content = &amp;lt;&amp;lt;-EOF
    vrrp_instance VI_1 {
        state ${node.vip.state}
        interface ${node.vip.interface}
        virtual_router_id ${var.configuration.vip.router_id}
        priority ${node.vip.priority}
        virtual_ipaddress {
            ${var.configuration.vip.address}
        }
    }
  EOF
}&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;gpu-passthrough&#34;&gt;GPU Passthrough
&lt;/h2&gt;&lt;p&gt;For hardware acceleration (e.g., transcodeing, ML workloads), device passthrough is configured in the host pool:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;host_pool = [
  {
    name = &amp;#34;alpha&amp;#34;
    device_map = [
      { device = &amp;#34;/dev/apex_0&amp;#34;, mode = &amp;#34;0666&amp;#34; }      # GPU
      { device = &amp;#34;/dev/dri/renderD128&amp;#34;, mode = &amp;#34;0666&amp;#34; }  # iGPU
      { device = &amp;#34;/dev/dri/card1&amp;#34;, mode = &amp;#34;0666&amp;#34; }
    ]
    datastore_id = &amp;#34;local-lvm&amp;#34;
  }
]&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The devices are passed through to containers:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;resource &amp;#34;proxmox_virtual_environment_container&amp;#34; &amp;#34;this&amp;#34; {
  # ...
  
 devices_passthrough = [
    for device in var.configuration.host_pool[each.key % length(var.configuration.host_pool)].device_map : {
      path = device.device
      mode = device.mode
    }
  ]
}&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;docker-installation&#34;&gt;Docker Installation
&lt;/h2&gt;&lt;p&gt;Docker is installed via cloud-init:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;resource &amp;#34;proxmox_virtual_environment_file&amp;#34; &amp;#34;cloud_config&amp;#34; {
  content = &amp;lt;&amp;lt;-EOF
#cloud-config
package_update: true
packages:
  - docker.io
  - docker-compose
runcmd:
  - systemctl enable docker
  - systemctl start docker
  - usermod -aG docker root
EOF
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Or for more complex setups, custom post-install commands:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;node_os_configuration = {
  debian = {
    family = &amp;#34;debian&amp;#34;
    template_image_url = &amp;#34;https://...&amp;#34;
    packages = [&amp;#34;docker.io&amp;#34;, &amp;#34;docker-compose&amp;#34;]
    package_manager = {
      install_command = &amp;#34;apt-get install -y&amp;#34;
    }
    post_install_commands = [
      &amp;#34;systemctl enable docker&amp;#34;,
      &amp;#34;usermod -aG docker root&amp;#34;
    ]
  }
}&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;multi-network-support&#34;&gt;Multi-Network Support
&lt;/h2&gt;&lt;p&gt;The module supports multiple network interfaces per node:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;networks = {
  dmz = {
    address = &amp;#34;192.168.61.21/24&amp;#34;
    gateway = &amp;#34;192.168.61.1&amp;#34;
  }
  vmbr1 = {
    address = &amp;#34;192.168.192.121/25&amp;#34;
  }
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This maps to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;dmz&lt;/strong&gt; — frontend network with gateway (for public access)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;vmbr1&lt;/strong&gt; — backend network (for inter-node communication)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;node-sizing&#34;&gt;Node Sizing
&lt;/h2&gt;&lt;p&gt;The &lt;code&gt;node_size_configuration&lt;/code&gt; block keeps definitions DRY:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;node_size_configuration = {
  small = {
    cpu     = 2
    memory  = 512
    os_disk = 20
  }
  medium = {
    cpu     = 8
    memory  = 32768
    os_disk = 256
  }
  large = {
    cpu     = 16
    memory  = 65536
    os_disk = 512
  }
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;My production cluster uses medium nodes (8 vCPU, 32GB RAM, 256GB disk).&lt;/p&gt;
&lt;h2 id=&#34;optional-tools&#34;&gt;Optional Tools
&lt;/h2&gt;&lt;p&gt;The module can provision additional tools:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;configuration = {
  cluster = {
    options = {
      # Hawser - container management
      hawser = {
        enabled = true
        image   = &amp;#34;harbor.example.com/gh/finsys/hawser:latest&amp;#34;
      }
      
      # Newt - container log viewer  
      newt = {
        enabled = true
        image   = &amp;#34;harbor.example.com/dh/fosrl/newt:latest&amp;#34;
        endpoint = &amp;#34;https://newt.example.com&amp;#34;
      }
      
      # APT cache for faster downloads
      apt_cache = {
        enabled = true
        url     = &amp;#34;https://apt.example.com/&amp;#34;
      }
    }
  }
}&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;my-production-configuration&#34;&gt;My Production Configuration
&lt;/h2&gt;&lt;p&gt;Here&amp;rsquo;s the actual production YAML configuration:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;# configurations/docker/prod-docker-lxc.yaml
name: prod-docker-lxc
enabled: true

cluster:
  name: prod-docker-lxc
  type: lxc
  datastore:
    id: nas
    node: alpha

host_pool:
  - name: alpha
    device_map:
      - device: /dev/apex_0
        mode: &amp;#34;0666&amp;#34;
      - device: /dev/dri/renderD128
        mode: &amp;#34;0666&amp;#34;
      - device: /dev/dri/card1
        mode: &amp;#34;0666&amp;#34;
    datastore_id: local-lvm
  - name: charlie
    device_map:
      - device: /dev/dri/renderD128
        mode: &amp;#34;0666&amp;#34;
      - device: /dev/dri/card0
        mode: &amp;#34;0666&amp;#34;
    datastore_id: local-lvm
  - name: foxtrot
    device_map:
      - device: /dev/apex_0
        mode: &amp;#34;0666&amp;#34;
      - device: /dev/dri/renderD128
        mode: &amp;#34;0666&amp;#34;
      - device: /dev/dri/card0
        mode: &amp;#34;0666&amp;#34;
    datastore_id: local-lvm

worker_nodes:
  - size: medium
    networks:
      dmz:
        address: 192.168.61.21/24
        gateway: 192.168.61.1
      vmbr1:
        address: 192.168.192.121/24
    vip:
      state: MASTER
      priority: 100
      interface: dmz
  - size: medium
    networks:
      dmz:
        address: 192.168.61.22/24
        gateway: 192.168.61.1
      vmbr1:
        address: 192.168.192.122/24
    vip:
      state: BACKUP
      priority: 90
      interface: dmz
  - size: medium
    networks:
      dmz:
        address: 192.168.61.23/24
        gateway: 192.168.61.1
      vmbr1:
        address: 192.168.192.123/24
    vip:
      state: BACKUP
      priority: 80
      interface: dmz

node_size_configuration:
  medium:
    cpu: 8
    memory: 32768
    os_disk: 256

vip:
  enabled: true
  address: 192.168.61.20
  router_id: 20&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;outputs&#34;&gt;Outputs
&lt;/h2&gt;&lt;p&gt;The module returns node credentials for access:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;output &amp;#34;nodes_credentials&amp;#34; {
  value = {
    password = random_password.node_root_password.result
    ssh_key = tls_private_key.node_root_ssh_key.private_key_pem
    hawser_token = random_uuid.hawser_token.id
  }
}

output &amp;#34;nodes_configurations&amp;#34; {
  value = {
    for idx, node in proxmox_virtual_environment_container.this : idx =&amp;gt; {
      id      = node.id
      name    = node.name
      node    = node.node
      ip      = node.ip_addresses[0]
    }
  }
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Credentials are automatically stored in Bitwarden:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;resource &amp;#34;bitwarden-secrets_secret&amp;#34; &amp;#34;docker_nodes_password&amp;#34; {
  key   = &amp;#34;${local.cluster_name}-nodes_password&amp;#34;
  value = module.docker[0].nodes_credentials.password.result
}

resource &amp;#34;bitwarden-secrets_secret&amp;#34; &amp;#34;docker_nodes_ssh_key&amp;#34; {
  key   = &amp;#34;${local.cluster_name}-nodes_ssh_key&amp;#34;
  value = module.docker[0].nodes_credentials.ssh_key.private_key_pem
}&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;use-cases&#34;&gt;Use Cases
&lt;/h2&gt;&lt;p&gt;This cluster handles workloads like:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Home Assistant&lt;/strong&gt; — Docker Compose-based home automation&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Media services&lt;/strong&gt; — Plex, Jellyfin with GPU transcoding&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;VPN services&lt;/strong&gt; — WireGuard, OpenVPN&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;CI runners&lt;/strong&gt; — GitHub Actions self-hosted runners&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The hardware acceleration via GPU passthrough is critical for media workloads.&lt;/p&gt;
&lt;h2 id=&#34;what-most-people-get-wrong&#34;&gt;What Most People Get Wrong
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&amp;ldquo;Docker Swarm is dead&amp;rdquo;&lt;/strong&gt; — It&amp;rsquo;s not Kubernetes, but for 10-container workloads, it&amp;rsquo;s simpler. No RBAC complexity, no CNI headaches.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;GPU passthrough works on LXC&lt;/strong&gt; — Most guides assume PCI passthrough (VMs). With device nodes (&lt;code&gt;/dev/apex_0&lt;/code&gt;, &lt;code&gt;/dev/dri/*&lt;/code&gt;), LXC containers access GPUs directly.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Keepalived needs 3 nodes for quorum&lt;/strong&gt; — Two nodes work fine with &lt;code&gt;nopreempt&lt;/code&gt; on the master. The backup only takes over if master fails.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;when-to-use--when-not-to-use&#34;&gt;When to Use / When NOT to Use
&lt;/h2&gt;&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Use Docker Swarm&lt;/th&gt;
          &lt;th&gt;Use Kubernetes&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;3-15 containers&lt;/td&gt;
          &lt;td&gt;50+ containers&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Simple networking&lt;/td&gt;
          &lt;td&gt;Custom CNI required&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Single admin&lt;/td&gt;
          &lt;td&gt;Team with RBAC needs&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;GPU passthrough via LXC&lt;/td&gt;
          &lt;td&gt;GPU operators&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&#34;whats-next&#34;&gt;What&amp;rsquo;s Next
&lt;/h2&gt;&lt;p&gt;Current areas of exploration:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;GPU scheduling&lt;/strong&gt; — Kubernetes-style GPU scheduling for Docker&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Portainer integration&lt;/strong&gt; — management UI for Docker&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Observability&lt;/strong&gt; — centralized logging with Loki&lt;/li&gt;
&lt;/ol&gt;
</description>
        </item>
        <item>
        <title>Talos Kubernetes on Proxmox</title>
        <link>https://zharif.my/posts/talos-kubernetes-proxmox/</link>
        <pubDate>Sat, 28 Mar 2026 00:00:00 +0000</pubDate>
        
        <guid>https://zharif.my/posts/talos-kubernetes-proxmox/</guid>
        <description>&lt;img src="https://images.unsplash.com/photo-1667372459510-55b5e2087cd0?w=800&amp;h=400&amp;fit=crop" alt="Featured image of post Talos Kubernetes on Proxmox" /&gt;&lt;h2 id=&#34;why-talos&#34;&gt;Why Talos
&lt;/h2&gt;&lt;p&gt;Kubernetes the hard way (kubeadm) requires 15+ steps and manual certificate management. Talos gives you a declarative cluster that manages its own certificates, API server rotation, and upgrades — all through a single machine config.&lt;/p&gt;
&lt;p&gt;The trade-off: Talos is opinionated. You don&amp;rsquo;t get a traditional kubelet. But for infrastructure that should just work, this is a feature.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Air-gapped requirement&lt;/strong&gt;: My homelab can&amp;rsquo;t reach public registries. Every container pull route redirects through my Harbor mirror. Talos&amp;rsquo;s registry mirror config makes this seamless.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Talos automatically rotates certificates before they expire. No manual intervention needed for cluster certificate management.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&#34;module-capabilities&#34;&gt;Module Capabilities
&lt;/h2&gt;&lt;p&gt;The &lt;code&gt;tf-module-proxmox-talos&lt;/code&gt; module provisions a complete Talos-based Kubernetes cluster on Proxmox VE in a single Terraform apply:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Talos Image Factory&lt;/strong&gt; — generates custom ISOs with specific extensions&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Machine Configuration&lt;/strong&gt; — generates Talos machine configs with networking&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;ISO Upload&lt;/strong&gt; — downloads and uploads to Proxmox datastore&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Node Provisioning&lt;/strong&gt; — provisions control plane and worker VMs across host pool&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cluster Bootstrap&lt;/strong&gt; — applies machine configs and bootstraps Kubernetes&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Day-0 GitOps&lt;/strong&gt; — optionally installs Flux or Argo CD during bootstrap&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Registry Mirrors&lt;/strong&gt; — configures container registry redirects&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;quick-start&#34;&gt;Quick Start
&lt;/h2&gt;&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;module &amp;#34;talos_cluster&amp;#34; {
  source  = &amp;#34;registry.example.com/namespace/tf-module-proxmox-talos/talos&amp;#34;
  version = &amp;#34;1.2.1&amp;#34;

  configuration = {
    cluster = {
      name = &amp;#34;prod-k8s&amp;#34;
      datastore = { id = &amp;#34;nas&amp;#34;, node = &amp;#34;alpha&amp;#34; }
      talos = { version = &amp;#34;v1.12.4&amp;#34; }
      kubernetes_version = &amp;#34;v1.35.0&amp;#34;
    }

    host_pool = {
      alpha = { datastore_id = &amp;#34;local-lvm&amp;#34; }
      charlie = { datastore_id = &amp;#34;local-lvm&amp;#34; }
      foxtrot = { datastore_id = &amp;#34;local-lvm&amp;#34; }
    }

    control_plane_nodes = {
      nodes = [
        { size = &amp;#34;control_plane&amp;#34;, networks = { dmz = { address = &amp;#34;192.168.62.21/24&amp;#34;, gateway = &amp;#34;192.168.62.1&amp;#34; } } }
      ]
      host_pool = [&amp;#34;alpha&amp;#34;, &amp;#34;charlie&amp;#34;, &amp;#34;foxtrot&amp;#34;]
      vip = { enabled = true, address = &amp;#34;192.168.62.20&amp;#34; }
    }

    worker_nodes = {
      nodes = [
        { size = &amp;#34;worker&amp;#34;, networks = { dmz = { address = &amp;#34;192.168.62.24/24&amp;#34;, gateway = &amp;#34;192.168.62.1&amp;#34; } } }
      ]
      host_pool = [&amp;#34;alpha&amp;#34;, &amp;#34;charlie&amp;#34;, &amp;#34;foxtrot&amp;#34;]
    }

    node_size_configuration = {
      control_plane = { cpu = 4, memory = 8192, os_disk = 128 }
      worker = { cpu = 10, memory = 49152, os_disk = 128, data_disk = 512 }
    }
  }
}&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;talos-image-factory&#34;&gt;Talos Image Factory
&lt;/h2&gt;&lt;p&gt;The module uses Talos&amp;rsquo;s image factory to generate custom ISOs with specific extensions:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;# image.tf
resource &amp;#34;talos_image_factory_schematic&amp;#34; &amp;#34;this&amp;#34; {
  schematic = yamlencode({
    customization = {
      systemExtensions = {
        officialExtensions = data.talos_image_factory_extensions_versions.this.extensions_info[*].name
      }
    }
  })
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The extensions are defined in locals:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;locals {
  image = {
    platform = &amp;#34;nocloud&amp;#34;
    customizations = {
      base = [
        &amp;#34;lldp&amp;#34;,           # Network topology discovery
        &amp;#34;qemu-guest-agent&amp;#34;,  # Proxmox agent integration
        &amp;#34;util-linux-tools&amp;#34;,   # Core utilities
        &amp;#34;iscsi-tools&amp;#34;,       # NFS storage
        &amp;#34;nfs-utils&amp;#34;        # NFS mounting
      ]
    }
  }
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The generated schematic ID is used to construct the ISO URL:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;resource &amp;#34;proxmox_download_file&amp;#34; &amp;#34;talos_iso&amp;#34; {
  file_name = &amp;#34;talos-${var.configuration.cluster.name}-${var.configuration.cluster.talos.version}-${data.talos_image_factory_urls.this.schematic_id}.iso&amp;#34;
  url      = var.configuration.cluster.talos.iso_mirror != null
    ? replace(data.talos_image_factory_urls.this.urls.iso, &amp;#34;https://&amp;#34;, var.configuration.cluster.talos.iso_mirror)
    : data.talos_image_factory_urls.this.urls.iso
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This allows using mirror registries for air-gapped environments.&lt;/p&gt;
&lt;h2 id=&#34;machine-configuration&#34;&gt;Machine Configuration
&lt;/h2&gt;&lt;p&gt;Talos machine configuration is generated through the Talos provider:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;data &amp;#34;talos_machine_configuration&amp;#34; &amp;#34;configurations&amp;#34; {
  cluster_name     = var.configuration.cluster.name
  cluster_version = var.configuration.cluster.talos.version
  
  # Control plane specific config
  machine_type = &amp;#34;controlplane&amp;#34;
  
  # Network configuration
  network = {
    interfaces = [
      for idx, network in var.configuration.control_plane_nodes.nodes[0].networks : {
        interface = keys(network.networks)[0]
       DHCP    = false
        addresses = [values(network.networks)[0].address]
      }
    ]
  }
  
  # Kubernetes configuration  
  kubernetes = {
    version = var.configuration.cluster.kubernetes_version
  }
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The configuration supports:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Multiple network interfaces per node&lt;/li&gt;
&lt;li&gt;Registry mirrors for all major registries&lt;/li&gt;
&lt;li&gt;Custom CNI (Cilium) configuration&lt;/li&gt;
&lt;li&gt;kube-proxy disable&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;registry-mirrors&#34;&gt;Registry Mirrors
&lt;/h3&gt;&lt;p&gt;A key feature is container registry mirror configuration:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;configuration = {
  cluster = {
    registry_mirrors = {
      &amp;#34;ghcr.io&amp;#34; = {
        endpoints = [&amp;#34;https://harbor.example.com/v2/gh&amp;#34;]
        override_path = true
      }
      &amp;#34;registry.k8s.io&amp;#34; = {
        endpoints = [&amp;#34;https://harbor.example.com/v2/k8s&amp;#34;]
        override_path = true
      }
      &amp;#34;docker.io&amp;#34; = {
        endpoints = [&amp;#34;https://harbor.example.com/v2/dh&amp;#34;]
        override_path = true
      }
      &amp;#34;quay.io&amp;#34; = {
        endpoints = [&amp;#34;https://harbor.example.com/v2/qi&amp;#34;]
        override_path = true
      }
      &amp;#34;factory.talos.dev&amp;#34; = {
        endpoints = [&amp;#34;https://harbor.example.com/v2/talos&amp;#34;]
        override_path = true
      }
    }
  }
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;All container pulls route through my Harbor registry — essential for air-gapped homelabs.&lt;/p&gt;
&lt;h2 id=&#34;multi-network-support&#34;&gt;Multi-Network Support
&lt;/h2&gt;&lt;p&gt;The module provisions VMs with multiple network interfaces:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;network_devices = [
  for network_name, network in each.value.networks : {
    name    = network_name
    enabled = true
    bridge  = network_name
    ipv4 = {
      address = network.address
      gateway = network.gateway
    }
  }
]&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;My production setup uses:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;dmz&lt;/strong&gt; — frontend network with gateway (192.168.62.0/24)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;vmbr1&lt;/strong&gt; — backend network for inter-node communication (192.168.192.0/24)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;cluster-bootstrap&#34;&gt;Cluster Bootstrap
&lt;/h2&gt;&lt;p&gt;The bootstrap sequence is orchestrated by Terraform:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;# 1. Generate machine secrets
resource &amp;#34;talos_machine_secrets&amp;#34; &amp;#34;this&amp;#34; {}

# 2. Apply control plane configuration
resource &amp;#34;talos_machine_configuration_apply&amp;#34; &amp;#34;controlplane&amp;#34; {
  for_each = { for idx, node in var.configuration.control_plane_nodes.nodes : idx =&amp;gt; node }
  
  node =  module.control_plane_virtual_machine[each.key].virtual_machine.id
  config = data.talos_machine_configuration.configurations[each.key].machine_config
  secrets = talos_machine_secrets.this.secrets
}

# 3. Bootstrap the cluster
resource &amp;#34;talos_machine_bootstrap&amp;#34; &amp;#34;this&amp;#34; {
  node = var.configuration.control_plane_nodes.nodes[0].name
  config = data.talos_machine_configuration.configurations[0].machine_config
  secrets = talos_machine_secrets.this.secrets
}&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;gitops-bootstrap&#34;&gt;GitOps Bootstrap
&lt;/h2&gt;&lt;p&gt;One of the most powerful features — Flux or ArgoCD can be bootstrapped during cluster creation:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;configuration = {
  cluster = {
    gitops = {
      provider      = &amp;#34;flux&amp;#34;  # or &amp;#34;argocd&amp;#34;
      namespace     = &amp;#34;flux-system&amp;#34;
      chart_version = &amp;#34;2.18.2&amp;#34;
      
      bootstrap = {
        repo_url              = &amp;#34;https://github.com/your-org/applications.git&amp;#34;
        revision              = &amp;#34;main&amp;#34;
        path                  = &amp;#34;src/k8s/prod&amp;#34;
        destination_namespace = &amp;#34;homelab&amp;#34;
      }
    }
  }
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This does:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Installs Flux during Talos bootstrap (via inline manifest)&lt;/li&gt;
&lt;li&gt;Configures it to sync from the applications-homelab repository&lt;/li&gt;
&lt;li&gt;The cluster starts deploying apps immediately after boot&lt;/li&gt;
&lt;/ol&gt;
&lt;pre class=&#34;mermaid&#34;&gt;
  sequenceDiagram
    participant TF as Terraform
    participant Talos as Talos
    participant Flux as Flux
    participant GH as GitHub
    participant K8s as Kubernetes
    
    TF-&amp;gt;&amp;gt;Talos: Apply machine config
    Talos-&amp;gt;&amp;gt;Talos: Bootstrap control plane
    Talos-&amp;gt;&amp;gt;Flux: Install Flux CRDs
    Flux-&amp;gt;&amp;gt;GH: Clone applications-homelab
    GH--&amp;gt;&amp;gt;Flux: Return repo contents
    Flux-&amp;gt;&amp;gt;K8s: Deploy applications
&lt;/pre&gt;

&lt;h2 id=&#34;cilium-integration&#34;&gt;Cilium Integration
&lt;/h2&gt;&lt;p&gt;For advanced networking, the default CNI can be replaced with bundled Cilium:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;configuration = {
  cluster = {
    # Disable Talos-managed CNI
    options = {
      disable_default_cni = true
      disable_kube_proxy = true
    }
    
    # Configure Cilium via helm values
    helm_values_override = {
      cilium = {
        operator = { replicas = 1 }
      }
    }
  }
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The module uses the Helm provider to template the Cilium manifest:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;data &amp;#34;helm_template&amp;#34; &amp;#34;cilium&amp;#34; {
  name      = &amp;#34;cilium&amp;#34;
  repo      = &amp;#34;https://cilium.github.io/cilium&amp;#34;
  chart     = &amp;#34;cilium&amp;#34;
  version   = var.configuration.cluster.talos.version
  namespace = &amp;#34;cilium&amp;#34;
  values    = [var.configuration.cluster.helm_values_override]
}&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;node-sizing&#34;&gt;Node Sizing
&lt;/h2&gt;&lt;p&gt;The &lt;code&gt;node_size_configuration&lt;/code&gt; block keeps definitions DRY:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;node_size_configuration = {
  control_plane = {
    cpu     = 4
    memory  = 8192   # MB
    os_disk = 128     # GB
  }
  worker = {
    cpu      = 10
    memory  = 49152  # MB
    os_disk = 128
    data_disk = 512  # Extra data disk for PVs
  }
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;My prod-k8s cluster:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;3 control plane nodes: 4 vCPU, 8GB RAM, 128GB disk&lt;/li&gt;
&lt;li&gt;3 worker nodes: 10 vCPU, 48GB RAM, 128GB OS + 512GB data&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;host-pool-scheduling&#34;&gt;Host Pool Scheduling
&lt;/h2&gt;&lt;p&gt;VMs are distributed across Proxmox nodes via modulo arithmetic:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;# In nodes.tf
node_name = var.configuration.control_plane_nodes.host_pool[
  each.key % length(var.configuration.control_plane_nodes.host_pool)
]&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;With 3 nodes and 6 worker indices:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Worker 0 → alpha (0 % 3)&lt;/li&gt;
&lt;li&gt;Worker 1 → charlie (1 % 3)&lt;/li&gt;
&lt;li&gt;Worker 2 → foxtrot (2 % 3)&lt;/li&gt;
&lt;li&gt;Worker 3 → alpha (3 % 3)&lt;/li&gt;
&lt;li&gt;Worker 4 → charlie (4 % 3)&lt;/li&gt;
&lt;li&gt;Worker 5 → foxtrot (5 % 3)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This ensures even distribution across the cluster.&lt;/p&gt;
&lt;h2 id=&#34;outputs&#34;&gt;Outputs
&lt;/h2&gt;&lt;p&gt;The module returns cluster credentials for external use:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;output &amp;#34;cluster_credentials&amp;#34; {
  value = {
    kubeconfig    = talos_cluster_kubeconfig.this.kubeconfig
    talosconfig = talos_client_configuration.this.talosconfig
    
    # Kubeconfig file is also written locally when debug = true
    talosconfig_path = local.talosconfig_path
    kubeconfig_path = local.kubeconfig_path
  }
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Credentials are automatically stored in Bitwarden:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-hcl&#34;&gt;resource &amp;#34;bitwarden-secrets_secret&amp;#34; &amp;#34;kubernetes_kubeconfig&amp;#34; {
  key   = &amp;#34;${local.cluster_name}-kubeconfig&amp;#34;
  value = module.kubernetes[0].cluster_credentials.kubeconfig
}

resource &amp;#34;bitwarden-secrets_secret&amp;#34; &amp;#34;kubernetes_talosconfig&amp;#34; {
  key   = &amp;#34;${local.cluster_name}-talosconfig&amp;#34;
  value = module.kubernetes[0].cluster_credentials.talosconfig
}&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;my-production-configuration&#34;&gt;My Production Configuration
&lt;/h2&gt;&lt;p&gt;Here&amp;rsquo;s the actual production YAML configuration:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;# configurations/kubernetes/prod-k8s.yaml
cluster:
  name: prod-k8s
  datastore:
    id: nas
    node: alpha
  talos:
    version: v1.12.4
    installer_mirror: harbor.example.com/talos
    iso_mirror: https://proxy.example.com/
  kubernetes_version: v1.35.0
  registry_mirrors:
    ghcr.io: { endpoints: [https://harbor.example.com/v2/gh], override_path: true }
    registry.k8s.io: { endpoints: [https://harbor.example.com/v2/k8s], override_path: true }
    docker.io: { endpoints: [https://harbor.example.com/v2/dh], override_path: true }
    quay.io: { endpoints: [https://harbor.example.com/v2/qi], override_path: true }
    factory.talos.dev: { endpoints: [https://harbor.example.com/v2/talos], override_path: true }
  options:
    disable_default_cni: true
    disable_kube_proxy: true
    disable_scheduling_on_control_plane: true
  gitops:
    provider: flux
    bootstrap:
      repo_url: https://github.com/your-org/applications.git
      path: src/k8s/prod
      destination_namespace: homelab

host_pool:
  alpha: { datastore_id: local-lvm }
  charlie: { datastore_id: local-lvm }
  foxtrot: { datastore_id: local-lvm }

control_plane_nodes:
  nodes: [...]  # 3 control planes
  host_pool: [alpha, charlie, foxtrot]
  vip: { enabled: true, address: 192.168.62.20 }

worker_nodes:
  nodes: [...]  # 3 workers
  host_pool: [alpha, charlie, foxtrot]

node_size_configuration:
  control_plane: { cpu: 4, memory: 8192, os_disk: 128 }
  worker: { cpu: 10, memory: 49152, os_disk: 128, data_disk: 512 }&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;whats-next&#34;&gt;What&amp;rsquo;s Next
&lt;/h2&gt;&lt;p&gt;Current areas of exploration:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Multi-cluster federation&lt;/strong&gt; — connecting Talos clusters for workload distribution&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Nested Talos&lt;/strong&gt; — running Talos inside Proxmox for testing&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Observability&lt;/strong&gt; — centralized logging with Loki and Grafana&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;what-most-people-get-wrong&#34;&gt;What Most People Get Wrong
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&amp;ldquo;Talos upgrades break clusters&amp;rdquo;&lt;/strong&gt; — With proper machine configs and registry mirrors, upgrades are rolling. The immutability is a feature, not a bug.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&amp;ldquo;Air-gapped is impossible&amp;rdquo;&lt;/strong&gt; — Talos&amp;rsquo; registry mirror config + image factory handles this. Your nodes don&amp;rsquo;t need public internet access.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&amp;ldquo;No kubelet means no logging&amp;rdquo;&lt;/strong&gt; — Talos has built-in &lt;code&gt;talosctl logs&lt;/code&gt; and &lt;code&gt;talosctl metrics&lt;/code&gt;. It&amp;rsquo;s different from Kubernetes logging, not less capable.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;when-to-use--when-not-to-use&#34;&gt;When to Use / When NOT to Use
&lt;/h2&gt;&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Use Talos&lt;/th&gt;
          &lt;th&gt;Stick with kubeadm&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;Want declarative infrastructure&lt;/td&gt;
          &lt;td&gt;Need full kubelet control&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Air-gapped environments&lt;/td&gt;
          &lt;td&gt;Custom init systems required&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Single apply to cluster&lt;/td&gt;
          &lt;td&gt;Manual certificate management needed&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&#34;use-talos&#34;&gt;Use Talos
&lt;/h2&gt;&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Use Talos&lt;/th&gt;
          &lt;th&gt;Stick with kubeadm&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;Want declarative infrastructure&lt;/td&gt;
          &lt;td&gt;Need full kubelet control&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Air-gapped environments&lt;/td&gt;
          &lt;td&gt;Custom init systems required&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Single apply to cluster&lt;/td&gt;
          &lt;td&gt;Manual certificate management needed&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The foundation is solid — every cluster can be versioned, reviewed, and rolled back.&lt;/p&gt;
</description>
        </item>
        <item>
        <title>Terraform-Driven Homelab Architecture</title>
        <link>https://zharif.my/posts/homelab-terraform-architecture/</link>
        <pubDate>Sun, 15 Mar 2026 00:00:00 +0000</pubDate>
        
        <guid>https://zharif.my/posts/homelab-terraform-architecture/</guid>
        <description>&lt;img src="https://images.unsplash.com/photo-1558494949-ef010cbdcc31?w=800&amp;h=400&amp;fit=crop" alt="Featured image of post Terraform-Driven Homelab Architecture" /&gt;&lt;h2 id=&#34;the-problem-space&#34;&gt;The Problem Space
&lt;/h2&gt;&lt;p&gt;Homelabs evolve. You start with one Docker container, add some LXCs, then Kubernetes, and suddenly your infrastructure is a house of cards held together by scripts you wrote two years ago and don&amp;rsquo;t remember.&lt;/p&gt;
&lt;p&gt;This architecture solves that through:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Everything in code&lt;/strong&gt; — from VM provisioning to Kubernetes bootstrap&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Versioned modules&lt;/strong&gt; — each update is a code review opportunity&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Self-service via Backstage&lt;/strong&gt; — templated provisioning, no Slack threads&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Numbers&lt;/strong&gt;: 3 Proxmox nodes, 2 production clusters (Docker Swarm + Talos K8s), ~50 resources defined across 24+ YAML configurations.&lt;/p&gt;
&lt;p&gt;Running in production:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;3 Proxmox nodes (alpha, charlie, foxtrot)&lt;/li&gt;
&lt;li&gt;Docker Swarm clusters with Keepalived HA&lt;/li&gt;
&lt;li&gt;Talos Kubernetes clusters with Flux GitOps&lt;/li&gt;
&lt;li&gt;GPU passthrough for hardware acceleration&lt;/li&gt;
&lt;li&gt;Multi-network topology (dmz + vmbr1)&lt;/li&gt;
&lt;li&gt;Private container registry (Harbor)&lt;/li&gt;
&lt;li&gt;Private Terraform registry (Cloudflare Workers)&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;Start with the basic template. All custom modules derive from it — maintaining consistency across the infrastructure.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&#34;module-hierarchy&#34;&gt;Module Hierarchy
&lt;/h2&gt;&lt;pre class=&#34;mermaid&#34;&gt;
  graph TB
    subgraph &amp;#34;Root Module&amp;#34;
        A[tf-infra-homelab]
    end
    
    subgraph &amp;#34;Compute Modules&amp;#34;
        B[tf-module-proxmox-lxc]
        C[tf-module-proxmox-vm]
        D[tf-module-proxmox-talos]
        E[tf-module-proxmox-docker]
    end
    
    subgraph &amp;#34;Application Layer&amp;#34;
        F[applications-homelab]
    end
    
    subgraph &amp;#34;Platform&amp;#34;
        G[Proxmox VE]
    end
    
    A --&amp;gt; B
    A --&amp;gt; C
    A --&amp;gt; D
    A --&amp;gt; E
    D --&amp;gt; F
    E --&amp;gt; F
    B --&amp;gt; G
    C --&amp;gt; G
    D --&amp;gt; G
    E --&amp;gt; G
&lt;/pre&gt;

&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Module&lt;/th&gt;
          &lt;th&gt;Purpose&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;terraform-basic-template&lt;/td&gt;
          &lt;td&gt;Foundation for all modules&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;tf-module-proxmox-lxc&lt;/td&gt;
          &lt;td&gt;LXC container provisioning&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;tf-module-proxmox-vm&lt;/td&gt;
          &lt;td&gt;Full VM provisioning&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;tf-module-proxmox-docker&lt;/td&gt;
          &lt;td&gt;Docker Swarm clusters&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;tf-module-proxmox-talos&lt;/td&gt;
          &lt;td&gt;Talos Kubernetes clusters&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;tf-infra-homelab&lt;/td&gt;
          &lt;td&gt;Root orchestration&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;applications-homelab&lt;/td&gt;
          &lt;td&gt;Kustomize deployments&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;github-management-plane&lt;/td&gt;
          &lt;td&gt;GitHub org management&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&#34;the-dependency-graph&#34;&gt;The Dependency Graph
&lt;/h2&gt;&lt;pre class=&#34;mermaid&#34;&gt;
  graph TB
    T[terraform-basic-template]
    L[tf-module-proxmox-lxc]
    V[tf-module-proxmox-vm]
    DT[tf-module-proxmox-docker]
    TT[tf-module-proxmox-talos]
    RH[tf-infra-homelab]
    AH[applications-homelab]
    P[ProxmoxVE]
    
    T --&amp;gt; L
    T --&amp;gt; V
    L --&amp;gt; DT
    V --&amp;gt; DT
    L --&amp;gt; TT
    V --&amp;gt; TT
    DT --&amp;gt; RH
    TT --&amp;gt; RH
    RH --&amp;gt; P
    DT --&amp;gt; AH
    TT --&amp;gt; AH
&lt;/pre&gt;

&lt;p&gt;Key observations:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Template is foundational&lt;/strong&gt; — all modules derive from the same template&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;LXC and VM are leaf modules&lt;/strong&gt; — no dependencies on other custom modules&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Docker and Talos are composite&lt;/strong&gt; — build on LXC/VM modules&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Root module is orchestrational&lt;/strong&gt; — composes modules based on configurations&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Applications deploy post-provisioning&lt;/strong&gt; — GitOps ties into Docker/Talos clusters&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;configuration-driven&#34;&gt;Configuration-Driven
&lt;/h2&gt;&lt;p&gt;All infrastructure is defined in YAML configurations, not ad-hoc Terraform runs:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-txt&#34;&gt;configurations/
├── docker/
│   ├── dev-docker-lxc.yaml
│   └── prod-docker-lxc.yaml
├── kubernetes/
│   ├── dev-k8s.yaml
│   └── prod-k8s.yaml
└── virtual_machine/
    └── ...&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Each config has an &lt;code&gt;enabled&lt;/code&gt; flag for gradual rollout:&lt;/p&gt;
&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;name: prod-k8s
enabled: true  # Set to false to disable without deletion

cluster:
  name: prod-k8s
  datastore:
    id: nas
    node: alpha&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;whats-running&#34;&gt;What&amp;rsquo;s Running
&lt;/h2&gt;&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Cluster&lt;/th&gt;
          &lt;th&gt;Type&lt;/th&gt;
          &lt;th&gt;Nodes&lt;/th&gt;
          &lt;th&gt;VIP&lt;/th&gt;
          &lt;th&gt;Purpose&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;prod-docker-lxc&lt;/td&gt;
          &lt;td&gt;Docker Swarm&lt;/td&gt;
          &lt;td&gt;3x medium (8vCPU/32GB)&lt;/td&gt;
          &lt;td&gt;192.168.61.20&lt;/td&gt;
          &lt;td&gt;Container workloads&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;prod-k8s&lt;/td&gt;
          &lt;td&gt;Talos K8s&lt;/td&gt;
          &lt;td&gt;3x CP (4vCPU/8GB) + 3x worker (10vCPU/48GB)&lt;/td&gt;
          &lt;td&gt;192.168.62.20&lt;/td&gt;
          &lt;td&gt;Kubernetes workloads&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Both clusters span all 3 Proxmox nodes for high availability.&lt;/p&gt;
&lt;h2 id=&#34;design-principles&#34;&gt;Design Principles
&lt;/h2&gt;&lt;p&gt;This architecture follows specific principles:&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Principle&lt;/th&gt;
          &lt;th&gt;Implementation&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;Single configuration object&lt;/td&gt;
          &lt;td&gt;All modules use unified &lt;code&gt;configuration&lt;/code&gt; input&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Host pools&lt;/td&gt;
          &lt;td&gt;Resilience through multi-node distribution&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Versioned modules&lt;/td&gt;
          &lt;td&gt;Each module has explicit versions&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;YAML configurations&lt;/td&gt;
          &lt;td&gt;Infrastructure as data, not ad-hoc apply&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Private registry&lt;/td&gt;
          &lt;td&gt;Distribution without Terraform Cloud cost&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Secrets integration&lt;/td&gt;
          &lt;td&gt;Bitwarden for credential storage&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;GitOps&lt;/td&gt;
          &lt;td&gt;Flux bootstrapped during cluster creation&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Multi-network&lt;/td&gt;
          &lt;td&gt;Separate DMZ and backend networks&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;GPU passthrough&lt;/td&gt;
          &lt;td&gt;Device mapping in host pool&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&#34;what-most-people-get-wrong&#34;&gt;What Most People Get Wrong
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&amp;ldquo;More modules = better architecture&amp;rdquo;&lt;/strong&gt; — I started with 10+ modules. Consolidated to 5. Over-modularization creates maintenance overhead.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&amp;ldquo;YAML = Terraform&amp;rdquo;&lt;/strong&gt; — Terraform is the engine, YAML is the fuel. Don&amp;rsquo;t embed YAML in &lt;code&gt;.tf&lt;/code&gt; files; load from external files.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&amp;ldquo;GitOps replaces Terraform&amp;rdquo;&lt;/strong&gt; — They work together: Terraform provisions, Flux manages apps. Both are declarative.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;related-posts&#34;&gt;Related Posts
&lt;/h2&gt;&lt;p&gt;Each component has its own detailed post:&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Post&lt;/th&gt;
          &lt;th&gt;Focus&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://zharif.my/posts/talos-kubernetes-proxmox&#34; &gt;Talos Kubernetes on Proxmox&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;tf-module-proxmox-talos deep dive — image factory, machine config, bootstrap&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://zharif.my/posts/docker-swarm-proxmox&#34; &gt;Docker Swarm on Proxmox&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;tf-module-proxmox-docker deep dive — Keepalived HA, provisioning&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://zharif.my/posts/lxc-vm-modules&#34; &gt;LXC &amp;amp; VM Modules&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;tf-module-proxmox-lxc + tf-module-proxmox-vm basics&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://zharif.my/posts/backstage-homelab&#34; &gt;Backstage Integration&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;Catalog generation, software templates&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://zharif.my/posts/terraform-registry-cloudflare-workers&#34; &gt;Private Terraform Registry&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;Module distribution via Cloudflare Workers&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;a class=&#34;link&#34; href=&#34;https://zharif.my/posts/github-management-plane&#34; &gt;GitHub Management Plane&lt;/a&gt;&lt;/td&gt;
          &lt;td&gt;Managing GitHub org via Terraform&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
</description>
        </item>
        
    </channel>
</rss>
