Skip to content

Proxmox SDN with Terraform

Purpose: Deploy a repeatable Proxmox SDN fabric (zone, VNets, subnets, host L3 gateways, NAT, DHCP) using the public terraform-proxmox-sdn module, either as standalone Terraform or via a Terragrunt stack (HybridOps is used here as an example).
Difficulty: Intermediate
Prerequisites: Proxmox VE 8.x with SDN enabled, dnsmasq installed, and a Linux control node with Terraform (and Terragrunt if you use the stack workflow).

Scope & reuse
HybridOps appears in this guide as an example host project and stack layout.
You can use the terraform-proxmox-sdn module in any Terraform or Terragrunt codebase – adapt paths, naming, and environments to your own repository.

Multi-node roadmap
The current module (v0.1.x) is tested and supported on single-node SDN zones.
Multi-node SDN clusters (a shared SDN zone across multiple nodes) are planned for a future release and prototyped under examples/multi-node in the module repository.

This HOWTO assumes:

For day-to-day operations (restarts, debugging, range changes), use the SDN operations runbook in your SDN stack directory (for HybridOps, this is sdn_operations.md alongside network-sdn).


Demo

No public walkthrough video is published for this HOWTO yet.


1. Module capabilities (v0.1.x)

At a high level, the module manages:

  • Proxmox SDN objects
  • SDN VLAN zone (for example hybzone).
  • SDN VNets (for example vnetmgmt, vnetobs, …).
  • SDN subnets attached to those VNets.

  • Host-level networking on the Proxmox node

  • L3 gatewaysip addr configuration on vnet* interfaces.
  • SNAT (masquerade) out of an uplink bridge (for example vmbr0).
  • dnsmasq-based DHCP, provisioned via systemd unit templates.

These are controlled with three boolean toggles:

  • enable_host_l3 – manage gateway IPs on the Proxmox host for each subnet.
  • enable_snat – configure SNAT rules for each subnet whose traffic should exit via the uplink.
  • enable_dhcp – configure dnsmasq DHCP for selected subnets.

Typical combinations:

Scenario enable_host_l3 enable_snat enable_dhcp
Control-plane only (no NAT, no DHCP) true false false
Routed lab with static addressing true true false
Routed lab with DHCP true true true

Guardrail
enable_dhcp = true requires enable_host_l3 = true, so that DHCP services always bind to valid VNet interfaces and the SDN zone stays in a healthy state.


2. Proxmox API token

  1. In the Proxmox web UI, create an API token:

  2. User: automation@pam

  3. Token ID: infra-token
  4. Permissions: enough to manage SDN, node networking, and read node status.

  5. Note:

  6. API URL: https://<PROXMOX-IP>:8006/api2/json

  7. Token ID: automation@pam!infra-token
  8. Token secret: <UUID>

On the control node, export them as shell variables:

export PROXMOX_URL="https://<PROXMOX-IP>:8006/api2/json"
export PROXMOX_TOKEN_ID="automation@pam!infra-token"
export PROXMOX_TOKEN_SECRET="<YOUR-TOKEN-SECRET>"
export PROXMOX_NODE="<NODE-NAME>"

3. Option A – Standalone Terraform quickstart

Use this when you want to use the module directly, without Terragrunt or a particular monorepo layout.

3.1. Minimal project layout

On the control node:

mkdir -p ~/proxmox-sdn-quickstart
cd ~/proxmox-sdn-quickstart

touch main.tf variables.tf terraform.tfvars

3.2. variables.tf

variable "proxmox_url" {
  description = "Proxmox API URL (e.g., https://192.168.1.10:8006/api2/json)"
  type        = string
}

variable "proxmox_token" {
  description = "Proxmox API token (USER@REALM!TOKENID=UUID)"
  type        = string
  sensitive   = true
}

variable "proxmox_insecure" {
  description = "Skip TLS verification"
  type        = bool
  default     = true
}

variable "proxmox_node" {
  description = "Proxmox node name (e.g., hybridhub)"
  type        = string
}

variable "proxmox_host" {
  description = "Proxmox host IP for SSH (host-level L3/NAT/DHCP orchestration)"
  type        = string
}

3.3. main.tf

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    proxmox = {
      source  = "bpg/proxmox"
      version = ">= 0.50.0"
    }
  }
}

provider "proxmox" {
  endpoint  = var.proxmox_url
  api_token = var.proxmox_token
  insecure  = var.proxmox_insecure
}

module "sdn" {
  source  = "hybridops-tech/sdn/proxmox"
  version = "~> 0.1.2"

  # SDN zone ID must be <= 8 chars, lowercase, no dashes
  zone_name    = "hybzone"
  zone_bridge  = "vmbr0"
  proxmox_node = var.proxmox_node
  proxmox_host = var.proxmox_host

  # Host-level routing and services
  enable_host_l3 = true   # configure gateways on vnet* interfaces
  enable_snat    = true   # SNAT out of zone_bridge/uplink_interface
  enable_dhcp    = true   # DHCP where ranges are defined (see table below)

  dns_domain = "hybridops.local"
  dns_lease  = "24h"

  vnets = {
    vnetmgmt = {
      vlan_id     = 10
      description = "Management network"
      subnets = {
        mgmt = {
          cidr    = "10.10.0.0/24"
          gateway = "10.10.0.1"

          # DHCP enabled implicitly because:
          # - enable_dhcp = true
          # - dhcp_range_start / dhcp_range_end are set
          dhcp_range_start = "10.10.0.100"
          dhcp_range_end   = "10.10.0.200"
          dhcp_dns_server  = "8.8.8.8"
        }
      }
    }

    # vnetobs, vnetdev, vnetstag, vnetprod, vnetlab can be added here.
  }

  # Provider-related variables passed through to the module
  proxmox_url      = var.proxmox_url
  proxmox_token    = var.proxmox_token
  proxmox_insecure = var.proxmox_insecure
}

DHCP behaviour at a glance

This module treats DHCP as a host-side add-on on top of SDN + L3.
You can safely run L3 + NAT without DHCP, or toggle DHCP per subnet.

enable_host_l3 enable_dhcp Subnet flags / ranges Result
false false anything Pure L2 SDN only. No host gateways, no NAT, no DHCP.
true false ranges optional Host has .1 gateway per subnet, optional SNAT, no DHCP.
true true dhcp_range_start + dhcp_range_end set, dhcp_enabled omitted DHCP enabled for that subnet (implicit, “ranges = on”).
true true ranges set, dhcp_enabled = true DHCP enabled for that subnet (explicit).
true true ranges set, dhcp_enabled = false DHCP disabled – ranges treated as documentation only.

Guardrails:

  • enable_dhcp = true requires enable_host_l3 = true so dnsmasq can bind to the VNet interfaces.
  • If dhcp_enabled = true, both dhcp_range_start and dhcp_range_end must be set for that subnet.

3.4. terraform.tfvars

Use either real values or a redacted example:

# Proxmox API configuration
proxmox_url      = "https://<PROXMOX-IP>:8006/api2/json"
proxmox_token    = "root@pam!terraform=<YOUR-API-TOKEN-SECRET>"
proxmox_insecure = true

# Proxmox node configuration (single-node)
proxmox_node = "<PROXMOX-NODE-NAME>"  # e.g. "hybridhub"
proxmox_host = "<PROXMOX-IP>"         # usually same as <PROXMOX-IP> above

Your real token secret replaces the placeholder when you run it in your own environment.

3.5. Apply with Terraform

cd ~/proxmox-sdn-quickstart

terraform init
terraform plan
terraform apply

This will:

  • Create SDN zone hybzone.
  • Create SDN VNets (for example vnetmgmt).
  • Create SDN subnets and configure gateways on VNet interfaces when enable_host_l3 = true.
  • Configure NAT on the Proxmox host when enable_snat = true.
  • Generate dnsmasq DHCP snippets under /etc/dnsmasq.d/ for subnets that have DHCP enabled.
  • Reload SDN and dnsmasq on the Proxmox node.

4. Option B – Terragrunt SDN stack (HybridOps example)

Use this when you are working inside a Terragrunt-based monorepo. The example below assumes the HybridOps layout:

hybridops-platform/infra/terraform/live-v1/onprem/proxmox/core/00-foundation/network-sdn/
  ├─ terragrunt.hcl
  ├─ README.md
  └─ sdn_operations.md

The module remains external:

  • Registry: hybridops-tech/sdn/proxmox
  • GitHub: https://github.com/hybridops-tech/terraform-proxmox-sdn

4.1. Terragrunt wiring (example)

In network-sdn/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders("root.hcl")
}

terraform {
  source = "tfr://registry.terraform.io/hybridops-tech/sdn/proxmox?version=0.1.2"
}

locals {
  # Environment is typically loaded in root.hcl into local.env
  proxmox_url   = local.env.PROXMOX_URL
  proxmox_token = "${local.env.PROXMOX_TOKEN_ID}=${local.env.PROXMOX_TOKEN_SECRET}"

  proxmox_insecure = true
  proxmox_node     = local.env.PROXMOX_NODE
  proxmox_host     = split(":", local.env.PROXMOX_HOST)[0]

  vnets = {
    vnetmgmt = {
      vlan_id     = 10
      description = "Management network"
      subnets = {
        submgmt = {
          cidr             = "10.10.0.0/24"
          gateway          = "10.10.0.1"
          dhcp_range_start = "10.10.0.120"
          dhcp_range_end   = "10.10.0.220"
          dhcp_dns_server  = "8.8.8.8"
        }
      }
    }

    vnetobs = {
      vlan_id     = 11
      description = "Observability network"
      subnets = {
        subobs = {
          cidr             = "10.11.0.0/24"
          gateway          = "10.11.0.1"
          dhcp_range_start = "10.11.0.120"
          dhcp_range_end   = "10.11.0.220"
          dhcp_dns_server  = "8.8.8.8"
        }
      }
    }

    vnetdata = {
      vlan_id     = 12
      description = "Shared services / data tier"
      subnets = {
        subdata = {
          cidr             = "10.12.0.0/24"
          gateway          = "10.12.0.1"
          dhcp_range_start = "10.12.0.120"
          dhcp_range_end   = "10.12.0.220"
          dhcp_dns_server  = "8.8.8.8"
        }
      }
    }

    vnetdev = {
      vlan_id     = 20
      description = "Development network"
      subnets = {
        subdev = {
          cidr    = "10.20.0.0/24"
          gateway = "10.20.0.1"
        }
      }
    }

    vnetstag = {
      vlan_id     = 30
      description = "Staging network"
      subnets = {
        substag = {
          cidr    = "10.30.0.0/24"
          gateway = "10.30.0.1"
        }
      }
    }

    vnetprod = {
      vlan_id     = 40
      description = "Production network"
      subnets = {
        subprod = {
          cidr    = "10.40.0.0/24"
          gateway = "10.40.0.1"
        }
      }
    }

    vnetlab = {
      vlan_id     = 50
      description = "Lab network"
      subnets = {
        sublab = {
          cidr    = "10.50.0.0/24"
          gateway = "10.50.0.1"
        }
      }
    }
  }
}

inputs = {
  zone_name    = "hybzone"
  zone_bridge  = "vmbr0"
  proxmox_node = local.proxmox_node
  proxmox_host = local.proxmox_host

  # Host-level toggles
  enable_host_l3 = true
  enable_snat    = true
  enable_dhcp    = true

  dns_domain = "hybridops.local"
  dns_lease  = "24h"

  vnets = local.vnets

  # Provider-related variables are passed through to the module
  proxmox_url      = local.proxmox_url
  proxmox_token    = local.proxmox_token
  proxmox_insecure = local.proxmox_insecure
}

The exact local.env wiring may differ in your root.hcl, but the pattern is:

  • Parent root.hcl → loads infra/env/proxmox.credentials.tfvars (or similar).
  • network-sdn/terragrunt.hcl → passes concrete values to the module.

4.2. Apply with Terragrunt

From the SDN stack directory:

cd hybridops-platform/infra/terraform/live-v1/onprem/proxmox/core/00-foundation/network-sdn

terragrunt init
terragrunt plan
terragrunt apply

Behaviour is functionally equivalent to the standalone Terraform example, but:

  • It is integrated into the full live-v1 layout.
  • It participates in the same environment promotion and validation story as the rest of HybridOps.

5. Attach VMs to SDN VNets

In the Proxmox UI:

  1. Edit a VM.
  2. Add or update a network device:
  3. Bridge: vnetmgmt (or another VNet created by the module).
  4. VLAN tag: none (tagging is handled at SDN/VNet level).
  5. Boot the VM.

On the VM, verify:

ip addr
ip route

ping 10.10.0.1        # gateway
ping 8.8.8.8          # internet (if NAT configured)
nslookup google.com   # DNS

6. Validation and run records

6.1. Control node (Terraform or Terragrunt)

For standalone Terraform:

cd ~/proxmox-sdn-quickstart
terraform state list
terraform output

For Terragrunt:

cd hybridops-platform/infra/terraform/live-v1/onprem/proxmox/core/00-foundation/network-sdn
terragrunt state list
terragrunt output

You should see:

  • SDN zone resource.
  • SDN VNets.
  • SDN subnets.
  • null_resource.gateway_setup when host L3 is enabled.
  • null_resource.nat_setup when SNAT is enabled.
  • null_resource.dhcp_setup when DHCP is enabled.

6.2. SDN module outputs

The SDN module exposes three primary outputs:

Name Type Description
zone_name string SDN zone name (Proxmox SDN zone ID).
vnets map Map of VNet keys to objects with id, zone, and vlan_id.
subnets map Map of subnet keys (<vnet>-<subnet>) to objects with CIDR, gateway, and DHCP configuration.

After terraform apply (or terragrunt apply), you can inspect them:

# Standalone Terraform
terraform output zone_name
terraform output vnets
terraform output subnets

# Terragrunt (same outputs, via Terragrunt wrapper)
terragrunt output zone_name
terragrunt output vnets
terragrunt output subnets

Example (simplified):

subnets = {
  "vnetmgmt-mgmt" = {
    id               = "hybzone-10.10.0.0-24"
    vnet             = "vnetmgmt"
    cidr             = "10.10.0.0/24"
    gateway          = "10.10.0.1"
    dhcp_enabled     = true
    dhcp_range_start = "10.10.0.120"
    dhcp_range_end   = "10.10.0.220"
    dhcp_dns_server  = "8.8.8.8"
  }
}

These outputs can be consumed by downstream modules (for example VM modules that attach NICs to specific VNets or use subnet CIDRs).


6.3. Proxmox node checks

# SDN zones and VNets
ssh root@<PROXMOX_HOST> 'pvesh get /cluster/sdn/zones'
ssh root@<PROXMOX_HOST> 'pvesh get /cluster/sdn/vnets'

# VNet bridges
ssh root@<PROXMOX_HOST> 'ip link show | grep vnet'

# Gateway IPs
ssh root@<PROXMOX_HOST> 'ip addr show | grep "inet 10\."'

# DHCP config snippets & service (when DHCP is enabled)
ssh root@<PROXMOX_HOST> 'ls -1 /etc/dnsmasq.d/'
ssh root@<PROXMOX_HOST> 'systemctl list-units "dnsmasq@hybridops-sdn-dhcp-*" --no-pager || true'
ssh root@<PROXMOX_HOST> 'journalctl -u "dnsmasq@hybridops-sdn-dhcp-*" -n 20 --no-pager || true'

7. Troubleshooting

For module-specific edge cases, also see:

  • KNOWN-ISSUES-terraform-proxmox-sdn.md in the module repository.
  • Any SDN operations runbook you maintain alongside your SDN stack (for HybridOps: sdn_operations.md in network-sdn).

7.1. SDN zone or VNet name errors

Symptoms

  • Terraform fails with messages about invalid id or length.
  • Errors mention SDN identifiers or “at most 8 characters”.

Cause

Proxmox SDN enforces naming rules:

  • IDs must be ≤ 8 characters.
  • Typically lowercase.
  • No dashes.

Fix

Ensure zone_name and all VNet keys in vnets comply:

  • Good: hybzone, vnetmgmt, vclst01m
  • Bad: basic-zone, vnet-mgmt, cluster-zone

Re-run plan and apply.

7.2. SDN zone already exists

Symptoms

  • Error: sdn zone object ID 'hybzone' already defined.

Cause

The SDN zone was created manually or by a previous run.

Fix

  • Either:
  • Import the existing zone into Terraform state, or
  • Delete it from the Proxmox UI and re-apply.
  • In a lab, deleting and re-applying is usually fine; in production, prefer import.

7.3. DHCP units failing to start

Symptoms

  • systemctl status dnsmasq@hybridops-sdn-dhcp-* shows failed or repeated restarts.
  • Logs contain unknown interface vnet* or similar.

Cause

dnsmasq is trying to bind to a VNet interface that does not yet exist or has been torn down by an SDN reload.

Fix

  • Confirm VNets exist and are up:
    ssh root@<PROXMOX_HOST> 'ip link show | grep vnet'
    
  • Confirm gateway IPs are present when enable_host_l3 = true:
    ssh root@<PROXMOX_HOST> 'ip addr show | grep "inet 10\."'
    
  • Re-run the SDN stack apply so that:
  • SDN is applied via pvesh set /cluster/sdn.
  • VNets come up.
  • Host L3 and DHCP are applied in that order.

If a unit is stuck in a crash loop, you can stop and disable it, then re-apply:

ssh root@<PROXMOX_HOST> '
  systemctl list-unit-files "dnsmasq@hybridops-sdn-dhcp-*" --no-legend     | awk "{print $1}"     | while read -r unit; do
        [ -n "$unit" ] || continue
        systemctl disable "$unit" --now || true
      done
'
# Re-run Terraform/Terragrunt apply afterwards

7.4. SDN destroyed but VNet interfaces persist

Symptoms

  • destroy completes successfully.
  • /etc/pve/sdn/*.cfg no longer contains the VNets.
  • ip link show still lists vnet*.
  • Proxmox UI shows VNets as deleted or error.

Cause

On Proxmox VE 8.x, kernel interfaces may persist after SDN objects are removed via API.

Workaround (lab)

ssh root@<PROXMOX_HOST> '

for vnet in vnetdata vnetdev vnetlab vnetmgmt vnetobs vnetprod vnetstag; do
  ip link set "$vnet" down 2>/dev/null || true
  ip link delete "$vnet" 2>/dev/null || true
done

ifreload -a || true
pvesh set /cluster/sdn || true
'

Use with care if other SDN zones/VNets exist on the node.


8. Planned: Multi-node SDN clusters

Status: Design/roadmap – not yet supported in v0.1.x.
The goal is to reuse the same module to manage a shared SDN zone that spans multiple Proxmox nodes.

8.1. Target design

The planned multi-node model looks like:

  • One SDN zone (for example clust01) shared across all nodes.
  • Per-cluster VNets with SDN-safe identifiers (≤ 8 chars, no dashes), for example:
  • vclst01m – cluster management.
  • vclst01d – cluster data.
  • A single dnsmasq configuration that serves DHCP for all cluster VLANs.

In Terraform module terms, this would be driven by:

  • zone_name – shared across all nodes.
  • A future nodes input – list of node names (["pve1", "pve2", "pve3"]).
  • A shared vnets map (same structure as in the single-node examples).

8.2. Example scaffold (subject to change)

Do not use this in production yet – this is a design sketch for a future >= 0.2.0 release.

module "sdn_cluster" {
  source  = "hybridops-tech/sdn/proxmox"
  version = "~> 0.2.0"

  # Shared SDN zone across nodes (≤ 8 chars, no dashes)
  zone_name = "clust01"

  # Planned: list of nodes instead of a single node
  nodes = [
    "pve1",
    "pve2",
  ]

  vnets = {
    vclst01m = {
      vlan_id     = 200
      description = "Cluster management network"
      subnets = {
        mgmt = {
          cidr             = "10.200.0.0/24"
          gateway          = "10.200.0.1"
          dhcp_range_start = "10.200.0.120"
          dhcp_range_end   = "10.200.0.220"
          dhcp_dns_server  = "8.8.8.8"
        }
      }
    }

    # vclst01d, vclst01s, etc.
  }

  dns_domain = "hybridops.local"
  dns_lease  = "24h"

  proxmox_url      = var.proxmox_url
  proxmox_token    = var.proxmox_token
  proxmox_insecure = var.proxmox_insecure
}

A prototype of this layout is maintained under:

  • examples/multi-node/ in the module repository (draft, subject to change).

Optional: Proxmox SDN GUI compatibility helper

Some Proxmox VE releases can misreport SDN state (for example, missing VNet gateways or stale DHCP status) even though the SDN configuration and data plane are working correctly.

HybridOps includes an optional SDN auto-healing helper:

  • Script: control/tools/helper/proxmox/install-sdn-auto-healing.sh
  • Purpose: normalise /etc/network/interfaces.d/sdn, ensure expected VNet gateways are present, and sync SDN state so the Proxmox GUI reflects the actual configuration.

This helper is:

  • Not required for correct routing, NAT, or DHCP.
  • Tightly coupled to the HybridOps reference VLAN plan (10.10.0.0/24, 10.11.0.0/24, 10.12.0.0/24, 10.20.0.0/24, 10.30.0.0/24, 10.40.0.0/24, 10.50.0.0/24).
  • Safe to disable or remove if you upgrade to a Proxmox version where the SDN GUI behaves correctly.

9. References

Module & code

If you are using this inside HybridOps