Skip to content

Runbook – Bootstrap Jenkins Controller on Control Node

Intent

  • Purpose: Deploy the Jenkins controller on the control node with Docker and Configuration as Code.
  • Owner: Platform Operations
  • Trigger: New control node build, controller rebuild, or controlled controller recovery.
  • Impact: Jenkins control plane availability; pipeline scheduling may pause during deployment or restart.
  • Severity: P2
  • Pre-reqs: SSH access to the control node, Docker Engine and Compose v2, and the bootstrap secrets required for the admin user.
  • Rollback: Stop the compose stack, revert JCasC or config inputs, and restore JENKINS_HOME from backup or snapshot if required.

Context

This runbook establishes the long-lived Jenkins control plane on the control node while keeping build execution off the controller.

  • Controller runs in Docker using docker compose.
  • Controller is configured by JCasC loaded from /var/jenkins_home/casc_configs.
  • Build workload is expected to run on agents (Docker bootstrap agent initially; RKE2 agents in steady state).

This is aligned with the HybridOps Jenkins placement model in ADR-0603.


Preconditions and safety checks

1) Confirm you are targeting the correct host:

hostnamectl
ip -br a

2) Confirm Docker is healthy:

docker version --format '{{.Server.Version}}'
docker compose version

3) Confirm ports are available (or intentionally bound):

ss -lntp | egrep ':8080|:50000' || true

4) Confirm the secrets you will use are available:

  • jenkins_admin_password (required for bootstrap).
  • Optional: jenkins_github_token, other credential material.

Steps

1) Apply the controller role

Run your playbook that applies:

  • hybridops.common.docker_engine (baseline, if needed)
  • hybridops.app.jenkins_controller

Example execution:

ansible-playbook -i <inventory> <playbook>.yml

Expected result

  • Controller image exists (built from the pinned plugin list).
  • Compose stack starts successfully.
  • Healthcheck passes (if enabled).

Evidence

  • Save role output log to: <runtime-root>/logs/ansible/jenkins_controller/<run_id>.log

2) Validate container health

docker ps --format "table {{.Names}}    {{.Status}} {{.Image}}" | egrep "jenkins-controller|NAME"
docker logs --tail 200 jenkins-controller

Expected result

  • jenkins-controller is Up and, when enabled, (healthy).
  • Logs show Jenkins is fully up and running.

Evidence

  • <runtime-root>/logs/ci/jenkins_controller/docker_ps_<run_id>.txt
  • <runtime-root>/logs/ci/jenkins_controller/docker_logs_<run_id>.txt

3) Validate HTTP and API probes (host-local)

curl -sS -o /dev/null -w "HTTP=%{http_code}
" http://localhost:8080/login
curl -sS -u "admin:<ADMIN_PASSWORD_OR_TOKEN>" http://localhost:8080/whoAmI/api/json && echo

Expected result

  • /login returns 200/302 (depending on configuration).
  • /whoAmI/api/json returns authenticated JSON for the admin user.

Evidence

  • <runtime-root>/logs/ci/jenkins_controller/http_probe_<run_id>.txt
  • <runtime-root>/logs/ci/jenkins_controller/whoami_<run_id>.json

4) Confirm JCasC inputs are present

sudo ls -la /opt/jenkins/jenkins_home/casc_configs/
sudo sed -n '1,200p' /opt/jenkins/jenkins_home/casc_configs/00-controller.yaml

Expected result

  • Controller baseline file exists and renders expected values.

Evidence

  • <runtime-root>/logs/ci/jenkins_controller/casc_listing_<run_id>.txt

5) Optional: Admin API token management

If enabled, the role can mint and persist an admin API token to a root-only file. Retrieve it on the controller host:

sudo cat /opt/jenkins/.secrets/admin_api_token

Use it as the Basic Auth “password”:

TOKEN="$(sudo sh -c 'tr -d "\n" </opt/jenkins/.secrets/admin_api_token')"
curl -sS -u "admin:${TOKEN}" http://localhost:8080/whoAmI/api/json && echo

Security note: avoid printing tokens in CI logs; rotate if exposure is suspected.


Verification

Success criteria:

  • jenkins-controller container is running and healthy.
  • /login is reachable and /whoAmI authenticates.
  • JCasC files load without boot failures.
  • Controller is configured for controller-only operation (typically numExecutors: 0 in steady state).

Post-actions and clean-up

  • Reduce controller executors back to 0 after bootstrap validation (if you temporarily set 1 for smoke tests).
  • Ensure an agent execution surface is available (Docker bootstrap agent or RKE2 agents) before scheduling production-like pipelines.
  • Rotate secrets/tokens if any were exposed during the run.

References


License: MIT-0 for code, CC-BY-4.0 for documentation