Overview

I’m always looking for the easy way. Certbot already makes retrieving TLS certificates from Let’s Encrypt easy. But it’s getting those certificates “into production” that tends to be less easy. This is the easy way to get Let’s Encrypt TLS certificates into production with Hashicorp’s Nomad.

This is an overview of what we’ll be doing:

Here is what we won’t be doing:

  • Setting up Nomad from scratch
  • Reviewing how to set up Nomad host volumes
  • Setting up Nomad jobs that Nginx can reverse proxy to. However, the example nginx configuration provides an example that uses Nomad service discovery
  • Auto-configuring DNS records to point at the Nomad client running our nginx job

If anything mentioned here is not exactly the same as your stack, don’t worry because you’ll be able to use the general approach on different stacks. For example, let’s say you’re not using Hetzner, but AWS Route 53 for DNS – you will simply change the certbot docker image from Hetzner to the official cerbot route53 image.

Component Details
Hosting Provider Hetzner – no frills, simple, cheap cloud computing and DNS
Reverse Proxy/TLS Nginx – the old standby
Job Scheduling Hashicorp Nomad – no frills, simple, job scheduling

If you’re here, you’re probably familiar with each of these, at least in theory if not in experience.

Let’s Go

Enough setup, let’s get to it.

What to do with with Hetzner

Head over to Hetzner DNS and retrieve an API token: https://dns.hetzner.com/settings/api-token

Save your token somewhere; you’ll need it later.

What to do with Nomad

First of all, while I’m not going to get into the details of setting up host volumes, I will provide my cluster’s client configuration for host volumes:

Create host volumes in /opt/nomad/config/client.hcl

client {
  enabled = true

  host_volume "letsencrypt" {
    path = "/var/www/certbot"
    read_only = false
  }

  host_volume "certbot" {
    path = "/etc/certbot"
    read_only = false
  }
}

What does this config do? We’re making the directories /var/www/certbot and /etc/certbot from our nomad clients – that is, the nodes that run nomad jobs – available to be mounted inside our nomad jobs. Doing so has one implication that you should be aware of: every node that your nginx job runs on will need to fetch new TLS certificates from Let’s Encrypt the first time the job runs. Therefore, if you’re deploying your nginx job to dozens of nodes, or auto-scaling up and down frequently, this can result in hitting Let’s Encrypt API limits. So tread lightly. Ideally, instead of “host volumes”, you can use CSI volumes, where the certbot jobs claims a read/write CSI volume on startup, and nginx jobs make multi-node read claims to the same volume. That way, a single set of certs is shared across all nginx nodes.

After adding this client configuration, restart your client nodes.

Create a combined Certbot / Ninx job

Full Example job.nomad.hcl

variable "hetzner_dns_access_token" {
  type = string
}

job "nginx" {
  datacenters = ["dc1"]
  namespace = "default"

  spread {
    attribute =  "${node.unique.name}"
    weight    = 100
  }

  update {
    stagger      = "30s"
    max_parallel = 1
  }

  group "work" {
    # My cluster node names are cluster-0, cluster-1, etc.
    # I run this job on a single node at a time, and "pin" it to the same node using the below constraint
    # This way, once certificates are generated for my domains, they'll be reused until they expire since
    # the job always runs on the same host where they were generated.
    count = 1
    constraint {
      attribute = "${node.unique.name}"
      operator = "set_contains_any"

      # CHANGE THIS TO THE NAME OF THE NODE THAT WILL RUN NGINX
      value    = "cluster-0"
    }

    # Claim our host volumes, so they can be mounted inside our certbot/nginx jobs
    volume "certbot" {
      type      = "host"
      read_only = false
      source    = "certbot"
    }

    volume "letsencrypt" {
      type      = "host"
      read_only = false
      source    = "letsencrypt"
    }

    network {
      port "http" {
        static = 80
      }
      port "https" {
        static = 443
      }
    }

   # This is the task that fetches certificates for all of our domains and places them on our host volume
   # Doing so makes the certificates available in our `nginx` job
   # Note: This task's lifecycle is "prestart", meaning it must complete before the `nginx` task starts
   task "certbot-all-domains" {
      driver = "docker"
      user = "root"

      config {
        image = "inetsoftware/certbot-dns-hetzner"
        # We use a custom entrypoint so we can script this tasks's behavior
        entrypoint = ["${NOMAD_TASK_DIR}/run.sh"]
      }

      template {
        data = <<EOF
#!/bin/sh

# Note, we're not dealing with bash here. We're dealing with 'ash', as this docker image is BusyBox linux
# Hence, we can't use normal bash arrays. Separate each domain by SPACES, NOTE COMMAS
DOMAINS="example.com foo.com bar.com"

# Set this to an empty string "" for production mode
STAGING="--staging"

DNS_PROPAGATION_SECONDS=30

set -- $DOMAINS

while [ -n "$1" ]; do
  domain=$1
  echo "Domain: $domain"

  certbot certonly -v $STAGING --cert-name ${domain} -d ${domain} -d *.${domain} --agree-tos -m youremail@example.com --keep-until-expiring --authenticator dns-hetzner --dns-hetzner-credentials /alloc/etc/letsencrypt/hetzner-certbot-config.ini --dns-hetzner-propagation-seconds=${DNS_PROPAGATION_SECONDS}

  echo "Done with: $domain"
  shift
done
EOF

        destination   = "${NOMAD_TASK_DIR}/run.sh"
        perms = "755"
      }

      # Make our hetzner DNS token available to the certbot job
      template {
        data = <<EOF
dns_hetzner_api_token = ${var.hetzner_dns_access_token}
EOF

        destination   = "${NOMAD_ALLOC_DIR}/etc/letsencrypt/hetzner-certbot-config.ini"
        change_mode   = "signal"
        change_signal = "SIGHUP"
      }

      resources {
        cpu = 50
        memory = 100
      }


      volume_mount {
        volume      = "certbot"
        destination = "/etc/letsencrypt" #<-- in the container
        read_only   = false
      }

      volume_mount {
        volume      = "letsencrypt"
        destination = "/var/www/certbot" #<-- in the container
        read_only   = false
      }

      lifecycle {
        hook    = "prestart"
        sidecar = false
      }
    }

    task "nginx" {
      driver = "docker"

      template {
        data = <<EOF
# You'll need to manually create `server` blocks for every domain that you host
# Here we provide example server blocks for a single domain
server {
    listen 80;
    listen [::]:80;

    server_tokens off;
    gzip on;
    gzip_vary on;
    gzip_min_length 1024;
    gzip_proxied expired no-cache no-store private auth;
    gzip_types text/plain text/css text/xml text/javascript application/x-javascript application/xml;
    gzip_disable "MSIE [1-6]\.";

    server_name example www.example.com;

    # Force TLS
    location / {
        return 301 https://example.com$request_uri;
    }
}

server {
    listen 443 ssl http2;
    listen [::]:443 ssl http2;

    server_tokens off;
    gzip on;
    gzip_vary on;
    gzip_min_length 1024;
    gzip_proxied expired no-cache no-store private auth;
    gzip_types text/plain text/css text/xml text/javascript application/x-javascript application/xml;
    gzip_disable "MSIE [1-6]\.";
    server_name example.com;

    ssl_certificate /etc/nginx/ssl/live/example.com/fullchain.pem;
    ssl_certificate_key /etc/nginx/ssl/live/example.com/privkey.pem;

    location / {

      # I run my application servers on nomad as well and use nomad service discovery to find them
      # This is provided only for example purpose. If you don't have a nomad job named "example.com-application"
      # this will not work
      {{ range nomadService "example.com-application" }}
      proxy_pass http://{{ .Address }}:{{ .Port }};
      {{ end }}
    }
}
EOF

        destination   = "local/conf/load-balancer.conf"
        change_mode   = "signal"
        change_signal = "SIGHUP"
      }

      config {
        image = "nginx"
        ports = ["http", "https"]
        volumes = [
          # Make our load-balancer.conf available in nginx's config directory
          "local/conf:/etc/nginx/conf.d",
        ]
        privileged = true
      }

      resources {
        cpu = 250
        memory = 250
      }

      # We're mounting the same host volume claims that the certbot task mounted
      volume_mount {
        volume      = "certbot"
        destination = "/etc/nginx/ssl/" #<-- in the container
        read_only   = false
      }

      volume_mount {
        volume      = "letsencrypt"
        destination = "/var/www/certbot" #<-- in the container
        read_only   = false
      }


      service {
        port = "http"

        # I use nomad for service discovery, but many nomad users prefer "consul" for this
        provider = "nomad"

        meta {
          meta = "General purpose load balancer"
        }

        # Take this check with a grain of salt ant tune yours for yourself
        check {
          type     = "tcp"
          port     = "http"
          interval = "10s"
          timeout  = "2s"
        }
      }
    }
  }
}

While you’ll obviously want to change all the example.com values, there are three other things you’ll want to explicitly set:

  1. You’ll need to change the following line to be the name of one of your Nomad client nodes value = "cluster-0"

  2. When you’re ready for production, you’ll want to change STAGING="--staging" to STAGING="" in run.sh.

  3. And if it’s not obvious, you’ll also want to change the DOMAINS variable in run.sh: DOMAINS="example.com foo.com bar.com" to your domain(s).

Deploy the job

nomad run -var="hetzner_dns_access_token=<YOUR TOKEN HERE>" job.nomad.hcl

Bonus for Terraform users

If you use the Terraform Nomad provider, which I recommend you do, you can drop this into your terraform config to deploy the above job with Terraform:

resource "nomad_job" "nginx" {
  jobspec = file("job.nomad.hcl")

  hcl2 {
    enabled = true

    vars = {
      # For Certbot to perform dns-01 challenges
      hetzner_dns_access_token = var.hetzner_dns_access_token
    }
  }
}

Summmary

What we’ve done here is created a single Nomad job that:

  1. Has a prestart lifecycle task that ensures all certificates are available to the nginx job, using Nomad host volumes.
  2. Uses a Docker cerbot image with the Hetzner plugin, which performs a DNS-01 challenge using Hetzner’s DNS API.
  3. Starts Nginx with Nomad host volumes mounted, containing the certificates retrieved using cerbot.

Note that if you run this job on multiple hosts, or auto-scale this job across multiple hosts, you should be aware of Let’s Encrypt rate limits. If you end up requesting too many certificates for the same domain, LE can block your domain for long periods of time, so be careful.

Also note that the provided Nomad job does not automate the creation of records pointing at your nomad client running the nginx job. It only creates DNS records for the DNS-01 challenge. This is why you’ll notice that the job is to a specific client node. Doing so allows the same certificates to be reused until they expire.

The user will need to create their own DNS records pointing at the “pinned” node, for every domain listed in the DOMAINS variable.

Good luck, and feel free to reach out to me via email.