Let’s Encrypt (cerbot) with Hashicorp’s Nomad, Nginx, and Docker, the easy way
Content Overview
Overview⌗
I’m always looking for the easy way. Certbot already makes retrieving TLS certificates from Let’s Encrypt easy. But it’s getting those certificates “into production” that tends to be less easy. This is the easy way to get Let’s Encrypt TLS certificates into production with Hashicorp’s Nomad.
This is an overview of what we’ll be doing:
- Using Nomad’s docker driver to run docker container jobs.
- Using a cerbot docker image that performs a DNS-01 Challenge using Hetzner’s DNS API.
- Using Nomad host volumes to share certificates between nomad tasks.
- Using Nomad lifecycle block to initialize certificates before Nginx startup.
- Using Nginx as a reverse proxy/SSL termination
Here is what we won’t be doing:
- Setting up Nomad from scratch
- Reviewing how to set up Nomad host volumes
- Setting up Nomad jobs that Nginx can reverse proxy to. However, the example nginx configuration provides an example that uses Nomad service discovery
- Auto-configuring DNS records to point at the Nomad client running our nginx job
If anything mentioned here is not exactly the same as your stack, don’t worry because you’ll be able to use the general approach on different stacks. For example, let’s say you’re not using Hetzner, but AWS Route 53 for DNS – you will simply change the certbot
docker image from Hetzner to the official cerbot route53 image.
Component | Details |
---|---|
Hosting Provider | Hetzner – no frills, simple, cheap cloud computing and DNS |
Reverse Proxy/TLS | Nginx – the old standby |
Job Scheduling | Hashicorp Nomad – no frills, simple, job scheduling |
If you’re here, you’re probably familiar with each of these, at least in theory if not in experience.
Let’s Go⌗
Enough setup, let’s get to it.
What to do with with Hetzner⌗
Head over to Hetzner DNS and retrieve an API token: https://dns.hetzner.com/settings/api-token
Save your token somewhere; you’ll need it later.
What to do with Nomad⌗
First of all, while I’m not going to get into the details of setting up host volumes, I will provide my cluster’s client configuration for host volumes:
Create host volumes in /opt/nomad/config/client.hcl
⌗
client {
enabled = true
host_volume "letsencrypt" {
path = "/var/www/certbot"
read_only = false
}
host_volume "certbot" {
path = "/etc/certbot"
read_only = false
}
}
What does this config do? We’re making the directories
/var/www/certbot
and/etc/certbot
from our nomad clients – that is, the nodes that run nomad jobs – available to be mounted inside our nomad jobs. Doing so has one implication that you should be aware of: every node that your nginx job runs on will need to fetch new TLS certificates from Let’s Encrypt the first time the job runs. Therefore, if you’re deploying your nginx job to dozens of nodes, or auto-scaling up and down frequently, this can result in hitting Let’s Encrypt API limits. So tread lightly. Ideally, instead of “host volumes”, you can use CSI volumes, where thecertbot
jobs claims a read/write CSI volume on startup, andnginx
jobs make multi-node read claims to the same volume. That way, a single set of certs is shared across all nginx nodes.
After adding this client configuration, restart your client nodes.
Create a combined Certbot / Ninx job⌗
variable "hetzner_dns_access_token" {
type = string
}
job "nginx" {
datacenters = ["dc1"]
namespace = "default"
spread {
attribute = "${node.unique.name}"
weight = 100
}
update {
stagger = "30s"
max_parallel = 1
}
group "work" {
# My cluster node names are cluster-0, cluster-1, etc.
# I run this job on a single node at a time, and "pin" it to the same node using the below constraint
# This way, once certificates are generated for my domains, they'll be reused until they expire since
# the job always runs on the same host where they were generated.
count = 1
constraint {
attribute = "${node.unique.name}"
operator = "set_contains_any"
# CHANGE THIS TO THE NAME OF THE NODE THAT WILL RUN NGINX
value = "cluster-0"
}
# Claim our host volumes, so they can be mounted inside our certbot/nginx jobs
volume "certbot" {
type = "host"
read_only = false
source = "certbot"
}
volume "letsencrypt" {
type = "host"
read_only = false
source = "letsencrypt"
}
network {
port "http" {
static = 80
}
port "https" {
static = 443
}
}
# This is the task that fetches certificates for all of our domains and places them on our host volume
# Doing so makes the certificates available in our `nginx` job
# Note: This task's lifecycle is "prestart", meaning it must complete before the `nginx` task starts
task "certbot-all-domains" {
driver = "docker"
user = "root"
config {
image = "inetsoftware/certbot-dns-hetzner"
# We use a custom entrypoint so we can script this tasks's behavior
entrypoint = ["${NOMAD_TASK_DIR}/run.sh"]
}
template {
data = <<EOF
#!/bin/sh
# Note, we're not dealing with bash here. We're dealing with 'ash', as this docker image is BusyBox linux
# Hence, we can't use normal bash arrays. Separate each domain by SPACES, NOTE COMMAS
DOMAINS="example.com foo.com bar.com"
# Set this to an empty string "" for production mode
STAGING="--staging"
DNS_PROPAGATION_SECONDS=30
set -- $DOMAINS
while [ -n "$1" ]; do
domain=$1
echo "Domain: $domain"
certbot certonly -v $STAGING --cert-name ${domain} -d ${domain} -d *.${domain} --agree-tos -m youremail@example.com --keep-until-expiring --authenticator dns-hetzner --dns-hetzner-credentials /alloc/etc/letsencrypt/hetzner-certbot-config.ini --dns-hetzner-propagation-seconds=${DNS_PROPAGATION_SECONDS}
echo "Done with: $domain"
shift
done
EOF
destination = "${NOMAD_TASK_DIR}/run.sh"
perms = "755"
}
# Make our hetzner DNS token available to the certbot job
template {
data = <<EOF
dns_hetzner_api_token = ${var.hetzner_dns_access_token}
EOF
destination = "${NOMAD_ALLOC_DIR}/etc/letsencrypt/hetzner-certbot-config.ini"
change_mode = "signal"
change_signal = "SIGHUP"
}
resources {
cpu = 50
memory = 100
}
volume_mount {
volume = "certbot"
destination = "/etc/letsencrypt" #<-- in the container
read_only = false
}
volume_mount {
volume = "letsencrypt"
destination = "/var/www/certbot" #<-- in the container
read_only = false
}
lifecycle {
hook = "prestart"
sidecar = false
}
}
task "nginx" {
driver = "docker"
template {
data = <<EOF
# You'll need to manually create `server` blocks for every domain that you host
# Here we provide example server blocks for a single domain
server {
listen 80;
listen [::]:80;
server_tokens off;
gzip on;
gzip_vary on;
gzip_min_length 1024;
gzip_proxied expired no-cache no-store private auth;
gzip_types text/plain text/css text/xml text/javascript application/x-javascript application/xml;
gzip_disable "MSIE [1-6]\.";
server_name example www.example.com;
# Force TLS
location / {
return 301 https://example.com$request_uri;
}
}
server {
listen 443 ssl http2;
listen [::]:443 ssl http2;
server_tokens off;
gzip on;
gzip_vary on;
gzip_min_length 1024;
gzip_proxied expired no-cache no-store private auth;
gzip_types text/plain text/css text/xml text/javascript application/x-javascript application/xml;
gzip_disable "MSIE [1-6]\.";
server_name example.com;
ssl_certificate /etc/nginx/ssl/live/example.com/fullchain.pem;
ssl_certificate_key /etc/nginx/ssl/live/example.com/privkey.pem;
location / {
# I run my application servers on nomad as well and use nomad service discovery to find them
# This is provided only for example purpose. If you don't have a nomad job named "example.com-application"
# this will not work
{{ range nomadService "example.com-application" }}
proxy_pass http://{{ .Address }}:{{ .Port }};
{{ end }}
}
}
EOF
destination = "local/conf/load-balancer.conf"
change_mode = "signal"
change_signal = "SIGHUP"
}
config {
image = "nginx"
ports = ["http", "https"]
volumes = [
# Make our load-balancer.conf available in nginx's config directory
"local/conf:/etc/nginx/conf.d",
]
privileged = true
}
resources {
cpu = 250
memory = 250
}
# We're mounting the same host volume claims that the certbot task mounted
volume_mount {
volume = "certbot"
destination = "/etc/nginx/ssl/" #<-- in the container
read_only = false
}
volume_mount {
volume = "letsencrypt"
destination = "/var/www/certbot" #<-- in the container
read_only = false
}
service {
port = "http"
# I use nomad for service discovery, but many nomad users prefer "consul" for this
provider = "nomad"
meta {
meta = "General purpose load balancer"
}
# Take this check with a grain of salt ant tune yours for yourself
check {
type = "tcp"
port = "http"
interval = "10s"
timeout = "2s"
}
}
}
}
}
While you’ll obviously want to change all the example.com
values, there are three other things you’ll want to explicitly set:
-
You’ll need to change the following line to be the name of one of your Nomad client nodes
value = "cluster-0"
-
When you’re ready for production, you’ll want to change
STAGING="--staging"
toSTAGING=""
inrun.sh
. -
And if it’s not obvious, you’ll also want to change the
DOMAINS
variable inrun.sh
:DOMAINS="example.com foo.com bar.com"
to your domain(s).
Deploy the job⌗
nomad run -var="hetzner_dns_access_token=<YOUR TOKEN HERE>" job.nomad.hcl
Bonus for Terraform users⌗
If you use the Terraform Nomad provider, which I recommend you do, you can drop this into your terraform config to deploy the above job with Terraform:
resource "nomad_job" "nginx" {
jobspec = file("job.nomad.hcl")
hcl2 {
enabled = true
vars = {
# For Certbot to perform dns-01 challenges
hetzner_dns_access_token = var.hetzner_dns_access_token
}
}
}
Summmary⌗
What we’ve done here is created a single Nomad job that:
- Has a
prestart
lifecycle task that ensures all certificates are available to thenginx
job, using Nomad host volumes. - Uses a Docker
cerbot
image with the Hetzner plugin, which performs a DNS-01 challenge using Hetzner’s DNS API. - Starts Nginx with Nomad host volumes mounted, containing the certificates retrieved using cerbot.
Note that if you run this job on multiple hosts, or auto-scale this job across multiple hosts, you should be aware of Let’s Encrypt rate limits. If you end up requesting too many certificates for the same domain, LE can block your domain for long periods of time, so be careful.
Also note that the provided Nomad job does not automate the creation of records pointing at your nomad client running the nginx job. It only creates DNS records for the DNS-01 challenge. This is why you’ll notice that the job is to a specific client node. Doing so allows the same certificates to be reused until they expire.
The user will need to create their own DNS records pointing at the “pinned” node, for every domain listed in the DOMAINS
variable.
Good luck, and feel free to reach out to me via email.