Traefik not issuing certs in HA cluster

Traefik eventually started working on it’s own. Might have been a lingering lock that needed to expire or something.

I added the following to the Traefik Consul policy

key_prefix "traefik/" {
  policy = "write"
}
session "" {
  policy = "write"
}
session_prefix "" {
  policy = "write"
}

I’ll be honest that I’m not sure which part fixed the ACL issue. Will update this thread if I get time to do more debugging.

recently migrated Traefik for a single host deployment to a HA cluster backed by consul.

I initially got the permissions wrong and Traefik was unable to claim a lock session needed for cluster leader election, the logs were full of:

Node <uuid> elected worker
Leadership elector error Unexpected response code: 403 (rpc error making call: Permission denied)

I fixed the permissions, and it seems to have gone through a successful election, 3 workers, one leader.

Now, after submitting a new ingress object, any node that handles the request, leader or worker, produces the following logs, and never issues a cert:

level=debug msg="Looking for an existing ACME challenge for mydomain.com..."
level=debug msg="No provided certificate found for domains mydomain.com, get ACME certificate."

I’ve cleared the consul lock manually to force a new election. I’ve rebooted nodes to force them to rejoin the cluster. Verified that mydomain.com is pointed to the cluster and is being served by traefik. No luck. No error logs from ACME about rate limiting or anything, looks like it just never tried to issue it.

Traefik version: 1.7.12
Consul version: 1.5