Breaking down monoliths: blue-green microservice deployments with Traefik

When downtime is not acceptable, experts start speaking about zero downtime deployment strategies, for example blue-green deployment, sometimes referred to as A/B deployment.

The main idea behind blue-green deployment is that you have some kind of load balancer, and behind that a live system that we refer to as “green” and a stand-by system called “blue”. When we deploy software we do it on the blue system, when we finish we switch the load balancer so the traffic will hit the newly configured system that was “blue” – now “green” –, and the old live will take the place of the stand-by “blue”.

Articles on the internet will describe several variations of this setup, the minimal solution will be a load balancer and one or more nodes of the service you are trying to redeploy without downtime.

Traefik

At GPMD we also use this solution, in our case the load balancer is a Traefik container, the nodes are other Docker containers. The setup is running on the same AWS t3-micro node or bigger, depending on the necessary resources.

Our goal was to minimise resource and configuration needs for really small services without compromising – actually improving – availability and separation of services.

Traefik will act as a web server, providing access to all desired ports to the running docker containers: http, https, ssh, custom tcp ports.

Traefik “understands” Docker, so no extra steps are necessary for launching a new container into a load balancer group. There are other options for defining services like file based configuration, Kubernetes, Rancher or Marathon.

Our solution is simple: the load balancer will send the traffic to the old node while the new node is starting. As soon as the new node is ready, the traffic would hit both nodes equally, but as soon as the new node is ready we stop the old node gracefully.

Breaking monoliths down to microservices

As a very welcome – definitely not accidental – side-effect, the implemented solution also allowed us to break down monolithic applications into microservices. This is due to the ability of Traefik connecting domains and endpoint path prefixes into backend services.

As an example, we want to serve https://foo.gpmd.co.uk/ and https://foo.gpmd.co.uk/api endpoints from separate microservices, we can fire up two separate docker containers with labels indicating our needs, Traefik will create the internal router-middleware-backend structure necessary on the fly.

We also do the TLS termination in Traefik, running a http endpoint behind, even with automated installation and updates of Let’s Encrypt certificates.

Using Traefik v1.7 our docker container labels would look like this:

# main website
traefik.frontend.rule: "Host: some.gpmd.co.uk;PathPrefixStrip:/"
traefik.frontend.entryPoints: "https"
traefik.port:                 "80"
# api endpoint
traefik.frontend.rule: "Host: some.gpmd.co.uk;PathPrefixStrip:/api"
traefik.frontend.entryPoints: "https"
traefik.port:                 "80"

Traefik v2 has a different structure:

# main website
traefik.enable:                           "true",
traefik.http.routers.website.entrypoints: "web-secure",
traefik.http.routers.website.tls:         "true",
traefik.http.routers.website.rule:        "Host(`foo.gpmd.co.uk`)",
traefik.http.routers.website.service:     "website",
# api endpoint
traefik.enable:                       "true",
traefik.http.routers.api.entrypoints: "web-secure",
traefik.http.routers.api.tls:         "true",
traefik.http.routers.api.rule:        "Host(`foo.gpmd.co.uk`) && PathPrefix(`/api`)",
traefik.http.routers.api.service:     "api",

Let’s make it a bit more complicated for the /api endpoint: we have to make sure http is redirected to https, so we introduce a https-only middleware with redirectscheme.scheme configuration, we also want to remove path prefix with strip-prefix with stripprefix.prefixes. When we need to use more than one middlewares, we can chain them together:

traefik.http.routers.api.middlewares: "chained"
traefik.http.middlewares.chained.chain.middlewares:"https-only,strip-prefix"
traefik.http.middlewares.https-only.redirectscheme.scheme: "https"
traefik.http.middlewares.strip-prefix.stripprefix.prefixes: "/api"

If we already had the main website docker composer running and serving the full content of the website, including /api, as soon as we launch the second docker container, Traefik will stop using the monolithic application’s /api and send traffic to the microservice.

Simple blue-green

The setup allows us to run two or more website and api docker containers, load balancing between them. When launching the new version – as mentioned before – we start the new containers, then stop the old ones. We could do this with any load balancing solution, but not with this size. When we also need to provide SSH access to the container – using the same port in our case – Traefik 2’s TCP support comes handy.

Conclusion

Introducing Traefik for blue-green deployments has many benefits – for the cost of a slight speed decrease compared to a similar nginx setup –, we can have a way to introduce new endpoint microservices without affecting the live site or writing/using complex configuration management solutions.

Where to go next?

We did not stop here, our CI toolchains needed an automated solution, so we created a small tool automating this process – launching new docker containers with the right configuration, stopping the running service when the new container is ready.