Blog.

Building a Elasticsearch cluster using Docker-Compose and Traefik

Marco Franssen

Marco Franssen /

6 min read1139 words

Cover Image for Building a Elasticsearch cluster using Docker-Compose and Traefik

In a previous blog I have written on setting up Elasticsearch in docker-compose.yml already. I have also shown you before how to setup Traefik 1.7 in docker-compose.yml. Today I want to show you how we can use Traefik to expose a loadbalanced endpoint on top of a Elasticsearch cluster.

Simplify networking complexity while designing, deploying, and running applications.

We will setup our cluster using docker-compose so we can easily run and cleanup this cluster from our laptop.

Create a Elasticsearch cluster

Lets first create a 2 node Elasticsearch cluster using the following docker-compose setup.

docker-compose.yml
version: "3.7"
 
services:
  es01:
    image: "docker.elastic.co/elasticsearch/elasticsearch-oss:7.7.1"
    ports:
      - "9200:9200"
      - "9300:9300"
    environment:
      node.name: es01
      discovery.seed_hosts: es02
      cluster.initial_master_nodes: es01,es02
      cluster.name: traefik-tutorial-cluster
      bootstrap.memory_lock: "true"
      ES_JAVA_OPTS: -Xms256m -Xmx256m
    volumes:
      - "es-data-es01:/usr/share/elasticsearch/data"
    ulimits:
      memlock:
        soft: -1
        hard: -1
 
  es02:
    image: "docker.elastic.co/elasticsearch/elasticsearch-oss:7.7.1"
    ports:
      - "9201:9200"
      - "9301:9300"
    environment:
      node.name: es02
      discovery.seed_hosts: es01
      cluster.initial_master_nodes: es01,es02
      cluster.name: traefik-tutorial-cluster
      bootstrap.memory_lock: "true"
      ES_JAVA_OPTS: -Xms256m -Xmx256m
    volumes:
      - "es-data-es02:/usr/share/elasticsearch/data"
    ulimits:
      memlock:
        soft: -1
        hard: -1
 
volumes:
  es-data-es01:
  es-data-es02:

Now when we run this docker-compose setup you will be able to reach the first node at http://localhost:9200 and the second node at http://localhost:9201. Now for every node we would like to add to this cluster we simply would have to expose another port from our docker-environment to be able to connect directly with such a node.

A cleaner solution would be if we would just have to expose a single port to our host. When we connect to this single port, we want our request to be loadbalanced on any of the nodes in our cluster.

Add Traefik as Loadbalancer

Traefik has different configuration providers. One of them is Docker which allows to configure Traefik via Docker labels. Now let us first add the Traefik container.

docker-compose.yml
version: "3.7"
 
services:
  gateway:
    image: traefik:v2.2
    command:
      - --api.insecure=true
      - --providers.docker=true
      - --providers.docker.exposedByDefault=false
    ports:
      - "80:80"
      - "8080:8080"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro

In order for Traefik to be able to read the Docker labels we need to mount the docker.sock as a volume. We will also specify that we want to enable the Traefik Docker provider, and configure it to only include containers that are explicitly enabled using a Docker label. Last but not least we will enable the api, so we can also have a look at the Traefik Dashboard.

When we now run docker-compose up -d again you will be able to navigate to Traefik Dashboard. Here you can see an overview of routers, service and middleware for HTTP, TCP and UDP. At the moment we didn't configure any as we didn't specify the labels just yet on our elasticsearch containers.

Now let's define the labels on the Elasticsearch containers. For brevity I left the other properties of these Docker containers in the example below.

docker-compose.yml
es01:
  labels:
    - "traefik.enable=true"
    - "traefik.http.routers.elasticsearch.entrypoints=http"
    - "traefik.http.routers.elasticsearch.rule=Host(`localhost`) && PathPrefix(`/es`) || Host(`elasticsearch`)"
    - "traefik.http.routers.elasticsearch.middlewares=es-stripprefix"
    - "traefik.http.middlewares.es-stripprefix.stripprefix.prefixes=/es"
    - "traefik.http.services.elasticsearch.loadbalancer.server.port=9200"
es02:
  labels:
    - "traefik.enable=true"
    - "traefik.http.routers.elasticsearch.entrypoints=http"
    - "traefik.http.routers.elasticsearch.rule=Host(`localhost`) && PathPrefix(`/es`) || Host(`elasticsearch`)"
    - "traefik.http.routers.elasticsearch.middlewares=es-stripprefix"
    - "traefik.http.middlewares.es-stripprefix.stripprefix.prefixes=/es"
    - "traefik.http.services.elasticsearch.loadbalancer.server.port=9200"

With the labels on these 2 containers we do the following:

  • Enable to container with Traefik
  • Listen on the default http (:80) entrypoint
  • Add a rule that will direct all traffic to http://localhost/es to one of the elasticsearch nodes
  • Register a middleware which will strip the /es prefix before forwarding the request.
  • Explicitly inform Traefik it has to connect on port 9200 of the Elasticsearch containers (required because Elasticsearch exposes port 9200 and 9300)

Now when we run docker-compose up -d again we will see the Elasticsearch containers will be reloaded. When navigating to the Traefik Dashboard you will now see a router, service and middleware has been configured. With all of this in place you can now access Elasticsearch at http://localhost/es. Refresh your browser a couple of times and notice you are being loadbalanced on the 2 Elasticsearch nodes.

If you update your hosts file with the following we can also access the elasticsearch cluster at http://elasticsearch which was the other rule we defined in the Traefik routing rule.

/etc/hosts
127.0.0.1 localhost elasticsearch

You can now also remove the port mappings from docker-compose.yml. So please go ahead and remove from both the containers the mapping for the ports.

docker-compose.yml
es01:
  ports:
    - 9200:9200
    - 9300:9300
es02:
  ports:
    - 9201:9200
    - 9301:9300

Cerebro as your Elasticsearch admin interface

Last but not least I want to show you Cerebro which is a nice little admin tool to work with your Elasticsearch cluster. In the following docker-compose configuration we will expose Cerebro at http://localhost/admin.

docker-compose.yml
cerebro:
  image: lmenezes/cerebro:0.8.5
  volumes:
    - "./conf/cerebro/application.conf:/opt/cerebro/conf/application.conf"
  depends_on:
    - gateway
  links:
    - "gateway:elasticsearch"
  labels:
    - "traefik.enable=true"
    - "traefik.http.routers.admin.entrypoints=http"
    - "traefik.http.routers.admin.rule=Host(`localhost`) && PathPrefix(`/admin`)"
    - "traefik.http.services.cerebro.loadbalancer.server.port=9000"

Also here we enable the configuration in Traefik. Furthermore we enable a rule that will listen at http://localhost/admin. We also included a link that will define a network alias for our gateway container called elasticsearch. Remember, we defined previously a rule that listened for http://elasticsearch? Now we will utilize this in the Cerebro configuration which we mount into our container.

conf/cerebro/application.conf
# Secret will be used to sign session cookies, CSRF tokens and for other encryption utilities.
# It is highly recommended to change this value before running cerebro in production.
secret = "ki:s:[[@=Ag?QI`W2jMwkY:eqvrJ]JqoJyi2axj3ZvOv^/KavOT4ViJSv?6YY4[N"

# Application base path
basePath = "/admin/"

# Defaults to RUNNING_PID at the root directory of the app.
# To avoid creating a PID file set this value to /dev/null
#pidfile.path = "/var/run/cerebro.pid"
pidfile.path=/dev/null

# Rest request history max size per user
rest.history.size = 50 // defaults to 50 if not specified

# Path of local database file
#data.path: "/var/lib/cerebro/cerebro.db"
data.path = "./cerebro.db"

play {
  # Cerebro port, by default it's 9000 (play's default)
  server.http.port = ${?CEREBRO_PORT}
}

es = {
  gzip = true
}

# Authentication
auth = {
  # either basic or ldap
  type: ${?AUTH_TYPE}
  settings {
    # LDAP
    url = ${?LDAP_URL}
    # OpenLDAP might be something like "ou=People,dc=domain,dc=com"
    base-dn = ${?LDAP_BASE_DN}
    # Usually method should  be "simple" otherwise, set it to the SASL mechanisms to try
    method = ${?LDAP_METHOD}
    # user-template executes a string.format() operation where
    # username is passed in first, followed by base-dn. Some examples
    #  - %s => leave user untouched
    #  - %[email protected] => append "@domain.com" to username
    #  - uid=%s,%s => usual case of OpenLDAP
    user-template = ${?LDAP_USER_TEMPLATE}
    // User identifier that can perform searches
    bind-dn = ${?LDAP_BIND_DN}
    bind-pw = ${?LDAP_BIND_PWD}
    group-search {
      // If left unset parent's base-dn will be used
      base-dn = ${?LDAP_GROUP_BASE_DN}
      // Attribute that represent the user, for example uid or mail
      user-attr = ${?LDAP_USER_ATTR}
      // Define a separate template for user-attr
      // If left unset parent's user-template will be used
      user-attr-template = ${?LDAP_USER_ATTR_TEMPLATE}
      // Filter that tests membership of the group. If this property is empty then there is no group membership check
      // AD example => memberOf=CN=mygroup,ou=ouofthegroup,DC=domain,DC=com
      // OpenLDAP example => CN=mygroup
      group = ${?LDAP_GROUP}
    }

    # Basic auth
    username = ${?BASIC_AUTH_USER}
    password = ${?BASIC_AUTH_PWD}
  }
}

# A list of known hosts
hosts = [
  {
   host = "http://elasticsearch"
   name = "traefik-tutorial-cluster"
   headers-whitelist = [ "x-proxy-user", "x-proxy-roles", "X-Forwarded-For" ]
  }
  # Example of host with authentication
  # {
  #  host = "http://some-authenticated-host:9200"
  #  name = "Secured Cluster"
  #  auth = {
  #    username = "username"
  #    password = "secret-password"
  #  }
  # }
]

The two important settings for Cerebro to work properly with our Traefik setup are basePath configured as /admin/, because we run Cerebro at http://localhost/admin. Secondly we are utilizing the route elasticsearch which was defined as a Traefik routing rule and added as an alias for the gateway container.

Now you can run docker-compose up -d again. Give Cerebro a try at http://localhost/admin.

Summary

Below you can find the entire docker-compose.yml that was covered in this Blog.

docker-compose.yml
version: '3.7'

services:
  gateway:
    image: traefik:v2.2
    command:
      - --api.insecure=true
      - --providers.docker=true
      - --providers.docker.exposedByDefault=false
    ports:
      - "80:80"
      - "8080:8080"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro

  es01:
    image: "docker.elastic.co/elasticsearch/elasticsearch-oss:7.7.1"
    environment:
      node.name: es01
      discovery.seed_hosts: es02
      cluster.initial_master_nodes: es01,es02
      cluster.name: traefik-tutorial-cluster
      bootstrap.memory_lock: "true"
      ES_JAVA_OPTS: -Xms256m -Xmx256m
    volumes:
      - "es-data-es01:/usr/share/elasticsearch/data"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.elasticsearch.entrypoints=http"
      - "traefik.http.routers.elasticsearch.rule=Host(`localhost`) && PathPrefix(`/es`) || Host(`elasticsearch`)"
      - "traefik.http.routers.elasticsearch.middlewares=es-stripprefix"
      - "traefik.http.middlewares.es-stripprefix.stripprefix.prefixes=/es"
      - "traefik.http.services.elasticsearch.loadbalancer.server.port=9200"

  es02:
    image: "docker.elastic.co/elasticsearch/elasticsearch-oss:7.7.1"
    environment:
      node.name: es02
      discovery.seed_hosts: es01
      cluster.initial_master_nodes: es01,es02
      cluster.name: traefik-tutorial-cluster
      bootstrap.memory_lock: "true"
      ES_JAVA_OPTS: -Xms256m -Xmx256m
    volumes:
      - "es-data-es02:/usr/share/elasticsearch/data"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.elasticsearch.entrypoints=http"
      - "traefik.http.routers.elasticsearch.rule=Host(`localhost`) && PathPrefix(`/es`) || Host(`elasticsearch`)"
      - "traefik.http.routers.elasticsearch.middlewares=es-stripprefix"
      - "traefik.http.middlewares.es-stripprefix.stripprefix.prefixes=/es"
      - "traefik.http.services.elasticsearch.loadbalancer.server.port=9200"

  cerebro:
    image: lmenezes/cerebro:0.8.5
    volumes:
      - "./conf/cerebro/application.conf:/opt/cerebro/conf/application.conf"
    depends_on:
      - gateway
    links:
      - "gateway:elasticsearch"
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.admin.entrypoints=http"
      - "traefik.http.routers.admin.rule=Host(`localhost`) && PathPrefix(`/admin`)"
      - "traefik.http.services.cerebro.loadbalancer.server.port=9000"

volumes:
  es-data-es01:
  es-data-es02:

Homework

Now last but not least you could add Kibana by yourself. Try to expose Kibana at http://localhost by defining a Traefik rule for Kibana. You could check here to get started with Kibana.

References

I hope you enjoyed this blog. As always please share this blog with your friends and colleagues and provide me with some feedback in the comments below.

You have disabled cookies. To leave me a comment please allow cookies at functionality level.

More Stories

Cover Image for Remove files from Git history using git-filter-repo

Remove files from Git history using git-filter-repo

Marco Franssen

Marco Franssen /

Many of you have probably been in a situation where you committed a file in your repository which you shouldn't have done in the first place. For example a file with credentials or a crazy big file that made your repository clones very slow. Now there are a lot of blogs and guides already available on how to get these files completely removed. It involves git filter-branch or bfg sourcery. In this blog I'm going to show you the new recommended way of doing this using git-filter-repo, which simpl…

Cover Image for Nginx 1.19 supports environment variables and templates in Docker

Nginx 1.19 supports environment variables and templates in Docker

Marco Franssen

Marco Franssen /

In this blog I want to show you a nice new feature in Nginx 1.19 Docker image. I requested it somewhere 2 years ago when I was trying to figure out how I could configure my static page applications more flexibly with various endpoints to backing microservices. Back then I used to have my static pages fetch a json file that contained the endpoints for the apis. This way I could simply mount this json file into my container with all kind of endpoints for this particular deployment. It was some sor…

Cover Image for Use the ACME DNS-Challenge to get a TLS certificate

Use the ACME DNS-Challenge to get a TLS certificate

Marco Franssen

Marco Franssen /

In my previous 2 blogs I have shown you how to build a HTTP/2 webserver. In these blogs we have covered self signed TLS certificates as well retrieving a Certificate via Letsencrypt. I mentioned there you will have to expose your server publicly on the internet. However I now figured out there is another way. So please continue reading. Let's Encrypt is a free, automated, and open certificate authority brought to you by the nonprofit Internet Security Research Group (ISRG). Letsencrypt impleme…

Cover Image for Build a Go Webserver on HTTP/2 using Letsencrypt

Build a Go Webserver on HTTP/2 using Letsencrypt

Marco Franssen

Marco Franssen /

Pretty often I see developers struggle with setting up a webserver running on https. Now some might argue, why to run a webserver on https during development? The reason for that is simple. If you would like to benefit from HTTP/2 features like server push, utilizing the http.Pusher interface, you will need to run your webserver on HTTP/2. That is the only way how you can very early on in the development process test this. In this blog I'm showing you how to do that in Go using Letsencrypt and a…