Using Apache Mesos and Docker in production
Introduction
To be able to deploy our infrastructure and scale in a fast and reliable way at Revisely, we chose to use Apache Mesos and Docker in our systems. Docker is an incredible tool for deploying components in a super-fast and reliable way, allowing to pack images and reuse them in different hosts depending on demand.
Apache Mesos and Marathon API are the matching pieces for orchestrating all these docker components. They allow to declare applications, based on docker containers, and express configuration and relationships between them. Apache Mesos is providing the infrastructure for container deployment and communication, and Marathon API provides an useful tool to interact with the system.
First step: install Mesos and Marathon
We are currently relying on Digital Ocean for our hosting. So we are following their blog post in order to setup the main infrastructure: https://www.digitalocean.com/community/tutorials/how-to-configure-a-production-ready-mesosphere-cluster-on-ubuntu-14-04
Mesos by default is using its own containerizer to deploy apps, not using docker containers. To enable docker, the docker packages needs to be installed on each slave, and docker containerizer needs to be properly configured on each slave: https://mesosphere.com/docs/tutorials/launch-docker-container-on-mesosphere/
We need to customize these settings a bit to match our needs. One of our requirements was to be able to use the same ports on the docker container than in the host, so we could map them in a 1:1 relationship. This involves updating default Marathon and Mesos configuration to allow our needed ports to be used.
So we created a local_port_min file, in mesos-master node, directory /etc/marathon/conf/, setting the initial port to 7000, that was the first initial port we needed.
So also needed that our slaves also use the right ports. This involves creating a resources file on each slave we are going to use, in /etc/mesos-slave/ . File looks like that:ports:[7000-10000,31000-32000]
By this way we enable this slave to work on the port ranges specified. We can add different ports to different slaves depending on needs, and Mesos is going to deal with that, associating each app with the slaves that better match the needs.
Second step: create and deploy docker images – simple approach
To deploy our components we needed to first create custom images, based on Dockerfile, that describe how to deploy our components. We have two different components at the moment: memcached and cassandra.
Memcached is simple component, it installs memcached and exposes it, optionally allowing to setup a password. As there was an upstream image that worked for us, we reused this instead of working on our own: https://github.com/tutumcloud/tutum-docker-memcached . To deploy this image into Mesos, a json file needs to be created:
{ "id": "memcached", "cpus": 0.5, "mem": 64.0, "instances": 1, "container": { "type": "DOCKER", "docker": { "image": "tutum/memcached", "network": "BRIDGE", "portMappings": [ { "containerPort": 11211, "hostPort": 0, "servicePort": 11211, "protocol": "tcp" } ] } }, "healthChecks": [ { "protocol": "TCP", "portIndex": 0, "path": "/", "gracePeriodSeconds": 5, "intervalSeconds": 20, "maxConsecutiveFailures": 3 } ], "env": { "MEMCACHED_PASS": "admin" } }
This JSON file can be used to deploy the initial app to Mesos using Marathon API call:
curl -X POST -H "Content-Type: application/json" localhost:8080/v2/apps -d@memcached.json
In this case we are using bridged networks because we aren’t tied to any specific port and we can leave mesos to decide the port to use. We are using 11211 as an external port, so our haproxy configuration can map the real memcached port to the ones used in the docker containers. We also used the healthChecks feature, to ensure that our component is working properly. If healthCheck failed, docker instance would be destroyed and recreated.
We are also using the environment variables, to fix a pass for memcached, so we can properly connect with a known password.
Third step: create and deploy docker images – complex approach
The memcached component was a sample of a simple docker image that can be used. But we can find more complex cases, such as creating a Cassandra docker cluster. We are facing two challenges here: Cassandra ports are not easily configurable, so it’s better to rely on fixed ports. And there is a need to communicate one container with another.
To achieve that, we rely on bridge network, and we are using environment vars to pass the ips needed to communicate. To do this, we needed to create custom cassandra images, designed to work this mesos. Source code is here: https://github.com/yrobla/docker-cassandra. It’s based on cassandra-spotify images but with some updates.
First step is to create the initial cassandra node. We are using the cassandra:latest images for it. App is deployed with the following JSON:
{ "id": "yroblacassandra", "instances": 1, "mem": 512, "ports": [7199, 7000, 7001, 9160, 9042, 8012], "requirePorts": true, "container": { "type": "DOCKER", "docker": { "image": "yrobla/cassandra:latest", "network": "BRIDGE", "portMappings": [ { "containerPort": 7199, "hostPort": 7199, "protocol": "tcp" }, { "containerPort": 7000, "hostPort": 7000, "protocol": "tcp" }, { "containerPort": 7001, "hostPort": 7001, "protocol": "tcp" }, { "containerPort": 9160, "hostPort": 9160, "protocol": "tcp" }, { "containerPort": 9042, "hostPort": 9042, "protocol": "tcp" }, { "containerPort": 8012, "hostPort": 8012, "protocol": "tcp" } ] } }, "env": { } }
In the portMappings section, we are associating each port of the container with the host port. The ports settings is used to list all the ports that component is going to expose, and the requirePorts is forcing to choose an slave with all these ports available. By this way we ensure what only one node of cassandra is deployed on each static slave, so ports aren’t overlapping.
Once we have our first node deployed, we can add the second one and cluster them. We are using a different json this time, hardcoding in the env settings the IP of first node, that will act as a seed:
{ "id": "yroblacassandracluster", "instances": 1, "mem": 512, "ports": [7199, 7000, 7001, 9160, 9042, 8012, 61621], "requirePorts": true, "container": { "type": "DOCKER", "docker": { "image": "yrobla/cassandra:cluster", "network": "BRIDGE", "portMappings": [ { "containerPort": 7199, "hostPort": 7199, "protocol": "tcp" }, { "containerPort": 7000, "hostPort": 7000, "protocol": "tcp" }, { "containerPort": 7001, "hostPort": 7001, "protocol": "tcp" }, { "containerPort": 9160, "hostPort": 9160, "protocol": "tcp" }, { "containerPort": 9042, "hostPort": 9042, "protocol": "tcp" }, { "containerPort": 8012, "hostPort": 8012, "protocol": "tcp" } ] } }, "env": { "CASSANDRA_TOKEN": "1", "CASSANDRA_SEEDS": "xx.xx.xx.xx" } }
This json file is using the cassandra:cluster image , not the cassandra:latest, that is a slightly modified version of the image that runs the cluster scripts. We are passing the internal IP of the first node that will act as seed, and a CASSANDRA_TOKEN to identify the number of node. By this way, both nodes will be able to communicate and form a Cassandra cluster.
This is an initial working approach, but the final image will be integrated with Mesos. We will pass the Marathon IP to the images , and these images will be able to ask the API about existing cassandra nodes in the deployment and use them as seeds.
Final step: add haproxy
If we want to have a LBAAS system, providing a single endpoint and not having to use API to check all IPs for the components, we can use a HAProxy element on the master: https://mesosphere.github.io/marathon/docs/service-discovery-load-balancing.html
Conclusion
Thist post is describing our first approaches with Docker and Mesos, and tries to show the potential of it. From our side there is still much work to do, one of the main ones is interacting with Marathon API. Once all the steps are completed, it looks like a very powerful systems to us, allowing to scale depending on needs, and be able to react to demand or failures.
Comments are closed.