Estimated reading time: 8 mins
Is it possible to build a five node (3 manager nodes, 2 worker nodes) Docker Swarm in under 2 minutes? Yes it is! Some weeks ago, Hennnig Jacobs who works at Zalando Technology, posted a Tweet where he referenced an article he wrote called “Why Kubernetes?”. This article covers another post, “Maybe You Don’t Need Kubernetes”, written by Matthias Endler who works at Trivago. There are always pros and cons for every solution, but I missed Docker Swarm in his article. And there was another thing that triggered my brain. He wrote that “[…] creating a cluster on DigitalOcean takes less than 4 minutes and is reasonably cheap ($30/month for 3 small nodes with 2 GiB and 1 CPU each).”. In addition, he wrote that at Zalando they “[…] run 100+ Kubernetes clusters […]”.
Therefore I asked myself how long it would take to setup a Docker Swarm cluster with 3 managers and 2 workers on AWS by myself and furthermore would it be possible to start 101 (100+) Docker Swarm clusters too? Short answer, yes it is 😎! But lets start with the idea. And as a side-note, “3 small nodes” are not a productive setup for Kubernetes, whereas 3 manager nodes and 2 worker nodes are a productive setup for Docker Swarm.
After some testing it was clear to me that I will use a script to run multiple Ansible Playbooks at the same time in parallel. I ended up with the following BASH scrtipt:
This script uses a variable $1 which is provided by an wrapper script (to create n-Docker Swarms) which is simple counter loop. First of all, I need the mm-node, the master-manager-node. The master manager node is the node, where the docker swarm init command will be issued, after the EC2 instances are created - see line 20. To make things easier, the script will wait on line 17 for all executes. The & after the code lines are indicating that these lines are running in parallel. In line 20 the join tokens for manager and worker joins are created and then, from line 23 to line 28 used to register the nodes as master or worker in parallel too.
To get this up and running smooth and fast, I have to use some tricks 🤩😁 - they might be useful out there!
Trick #1: “dynamic” inventory
The EC2 instances are using dynamic ip addresses. Therefore the script and also the Ansible Playbooks cannot rely on Ansible Inventories! There are dynamic inventory scripts for Ansible and AWS (and many others) out there and they are official supported, but they are often not that fast. Thankfully, there is ec2_instance_facts for AWS to filter (find) instances which meet certain requirements. If instances are found, we add them to a in memory Ansible Inventory. Look at the Ansible Playbook below, lines 10-23.
Trick #2: Name your instances
The second trick is, to tag the instances you create with names that are dynamic but predictable. We cannot only create one Docker Swarm cluster with this Anisble Playbook, we can create hundreds if we like. Look at the Ansible Playbook below, lines 14.
Trick #3: Save Ansible data (varibales) to a local file
This is huge! you can save Ansible output to a local file and afterwards you can load the data from this file to use it in another playbook. Look at the Ansible Playbook below, lines 40-44. If you are clever, you can create very smart playbooks. Like in this case, I save the Docker join tokes to files that are name according to the Docker Swarm cluster that is currently created. Therefore, you can use this information during the parallel creation of Docker swarms!
Trick #4: Load Ansible data (varibales) from a local file
It is easy (if you found it) to load Ansible data from local files saved previously. Look at the Ansible Playbook below, lines 30-38.
Trick #5: Use Ansible build in function
Ansible comes with a lot of handy functions. In this example, I use split to split up the number of the Docker Swarm this Ansible Playbook is running for to load the correct Docker Swarm join command. Line 32 below.
Here is the video of this run with 1 minute and 47 seconds.
Create 100+ Docker Swarms
AWS raised the limit of EC2 nano instances from 28 (default) to 550 - the only thing that you have to do for this is to open up a support ticket. The next problem is, that the BASH fork mechanism is really exhausting the resources of our ansible host - and this is OK as copying the Pyhton processes is expensive. Just for test, I’ve put in 32 cores and 64 GB of memory (VMware). With this configuration I was able to start the creation of 101 Docker Swarms but then I got a lock-down of the AWS-API - “Too many requests” 😂 Maybe, in the future I will try to create 10 or 20 Docker Swarm at a time to not reach this limit.
Ansible in combination with AWS and Docker Swarm is pretty awesome! It was a lot of fun to optimize the playbook run to get this up and running in parallel. I will upload the playbooks into a GitLab repository the next weeks. If you need it earlier, let me know!
Estimated reading time: 4 mins
For the second time I attend to the DevOps Gathering as a speaker. This year I shared the stage with Alexander Ortner, a colleague and friend of mine, and we did our talk together. Also Bernhard Rausch was with us, but we have to leave him back during our travelling challenge. 😮
Our travelling challenge started on Tuesday, 12th March at 5am at our workplace. The plan was to travel to Salzburg airport by car to catch our flight to Düsseldorf. Normally this ride takes about 90 minutes. Our flight was scheduled for 8:25am, usually plenty of time reserve. But this day was one of these days, where hardly anything works as expected. The first thing that happened was that a truck crash stopped our ride to Salzburg airport. Unfortunately we were locked down on the motorway for more than two hours and therefore we were not able to catch up our flight.
As the previous plan was that we were going to attend to the DevOps Gathering 2019 as private persons, all chances were gone to reach our talk on Wednesday. But during the booking of the flight some month ago and the day this story happens, our employer, STRABAG BRVZ Gmbh, was so kind to support our travel. Therefore Alex called his principal and we got a go to book flights for Alex and me (Mario) from Munich. MANY, MANY THANKS FOR THIS SUPPORT TO STRABAG BRVZ GMBH! 💗
But we had to leave back Bernhard at Salzburg train station 😓. Nevertheless Alex and I (Mario) went on to Munich to catch the flight from there. After the ride to Munich we were able to check in in-time and after a rough flight with a nice side-wind landing we caught the train to Bochum without any problems. After thirteen hours we arrived at the DevOps Gathering location at Bochum (G-Data) finally.
As we arrived, we received a really huge welcome from the other attendees! Special thanks to Xinity, you are always welcome my friend! We talked a lot with the other attendees and we were able to catch up with the latest information. After some hours we left the venue and went back to our hotel where Alex and I updated our presentation with a special slide to honor Bernhard for all that he tries to be with us. It’s always about the people and friends - people matter!
Next day, we started early to get all up and running and to test our equipment at the conference location. Then it was stage time and overall all went smooth! You can find the slides from our presentation on Speaker Deck | C4 - Continuous Culture Change Challenges! It’s a different if you do a talk alone or if you share the stage. Both ways have different challenges. After the talk we got a load of positive feedback! And we would like to say THANK YOU for all your positive feedback!
An hour later or so, I checked the trains from Düsseldorf to Bochum for our return travel and that’s where I noticed that all trains from Düsseldorf to Bochum were cancelled for the whole day because of the storm (trees on the train track). Niclas Mietz from the Bee42 (DevOps Gathering Organizer) was so kind to bring us to Essen, where we were able to catch our flight to Munich. After the ride back to our working location we arrived happily.
The DevOps Gathering 2019 was a great conference for us, even if we were not able to be there for long time. But it was very, very nice to see how everyone tried to help us and our short stay was very intensive. Many, many thanks for all who have supported us! 🤗
And here is the recording of our talk!
Here are some pictures from the conference!
Posted on: Mon, 18 Mar 2019 19:21:00 +0100 by Mario Kleinsasser , Alexander O. Ortner , Bernhard Rausch
Estimated reading time: 4 mins
This weekend, from the 25th until the 27th of January, the DEVCONF.cz took place at the Faculty of Information Technology Brno (Czech Republic) and I’ve got the chance to attend. As an Open Source addicted community driven conference, which is mainly sponsored by Red Hat there was no ticket charge, but a free ticket registration was required. Now, after the conference I know why, but more on that later. Red Hat is running a large office at Brno (around 1200 employees) and most of them are working in a technical area. Therefore there is an intense partnership between the technical university of Brno and Red Hat.
I got knowledge about this conference by colleagues who are working at Red Hat Vienna. A while ago they told me, that there is a large annual conference at Brno and if I would be interested to attend. I said yes, because the conference is free of charge, community driven, the schedule was very interesting and my company (STRABAG) payed the hotel expenses - many thanks for that at this point! ❤️ 😃
My journey started on Friday morning at the company and after a 6 hour drive I arrived at my hotel at Brno around 3 pm save and sound. I checked in and went off to the conference venue immediately by tram. What should I say, there were lots of people there! As written above, now I know why a free registration is needed and recommended before the conference. The DEVCONF.cz used the system provided by Eventbrite, which worked perfectly!
The first track I listen to was Ansible Plugins by Abhijeet Kasurde which was very informative because it is possible to easily extend Ansible by plugging in filters for example. The second and last track on Friday was Convergence of Communities: OKD = f(Kubernetes++) by Daniel Izquierdo and Diane Mueller. This one was really interesting as it gave a cool insight how people are contributing to various open source project based on the GitHub repositories, commits and comments.
After that, I went back to the hotel and met with the colleagues from RedHat, Franz Theisen and Armin Müellner and after some chatting we went up to the dinner, which was really delicious! During the dinner I had the chance to talk to other colleagues who were with us.
On Saturday I got the chance to visit the Red Hat office at Brno and after a delicious coffer we went on to the conference.
I had a full packed day with a lot of sessions which are listed afterwards. The full schedule can be found here.
- Container Meetup
- Ansible to manage Desktops
- Visualized Workstation
- Containers without Daemons - Podman Internals
- Container pipeline for devs and enterprises alike!
- How OpenShift Builds Container Images
- Insiders info from the Masters of Clouds
- Managing a fleet of Linux desktops with Ansible
- Legacy Monolith to Microservices
All of the tracks that I have visited were great! But I would like to highlight two of them. The Containers Meetup with Daniel Walsh was super interesting because of the discussion about cgroups v2 which might cause a lot of problems for the container software. The problem herein is, that the cgroups v2 interface of the Linux kernel is not compatible with the v1 version. This means, that software which relies on libraries that are implementing cgroups v1, like Docker and others, will be broken. if the new Kernel interface is enabled. In the meetup it was discussed if the upcoming Fedora version should go this way. Well, we will see whats coming up…
The Insiders info from the Masters of Clouds is the second one I like to mention because there were lots of insight, how Red Hat manages their infrastructure. For me it was mega cool to see, that Red Hat is also using Zabbix heavily for system monitoring, like we on-premises too!
On Saturday evening we had a very nice dinner and the opportunity to continue our chats from Friday. On Sunday I went back to Austria early, as I have to drive 6 hours back. 🚗😃
In summary, I am very happy, that I’ve got the chance to attend to this conference and I will try to attend to it next year too! I meet a lot of cool people, like Akihiro Suda, the Docker Community Leader of Tokyo, which I am really proud of. DevConf.cz I will come back!
Estimated reading time: 2 mins
The two weeks Holidays vacation is over, we are back at work and the Docker Swarms ran fully unattended without a single outage during this period! For every IT engineer reliability may be something different because it depends upon which goals you have to achieve with your team.
Last year at the same time we ran roughly 450 containers in our Docker Swarm’s, this year we already had more than 1500 containers. Almost three times more than last year.
For me, the Yin and Yang symbol is not the worst symbol to reflect the idea of Site Reliability Engineering because there are always some kind of trade offs you have to accept between the ultimate reliable system and the infinite time it would take to implement such system. Therefore different and often fully contrary needs have to work together to still create a system that fulfills all needs, like Yin and Yang.
The monitoring and alerting system observed the Docker Swarms autonomous and today we reviewed the data tracked. The only thing that happened was a failure of a single Docker worker node which did a reboot. The Docker Swarm automatically started the missing containers on the remaining Docker workers and nothing more happens.
I think that we did a great job, because we had two full vacation weeks without any stress. The Docker Swarm doesn’t break, all services were always up and running and the system has handled a failing Docker worker as expected.
Times like these are always exciting because they proof if the systems are working even without people who are watching it. Happy new year and happy hacking!
Estimated reading time: 4 mins
This year we had the great opportunity to attend to DockerCon EU 2018 at Barcelona with four people. Two of us, Alex and Martin are developers, and Bernhard and I are operators so we did a real DevOps journey! The decision to go with both teams, in terms of DevOps, was the best we ever made and we are very thankful that our company, STRABAG BRVZ supported this idea. In fact, there were a lot of topics which were developer focused and in parallel there were also a lot of breakouts that were more operator focused. So we’ve got the best of both worlds.
We will not write a long summary of all sessions, break outs and workshops we attended as you can find all the sessions already online - videos of all sessions are available here!
Rather we will give you an inside view about a great community.
I am (Mario) a active Docker Community Leader and therefore I got the chance to attend to the Docker Community Leader Summit which took place on Monday afternoon. I came late to the summit because our flight was delayed, but Lisa (Docker Community Manager) reserved a seat for me. Therefore I was only able to bring myself in for the last two hours of the summit, but this was still a huge benefit. You might think that at such summits there are only soft laundered discussions going on, but from my point of view I can tell, that this was not the case. Instead, the discussion was very focused about the pros and the cons on what Docker does expect from the Community Leaders and what the Community Leaders can expect from Docker to retrieve support with their meetups. In short, there will be a new Code of Conduct for the Community Leaders in the near future. The second discussion was about Bevy, the “Meetup” platform where the Docker Meetup pages are created and the Docker Meetups are to be announced. Not all of us are happy with the current community split up situation between bevy.com and meetup.com and we had discussed both sides of the medal. This is obviously a topic we will have to look more at in the next few month and we will see how things progress. Sadly, I had to leave the summit just in time, as Bernhard and I were going to hold a Hallway Track and therefore I missed the Community Leader Summit group photo…
The Hallway Track we did was really fun and impressive. We shared our BosnD project as we think, that a lot of people are still struggling to run more than a handful services in production. There are new load balancer concepts like Traefik out there and there are also service meshes but most of the time people just want to get up and running with the things they already have but in containers and with the many benefits of an orchestrator (like docker swarm). And regarding to our Hallway Track and also referencing the Hallway Track held by Rachid Zarouali (AMA Docker Captains Track) which I attended too, this is still one of the main issues.
The DockerCon party was huge and we had the chance to talk to a lot of people and friends. It was a very nice evening with great food and a large number of discussions. After the DockerCon EU 2017 people said that Docker is dead and that the Docker experiment will be over soon. And yes it was not clear how the Docker Inc. will handle the facing challenges. One year later, after Microsoft bought GitHub and RedHat was swallowed by IBM, Docker Inc. is now on a good course. Of course, they have to run their Enterprise program, they have to earn money, but they are still dedicated to the community and, and this was surprising, to their customers. There were some really cool break outs, like the one from Citizens Bank, which clearly showed, that Docker inc. (the company) is able to handle both, Docker Swarm AND Kubernetes, very well with their Docker EE product.
Well, we will see where this is all going to, but, in our oppinion, Docker Inc currently seems to be vital (look at their growing customer number) and their business model seems to work.