Estimated reading time: 4 mins
This weekend, from the 25th until the 27th of January, the DEVCONF.cz took place at the Faculty of Information Technology Brno (Czech Republic) and I’ve got the chance to attend. As an Open Source addicted community driven conference, which is mainly sponsored by Red Hat there was no ticket charge, but a free ticket registration was required. Now, after the conference I know why, but more on that later. Red Hat is running a large office at Brno (around 1200 employees) and most of them are working in a technical area. Therefore there is an intense partnership between the technical university of Brno and Red Hat.
I got knowledge about this conference by colleagues who are working at Red Hat Vienna. A while ago they told me, that there is a large annual conference at Brno and if I would be interested to attend. I said yes, because the conference is free of charge, community driven, the schedule was very interesting and my company (STRABAG) payed the hotel expenses - many thanks for that at this point! ❤️ 😃
My journey started on Friday morning at the company and after a 6 hour drive I arrived at my hotel at Brno around 3 pm save and sound. I checked in and went off to the conference venue immediately by tram. What should I say, there were lots of people there! As written above, now I know why a free registration is needed and recommended before the conference. The DEVCONF.cz used the system provided by Eventbrite, which worked perfectly!
The first track I listen to was Ansible Plugins by Abhijeet Kasurde which was very informative because it is possible to easily extend Ansible by plugging in filters for example. The second and last track on Friday was Convergence of Communities: OKD = f(Kubernetes++) by Daniel Izquierdo and Diane Mueller. This one was really interesting as it gave a cool insight how people are contributing to various open source project based on the GitHub repositories, commits and comments.
After that, I went back to the hotel and met with the colleagues from RedHat, Franz Theisen and Armin Müellner and after some chatting we went up to the dinner, which was really delicious! During the dinner I had the chance to talk to other colleagues who were with us.
On Saturday I got the chance to visit the Red Hat office at Brno and after a delicious coffer we went on to the conference.
I had a full packed day with a lot of sessions which are listed afterwards. The full schedule can be found here.
- Container Meetup
- Ansible to manage Desktops
- Visualized Workstation
- Containers without Daemons - Podman Internals
- Container pipeline for devs and enterprises alike!
- How OpenShift Builds Container Images
- Insiders info from the Masters of Clouds
- Managing a fleet of Linux desktops with Ansible
- Legacy Monolith to Microservices
All of the tracks that I have visited were great! But I would like to highlight two of them. The Containers Meetup with Daniel Walsh was super interesting because of the discussion about cgroups v2 which might cause a lot of problems for the container software. The problem herein is, that the cgroups v2 interface of the Linux kernel is not compatible with the v1 version. This means, that software which relies on libraries that are implementing cgroups v1, like Docker and others, will be broken. if the new Kernel interface is enabled. In the meetup it was discussed if the upcoming Fedora version should go this way. Well, we will see whats coming up…
The Insiders info from the Masters of Clouds is the second one I like to mention because there were lots of insight, how Red Hat manages their infrastructure. For me it was mega cool to see, that Red Hat is also using Zabbix heavily for system monitoring, like we on-premises too!
On Saturday evening we had a very nice dinner and the opportunity to continue our chats from Friday. On Sunday I went back to Austria early, as I have to drive 6 hours back. 🚗😃
In summary, I am very happy, that I’ve got the chance to attend to this conference and I will try to attend to it next year too! I meet a lot of cool people, like Akihiro Suda, the Docker Community Leader of Tokyo, which I am really proud of. DevConf.cz I will come back!
Posted on: Sun, 27 Jan 2019 16:32:12 +0100 by Mario Kleinsasser
Estimated reading time: 2 mins
The two weeks Holidays vacation is over, we are back at work and the Docker Swarms ran fully unattended without a single outage during this period! For every IT engineer reliability may be something different because it depends upon which goals you have to achieve with your team.
Last year at the same time we ran roughly 450 containers in our Docker Swarm’s, this year we already had more than 1500 containers. Almost three times more than last year.
For me, the Yin and Yang symbol is not the worst symbol to reflect the idea of Site Reliability Engineering because there are always some kind of trade offs you have to accept between the ultimate reliable system and the infinite time it would take to implement such system. Therefore different and often fully contrary needs have to work together to still create a system that fulfills all needs, like Yin and Yang.
The monitoring and alerting system observed the Docker Swarms autonomous and today we reviewed the data tracked. The only thing that happened was a failure of a single Docker worker node which did a reboot. The Docker Swarm automatically started the missing containers on the remaining Docker workers and nothing more happens.
I think that we did a great job, because we had two full vacation weeks without any stress. The Docker Swarm doesn’t break, all services were always up and running and the system has handled a failing Docker worker as expected.
Times like these are always exciting because they proof if the systems are working even without people who are watching it. Happy new year and happy hacking!
Posted on: Mon, 07 Jan 2019 20:24:46 +0100 by Mario Kleinsasser
Estimated reading time: 4 mins
This year we had the great opportunity to attend to DockerCon EU 2018 at Barcelona with four people. Two of us, Alex and Martin are developers, and Bernhard and I are operators so we did a real DevOps journey! The decision to go with both teams, in terms of DevOps, was the best we ever made and we are very thankful that our company, STRABAG BRVZ supported this idea. In fact, there were a lot of topics which were developer focused and in parallel there were also a lot of breakouts that were more operator focused. So we’ve got the best of both worlds.
We will not write a long summary of all sessions, break outs and workshops we attended as you can find all the sessions already online - videos of all sessions are available here!
Rather we will give you an inside view about a great community.
I am (Mario) a active Docker Community Leader and therefore I got the chance to attend to the Docker Community Leader Summit which took place on Monday afternoon. I came late to the summit because our flight was delayed, but Lisa (Docker Community Manager) reserved a seat for me. Therefore I was only able to bring myself in for the last two hours of the summit, but this was still a huge benefit. You might think that at such summits there are only soft laundered discussions going on, but from my point of view I can tell, that this was not the case. Instead, the discussion was very focused about the pros and the cons on what Docker does expect from the Community Leaders and what the Community Leaders can expect from Docker to retrieve support with their meetups. In short, there will be a new Code of Conduct for the Community Leaders in the near future. The second discussion was about Bevy, the “Meetup” platform where the Docker Meetup pages are created and the Docker Meetups are to be announced. Not all of us are happy with the current community split up situation between bevy.com and meetup.com and we had discussed both sides of the medal. This is obviously a topic we will have to look more at in the next few month and we will see how things progress. Sadly, I had to leave the summit just in time, as Bernhard and I were going to hold a Hallway Track and therefore I missed the Community Leader Summit group photo…
The Hallway Track we did was really fun and impressive. We shared our BosnD project as we think, that a lot of people are still struggling to run more than a handful services in production. There are new load balancer concepts like Traefik out there and there are also service meshes but most of the time people just want to get up and running with the things they already have but in containers and with the many benefits of an orchestrator (like docker swarm). And regarding to our Hallway Track and also referencing the Hallway Track held by Rachid Zarouali (AMA Docker Captains Track) which I attended too, this is still one of the main issues.
The DockerCon party was huge and we had the chance to talk to a lot of people and friends. It was a very nice evening with great food and a large number of discussions. After the DockerCon EU 2017 people said that Docker is dead and that the Docker experiment will be over soon. And yes it was not clear how the Docker Inc. will handle the facing challenges. One year later, after Microsoft bought GitHub and RedHat was swallowed by IBM, Docker Inc. is now on a good course. Of course, they have to run their Enterprise program, they have to earn money, but they are still dedicated to the community and, and this was surprising, to their customers. There were some really cool break outs, like the one from Citizens Bank, which clearly showed, that Docker inc. (the company) is able to handle both, Docker Swarm AND Kubernetes, very well with their Docker EE product.
Well, we will see where this is all going to, but, in our oppinion, Docker Inc currently seems to be vital (look at their growing customer number) and their business model seems to work.
Posted on: Sun, 30 Dec 2018 20:00:09 +0100 by Mario Kleinsasser , Bernhard Rausch
Estimated reading time: 10 mins
Some weeks ago I dived a little bit into the Play with Docker GitHub repository because I would like to run Play With Docker (called PWD) locally to have a backup option during a Docker Meetup if something would be wrong with the internet connectivity or with the Docker prepared workshop sessions.
The most important thing first: Running PWD locally means, running it on localhost per default and this will not allow others to connect to the PWD setup on your localhost obviously.
Second, I read a PWD GitHub Issue where a user asked how to run PWD on AWS and I thought, that this would be a nice to have and of course I would like to help this user. So, that’s for you Kevin Hung too.
Third, due to our job as Cloud Solution Architects at STRABAG BRVZ IT we have the possibility to try out things without having to hassle about the framework conditions. This blog is a Holidays gift from #strabagit. If you like it share it, as sharing is caring. :-)
To be honest, this blog post will be very technical (again) and there are a lot of probably other ways to achieve the same result. Therefore this post is not meant to be the holy grail and it is far from being perfect in the meaning of security, eg authentication. This post is meant to be a cooking recipe - feel free to change the ingredients as you like! I will try to describe all steps detailed enough so that everyone could derive it to the personal needs and possibility.
Tipp: It might be helpful to read the whole article once before start working with it!
As every cooking recipe needs an ingredient list, here it comes:
- AWS account (but you can use another cloud provider too)
- EC2 (a free tier AMI is ok for testing)
- Route53 (needed, as otherwise no one can connect to your PWD installation)
- Ansible (we use Ansible for IaaS tasks, but I will try to explain everything, so you can get it up and running with out it though)
- Domain(you need a domain! A free domain from FreeNom is enough)
This is going to be a cloud solution, hosted on AWS. And as with nearly every cloud solution it is hard to bootstrap the components in the correct order to get up and running because there might be implizit dependencies. Before we can cover the installation of PWD we have to prepare the environment. And first of all we need the internet domain name we would like to use, as this name needs to be known later during the PWD configuration.
1. The domain and AWS Route53
As written above, a free domain from Freenom fits perfect! Therefore, choose a domain name and register it there on Freenom. At this point, we have to do two things in parallel, as both, your domain name and the AWS Route 53 configuration are depending on each other!
If you have registered a domain name on Freenom move to your AWS console and start the AWS Route53 dashboard. Create a public hosted zone there with your zone name from Freenom. What we would like to achieve is a so called DNS delegation. To achieve this, write down your NS records you get, when you create a hosted zone with AWS Route53. For example I registered m4r10k.cf at Freenom. Therefore I created a hosted zone called m4r10k.cf in AWS Route53 which results in a list of NS records, in my case eg ns-296.awsdns-37.com. and ns-874.awsdns-45.net.. Head over to Freenom, edit your domain name and under your domain configuration choose DNS and use the DNS NS records provided by AWS Route53. See the picture on the right for details.
We will need the AWS Route53 hosted domain later to automatically register our AWS EC2 instance with an appropriate DNS CNAME entry called pwd.m4r10k.cf.
2. The AWS EC2 instance and Play with Docker installation
As mentioned above, we are using Ansible to automatize our cloud setups but you can do all the next steps manually of course. I will reference the Ansible tasks in the correct sequence to show you how to setup Play With Docker on a AWS EC2 instance. The process itself is fully automated but once again, you can do all this manually too.
At first we start the AWS EC2 instance which is pretty easy with Ansible. The documentation for every module, in this example this is ec2, can be found in the Ansible documentation. The most important thing here is, that the created instance is tagged, so we can find it later by the provided tag. As operating system (AMI), we use Ubuntu 18.04 as it is easier to install go-dep which is needed later.
After that, we install the needed software into the newly created AWS EC2 instance. This is the longer part of the Ansible playbook. Be aware that you might have to wait a little bit until the SSH connection to the AWS EC2 instance is ready. You can use the following to wait for it. The ec2training inventory is dynamically build during runtime.
The next thing we have to do is to install Python as the AWS EC2 Ubuntu AMI does not include Python. Python is needed for the Ansible modules. Therefore we install Python into the AWS EC2 instance the hard way.
Now we go on and install the whole Docker and PWD software. Here comes the description of the tasks in the playbook. The most important step here is, that you replace the localhost in the config.go file of PWD with your Freenom domain!
- Ping pong: Check if Ansible and Python works correctly
- Add Docker GPG key: Add the Docker apt repository GPG key
- Add Docker APT repository: Add the Docker apt repository
- Install Docker: Install the Docker version given by var docker_version
- Apt mark hold Docker: We hold back the Docker package, we do not want that it gets automatically updated during system updates
- Install go-dep: Install go-dep because we need it later for the Play With Docker dependencies
- Install docker-compose: Install docker-compose because we need it later to start Play With Docker
- Add ubuntu user to Docker group: We add the ubuntu user to the docker group to be able to run the Docker commands without sudo
- Run Docker Swarm Init: We create a Docker swarm because this is needed by PWD
- Git clone Docker PWD: Now we clone the PWD repository from GitHub to the correct local folder
- Run go dep: Now we run go dep to resolve all dependencies which are needed by PWD
- Replace localhost in config.go of PWD: This is the most important part! Replace this with your Freenom domain name!
- Docker pull franela/dind: Pull the needed Docker franela/dind image
- Run docker compose: Start the Docker compose file.
3. Automatically create the AWS Route53 CNAME records
Now the only thing left is to create AWS Route53 CNAME records. We can use Ansible for this too. The most important thing here is, that you also create a wildcard entry for your domain. If you later run Docker images which are exposing ports, like Nginx for example, PWD will automatically map the ports to a dynamic domain name which resides under your PWD domain.
How does it looks like
After the setup is up and running, you can point your browser to your given domain, which in my case is http://pwd.m4r10k.cf. Then you can just click the start button to start your PWD session. Create some instances and start a Nginx for example. Just wait a little bit and the dynamic port number, usually 33768, will come up and you can just click on it to see the NGinx welcome page.
This blog post should show, that it is possible to setup a Play With Docker environment for your personal usage in Amazons AWS Cloud fully automated with Ansible. You can use the PWD setup for different purposes like your Docker Meetups. Furthermore you do not have to use Ansible, all steps can also be done manually or with another automation framework of course.
Have a lot of fun, happy hacking, nice Holidays and a happy new year!
Posted on: Fri, 28 Dec 2018 13:20:53 +0100 by Mario Kleinsasser
Estimated reading time: 7 mins
Last week, Bernhard and I had to investigate high disk I/O load reported by one of our storage colleagues on our NFS server which serves the data for our Docker containers. We still have high disk load, because we are running lots of containers and therefore this post is will not resolve a load issue but it will give you some deep insights about some strange behaviors and technical details which we discovered during our I/O deep dive.
The question: Which container (or project) generates the I/O load?
First of all, the high I/O load is not the problem per se. We have plenty reserves in our storage and we are not investigating any performance issues. But the question asked by our storage colleagues was as simple as to ask which container (or project) generates the I/O load?
Short answer: We do not know and we are not able to track it down. Not now and not with the current setup. Read ahead to get used to the “why”?
Finding 1: docker stats does not track all block I/O
No, really, it doesn’t track all block I/0. This took us some time to understand, but lets do it step by step. The first thing you will think about when triaging block I/O loads is to run
docker stats which is absolutely correct. And that’s where you reach the end of the world because Docker and to be more precise, the Linux Kernel, does not see block I/0 which is served over a NFS mount! You don’t believe it? Just look at the following example.
First, create and mount a file system over a loop device. Mount a NFS share onto a folder inside this mount and monitor the block I/O on this device to see what happens, respectively what you cannot see.
At this point, open a second console. In the first console enter a
dd command to write a file into
/mnt/testmountpoint/nfsmount and in the second console, start the
iostat command to monitor the block I/O on your loop device.
Here is an output from this run and as you can see, iostat does not recognize any block I/O because the I/O never touches the underlying disk. If you do the same test without using the mounted NFS share, you will see the block I/O in the
iostat command as usual.
The same is true, if you are using
docker volume NFS mounts! The block I/O is not tracked and it’s fully logical because this block I/O never touches a local disk. Bad luck. Even it is true with any other mount type that will not be written to local disks like Gluster (FUSE) and many more.
Finding 2: docker stats tracks block I/O not fully correct
We think we will open an issue for this, because
docker stats counts the block I/O wrong. You can test this by starting a container, run a deterministic
dd command and watch the
docker stats output of the container in parallel. See the terminal recording to get an idea.
As the recording shows, the first
dd write is completely unseen by the
docker stats command. This might be OK, because there are several buffers for write operations involved. But, as the
dd command is issued second time, to write additional 100 megabytes, the
docker stats command shows a summary of
0B / 393MB megabytes, roughly 400 megabytes. The test wrote 200 megabytes, but
docker stat shows the doubled amount of data written. Strange buy why does this happen?
At this point, more information is needed. Therefore it is recommended to query the
docker api to retrieve more detailed information about the container stats. This can be done by using an actual version of
curl which would generate the following output.
Now, search for
io_service_bytes_recursive in the json output. There will be something like this:
Ups, there are three block devices here. Where are they coming from? If the totals are summed up, we get the 393 megabytes we have seen before. The
minor numbers identify the device type. The documentation of the Linux kernel includes the complete list of the device major and minor numbers. The
major number 8 identifies a block device as
SCSI disk device and this is correct, as the server uses
sd* for the local devices. The
major numner 253 refers to
RESERVED FOR DYNAMIC ASSIGNMENT which is also correct, because the container get a local mount for the write layer. Therefore there are multiple devices: The real device
sd* and the dynamic device for the writeable image layer, which will write the data to the local disk. That’s way the block I/O is counted multiple times!
But we can dig even deeper and we can inspect the
cgroup information used by the Linux kernel to isolate the resources for the container. This information can be found under
/sys/sys/fs/cgroup/blkio/docker/<container id> eg
/sys/fs/cgroup/blkio/docker/195fd970ab95d06b0ca1199ad19ca281d8da626ce6a6de3d29e3646ea1b2d033. The file
blkio.throttle.io_service_bytes contains the information what data was really transferred to the block devices. For this test container the output will be:
There we have the correct output. In SUM Total we have roughly 250 megabytes. 200 megabytes were written by the
dd commands and the rest would be logging and other I/O stuff. This would be the correct number. You can test this by yourself by running a
dd command and watching the
docker stats command is really helpful to get an overview about you block device I/O, but it does not show the full truth. But it is useful to monitor containers that are writing local data, which may indicate, that something is not correctly configured regarding the data persistence. Furthermore, if you use network shares to allow the containers to persist data, you cannot measure the block I/O count on the Docker host the container is running on. The ugly part is, if you are using one physical device (large LVM for example) on your network share server, you will only get one great number of block I/O but you will not be able to assign the I/O to a container, a group of containers or a project.
Facts: - If you use NFS (or whatever shares) which are backed by a single block device on the NFS server, you can only get the sum of all block I/O and you cannot assign this block I/O to a concrete project or container - Use separate block devices for your shares - Even if you use Gluster, there will be the exactly same problem - FUSE mounts are also not seen by the Linux kernel
We are currently evaluating a combination of thin allocated LVM devices in combination with Gluster to report the block I/O via iostat (the json out) to Elastic search. Stay tuned for more about this soon!
Posted on: Sun, 30 Sep 2018 10:47:39 +0100 by Mario Kleinsasser