While Docker has been around for a few years, it is fairly new to me. The organization I work for has only been using it for about a year but I've never had to delve to deep into using it.
In most cases, I run a command to start my containers and start working. It all just works out of box, as it was intended.
However, I recently had a need to setup a new project on my own and using our standard setup wouldn't work and I realized I really don't know much about Docker. So, this article about a few key parts of Docker that I struggled with in the hopes that others might learn something of benefit.
Images and Words
Dream Theater references aside, one key aspect of Docker I didn't understand was the different between an image and a container. I thought the words were interchangeable but, alas, they most certainly are not.
One metaphor I read about was that an image is to a class as a container is to an instance. I think that's fairly accurate. Saying that a container is just a running image isn't quite right because it doesn't capture the full picture. A container is not only a running image but also has additional configuration and may have a filesystem attached.
You can have two containers, two instances of an image, with different settings. This is important to note.
Monolithic vs Multiple Containers
Another question was: "why one would want to run multiple containers orchestrated to work together rather than a single, monolithic image."
At first, I thought the only real difference was the argument between statically linked binaries and dynamically linked ones. Why go through all the headache require the use of additional tools, like docker compose, just to achieve the same thing as can be done with a single image? Turns out, that's not the case.
Of course, reproducibility is a key part of developing an application. The ideal situation is having a development environment that perfectly reflects the production environment so that any bugs or issues can be replicated easily. A single docker image does that job just fine. In fact, in many organizations, web applications are deployed using containers so that the development environment and production environment are, quite literally, identical from a software point of view.
However, unlike a binary, where libraries are linked at compile time and become static dependencies, Docker containers behave more like plug and play peripherals.
A perfect example is adding a container to trap emails sent by an application. In a development environment, you can quickly spin up a container containing a tool like Mailhog to capture emails without affecting the rest of the environment.
Another advantage is that you can mock application services with a dummy container or even swap different versions of a dependent service without having to rebuild the image each time.
Imagine needing to swap between different versions of PHP. You could have 3 complete images of 500Mb each to cover PHP 5.6, 7.0 and 7.2 or, you could have 3 different PHP images that are 50 Mb each. Need to add a new PHP version to support? Easy! Just pull down another PHP image rather than be force to rebuild an entirely new image for your app.
I'm a bit embarrassed to admit it but I didn't understand how to configure a container. I assumed that you must have to build an image, instantiate a container, open an interactive terminal and perform the configuration. The idea felt all wrong but I didn't understand. The reason it felt wrong was because, well, it was wrong.
The problem was, I didn't know what Volumes were or how to use them. I also realized I didn't understand how containers were persistent or how to access files on a local computer. I assumed, again incorrectly, that you must have to copy the files into the image or container. Again, how wrong I was.
When starting an Nginx or Apache server, you need to be able to configure it. Whether it's core configuration or virtual host configuration, you need to supply this information to the web service. Did you create the file outside the image (yes) and then copy it in (no; well, not exactly)?
This is when the brilliance of Docker set in. When I understood how this worked.
You can configure the web server using the configuration files for the deployment server, whether they use containers or not, by attaching them to a container via volumes.
Volumes are exactly like file system mounting. Take a directory, or file, on your local system and mount it to a directory in the container's file system. In the case of Nginx, you might attach a "myapp/nginx.conf" to "/etc/nginx/nginx.conf" of the container. Then, when the container is started, it uses your local filesystem's file as if it were part of its own filesystem. A little like a chroot. It's brilliant!
This, of course, opens all kinds of wondrous doors! Need to import a database or assets for a web application? Use volumes to mount them where the live system would expect them to be! It's great!
This was a crucial part of the system I was missing and explains one aspect of why images are so reusable. You can have a single image whereby container A uses volumes for project A and container B uses volumes for project B. This comes at a major size savings.
Communication is Key
If configuration, volumes specifically, was the biggest hurdle I needed to get over, then the second largest would be intercommunication. How in the hell do different containers communicate?
If one image contains Apache, a second image contains PHP, a third image contains MySQL, how in the world do they talk to one another? The web server acts as a proxy to PHP (presume php-fpm) and thereby forwards paths to scripts to that container. PHP runs the code attached to the container in a volume but that code needs access to MySQL which is, again, in another container.
Within a single image or server, all these programs live in one space and so intercommunication happens (typically, but not always) through localhost. So, how in heck does it work with multiple images that have no knowledge of each other?
Enter networks and some Docker magic. To use multiple containers, you need to use docker compose, hereinafter referred to only as compose. Compose does two things:
- Create a network to communicate by.
- Sets the hosts file up in a container with the network names.
Now, with some simple setup, different image can communicate with each other as if they were all together on a single environment. Far out, solid and right on!
Last, and I think a bit more minor, might be what these Alpine images are. Alpine is a small Linux distro specifically created for Docker. An Ubuntu image is quite large. Unless you have a specific need to perfectly imitate an Ubuntu system, Alpine is an excellent choice.
However, I have discovered a potential best of both words that I haven't tested yet: https://blog.ubuntu.com/2018/07/09/minimal-ubuntu-released
Anyway, that's the gist of Alpine and its primary use.
That about covers the things I wanted to talk about. With any luck, a fellow newb will learn something and help them on their way to devops greatness!