What is docker?

Photo by Ian Taylor on Unsplash

Photo by Ian Taylor on Unsplash

Docker is a virtualization technology that is rapidly becoming more and more popular for application deployment.

Why is it so popular, and why should you learn to use docker? In this article, I will try to answer those questions while showing how to do it at the same time.

What is it?


Docker is a virtualization technology that allows a developer to create a so-called container that contains everything an application needs to run. One of the many struggles for web developers and other developers is the fact that one application may run fine on one developer's computer, but not on another developer's computer.

Back in the days when developers used to run software on their own computers. This was fast and efficient but unfortunately resulted in problems when the app was put on a production server. The "it did run on my computer" phenomenon was an annoying problem that was largely solved by developers who started to run their programs in VMs (virtual machines) that had a similar or even identical software stack as their production server. To be able to develop the files inside the development VM a mixture of NFS (network file system) and SMB (Server Message Block) shares was used. This worked fine, but the memory overhead of running a complete VM could be annoying. in addition, a VM is both slower to work on, and sometimes problems with symlinks and other features could slow things down as well.

Along comes Docker to save the day! With docker, you package everything a service needs to run except the OS kernel into a packet called an image. This image can be run very similarly to how a "normal" application runs but normally it only has one single service like MySQL, unlike a VM that bundles several services like Nginx, PHP, and MySQL.

When a Docker image is running it is called a container. A container will have a network interface, make use of storage and have software packages installed just like a VM.

The thing I love about docker is that it does not consume as much memory, and is a lot quicker to run compared to traditional VMs. In addition, I can separate every service out into separate containers meaning that I don't have to worry about breaking changes on one service affecting another service. I can also have all the software packages I need for a project inside the containers, without the need to install them on my development computer. This reduces the clutter on my own computer.

For example: If I develop an application that is running the Solr search engine I really appreciate the fact that I don't need to have it running on my computer. When I need Solr, I can quickly start up a container that could interact with an application I'm working on. It's just as fast as a native installed service, but is separated from my main OS similarly to a VM.

In my opinion, Docker is amazing for everything that you would normally run headless on a server. It's quick to deploy, start and stop.


What to know before learning Docker

There are two types of knowledge that are needed in order to use Docker. User-level knowledge, and developer-level knowledge.

Typical normal use of docker containers are running ready-made applications like node-red, or web applications that you don't need to develop, just deploy.

A very limited amount of knowledge is needed to run Docker containers that others developed and maintain. You do need to be able to install Docker on your computer of choice and to get the Docker Image installed. Once that is done you just need to run the Docker container in a specific way that the container maintainers specify that it is supposed to run. This is very simple. For all operating systems there is a GUI-based docker manager software available, to make the deployment easy if you don't feel comfortable in the CLI.

To create and maintain those ready-made container applications you will need a lot more knowledge.

Since a lot of the functionality Docker provides is an abstraction of how a VM functions it's good to have a solid understanding of how Virtual Machines are used, and how Linux systems are managed by system administrators. In addition, good fundamental knowledge about networking is needed to fully understand and make use of Docker Containers.

I think CompTIA Network+ level networking knowledge and Linux system administration knowledge on the level of CompTIA Linux+ are essential. If you think Linux system administration via the CLI and network configuration is hard to understand, then the Docker configuration will be even harder. Having said that I think that once you have learned to develop docker images, many of the typical use cases for virtual machines can be replaced by docker containers. They are both easier to manage and easier to use than VMs once you learn how to.

Images and Containers

Clear the confusion

On a system, you can have many Docker images. When you run a docker image all processes that it contain is running inside a container. A Docker container is in some way similar to how a running program becomes a process.

The main reason for why it's called a container that all processes inside the container can only access files and programs within the container. This way it's possible to run a complete operating system that is very dirreferent from the one running on the host computer. You might run bleeding edge version of node.js or PHP on your host computer, but keeping it safe with a more stable version inside your container that is being developed for production.

Another use-case is that you could keep legacy code safe inside a container to prevent vulnerabilities from being exploited and potentially hurting the rest of the the websites on the server.

One of the reasons for the small memory footprint of a Docker container is that you can use very lightweight versions of Linux distros inside your container. For a normal operating system you might need many packages that help with general use of the computer, like text editors and other productivity tools. A container that will be used for only one purpose don't need packages like that. A container shares the OS kernel with the host OS, this significantly help to keep the memory footprint of the container at a minimum.


Blueprint for Docker images

To build Docker images a Dockerfile is used. This uses a special syntax to define a script on how to make a docker image step-by-step. in this example, we will create a docker image based on CentOS8.

We start by creating a project directory, and a Dockerfile inside it.

mkdir haxorDocker
cd haxorDocker
vim Dockerfile
# base this image on CentOS 8
FROM centos:8

# add morror list
RUN cd /etc/yum.repos.d/
RUN sed -i 's/mirrorlist/#mirrorlist/g' /etc/yum.repos.d/CentOS-*
RUN sed -i 's|#baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-*

# Update packages
RUN dnf update -y
RUN dnf upgrade -y

# Install PHP
RUN dnf install php -y

The Dockerfile above creates a simple image that is based on CentoOS 8 that has PHP installed. The Dockerfile is essentially a recipe for how a new image is to be built.

It first says what image it is derived from. In this case an image called centos, version 8. The new image is built by first downloading the image it is derived from. Once downloaded a container is created based on that image. When the container is running several commands will be executed to add content to the container or run commands inside it. When the script in the Dockerfile is completed the container is stopped and a new image is created based on that container.

When a container is created based on the new image it will run from the point the Dockerfile script was completed.

I like to visualize a container to be similar to how a VM snapshot works. If you stop a VM and save the state as a snapshot you can quickly start the VM just where you left it.

To build the image from the Dockerfile run this command:

docker build -t haxor.no/centos_php .

This will create a new image called centos_php, from the vendor haxor.no.

Running a container

The magic of docker is that you can have a container that has everything you need to perform a specific task. A super simple example is running a specific version of a tool or service. To continue with our example we can create a proof of concept PHP file that we want to run by using a container created from the image ww just build.

vim test.php
	echo 2*22;

To have the centos_php container run this script, run this command:

docker run -i haxor.no/centos_php:latest php < test.php

As you can see from the terminal dump above we were able to use the container to run the test.php script and get the output from the container written to our terminal.


Orchestration of containers

Docker containers are amazing since you can use them to run specific versions of programs and services like a MySQL server or MongoDB server without having them installed on your host.

One of the main benefits of using Docker is that the programs and services running inside a container cannot access anything outside itself. The problem with this feature is that you must network the containers to make them able to communicate with each other. If you for example have a container running a web server, and another running a database you must network these two containers.

Another benefit of Docker containers is that everything that is done on the container while it's running is volatile, meaning that it will not be present the next time the container runs. This can is an issue if you have a database that you want to store data permanently inside. To solve this a part of your host filesystem can be bind-mounted into the container. The path to this part of your host's filesystem is called a volume in Docker lingo.

Once a Docker image is created it is stored in a local image repository that is accessible by docker. Because of this, Dockerfiles can be version controlled by git in their own repository without having to be a part of any project git repository.

Once an image is created it can be a part of any docker-compose orchestrated project.

To simplify this process starting containers, bind-mounting volumes, and creating networks between the containers a tool called Docker-compose is used. This process is called orchestration.

The really cool part about docker-compose is that not all containers on a network need to run all the time. As an example we can build upon the centos_php container we created earlier by creating a docker-compose configuration file.

mkdir db
vim docker-compose.yaml
version: '2'
    image: haxor.no/centos_php:latest
    image: mysql:latest
    restart: always
      MYSQL_DATABASE: test_db
      - 3306:3306
      - ./db:/var/lib/mysql
    command: ["--default-authentication-plugin=mysql_native_password"]

Docker-compose organises the orchestration by calling each container a service. If we focus on the DB service we can read that it will create a container from an image called MySQL, and it will bind-mount the host directory "db" to the place where MySQL store its files inside the container. By doing this the database will be stored persistently.

To start the container orchestration run this command:

docker-compose up

This will create two containers. The DB container and the PHP container. To check their status, run this command:

docker-compose ps      
      Name                    Command               State                           Ports                       
centos_php_db_1    docker-entrypoint.sh --def ...   Up>3306/tcp,:::3306->3306/tcp, 33060/tcp
centos_php_php_1   /bin/bash                        Exit 0

From the terminal dump above you can see that the DB container is running, and the PHP container has stopped.

Since they are in the same orchestration their service names can be used to resolve IP addresses on the virtual network connecting the containers.

To test connectivity this command can be run on the PHP container:

docker-compose run --rm php ping -c 2 db                                                                                                                              
Creating centos_php_php_run ... done
PING db ( 56(84) bytes of data.
64 bytes from centos_php_db_1.centos_php_default ( icmp_seq=1 ttl=64 time=0.127 ms
64 bytes from centos_php_db_1.centos_php_default ( icmp_seq=2 ttl=64 time=0.072 ms

--- db ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1004ms
rtt min/avg/max/mdev = 0.072/0.099/0.127/0.029 ms

The terminal dump above demonstrates that it's very simple to network and run commands on containers using Docker-compose. Because the PHP container was not currently running it had to start before the ping command could be issued. Once it was started it run the ping command before stopping.

Since the DB container is running under a service called "db" the hostname db was resolved to the IP address by docker-compose.

To stop the containers run this command:

docker-compose stop 
Stopping centos_php_db_1 ... done

To start the containers again, run this command:

docker-compose start 
Starting php ... done
Starting db  ... done

Since Docker-compose will only create containers that are not existing, you will have to delete previously created containers before you can re-create them and apply the changes in the docker-compose.yaml file.

Please note that you do have to stop the containers before you can dele them with this command:

docker-compose rm                                                                                                                                                
Going to remove centos_php_db_1, centos_php_php_1
Are you sure? [yN] y
Removing centos_php_db_1  ... done
Removing centos_php_php_1 ... done