Introduction to Docker Containers

First published — Aug 15, 2023
Last updated — Apr 29, 2024
#infrastructure #cloud #tools #docker

VMs, containers, Docker. Dockerfile, images, containers. Docker and docker-compose. Description and use.

Article Collection

This article is part of the following series:

1. Docker

Table of Contents

Introduction

You might have heard of “virtualization and “virtual machines” (VMs). Virtualization enables users to concurrently run multiple, completely separate operating systems on a single physical computer.

A conceptually similar, but technically different solution from virtualization is “containerization”. In this model virtualization is not implemented at the hardware level but at the kernel level. All containers still run in isolated environments, but under the same operating system kernel.

Both approaches have their preferred use cases. Virtual machines emulate hardware, so usually a whole operating system must be running in them. That provides more flexibility, but also adds overhead. Containers run under the host’s kernel and do not require an operating system; a single statically-compiled executable file would work. However, containerized processes are visible to the host OS, and also only operating systems and programs compatible with the host’s kernel can be running in those containers.

Docker is a complete containerization solution. It comes with tools necessary for creating software images and running them in containers. Images are binary snapshots of files and directories, packed together as a single unit. Containers are running instances of those images. In addition to creating and running images, Docker also comes with all the features necessary for supporting the lifecycle of images and orchestration of containers.

Docker is a high-level technology that builds on top of numerous computing, Unix, Linux, and networking concepts. To understand Docker, you should understand those supporting elements first.

Docker containers rely heavily on the functionality provided by the host’s kernel. Two primary groups of such functionality provided by the Linux kernel are Linux control groups and Linux namespaces.

Docker Components

As mentioned, Docker is a complete containerization solution. It consists of the following components:

  • Dockerfile and Dockerfile syntax, which allow one to define how images are built using a procedural syntax. Images are often easily created by referencing other, existing images and customizing them

  • Engine for building the images, which takes Dockerfiles as input and produces Docker images as output

  • Functionality for uploading Docker images to public or private Docker registries

  • Functionality for downloading Docker images from existing public or private (authenticated) Docker registries

  • Engine for running the containers, which runs Docker images in isolated environments (containers)

  • Engine for orchestrating the containers, which runs groups of possibly dependent and scalable containers (docker compose)

It is important to know that command docker is a single program to access all of the described functionality. Different subcommands invoke different parts, such as docker image, docker container, and docker system.

Previously a separate command docker-compose was used for orchestration, but its function has since been integrated into docker compose. It has a dedicated config file in [docker-]compose.yml.

And finally there is Docker Swarm, which can centrally manage a fleet of machines running Docker. That is a separate system and out of scope of this article.

Why Use Docker?

There is a number of entirely different reasons why you might use Docker.

For example:

  • If the host OS is outdated and can’t be upgraded, a newer version of a component can be run in a Docker container

  • If you want to try some software, but not risk it making modifications to your host OS, you might run it in a container

  • Some software is complex to install, so its authors might have prepared ready-to-use Docker containers, good for beginners

  • Container images (most often Docker, but could also be others) can be used as basic units for software deployment, especially in the cloud such as in Kubernetes

Why Not Use Docker?

This section is not intended to steer you away from using Docker, but to help you position it correctly in your mind. Containers are sometimes ideal to deploy software and tinker with systems and concepts, but there could be concerns.

In a standard (non-container) scenario, when you want to use software on GNU/Linux you install it via host’s native package management tools (such as Debian’s apt). Then, you configure the software if needed and run it. That is the default workflow. You are involved in the whole process, while also taking maximum advantage of all the effort that package maintainers have invested in:

  • Reviewing the licensing and quality of the software
  • Making it adhere to the distribution’s defined standards
  • Integrating it into the distribution’s standard procedures and tools
  • Documenting it and often providing configuration examples
  • Pre-configuring it and generally making it ready and easy to use

Docker images, on the other hand, can be created by anyone. Images are not verified or tuned by the distribution’s package maintainers, and software in them is not installed and configured manually by end users. Both steps are already done in advance by image authors and to their liking.

That raises the following concerns:

  • Images may contain code or behavior that you would not approve. Since images are bigger and less transparent than packages, you might start using them without knowing what exactly they are doing, or without determination necessary to audit and remove any offending parts

  • Software images, being pre-installed and pre-configured, can deliver functionality quickly. But if you rely only on images, you never learn how to install and configure software yourself. That makes you potentially miss out on the features of original software that were not made available through the image, and in general reduces your level of skill

  • Using software through images and containers, or through proprietary platforms on which they may be deployed, might make you accustomed to using “software as a service”, rather than demanding to have full control and ownership of your software, data, and devices

Installation

Docker

Docker installation is not a part of this article since it is adequately covered in numerous places elsewhere.

For Debian GNU-based systems, see for example Install Docker Engine on Debian and then return here.

Permissions

In a default scenario, Docker uses a simple permission model where all members of group docker are able to use it.

So our first task is to add the current user to group docker:

sudo adduser $USER docker

Adding user `user' to group `docker' ...
Done.

The operating system caches user group memberships for performance. Group memberships are re-read on first user login. Thus, for the cache to be refreshed and the new group membership applied, you should completely log out of the system, and then log back in. However, that may be inconvenient, so in the meantime there is a command that will force the current shell to apply the new group:

newgrp docker; newgrp $USER

Alternatively, if that asks for a password, but you do have root privileges, there is another method:

sudo su - -c "su - $USER"

To confirm that you have the necessary privilege to use Docker, simply run id to check that “docker” is in the list of auxiliary groups, and then run e.g. docker ps. If no direct error message is printed, you are OK.

Quick Start

Starting Containers

As mentioned, Docker in its core is a system for creating software images and running them in containers.

However, we do not have to build all images ourselves. Docker maintains a public registry of available images, and as soon as we reference an image that does not exist locally, Docker will connect to its public Internet registry and try download it from there.

Docker’s eagerness to look up images remotely is even inconvenient – it only takes a 1-letter typo or a mismatch in image version for Docker to not find it locally and go try download it from the Internet!

Let’s start using Docker by confirming that in a clean installation we do not have any containers or images. The following commands executed on a fresh installation should just print empty results:

# Show running containers
docker ps
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES

# Show all containers
docker container list -a
CONTAINER ID   IMAGE     COMMAND     CREATED          STATUS          PORTS     NAMES

# Show all images available locally
docker images
REPOSITORY    TAG       IMAGE ID       CREATED        SIZE

Now, knowing about Docker’s automatic lookup of images in the public registry, let’s run our first container “hello-world”.

When we run the command, the first part of output will be from Docker, informing us about downloading the image. The second part will be the actual message from a container that was started, printing “Hello from Docker” and a bunch of extra text.

Here is the first part of the output in which Docker is telling us that the image was downloaded:

docker run --rm hello-world

Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
719385e32844: Pull complete 
Digest: sha256:dcba6daec718f547568c562956fa47e1b03673dd010fe6ee58ca806767031d1c
Status: Downloaded newer image for hello-world:latest

And here is the second part which comes from the running container and begins with “Hello from Docker!”:

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (amd64)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
 https://hub.docker.com/

For more examples and ideas, visit:
 https://docs.docker.com/get-started/

In short – it worked!

Now we can check our list of local images again. One image will be there:

docker images

REPOSITORY    TAG       IMAGE ID       CREATED        SIZE
hello-world   latest    9c7a54a9a43c   3 months ago   13.3kB

The command docker ps, which shows running containers, will still be empty. That is because our container has started, printed the message, and exited, so there are no running containers at the moment:

docker ps
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES

Building Images

While we have the “hello-world” image at hand, let’s show the simplest possible customization and build of our own Docker image.

We already mentioned that Docker commands for building images are found in files named Dockerfile, and that new images can be built on top of existing ones.

Let’s create a very simple image based on “hello-world” and its program /hello that printed the welcome message.

To make our image a little different, we are going to start from the “hello-world” image and add a new command /hello2 to it, which will just print a brief Hello, World! to the screen and exit.

First, we need to create the hello2 program. If you have programmed in C, you will recognize the following snippet as a C program. But in any case, just run the following commands which will install the compiler, create a minimal .c file, compile it, and run it:

sudo apt install build-essential

echo -e '#include <stdio.h>\nint main() { printf("Hello, World!\\n");}' > hello2.c
gcc -o hello2 -static hello2.c

./hello2

Hello, World!

Now that we have our program, let’s create Dockerfile for our new image.

# Dockerfile

FROM hello-world
COPY ./hello2 /
CMD [ "/hello2" ]

The above lines specify that we want to use an existing image “hello-world” as a base, copy file hello2 from the host to /hello2 in the new image, and define the command ("/hello2”) that will run by default every time we run this image.

Note that Dockerfiles only define how images will be built, not how they will be named or which version they will have; those options are passed at build time.

We can then build the image with:

docker build -f Dockerfile -t hello-world2 .  # (Don't forget the dot at the end)

Once the image is built, we can verify its presence in the local Docker cache:

docker images

REPOSITORY    TAG       IMAGE ID       CREATED        SIZE
hello-world2  latest    d97789789d8d   4 seconds ago  775kB

Note that tag (version) “latest” is important. If version is not specified when trying to run an image, Docker will require an image tagged “latest”.

And we can now start a container based on image “hello-world2”:

docker run --rm hello-world2

Hello, World!

Explicitly mentioning the command to run in the container (/hello2) was not necessary because we already configured it as the default CMD in Dockerfile.

Since we built our image on top of the original “hello-world”, the image now contains both /hello and /hello2. What if we wanted to run the original /hello?

We should just specify the command to run after all other parameters:

docker run --rm hello-world2 /hello

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
...
...

Removing Images

The images “hello-world” and “hello-world2” are extremely simple. They consist of programs /hello and /hello2 which print welcome messages and exit. There is nothing else useful we can do with them, other than maybe inspecting them for the sake of practice, and then removing them from the local cache:

docker image inspect hello-world

{
		"Id": "sha256:9c7a54a9a43cca047013b82af109fe963fde787f63f9e016fdc3384500c2823d",
		"RepoTags": [
				"hello-world:latest"
		],
		...
		...
		...

docker image rm hello-world
docker image rm hello-world2

It is possible that above commands will fail, saying:

Error response from daemon: conflict: unable to remove repository reference "hello-world2" (must force) - container ... is using its reference image d97789789d8d

That simply means there exist containers which reference this image, so the image cannot be removed. List containers and remove them before removing the images:

docker container list -a
docker container rm ...

docker image rm hello-world
docker image rm hello-world2

Lastly, while working with Docker, you will notice that its cache can easily fill gigabytes of disk space, so we will also show a space-saving command here. That command will not make a difference with only few images, but will come handy in the future. Please note that it will remove all cache and images:

docker system df
docker system prune -f

Interacting with Containers

By default, for easier communication, Docker creates one default virtual network and all containers get assigned an IP from its subnet so they can talk to each other.

The containers also have access to the host OS’ networking, so if the host machine is connected to the Internet, containers will be able to access it as well.

However, other than that, containers run with pretty much everything separate from each other, including storage.

That is great for isolation, but may be a problem for durability of data. While it is quite normal to have long-lived containers, containers are often also created temporarily, and all their data is lost when containers are removed.

Similarly, container isolation may be a problem if we actually want software in the containers to interact with the outside system.

There are a couple ways to enable that interaction:

  • By exposing container’s network ports to the host OS or other containers

  • By copying additional files or other required data directly into the container at build time

  • By mounting some host OS directories (ad-hoc volumes) into the container at startup time

Mounting disk volumes inside containers is done with option -v HOST_PATH:CONTAINER_PATH, and exposing ports is done with option -p HOST_PORT:CONTAINER_PORT.

Let’s show it in practice.

Exposed Network Ports in Containers

We have seen the “hello-world” image in the previous chapter. The image did not exist locally, so it was automatically pulled from Docker’s public registry when we ran it.

That container did not require much interaction. All it did was print a welcome message and exit.

But now, to show network interaction with containers and set things up for other examples, we are going to explicitly download and then run a Docker image for Apache, an HTTP (web) server.

The image name is httpd:

docker pull httpd

To be useful, an HTTP server must of course be accessible. So we are going to run the container and route the host OS’ port 8080 to port 80 in the container. Port 80 is a standard port on which web servers are listening for unencrypted (non-SSL/TLS) connections.

docker run -ti --rm -p 8080:80 httpd

With that command, the container will start in foreground mode.

We can now use a web browser to open http://0:8080/ and we will be greeted by Apache with a simple message “It works!”.

When you are done with the test, press Ctrl+c to terminate the process. Because of option --rm, the container will also be removed automatically upon termination.

Additional Files in Containers

But, what about a more useful website? What if we had a personal or company website, and wanted to serve it from this container?

If you are familiar with the basics of HTTP protocol, you know the original idea was that a client would request a particular URL on the server, that URL would map to some HTML file on disk, and the server would return its contents to the user.

From the documentation on Docker official image 'httpd', we see that Apache’s root directory for serving HTML files is /usr/local/apache2/htdocs/.

Therefore, the simplest thing we could do to serve our website instead of the default “It works!” would be to copy our files over the default ones.

Let’s do that now and confirm that it worked by seeing the message change from “It works!” to “Hello, World!”:

First, locally we will create a directory public_html/ containing one page for our new website:

mkdir public_html
echo "<html><body>Hello, World!</body></html>" > public_html/index.html

Then, we will create a separate Dockerfile, e.g. Dockerfile.apache, for our new image:

FROM httpd
COPY ./public_html/ /usr/local/apache2/htdocs/

And finally, we will build and run the image:

docker build -f Dockerfile.apache -t hello-apache2 .  # (Don't forget the dot at the end)

docker run -ti --rm --name test-website -p 8080:80 hello-apache2

Visiting http://0:8080/ will now show our own website and message “Hello, World!”.

We are done with the test so press Ctrl+c to terminate the process.

Mounted Host OS Directories

The previous example works, but copying data into images is not very flexible. When data changes, we need to rebuild images and also restart containers using them.

As mentioned earlier, the solution is to mount host OS directories (ad-hoc volumes) into the container with option -v HOST_PATH:CONTAINER_PATH.

Since we already have our public_html/ directory, and mounting volumes does not require changing the images, we can use the original httpd image directly:

docker run -ti --rm --name test-website-volume -p 8080:80 -v ./public_html:/usr/local/apache2/htdocs/ httpd

Visiting http://0:8080/ will now show our new website and message “Hello, World!”.

But the example is not functionally equivalent to the previous one. This data is now “live”. If we modify any file in public_html/ and visit it through the browser, we will immediately see their current content. (You might need to press Ctrl+r or F5, or Ctrl+Shift+r or click Shift+Reload to cause browser to update its cache.)

Furthermore, since we now have a long-running container, we can verify its presence in the output of docker ps:

docker ps

CONTAINER ID   IMAGE     COMMAND            CREATED      STATUS      PORTS                                    NAMES
66bb93476t99   httpd     "http-foreground"  1 hour ago   Up 1 hour   0.0.0.0:8080->80/tcp, :::8080->80/tcp    test-website-volume

Running Commands in Containers

In containers, you can only run commands that exist in the underlying image.

As long as they exist, you can run them at container startup, or later after the container has already been running.

Let’s look at each option.

At Startup

From Dockerfile

There are two Dockerfile directives that define the default program to run in the container at startup – ENTRYPOINT and CMD. Both are by default empty (undefined).

The full command that Docker will run by default is $ENTRYPOINT $CMD. (That is, any ENTRYPOINT to which any CMD is appended.)

We have seen an example of CMD in our earlier Dockerfile:

FROM hello-world
COPY ./hello2 /
CMD [ "/hello2" ]

Example of CMD with additional command line arguments:

CMD [ "/some/program", "--with-option", "123" ]

And a combination of ENTRYPOINT and CMD which will result in Docker starting /some/program --with-option 123:

ENTRYPOINT [ "/some/program" ]
CMD [ "--with-option", "123" ]

Note that ENTRYPOINT and CMD above show the preferred “exec” syntax, but a “shell” syntax is also available. See more in ENTRYPOINT and CMD documentation.

From Command Line

It is possible to override both ENTRYPOINT and CMD on the command line, at time of container startup.

Option --entrypoint overrides ENTRYPOINT, while CMD is overriden just by listing arguments on the command line:

#                          [       ENTRYPOINT       ]   [      CMD      ]
docker run --rm some-image --entrypoint /some/program   --with-option 123

In Runtime

Often times we want to connect to containers that are currently running and run some commands in them.

Let’s first start a container running Debian GNU/Linux:

docker run --name my_debian -ti --rm debian

root@5821b3a41434:/#

Then, in another terminal let’s run docker ps to confirm our container is running:

CONTAINER ID   IMAGE     COMMAND   CREATED          STATUS          PORTS     NAMES
5821b3a41434   debian    "bash"    15 seconds ago   Up 13 seconds             my_debian

Now with a running container, we can execute commands in it via docker container exec. Here is an example that shows disk space:

docker container exec -ti my_debian df -h

Filesystem      Size  Used Avail Use% Mounted on
overlay          15G  6.0G  7.6G  44% /
tmpfs            64M     0   64M   0% /dev
tmpfs           1.2G     0  1.2G   0% /sys/fs/cgroup
shm              64M     0   64M   0% /dev/shm
/dev/xvda3       15G  6.0G  7.6G  44% /etc/hosts
tmpfs           1.2G     0  1.2G   0% /proc/asound
tmpfs           1.2G     0  1.2G   0% /proc/acpi
tmpfs           1.2G     0  1.2G   0% /proc/scsi
tmpfs           1.2G     0  1.2G   0% /sys/firmware

If there is a shell in the container, which in Debian image of course is, we can also run the shell directly:

docker container exec -ti my_debian /bin/bash

root@5821b3a41434:/#

The shell can be exited with command exit or by pressing Ctrl+d, as usual.

Hybrid

It is completely fine to combine startup and runtime methods of executing commands in Docker containers.

In many container images that implement some client/server applications, such as databases, it is customary that their Docker image will start the server by default, but if you want to start a client, then you override the command to just start the client.

You can do this either by running docker run IMAGE_NAME [CMD] twice, one time without, and one time with the command manually specified. This will run the same image twice, in two separate containers, and you will be able to confirm this with docker ps.

However, you can also run the second command with docker container exec CONTAINER CMD and it would have a similar, but different effect. It would run the second command in the first container, rather than starting two separate containers.

Article Collection

This article is part of the following series:

1. Docker

Automatic Links

The following links appear in the article:

1. Install Docker Engine on Debian - https://docs.docker.com/engine/install/debian/
2. CMD - https://docs.docker.com/reference/dockerfile/#cmd
3. ENTRYPOINT - https://docs.docker.com/reference/dockerfile/#entrypoint
4. Linux Control Groups - https://en.wikipedia.org/wiki/Cgroups
5. Linux Namespaces - https://en.wikipedia.org/wiki/Linux_namespaces
6. Virtualization - https://en.wikipedia.org/wiki/Virtualization
7. Containerization - https://en.wikipedia.org/wiki/Virtualization#Containerization
8. Docker Official Image 'Httpd' - https://hub.docker.com/_/httpd