Article Collection
This article is part of the following series:1. Docker
- Part 1: Introduction to Docker Containers (this article)
Table of Contents
Introduction
You might have heard of “virtualization and “virtual machines” (VMs). Virtualization enables users to concurrently run multiple, completely separate operating systems on a single physical computer.
A conceptually similar, but technically different solution from virtualization is “containerization”. In this model virtualization is not implemented at the hardware level but at the kernel level. All containers still run in isolated environments, but under the same operating system kernel.
Both approaches have their preferred use cases. Virtual machines emulate hardware, so usually a whole operating system must be running in them. That provides more flexibility, but also adds overhead. Containers run under the host’s kernel and do not require an operating system; a single statically-compiled executable file would work. However, containerized processes are visible to the host OS, and also only operating systems and programs compatible with the host’s kernel can be running in those containers.
Docker is a complete containerization solution. It comes with tools necessary for creating software images and running them in containers. Images are binary snapshots of files and directories, packed together as a single unit. Containers are running instances of those images. In addition to creating and running images, Docker also comes with all the features necessary for supporting the lifecycle of images and orchestration of containers.
Docker is a high-level technology that builds on top of numerous computing, Unix, Linux, and networking concepts. To understand Docker, you should understand those supporting elements first.
Docker containers rely heavily on the functionality provided by the host’s kernel. Two primary groups of such functionality provided by the Linux kernel are Linux control groups and Linux namespaces.
Docker Components
As mentioned, Docker is a complete containerization solution. It consists of the following components:
-
Dockerfile and Dockerfile syntax, which allow one to define how images are built using a procedural syntax. Images are often easily created by referencing other, existing images and customizing them
-
Engine for building the images, which takes Dockerfiles as input and produces Docker images as output
-
Functionality for uploading Docker images to public or private Docker registries
-
Functionality for downloading Docker images from existing public or private (authenticated) Docker registries
-
Engine for running the containers, which runs Docker images in isolated environments (containers)
-
Engine for orchestrating the containers, which runs groups of possibly dependent and scalable containers (
docker compose
)
It is important to know that command docker
is a single program to access all of the described functionality. Different subcommands invoke different parts, such as
docker image
, docker container
, and docker system
.
Previously a separate command docker-compose
was used for orchestration, but its function has since been integrated into docker compose
. It has a dedicated config file in [docker-]compose.yml
.
And finally there is Docker Swarm, which can centrally manage a fleet of machines running Docker. That is a separate system and out of scope of this article.
Why Use Docker?
There is a number of entirely different reasons why you might use Docker.
For example:
-
If the host OS is outdated and can’t be upgraded, a newer version of a component can be run in a Docker container
-
If you want to try some software, but not risk it making modifications to your host OS, you might run it in a container
-
Some software is complex to install, so its authors might have prepared ready-to-use Docker containers, good for beginners
-
Container images (most often Docker, but could also be others) can be used as basic units for software deployment, especially in the cloud such as in Kubernetes
Why Not Use Docker?
This section is not intended to steer you away from using Docker, but to help you position it correctly in your mind. Containers are sometimes ideal to deploy software and tinker with systems and concepts, but there could be concerns.
In a standard (non-container) scenario, when you want to use software on GNU/Linux you install it via host’s native package management tools (such as Debian’s apt
). Then, you configure the software if needed and run it.
That is the default workflow. You are involved in the whole process, while also taking maximum advantage of all the effort that package maintainers have invested in:
- Reviewing the licensing and quality of the software
- Making it adhere to the distribution’s defined standards
- Integrating it into the distribution’s standard procedures and tools
- Documenting it and often providing configuration examples
- Pre-configuring it and generally making it ready and easy to use
Docker images, on the other hand, can be created by anyone. Images are not verified or tuned by the distribution’s package maintainers, and software in them is not installed and configured manually by end users. Both steps are already done in advance by image authors and to their liking.
That raises the following concerns:
-
Images may contain code or behavior that you would not approve. Since images are bigger and less transparent than packages, you might start using them without knowing what exactly they are doing, or without determination necessary to audit and remove any offending parts
-
Software images, being pre-installed and pre-configured, can deliver functionality quickly. But if you rely only on images, you never learn how to install and configure software yourself. That makes you potentially miss out on the features of original software that were not made available through the image, and in general reduces your level of skill
-
Using software through images and containers, or through proprietary platforms on which they may be deployed, might make you accustomed to using “software as a service”, rather than demanding to have full control and ownership of your software, data, and devices
Installation
Docker
Docker installation is not a part of this article since it is adequately covered in numerous places elsewhere.
For Debian GNU-based systems, see for example Install Docker Engine on Debian and then return here.
Permissions
In a default scenario, Docker uses a simple permission model where all members of group docker
are able to use it.
So our first task is to add the current user to group docker
:
sudo adduser $USER docker
Adding user `user' to group `docker' ...
Done.
The operating system caches user group memberships for performance. Group memberships are re-read on first user login. Thus, for the cache to be refreshed and the new group membership applied, you should completely log out of the system, and then log back in. However, that may be inconvenient, so in the meantime there is a command that will force the current shell to apply the new group:
newgrp docker; newgrp $USER
Alternatively, if that asks for a password, but you do have root privileges, there is another method:
sudo su - -c "su - $USER"
To confirm that you have the necessary privilege to use Docker, simply run id
to check that “docker” is in the list of auxiliary groups, and then run e.g. docker ps
. If no direct error message is printed, you are OK.
Quick Start
Starting Containers
As mentioned, Docker in its core is a system for creating software images and running them in containers.
However, we do not have to build all images ourselves. Docker maintains a public registry of available images, and as soon as we reference an image that does not exist locally, Docker will connect to its public Internet registry and try download it from there.
Docker’s eagerness to look up images remotely is even inconvenient – it only takes a 1-letter typo or a mismatch in image version for Docker to not find it locally and go try download it from the Internet!
Let’s start using Docker by confirming that in a clean installation we do not have any containers or images. The following commands executed on a fresh installation should just print empty results:
# Show running containers
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
# Show all containers
docker container list -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
# Show all images available locally
docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
Now, knowing about Docker’s automatic lookup of images in the public registry, let’s run our first container “hello-world”.
When we run the command, the first part of output will be from Docker, informing us about downloading the image. The second part will be the actual message from a container that was started, printing “Hello from Docker” and a bunch of extra text.
Here is the first part of the output in which Docker is telling us that the image was downloaded:
docker run --rm hello-world
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
719385e32844: Pull complete
Digest: sha256:dcba6daec718f547568c562956fa47e1b03673dd010fe6ee58ca806767031d1c
Status: Downloaded newer image for hello-world:latest
And here is the second part which comes from the running container and begins with “Hello from Docker!”:
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
(amd64)
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker ID:
https://hub.docker.com/
For more examples and ideas, visit:
https://docs.docker.com/get-started/
In short – it worked!
Now we can check our list of local images again. One image will be there:
docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
hello-world latest 9c7a54a9a43c 3 months ago 13.3kB
The command docker ps
, which shows running containers, will still be empty. That is because
our container has started, printed the message, and exited, so there are no running containers
at the moment:
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
Building Images
While we have the “hello-world” image at hand, let’s show the simplest possible customization and build of our own Docker image.
We already mentioned that Docker commands for building images are found in files
named Dockerfile
, and that new images can be built on top of existing ones.
Let’s create a very simple image based on “hello-world” and its program /hello
that printed
the welcome message.
To make our image a little different, we are going to start from the “hello-world” image and add a new
command /hello2
to it, which will just print a brief Hello, World!
to the screen and exit.
First, we need to create the hello2
program. If you have programmed in C, you will recognize the following
snippet as a C program. But in any case, just run the following commands which will
install the compiler, create a minimal .c
file, compile it, and run it:
sudo apt install build-essential
echo -e '#include <stdio.h>\nint main() { printf("Hello, World!\\n");}' > hello2.c
gcc -o hello2 -static hello2.c
./hello2
Hello, World!
Now that we have our program, let’s create Dockerfile
for our new image.
# Dockerfile
FROM hello-world
COPY ./hello2 /
CMD [ "/hello2" ]
The above lines specify that we want to use an existing image “hello-world” as a base, copy
file hello2
from the host to /hello2
in the new image, and define the command ("/hello2”)
that will run by default every time we run this image.
Note that Dockerfiles only define how images will be built, not how they will be named or which version they will have; those options are passed at build time.
We can then build the image with:
docker build -f Dockerfile -t hello-world2 . # (Don't forget the dot at the end)
Once the image is built, we can verify its presence in the local Docker cache:
docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
hello-world2 latest d97789789d8d 4 seconds ago 775kB
Note that tag (version) “latest” is important. If version is not specified when trying to run an image, Docker will require an image tagged “latest”.
And we can now start a container based on image “hello-world2”:
docker run --rm hello-world2
Hello, World!
Explicitly mentioning the command to run in the container (/hello2
) was not necessary because we already configured it as the default CMD in Dockerfile
.
Since we built our image on top of the original “hello-world”, the image now contains both /hello
and /hello2
.
What if we wanted to run the original /hello
?
We should just specify the command to run after all other parameters:
docker run --rm hello-world2 /hello
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
...
...
Removing Images
The images “hello-world” and “hello-world2” are extremely simple. They consist of programs /hello
and /hello2
which print welcome messages and exit.
There is nothing else useful we can do with them, other than maybe inspecting them for the sake of practice, and then removing them from the local cache:
docker image inspect hello-world
{
"Id": "sha256:9c7a54a9a43cca047013b82af109fe963fde787f63f9e016fdc3384500c2823d",
"RepoTags": [
"hello-world:latest"
],
...
...
...
docker image rm hello-world
docker image rm hello-world2
It is possible that above commands will fail, saying:
Error response from daemon: conflict: unable to remove repository reference "hello-world2" (must force) - container ... is using its reference image d97789789d8d
That simply means there exist containers which reference this image, so the image cannot be removed. List containers and remove them before removing the images:
docker container list -a
docker container rm ...
docker image rm hello-world
docker image rm hello-world2
Lastly, while working with Docker, you will notice that its cache can easily fill gigabytes of disk space, so we will also show a space-saving command here. That command will not make a difference with only few images, but will come handy in the future. Please note that it will remove all cache and images:
docker system df
docker system prune -f
Interacting with Containers
By default, for easier communication, Docker creates one default virtual network and all containers get assigned an IP from its subnet so they can talk to each other.
The containers also have access to the host OS’ networking, so if the host machine is connected to the Internet, containers will be able to access it as well.
However, other than that, containers run with pretty much everything separate from each other, including storage.
That is great for isolation, but may be a problem for durability of data. While it is quite normal to have long-lived containers, containers are often also created temporarily, and all their data is lost when containers are removed.
Similarly, container isolation may be a problem if we actually want software in the containers to interact with the outside system.
There are a couple ways to enable that interaction:
-
By exposing container’s network ports to the host OS or other containers
-
By copying additional files or other required data directly into the container at build time
-
By mounting some host OS directories (ad-hoc volumes) into the container at startup time
Mounting disk volumes inside containers is done with option -v HOST_PATH:CONTAINER_PATH
, and exposing ports is done with option -p HOST_PORT:CONTAINER_PORT
.
Let’s show it in practice.
Exposed Network Ports in Containers
We have seen the “hello-world” image in the previous chapter. The image did not exist locally, so it was automatically pulled from Docker’s public registry when we ran it.
That container did not require much interaction. All it did was print a welcome message and exit.
But now, to show network interaction with containers and set things up for other examples, we are going to explicitly download and then run a Docker image for Apache, an HTTP (web) server.
The image name is httpd
:
docker pull httpd
To be useful, an HTTP server must of course be accessible. So we are going to run the container and route the host OS’ port 8080
to port 80
in the container.
Port 80
is a standard port on which web servers are listening for unencrypted (non-SSL/TLS) connections.
docker run -ti --rm -p 8080:80 httpd
With that command, the container will start in foreground mode.
We can now use a web browser to open http://0:8080/ and we will be greeted by Apache with a simple message “It works!”.
When you are done with the test, press Ctrl+c
to terminate the process. Because of option --rm
, the container will also be removed automatically upon termination.
Additional Files in Containers
But, what about a more useful website? What if we had a personal or company website, and wanted to serve it from this container?
If you are familiar with the basics of HTTP protocol, you know the original idea was that a client would request a particular URL on the server, that URL would map to some HTML file on disk, and the server would return its contents to the user.
From the documentation on Docker official image 'httpd', we see that Apache’s
root directory for serving HTML files is /usr/local/apache2/htdocs/
.
Therefore, the simplest thing we could do to serve our website instead of the default “It works!” would be to copy our files over the default ones.
Let’s do that now and confirm that it worked by seeing the message change from “It works!” to “Hello, World!”:
First, locally we will create a directory public_html/
containing one page for our new website:
mkdir public_html
echo "<html><body>Hello, World!</body></html>" > public_html/index.html
Then, we will create a separate Dockerfile, e.g. Dockerfile.apache
, for our new image:
FROM httpd
COPY ./public_html/ /usr/local/apache2/htdocs/
And finally, we will build and run the image:
docker build -f Dockerfile.apache -t hello-apache2 . # (Don't forget the dot at the end)
docker run -ti --rm --name test-website -p 8080:80 hello-apache2
Visiting http://0:8080/ will now show our own website and message “Hello, World!”.
We are done with the test so press Ctrl+c
to terminate the process.
Mounted Host OS Directories
The previous example works, but copying data into images is not very flexible. When data changes, we need to rebuild images and also restart containers using them.
As mentioned earlier, the solution is to mount host OS directories (ad-hoc volumes) into the container with option -v HOST_PATH:CONTAINER_PATH
.
Since we already have our public_html/
directory, and mounting volumes does not require changing the images, we can use the original httpd
image directly:
docker run -ti --rm --name test-website-volume -p 8080:80 -v ./public_html:/usr/local/apache2/htdocs/ httpd
Visiting http://0:8080/ will now show our new website and message “Hello, World!”.
But the example is not functionally equivalent to the previous one. This data is now “live”. If we modify any file in public_html/
and visit it through the browser, we will immediately see their current content.
(You might need to press Ctrl+r
or F5
, or Ctrl+Shift+r
or click Shift+Reload
to cause browser to update its cache.)
Furthermore, since we now have a long-running container, we can verify its presence in the output of docker ps
:
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
66bb93476t99 httpd "http-foreground" 1 hour ago Up 1 hour 0.0.0.0:8080->80/tcp, :::8080->80/tcp test-website-volume
Running Commands in Containers
In containers, you can only run commands that exist in the underlying image.
As long as they exist, you can run them at container startup, or later after the container has already been running.
Let’s look at each option.
At Startup
From Dockerfile
There are two Dockerfile
directives that define the default program to run in the container at startup –
ENTRYPOINT and
CMD.
Both are by default empty (undefined).
The full command that Docker will run by default is $ENTRYPOINT $CMD
. (That is, any ENTRYPOINT
to which any CMD
is appended.)
We have seen an example of CMD
in our earlier Dockerfile
:
FROM hello-world
COPY ./hello2 /
CMD [ "/hello2" ]
Example of CMD
with additional command line arguments:
CMD [ "/some/program", "--with-option", "123" ]
And a combination of ENTRYPOINT
and CMD
which will result in Docker starting /some/program --with-option 123
:
ENTRYPOINT [ "/some/program" ]
CMD [ "--with-option", "123" ]
Note that ENTRYPOINT
and CMD
above show the preferred “exec” syntax, but a “shell” syntax is also available.
See more in ENTRYPOINT and
CMD documentation.
From Command Line
It is possible to override both ENTRYPOINT
and CMD
on the command line, at time of container startup.
Option --entrypoint
overrides ENTRYPOINT
, while CMD
is overriden just by listing arguments on the
command line:
# [ ENTRYPOINT ] [ CMD ]
docker run --rm some-image --entrypoint /some/program --with-option 123
In Runtime
Often times we want to connect to containers that are currently running and run some commands in them.
Let’s first start a container running Debian GNU/Linux:
docker run --name my_debian -ti --rm debian
root@5821b3a41434:/#
Then, in another terminal let’s run docker ps
to confirm our container is running:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5821b3a41434 debian "bash" 15 seconds ago Up 13 seconds my_debian
Now with a running container, we can execute commands in it via docker container exec
. Here is an example that shows disk space:
docker container exec -ti my_debian df -h
Filesystem Size Used Avail Use% Mounted on
overlay 15G 6.0G 7.6G 44% /
tmpfs 64M 0 64M 0% /dev
tmpfs 1.2G 0 1.2G 0% /sys/fs/cgroup
shm 64M 0 64M 0% /dev/shm
/dev/xvda3 15G 6.0G 7.6G 44% /etc/hosts
tmpfs 1.2G 0 1.2G 0% /proc/asound
tmpfs 1.2G 0 1.2G 0% /proc/acpi
tmpfs 1.2G 0 1.2G 0% /proc/scsi
tmpfs 1.2G 0 1.2G 0% /sys/firmware
If there is a shell in the container, which in Debian image of course is, we can also run the shell directly:
docker container exec -ti my_debian /bin/bash
root@5821b3a41434:/#
The shell can be exited with command exit
or by pressing Ctrl+d
, as usual.
Hybrid
It is completely fine to combine startup and runtime methods of executing commands in Docker containers.
In many container images that implement some client/server applications, such as databases, it is customary that their Docker image will start the server by default, but if you want to start a client, then you override the command to just start the client.
You can do this either by running docker run IMAGE_NAME [CMD]
twice, one time without, and one time with the command manually specified. This will run the same image twice, in two separate containers, and you will be able to confirm this with docker ps
.
However, you can also run the second command with docker container exec CONTAINER CMD
and it would have a similar, but different effect. It would run the second command in the first container, rather than starting two separate containers.
Article Collection
This article is part of the following series:1. Docker
- Part 1: Introduction to Docker Containers (this article)
Automatic Links
The following links appear in the article:
1. Install Docker Engine on Debian - https://docs.docker.com/engine/install/debian/
2. CMD - https://docs.docker.com/reference/dockerfile/#cmd
3. ENTRYPOINT - https://docs.docker.com/reference/dockerfile/#entrypoint
4. Linux Control Groups - https://en.wikipedia.org/wiki/Cgroups
5. Linux Namespaces - https://en.wikipedia.org/wiki/Linux_namespaces
6. Virtualization - https://en.wikipedia.org/wiki/Virtualization
7. Containerization - https://en.wikipedia.org/wiki/Virtualization#Containerization
8. Docker Official Image 'Httpd' - https://hub.docker.com/_/httpd