Docker
Notes, links, & reference code for Docker/Docker Compose.
Warning
In progress...
Todo
- Add sections for things that took me entirely too long to learn/understand
- Multistage builds
- How to target specific layers, i.e.
dev
vsprod
- How to target specific layers, i.e.
- Common Docker commands, how to interpret/modify them
- Docker build
- Docker run
-
ENV
vsARG
-
EXPOSE
-
CMD
vsENTRYPOINT
vsRUN
- Multistage builds
- Add section(s) for Docker Compose
- Add an example
docker-compose.yml
- Detail required vs optional sections (i.e.
version
(required) andvolumes
(optional)) - Links (with
depends_on
) - Networking
- Internal & external networks
- Proxying
- Exposing ports (and when you don't need to/shouldn't)
- Add an example
How to use Docker build layers (multistage builds)
You can take advantage of Docker's BuildKit, which caches Docker layers so subsequent rebuilds with docker build
are much faster. BuildKit works by keeping a cache of the "layers" in your Docker container, rebuilding a layer only if changes have been made. What this means in practice is that you can separate the steps you use to build your container into stages like base
, build
, and run
, and if nothing in your build
layer has changed (i.e. no new dependencies added), that layer will not be rebuilt.
Example: layered Dockerfile
In this example, I am building a simple Python app inside a Docker container. The Python code itself does not matter for this example.
To illustrate the differences in a multistage Dockerfile, let's start with a "flat" Dockerfile, and modify it with build layers. This is the basic Dockerfile:
In this example, any changes to the code or dependencies will cause the entire container to rebuild each time. This is slow & inefficient, and leads to a larger container image. We can break these stages into multiple build layers. In the example below, the container is built in 3 "stages": base
, build
, and run
:
Layers:
base
: The base layer provides a common environment for the rest of the layers.- In this example, we set
ENV
variables, which persist across layers- In contrast,
ARG
lines can be set per-layer, and will need to be re-set for each new layer. This example does not use anyARG
lines, but be aware that build arguments you set withARG
are only present for the layer they are declared in. If you create a new layer and want to access the same argument, you will need to set theARG
value again in the new layer
- In contrast,
- In this example, we set
build
: The build layer is where you install your Python dependencies.- You can also install system packages in this layer with
apt
/apt-get
- The
python:3.11-slim
base image is built on Debian. If you are using a different Dockerfile, i.e.python:3.11-alpine
, use the appropriate package manager (i.e.apk
for Alpine,rpm
for Fedora/OpenSuSE, etc) to install packages in thebuild
layer
- The
- You can also install system packages in this layer with
run
: Finally, the run layer executes the code built in the previousbase
&build
steps. It also exposes port8000
inside the container to the host, which can be mapped withdocker run -p 1234:8000
, where1234
is the port on your host you want to map to port8000
inside the container.
Using this method, each time you run docker build
after the first, only layers that have changed in some way will trigger a rebuild. For example, if you add a Python dependency with pip install <pkg>
and update the requirements.txt
file with pip freeze > requirements.txt
, the build
layer will be rebuilt. If you make changes to your Python application, the run
layer will be rebuilt. Each layer that does not need to be rebuilt reduces the overall build time of the container, and only the run
layer will be saved as your image, leading to smaller Docker images.
Example: Targeting a specific Dockerfile build stage
With multistage builds, you can also create a dev
and prod
layer, which you can target with docker run
or a docker-compose.yml
file. This allows you to build the development & production version of an application using the same Dockerfile.
Let's modify the multistage Dockerfile example from above to add a dev
and prod
layer. Modifications to the multistage Dockerfile include adding an ENV
variable for storing the app's environment (dev
/prod
). In my projects, I use Dynaconf
to manage app configurations depending on my environment. Dynaconf allows you to set an ENV
variable called $ENV_FOR_DYNACONF
so you can control app configurations per-environment (Dynaconf environment docs).
With this multistage Dockerfile, you can target a specific layer with docker built --target <layer-name>
(i.e. docker build --target dev
). This will run through the base
and build
layers, but skip the prod
layer.
You can also target a specific layer in a docker-compose.yml
file:
The example docker-compose.yml
file above demonstrates targeting the dev
layer of the multistage Dockerfile above it. We also set the entrypoint (instead of using CMD
in the Dockerfile), and expose port 8000
in the container.
ENV vs ARG in a Dockerfile
The ENV
and ARG
commands in a Dockerfile can be used to control how an image is built and how it functions when live. The differences between an ENV
and an ARG
are outlined below.
Note
This list is not a complete comparison between ENV
and ARG
. For more information, please check the Docker build documentation
guide.
ENV
- Define environment variables for the container.
- Can be accessed the same way you would on a host, with
$ENV_VAR_NAME
. - Can be set/overridden with
docker build -e
, or theenvironment:
stanza in adocker-compose.yml
file. - Available during both the
build
andrun
phases when building a container.- When building a container (
docker build
ordocker compose build
),ENV
variables will always use the value declared in theDockerfile
. - At runtime (i.e. when running
docker run
ordocker compose up
), the values can be overridden withdocker run -e/--env
or theenvironment:
stanza in adocker-compose.yml
file.
- When building a container (
ARG
- Define environment variables that are only available at build time.
- Values may be overridden while building, i.e. between layers or after a command runs.
- Can be set/overridden with
docker build --build-arg ARG_NAME=value
, or thebuild: args:
stanza in adocker-compose.yml
file
Example:
Build ARGS
are useful for setting things like a software version number, i.e. when downloading a specific software release from Github
. You can set a build arg for the release version, i.e. ARG RELEASE_VER
, and provide it at buildtime with docker build --build-arg RELEASE_VER=1.2.3
, or in a docker-compose.yml
file like:
Example build arg stanza | |
---|---|
ENV
variables, meanwhile, can store things like a database password or some other secret, or configurations for the app.
Example ENV vars stanza | |
---|---|
Exposing container ports
In previous examples you have seen the EXPOSE
line in a Dockerfile. This command exposes a network port from within the container to the host. This is useful if your containerized application utilizes network ports (i.e. running a web frontend on port 8000
), and you are running the container directly with docker run
instead of through an orchestrator like Docker Compose or Kubernetes.
Note
When using an orchestrator like docker-compose
, kubernetes
, hashicorp nomad
, etc, it is not necessary (and often counterproductive) to
define EXPOSE
lines in a Dockerfile. It is better to define port binds between the host and container using the orchestrator's capabilities,
i.e. the ports:
stanza of a docker-compose.yml
file.
When building & running a container image locally or without an orchestrator, you can add these sections to a Dockerfile so when you run the built
container image, you can bind ports with docker run -p $HOST_PORT:$CONTAINER_PORT
.
Example:
Example EXPOSE syntax | |
---|---|
After building this container, you can run it and bind to a port on the host (i.e. port 80
) with docker run -rm -p 80:8000 ...
, or by specifying the port binding in a docker-compose.yml
file
Warning
If you are using Docker Compose, comment/remove the EXPOSE
and CMD
lines in your container and pass the values in through Docker Compose
Docker compose port binds | |
---|---|
CMD vs RUN vs ENTRYPOINT
RUN
- Execute when an image is built. The command defined with
RUN
is executed on top of the current base image.- Example: Installing the
neovim
container inside of a Dockerfile built on top ofubuntu:latest
image:RUN apt-get update -y && apt-get install -y neovim
- Commands defined with
RUN
show their output in the console as the container is built, but are not executed when the built container image is run withdocker run
ordocker compose up
- Example: Installing the
- Execute when an image is built. The command defined with
CMD
- Execute when the container is starting.
- Commands defined with
CMD
execute when you run the container withdocker run
ordocker compose up
- These commands do not execute if a different command is passed, i.e. with
docker run myimage cat log.txt
- The
cat log.txt
command overrides theCMD
defined in the container
- The
- IMPORTANT: Only the last
CMD
defined in your image is executed. If you specify more than one, all but the lastCMD
will execute. - The
CMD
command supersedesENTRYPOINT
, and should almost always be used instead ofENTRYPOINT
.
ENTRYPOINT
ENTRYPOINT
functions almost the same way asCMD
, but should be used only when extending an existing image (i.e.nginx
,tomcat
, etc)- Providing an
ENTRYPOINT
to an existing container will change the way that container executes, running the underlying Dockerfile logic with an ad-hoc command you provide. - The entrypoint for a container can be overridden with
docker run --entrypoint
- If you define a
CMD
and anENTRYPOINT
, theCMD
line will be provided as arguments to theENTRYPOINT
, meaning you can do things likecat
a file with aCMD
, then "pipe" the command's output into anENTRYPOINT
- In general, it is best practice to use
CMD
in your Dockerfiles, unless you are aware of and fully understand a reason to useENTRYPOINT
instead.