Making applications portable (containerisation using docker)
Table of contents
- What is Docker?
- How does this work?
- More than just Containers
- Setup Docker
- Working with Docker
- Dockerize the Application
- Multi-Stage Docker Build
- Managing Docker Containers
- Recommended reading
What is Docker?
Say that we write an application with OpenJDK 14 and we would like to deploy this application on a server somewhere within the organisation. The OS where our application will be running needs to have OpenJDK 14 installed otherwise our application will not run.
Another team builds another application, this time using NodeJS 12. The OS where this application will be running needs to have the right version of NodeJS installed.
This raises the following questions
- Who will be managing this?
- Who will make sure that the correct version of libraries/frameworks platforms are installed?
- How can we manage multiple version of libraries/frameworks/platforms running at the same time?
- How can we ensure that the application works as expected on the production environment?
- How will we save the configuration such that we can scale it to hundreds or thousands of servers?
Gradle helps us building and packaging all application’s dependencies into one fat JAR, but it falls short in setting up the operating system. That’s outside its scope.
Docker is a tool that we can use to bridge this gap and have an environment ready for our application to run on.
How does this work?
In a nutshell, an application is packaged into a Docker Image. For example, if OpenJDK 14 is needed to run the application, we set up Docker image to have OpenJDK 14 installed. If another application requires Oracle JRE 8, we setup a separate Docker image for the second application.
Similar to Java JAR files, docker creates images, which are the unit of work for docker. A docker image can be started using a command similar to the following.
$ docker run my-app
Docker takes the image named my-app
and runs it as a docker container. Note that here we switched from docker image to docker container.
An instance of a docker image is called a docker container.
A docker container is a running version of the docker image. If the application produces logs files, these logs files will be in the docker container (not in the docker image).
A docker container can start and stop like any OS. The state of a docker container may or may not be preserved between different runs. With that said, do not rely on the container state in production. Docker Volumes can be used to address this issue, but this goes way beyond the scope of this literature.
More than just Containers
Docker provides more than just the correct configuration.
Limited Access
Consider the case where two or more applications are running on the same OS. One of the applications may be able (intentionally or unintentionally) be able to access files saved by another application running on the same OS.
Someone needs to make sure that an application does not access resources that belong to another application.
Secrets used to access resources, such as databases, may be saved as environment variables. OS scope environment variables are available to all applications running on the same OS. Any application running on the same OS will be able to access the secrets that belong to another application.
Limit attack surface area
Say that we have a server with several applications running on it and one of these applications have a security vulnerability.
Vulnerabilities may come from different places including libraries used by an application or the platform on which it runs.
The other applications running on the same OS may be affected by this vulnerability too.
Limit Damage
An application may misbehave such that it causes the OS to misbehave, perform poorly or crash
- Consume more memory than planned
- Open too many files handles
- Open too many threads
It is not always easy to configure the OS such that it limits the resources each application uses.
Setup Docker
Verify that docker is installed
$ docker --version Docker version 19.03.8, build afacb8b
Install docker if missing following the instructions: https://docs.docker.com/docker-for-mac/install/
Verify that docker is running
Working with Docker
A docker hub account is required. Create an account if you do not have one yet.
Work with an existing docker image (created by someone else)
This docker image
bash:5.0.17
is a basic Linux OS that hasbash
support.$ docker pull bash:5.0.17
Alternative, we can run the image immediately using
run
instead ofpull
.You need to be logged in, otherwise you will get an error similar to the following.
Error response from daemon: Get https://registry-1.docker.io/v2/library/bash/manifests/5.0.17: unauthorized: incorrect username or password
Login
$ docker login --username <YOUR-USERNAME> Login Succeeded
Run the docker image
$ docker run -it bash:5.0.17
Now you are in the
bash:5.0.17
docker containerbash-5.0#
The
-i
option indicates that we need to interact with the docker container. Without it, we will not be able to interact with the docker container. This is very useful while debugging.Open another terminal and run
$ docker ps
This will show the running docker containers
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 110810b3d472 bash:5.0.17 "docker-entrypoint.s…" 53 minutes ago Up 53 minutes brave_payne
Some of the information, such as the
CONTAINER ID
and theNAMES
, will be different. Note that rerunning the image will create a new container which will have different and independent state from any other containers, even containers for the same image.Try out some commands
pwd
bash-5.0# pwd /
ls -la
bash-5.0# ls -la total 64 drwxr-xr-x 1 root root 4096 May 5 09:54 . drwxr-xr-x 1 root root 4096 May 5 09:54 .. -rwxr-xr-x 1 root root 0 May 5 09:54 .dockerenv drwxr-xr-x 1 root root 4096 Apr 24 22:51 bin drwxr-xr-x 5 root root 360 May 5 09:54 dev drwxr-xr-x 1 root root 4096 May 5 09:54 etc drwxr-xr-x 2 root root 4096 Apr 23 06:25 home drwxr-xr-x 1 root root 4096 Apr 24 22:51 lib drwxr-xr-x 5 root root 4096 Apr 23 06:25 media drwxr-xr-x 2 root root 4096 Apr 23 06:25 mnt drwxr-xr-x 2 root root 4096 Apr 23 06:25 opt dr-xr-xr-x 188 root root 0 May 5 09:54 proc drwx------ 2 root root 4096 Apr 23 06:25 root drwxr-xr-x 2 root root 4096 Apr 23 06:25 run drwxr-xr-x 2 root root 4096 Apr 23 06:25 sbin drwxr-xr-x 2 root root 4096 Apr 23 06:25 srv dr-xr-xr-x 13 root root 0 May 5 09:54 sys drwxrwxrwt 1 root root 4096 Apr 24 22:51 tmp drwxr-xr-x 1 root root 4096 Apr 24 22:51 usr drwxr-xr-x 1 root root 4096 Apr 24 22:51 var
echo
bash-5.0# echo "Hello Docker" Hello Docker
env
bash-5.0# env HOSTNAME=110810b3d472 PWD=/ _BASH_GPG_KEY=7C0135FB088AAF6C66C650B9BB5869F064EA74AB HOME=/root _BASH_VERSION=5.0 _BASH_PATCH_LEVEL=0 _BASH_LATEST_PATCH=17 TERM=xterm SHLVL=1 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin _=/usr/bin/env
Add curl
curl
is not available on the image we are using. We can use a different image that already containscurl
, or install it ourselves.⚠️ Note that we are working inside a container and all changes we make to this container will be lost once this container is stopped.
bash-5.0# curl http://www.google.com bash: curl: command not found
Use the package manager available to the OS you are using. Alpine, the OS we are using, uses the
apk
package manager. A list of available packages is available here.bash-5.0# apk add curl fetch http://dl-cdn.alpinelinux.org/alpine/v3.11/main/x86_64/APKINDEX.tar.gz fetch http://dl-cdn.alpinelinux.org/alpine/v3.11/community/x86_64/APKINDEX.tar.gz (1/4) Installing ca-certificates (20191127-r1) (2/4) Installing nghttp2-libs (1.40.0-r0) (3/4) Installing libcurl (7.67.0-r0) (4/4) Installing curl (7.67.0-r0) Executing busybox-1.31.1-r9.trigger Executing ca-certificates-20191127-r1.trigger OK: 8 MiB in 21 packages
Now we have
curl
installedbash-5.0# curl http://www.google.com <!doctype html><html itemscope="" ...
Note that
curl
is not part of the image. Only this container hascurl
installed. Thecurl
will not be available on any other container for the same image.Stop the container
bash-5.0# exit $ exit
Start a new container
$ docker run -it bash:5.0.17 bash-5.0#
Try the
curl
commandbash-5.0# curl http://www.google.com bash: curl: command not found
Any changes made to a container are lost once the container is stopped.
How can we run a Java application within a container?
The
bash:5.0.17
image does not include Java.$ docker run -it bash:5.0.17 bash-5.0# java -version bash: java: command not found
We can use an image which have the Java we need already installed, such as the
adoptopenjdk/openjdk14:jre-14.0.1_7-alpine
image.$ docker run -it adoptopenjdk/openjdk14:jre-14.0.1_7-alpine
Note that some examples may also append the command to be executed when the container starts, such as
$ docker run -it adoptopenjdk/openjdk14:jre-14.0.1_7-alpine /bin/sh
The above is instructing docker to open a shell terminal. Note that the image
adoptopenjdk/openjdk14:jre-14.0.1_7-alpine
does not havebash
installed. We can use thesh
instead. That is why we are running the/bin/sh
command instead of/bin/bash
.Check the Java version installed
# java -version openjdk version "14.0.1" 2020-04-14 OpenJDK Runtime Environment AdoptOpenJDK (build 14.0.1+7) OpenJDK 64-Bit Server VM AdoptOpenJDK (build 14.0.1+7, mixed mode, sharing)
This docker image comes with Java 14 already setup. A more important observation is that given that we are using a specific version of the image, we will always have the same version of Java installed.
How do we get our application in the docker container? The application needs to be dockerize, described in the Dockerize the Application section.
Dockerize the Application
The
Dockerfile
text fileDocker will use a text file named
Dockerfile
to create our image. TheDockerfile
file is part of the source code and can be used by the build pipeline to build our docker images and deploy them into production environments.Steps:
Create the
Dockerfile
$ vi Dockerfile
Extend an existing docker image
We can create a docker image from scratch, but this will require lots of effort as we need to install the OS files, the packages we need (such as
curl
for example) and install the correct version of Java. Alternatively, we can use an existing image from the docker repository that does this for us.⚠️ Note that many companies, have internal docker repositories and only allow images coming from these repositories for security purposes. Such docker repositories scan the images and make sure that these docker images are secure and do not contain any funny business.
FROM adoptopenjdk/openjdk14:jre-14.0.1_7-alpine
The above docker file is extending one of the adoptopenjdk images as defined by the
FROM
instruction.Fragments of the docker image adoptopenjdk/openjdk14:jre-14.0.1_7-alpine are shown next.
FROM alpine:3.11 ENV LANG='en_US.UTF-8' LANGUAGE='en_US:en' LC_ALL='en_US.UTF-8' RUN apk add --no-cache --virtual .build-deps curl binutils \ && GLIBC_VER="2.31-r0" \ && ALPINE_GLIBC_REPO="https://github.com/sgerrand/alpine-pkg-glibc/releases/download" \ && GCC_LIBS_URL="https://archive.archlinux.org/packages/g/gcc-libs/gcc-libs-9.1.0-2-x86_64.pkg.tar.xz" \ && GCC_LIBS_SHA256="91dba90f3c20d32fcf7f1dbe91523653018aa0b8d2230b00f822f6722804cf08" \ ... ENV JAVA_HOME=/opt/java/openjdk \ PATH="/opt/java/openjdk/bin:$PATH"
Please note that the above is incomplete for brevity and the full example can be found here.
The
adoptopenjdk/openjdk14:jre-14.0.1_7-alpine
installs the Adopt OpenJDK 14 and set the environment. Theadoptopenjdk/openjdk14:jre-14.0.1_7-alpine
docker image is built on top of another image, the alpine:3.11, shown next.FROM scratch ADD alpine-minirootfs-3.11.6-x86_64.tar.gz / CMD ["/bin/sh"]
The
alpine:3.11
does not depend on anything (as indicated by theFROM scratch
instruction) and is referred to as base image.If we want to create a docker image from scratch, we need to merge both docker images, and the files they are referring to, into our docker file. Note that this is necessary in our case, and will simply extend an existing image.
Steps:
Import from
adoptopenjdk/openjdk14:jre-14.0.1_7-alpine
in theDockerfile
FROM adoptopenjdk/openjdk14:jre-14.0.1_7-alpine
Build the new docker image
$ docker build . -t demo:local Sending build context to Docker daemon 166.9kB Step 1/1 : FROM adoptopenjdk/openjdk14:jre-14.0.1_7-alpine ---> 82f70d1be68e Successfully built 82f70d1be68e Successfully tagged demo:local
Run the newly built docker image
$ docker run -it demo:local /bin/sh #
Verify the Java version
# java -version openjdk version "14.0.1" 2020-04-14 OpenJDK Runtime Environment AdoptOpenJDK (build 14.0.1+7) OpenJDK 64-Bit Server VM AdoptOpenJDK (build 14.0.1+7, mixed mode, sharing)
Check the working directory
# pwd /
Set the working directory
WORKDIR /opt/app
The
WORKDIR
instruction defines directory where our application will be running from.Why do we need to change the working directory?
Putting our application in a specific directory helps us organise our application better. In some cases, we have more than one file. For example, a web application may contain several files and other web assets. Having such application in the root directory is a bit messy. Furthermore, the application itself may expose some files found on the OS and return these to the caller. We do not want to return a sensitive file by mistake, just because we deployed our application in the root folder.
Putting our application in a specific directory allows us to limit the access rights for the users that will be used to run our application to just this directory. This will prevent an attacker, accessing anywhere else in the docker container by simply taking advantage of a vulnerability within our application.
The working directory of our docker image is
/
(the root folder). We will change this it to/opt/app
. The directory does not need to exist and will be created automatically.Steps:
Add the working directory to the
Dockerfile
WORKDIR /opt/app
Build the new docker image
$ docker build . -t demo:local Sending build context to Docker daemon 166.9kB Step 1/2 : FROM adoptopenjdk/openjdk14:jre-14.0.1_7-alpine ---> 82f70d1be68e Step 2/2 : WORKDIR /opt/app ---> Using cache ---> e5738c7ba12f Successfully built e5738c7ba12f Successfully tagged demo:local
Note that now we have two steps, one for every line in the
Dockerfile
Run the image and print the current working directory
$ docker run -it demo:local /bin/sh # pwd /opt/app
Copy our application to docker
We need to copy our JAR file from the local filesystem to the docker image using the
COPY
instruction.COPY ./build/libs/demo-all.jar ./application.jar
When docker builds the image, it will copy the file
./build/libs/demo-all.jar
to the docker image being created. Here we are also renaming the JAR file toapplication.jar
.Steps:
Built the project
./gradlew clean build
The JAR file produced by the build task will be used to create the docker image. The docker image relies on the fat JAR file
build/libs/demo-all.jar
This needs to be an executable (fat) JAR containing all dependencies. The JAR file needs to be able to run using just
$ java -jar build/libs/demo.jar
This is how docker will run our application
Add the
COPY
instruction to theDockerfile
COPY ./build/libs/demo-all.jar ./application.jar
Build the new docker image
$ docker build . -t demo:local Sending build context to Docker daemon 14.77MB Step 1/3 : FROM adoptopenjdk/openjdk14:jre-14.0.1_7-alpine ---> 82f70d1be68e Step 2/3 : WORKDIR /opt/app ---> Using cache ---> e5738c7ba12f Step 3/3 : COPY ./build/libs/demo-all.jar ./application.jar ---> b5ff637e4e91 Successfully built b5ff637e4e91 Successfully tagged demo:local
Note that now we have three steps, one for every instruction we have in the
Dockefile
.Manually run the application
Run the newly built docker image and list the files in the current directory.
$ docker run -it demo:local /bin/sh # pwd /opt/app # ls -l -rw-r--r-- application.jar
Run the application using the same
java -jar
command passingapplication.jar
as the JAR file# java -jar application.jar Hello world.
Note that our application was copied into docker, but we have to manually start it.
Make the application to run on start-up
CMD ["java", "-jar", "application.jar"]
The
CMD
instruction instructs docker container to run the given command when the container starts.Steps:
Add the
CMD
instruction to theDockerfile
CMD ["java", "-jar", "application.jar"]
Build the new docker image
$ docker build . -t demo:local Sending build context to Docker daemon 14.77MB Step 1/4 : FROM adoptopenjdk/openjdk14:jre-14.0.1_7-alpine ---> 82f70d1be68e Step 2/4 : WORKDIR /opt/app ---> Using cache ---> e5738c7ba12f Step 3/4 : COPY ./build/libs/demo-all.jar ./application.jar ---> Using cache ---> b5ff637e4e91 Step 4/4 : CMD ["java", "-jar", "application.jar"] ---> Running in cd66c5b54493 Removing intermediate container cd66c5b54493 ---> efab5e9092f4 Successfully built efab5e9092f4 Successfully tagged demo:local
Note that now we have four steps, one for every instruction we have in the
Dockefile
.Run the docker image
$ docker run -t demo:local Hello world.
Note that we are not using the
-i
flag anymore. Docker knows what needs to be done and we do not need to interact with it unless we need to debug something. This is how docker will actually run our container.
The complete Dockerfile
is shown next
FROM adoptopenjdk/openjdk14:jre-14.0.1_7-alpine
WORKDIR /opt/app
COPY ./build/libs/demo.jar ./application.jar
CMD ["java", "-jar", "application.jar"]
Multi-Stage Docker Build
The docker file depends on the JAR file to be generated before it runs. Docker can be used to first build the executable JAR and then creates the image.
For docker to be able to build the application it now needs the source code and any other artefacts required to run ./gradlew build
. We can copy all files, but that’s considered as bad practice as ideally docker build a clean image and should not reuse anything else but the source files from our local filesystem. Docker should be able to build the image by simply checking out the source from the repository and the execute docker build
.
We have two options to selectively copy the files required by the ./gradlew build
to successfully run.
Create .dockerignore file
.classpath .dockerignore .git .gitattributes .gitignore .gradle .idea .project .settings .vscode Dockerfile bin build gradlew.bat out
The
COPY
command will ignore all matching files.Alternatively to adding a
.dockerignore
, add multipleCOPY
commands to theDockerfile
COPY ./build.gradle . COPY ./gradle ./gradle COPY ./gradlew . COPY ./settings.gradle . COPY ./src ./src
I personally prefer this option as I intentionally include the file and folders I need to copy. If the IDE generates new files, these are not automatically copied as I forgot to include them to the
.dockerignore
.Clean the project
$ ./gradlew clean
This is not required, but I prefer to remove any artefacts from the local filesystem to iron out any chances that built artefacts are copied by mistake.
Update the
dockerfile
making it a multi-stage docker fileExample using
.dockerignore
FROM adoptopenjdk/openjdk14:jdk-14.0.1_7-alpine-slim AS builder WORKDIR /opt/app COPY . . RUN ./gradlew build FROM adoptopenjdk/openjdk14:jre-14.0.1_7-alpine WORKDIR /opt/app COPY --from=builder /opt/app/build/libs/demo.jar ./application.jar CMD ["java", "-jar", "application.jar"]
Alternatively, copy individual files and folders (preferred option).
FROM adoptopenjdk/openjdk14:jdk-14.0.1_7-alpine-slim AS builder WORKDIR /opt/app COPY ./build.gradle . COPY ./gradle ./gradle COPY ./gradlew . COPY ./settings.gradle . COPY ./src ./src RUN ./gradlew build FROM adoptopenjdk/openjdk14:jre-14.0.1_7-alpine WORKDIR /opt/app COPY --from=builder /opt/app/build/libs/demo.jar ./application.jar CMD ["java", "-jar", "application.jar"]
The
RUN
instruction can be used to run command, such as install packages that we need (likecurl
) or build the project as we did here.We do not need to include the
clean
Gradle task here as we only copied the sources without any built artefacts. Therefore, there is nothing to clean as thebuild
directory was not copied.This time docker will build the JAR and then package it. Unfortunately, it does not take advantage of any caching and makes it a bit slower. While this is slow for development purposes, it ensures that the build is not relying on caches.
Run the docker image
$ docker run -it demo:local Hello world.
For more details, please refer to: https://docs.docker.com/develop/develop-images/multistage-build/.
Multi-stage docker images are not very common as this feature is not supported by some providers. Furthermore, the functionality provided by the Multi-stage docker built is also provided by the Pipeline tools such as Jenkins and GoCD and developers/dev-ops prefer to use this as they tend to provide more than just build the project.
Managing Docker Containers
Once an application is built and packaged into a container, this needs to be executed. We ran our application by using the run
command
$ docker run -it demo:local
That’s all great for development.
Consider the following:
- What happens if our application becomes unresponsive or crashes?
- What happens if our application experiences more load and new instances need to be started?
- How will we reduce the number of instances running when our application is not being used?
- How will we deploy new versions of our application?
- Can we have red/green deployments?
- How will we monitor our application?
- How can we access the logs of our application?
There are many more things to consider when running an application irrespective from docker. Docker enables developers to take advantage of tools that can help us with the above concerns and more as discussed before (such as security). Following are some (not complete) tools and services we can use to manage our docker containers in a production environment.
Some of the above services are able work with JAR files directly and we do not need to create a docker container. For example, using AWS Elastic Beanstalk we can deploy the JAR file and have AWS handling the rest.
Recommended reading
- Docker in Action, Second Edition (O’Reilly Books)
- Docker Essentials: The Definitive Guide to Docker Containerization (O’Reilly Video Series)
- Docker, Dockerfile, and Docker-Compose (2020 Ready!) (O’Reilly Video Series)
- Kubernetes in Action (O’Reilly Books)
- Kubernetes: Up and Running, 2nd Edition (O’Reilly Books)