Link

Making applications portable (containerisation using docker)

Table of contents

  1. What is Docker?
  2. How does this work?
  3. More than just Containers
  4. Setup Docker
  5. Working with Docker
  6. Dockerize the Application
  7. Multi-Stage Docker Build
  8. Managing Docker Containers
  9. Recommended reading

What is Docker?

Say that we write an application with OpenJDK 14 and we would like to deploy this application on a server somewhere within the organisation. The OS where our application will be running needs to have OpenJDK 14 installed otherwise our application will not run.

Another team builds another application, this time using NodeJS 12. The OS where this application will be running needs to have the right version of NodeJS installed.

This raises the following questions

  1. Who will be managing this?
  2. Who will make sure that the correct version of libraries/frameworks platforms are installed?
  3. How can we manage multiple version of libraries/frameworks/platforms running at the same time?
  4. How can we ensure that the application works as expected on the production environment?
  5. How will we save the configuration such that we can scale it to hundreds or thousands of servers?

Gradle helps us building and packaging all application’s dependencies into one fat JAR, but it falls short in setting up the operating system. That’s outside its scope.

Docker is a tool that we can use to bridge this gap and have an environment ready for our application to run on.

How does this work?

In a nutshell, an application is packaged into a Docker Image. For example, if OpenJDK 14 is needed to run the application, we set up Docker image to have OpenJDK 14 installed. If another application requires Oracle JRE 8, we setup a separate Docker image for the second application.

Similar to Java JAR files, docker creates images, which are the unit of work for docker. A docker image can be started using a command similar to the following.

$ docker run my-app

Docker takes the image named my-app and runs it as a docker container. Note that here we switched from docker image to docker container.

An instance of a docker image is called a docker container.

A docker container is a running version of the docker image. If the application produces logs files, these logs files will be in the docker container (not in the docker image).

A docker container can start and stop like any OS. The state of a docker container may or may not be preserved between different runs. With that said, do not rely on the container state in production. Docker Volumes can be used to address this issue, but this goes way beyond the scope of this literature.

More than just Containers

Docker provides more than just the correct configuration.

  1. Limited Access

    Consider the case where two or more applications are running on the same OS. One of the applications may be able (intentionally or unintentionally) be able to access files saved by another application running on the same OS.

    Someone needs to make sure that an application does not access resources that belong to another application.

    Secrets used to access resources, such as databases, may be saved as environment variables. OS scope environment variables are available to all applications running on the same OS. Any application running on the same OS will be able to access the secrets that belong to another application.

  2. Limit attack surface area

    Say that we have a server with several applications running on it and one of these applications have a security vulnerability.

    Vulnerabilities may come from different places including libraries used by an application or the platform on which it runs.

    The other applications running on the same OS may be affected by this vulnerability too.

  3. Limit Damage

    An application may misbehave such that it causes the OS to misbehave, perform poorly or crash

    1. Consume more memory than planned
    2. Open too many files handles
    3. Open too many threads

    It is not always easy to configure the OS such that it limits the resources each application uses.

Setup Docker

  1. Verify that docker is installed

    $ docker --version
    Docker version 19.03.8, build afacb8b
    

    Install docker if missing following the instructions: https://docs.docker.com/docker-for-mac/install/

  2. Verify that docker is running

    Docker Desktop

Working with Docker

  1. A docker hub account is required. Create an account if you do not have one yet.

  2. Work with an existing docker image (created by someone else)

    This docker image bash:5.0.17 is a basic Linux OS that has bash support.

    $ docker pull bash:5.0.17
    

    Alternative, we can run the image immediately using run instead of pull.

    You need to be logged in, otherwise you will get an error similar to the following.

    Error response from daemon: Get https://registry-1.docker.io/v2/library/bash/manifests/5.0.17: unauthorized: incorrect username or password
    

    Login

    $ docker login --username <YOUR-USERNAME>
    Login Succeeded
    
  3. Run the docker image

    $ docker run -it bash:5.0.17
    

    Now you are in the bash:5.0.17 docker container

    bash-5.0#
    

    The -i option indicates that we need to interact with the docker container. Without it, we will not be able to interact with the docker container. This is very useful while debugging.

  4. Open another terminal and run

    $ docker ps
    

    This will show the running docker containers

    CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
    110810b3d472        bash:5.0.17         "docker-entrypoint.s…"   53 minutes ago      Up 53 minutes                           brave_payne
    

    Some of the information, such as the CONTAINER ID and the NAMES, will be different. Note that rerunning the image will create a new container which will have different and independent state from any other containers, even containers for the same image.

  5. Try out some commands

    1. pwd

      bash-5.0# pwd
      /
      
    2. ls -la

      bash-5.0# ls -la
      total 64
      drwxr-xr-x    1 root     root          4096 May  5 09:54 .
      drwxr-xr-x    1 root     root          4096 May  5 09:54 ..
      -rwxr-xr-x    1 root     root             0 May  5 09:54 .dockerenv
      drwxr-xr-x    1 root     root          4096 Apr 24 22:51 bin
      drwxr-xr-x    5 root     root           360 May  5 09:54 dev
      drwxr-xr-x    1 root     root          4096 May  5 09:54 etc
      drwxr-xr-x    2 root     root          4096 Apr 23 06:25 home
      drwxr-xr-x    1 root     root          4096 Apr 24 22:51 lib
      drwxr-xr-x    5 root     root          4096 Apr 23 06:25 media
      drwxr-xr-x    2 root     root          4096 Apr 23 06:25 mnt
      drwxr-xr-x    2 root     root          4096 Apr 23 06:25 opt
      dr-xr-xr-x  188 root     root             0 May  5 09:54 proc
      drwx------    2 root     root          4096 Apr 23 06:25 root
      drwxr-xr-x    2 root     root          4096 Apr 23 06:25 run
      drwxr-xr-x    2 root     root          4096 Apr 23 06:25 sbin
      drwxr-xr-x    2 root     root          4096 Apr 23 06:25 srv
      dr-xr-xr-x   13 root     root             0 May  5 09:54 sys
      drwxrwxrwt    1 root     root          4096 Apr 24 22:51 tmp
      drwxr-xr-x    1 root     root          4096 Apr 24 22:51 usr
      drwxr-xr-x    1 root     root          4096 Apr 24 22:51 var
      
    3. echo

      bash-5.0# echo "Hello Docker"
      Hello Docker
      
    4. env

      bash-5.0# env
      HOSTNAME=110810b3d472
      PWD=/
      _BASH_GPG_KEY=7C0135FB088AAF6C66C650B9BB5869F064EA74AB
      HOME=/root
      _BASH_VERSION=5.0
      _BASH_PATCH_LEVEL=0
      _BASH_LATEST_PATCH=17
      TERM=xterm
      SHLVL=1
      PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
      _=/usr/bin/env
      
  6. Add curl

    curl is not available on the image we are using. We can use a different image that already contains curl, or install it ourselves.

    ⚠️ Note that we are working inside a container and all changes we make to this container will be lost once this container is stopped.

    bash-5.0# curl http://www.google.com
    bash: curl: command not found
    

    Use the package manager available to the OS you are using. Alpine, the OS we are using, uses the apk package manager. A list of available packages is available here.

    bash-5.0# apk add curl
    fetch http://dl-cdn.alpinelinux.org/alpine/v3.11/main/x86_64/APKINDEX.tar.gz
    fetch http://dl-cdn.alpinelinux.org/alpine/v3.11/community/x86_64/APKINDEX.tar.gz
    (1/4) Installing ca-certificates (20191127-r1)
    (2/4) Installing nghttp2-libs (1.40.0-r0)
    (3/4) Installing libcurl (7.67.0-r0)
    (4/4) Installing curl (7.67.0-r0)
    Executing busybox-1.31.1-r9.trigger
    Executing ca-certificates-20191127-r1.trigger
    OK: 8 MiB in 21 packages
    

    Now we have curl installed

    bash-5.0# curl http://www.google.com
    <!doctype html><html itemscope="" ...
    

    Note that curl is not part of the image. Only this container has curl installed. The curl will not be available on any other container for the same image.

    1. Stop the container

      bash-5.0# exit
      $ exit
      
    2. Start a new container

      $ docker run -it bash:5.0.17
      bash-5.0#
      
    3. Try the curl command

      bash-5.0# curl http://www.google.com
      bash: curl: command not found
      

    Any changes made to a container are lost once the container is stopped.

  7. How can we run a Java application within a container?

    The bash:5.0.17 image does not include Java.

    $ docker run -it bash:5.0.17
    bash-5.0# java -version
    bash: java: command not found
    

    We can use an image which have the Java we need already installed, such as the adoptopenjdk/openjdk14:jre-14.0.1_7-alpine image.

    $ docker run -it adoptopenjdk/openjdk14:jre-14.0.1_7-alpine
    

    Note that some examples may also append the command to be executed when the container starts, such as

    $ docker run -it adoptopenjdk/openjdk14:jre-14.0.1_7-alpine /bin/sh
    

    The above is instructing docker to open a shell terminal. Note that the image adoptopenjdk/openjdk14:jre-14.0.1_7-alpine does not have bash installed. We can use the sh instead. That is why we are running the /bin/sh command instead of /bin/bash.

    Check the Java version installed

    # java -version
    openjdk version "14.0.1" 2020-04-14
    OpenJDK Runtime Environment AdoptOpenJDK (build 14.0.1+7)
    OpenJDK 64-Bit Server VM AdoptOpenJDK (build 14.0.1+7, mixed mode, sharing)
    

    This docker image comes with Java 14 already setup. A more important observation is that given that we are using a specific version of the image, we will always have the same version of Java installed.

    How do we get our application in the docker container? The application needs to be dockerize, described in the Dockerize the Application section.

Dockerize the Application

  1. The Dockerfile text file

    Docker will use a text file named Dockerfile to create our image. The Dockerfile file is part of the source code and can be used by the build pipeline to build our docker images and deploy them into production environments.

    Steps:

    1. Create the Dockerfile

      $ vi Dockerfile
      
  2. Extend an existing docker image

    We can create a docker image from scratch, but this will require lots of effort as we need to install the OS files, the packages we need (such as curl for example) and install the correct version of Java. Alternatively, we can use an existing image from the docker repository that does this for us.

    ⚠️ Note that many companies, have internal docker repositories and only allow images coming from these repositories for security purposes. Such docker repositories scan the images and make sure that these docker images are secure and do not contain any funny business.

    FROM adoptopenjdk/openjdk14:jre-14.0.1_7-alpine
    

    The above docker file is extending one of the adoptopenjdk images as defined by the FROM instruction.

    Fragments of the docker image adoptopenjdk/openjdk14:jre-14.0.1_7-alpine are shown next.

    FROM alpine:3.11
    
    ENV LANG='en_US.UTF-8' LANGUAGE='en_US:en' LC_ALL='en_US.UTF-8'
    
    RUN apk add --no-cache --virtual .build-deps curl binutils \
        && GLIBC_VER="2.31-r0" \
        && ALPINE_GLIBC_REPO="https://github.com/sgerrand/alpine-pkg-glibc/releases/download" \
        && GCC_LIBS_URL="https://archive.archlinux.org/packages/g/gcc-libs/gcc-libs-9.1.0-2-x86_64.pkg.tar.xz" \
        && GCC_LIBS_SHA256="91dba90f3c20d32fcf7f1dbe91523653018aa0b8d2230b00f822f6722804cf08" \
    
    ...
    
    ENV JAVA_HOME=/opt/java/openjdk \
        PATH="/opt/java/openjdk/bin:$PATH"
    

    Please note that the above is incomplete for brevity and the full example can be found here.

    The adoptopenjdk/openjdk14:jre-14.0.1_7-alpine installs the Adopt OpenJDK 14 and set the environment. The adoptopenjdk/openjdk14:jre-14.0.1_7-alpine docker image is built on top of another image, the alpine:3.11, shown next.

    FROM scratch
    ADD alpine-minirootfs-3.11.6-x86_64.tar.gz /
    CMD ["/bin/sh"]
    

    The alpine:3.11 does not depend on anything (as indicated by the FROM scratch instruction) and is referred to as base image.

    If we want to create a docker image from scratch, we need to merge both docker images, and the files they are referring to, into our docker file. Note that this is necessary in our case, and will simply extend an existing image.

    Steps:

    1. Import from adoptopenjdk/openjdk14:jre-14.0.1_7-alpine in the Dockerfile

      FROM adoptopenjdk/openjdk14:jre-14.0.1_7-alpine
      
    2. Build the new docker image

      $ docker build . -t demo:local
      Sending build context to Docker daemon  166.9kB
      Step 1/1 : FROM adoptopenjdk/openjdk14:jre-14.0.1_7-alpine
       ---> 82f70d1be68e
      Successfully built 82f70d1be68e
      Successfully tagged demo:local
      
    3. Run the newly built docker image

      $ docker run -it demo:local /bin/sh
      #
      
    4. Verify the Java version

      # java -version
      openjdk version "14.0.1" 2020-04-14
      OpenJDK Runtime Environment AdoptOpenJDK (build 14.0.1+7)
      OpenJDK 64-Bit Server VM AdoptOpenJDK (build 14.0.1+7, mixed mode, sharing)
      
    5. Check the working directory

      # pwd
      /
      
  3. Set the working directory

    WORKDIR /opt/app
    

    The WORKDIR instruction defines directory where our application will be running from.

    Why do we need to change the working directory?

    1. Putting our application in a specific directory helps us organise our application better. In some cases, we have more than one file. For example, a web application may contain several files and other web assets. Having such application in the root directory is a bit messy. Furthermore, the application itself may expose some files found on the OS and return these to the caller. We do not want to return a sensitive file by mistake, just because we deployed our application in the root folder.

    2. Putting our application in a specific directory allows us to limit the access rights for the users that will be used to run our application to just this directory. This will prevent an attacker, accessing anywhere else in the docker container by simply taking advantage of a vulnerability within our application.

    The working directory of our docker image is / (the root folder). We will change this it to /opt/app. The directory does not need to exist and will be created automatically.

    Steps:

    1. Add the working directory to the Dockerfile

      WORKDIR /opt/app
      
    2. Build the new docker image

      $ docker build . -t demo:local
      Sending build context to Docker daemon  166.9kB
      Step 1/2 : FROM adoptopenjdk/openjdk14:jre-14.0.1_7-alpine
       ---> 82f70d1be68e
      Step 2/2 : WORKDIR /opt/app
       ---> Using cache
       ---> e5738c7ba12f
      Successfully built e5738c7ba12f
      Successfully tagged demo:local
      

    Note that now we have two steps, one for every line in the Dockerfile

    1. Run the image and print the current working directory

      $ docker run -it demo:local /bin/sh
      # pwd
      /opt/app
      
  4. Copy our application to docker

    We need to copy our JAR file from the local filesystem to the docker image using the COPY instruction.

    COPY ./build/libs/demo-all.jar ./application.jar
    

    When docker builds the image, it will copy the file ./build/libs/demo-all.jar to the docker image being created. Here we are also renaming the JAR file to application.jar.

    Steps:

    1. Built the project

      ./gradlew clean build
      

      The JAR file produced by the build task will be used to create the docker image. The docker image relies on the fat JAR file

      build/libs/demo-all.jar
      

      This needs to be an executable (fat) JAR containing all dependencies. The JAR file needs to be able to run using just

      $ java -jar build/libs/demo.jar
      

      This is how docker will run our application

    2. Add the COPY instruction to the Dockerfile

      COPY ./build/libs/demo-all.jar ./application.jar
      
    3. Build the new docker image

      $ docker build . -t demo:local
      Sending build context to Docker daemon  14.77MB
      Step 1/3 : FROM adoptopenjdk/openjdk14:jre-14.0.1_7-alpine
       ---> 82f70d1be68e
      Step 2/3 : WORKDIR /opt/app
       ---> Using cache
       ---> e5738c7ba12f
      Step 3/3 : COPY ./build/libs/demo-all.jar ./application.jar
       ---> b5ff637e4e91
      Successfully built b5ff637e4e91
      Successfully tagged demo:local
      

      Note that now we have three steps, one for every instruction we have in the Dockefile.

    4. Manually run the application

      Run the newly built docker image and list the files in the current directory.

      $ docker run -it demo:local /bin/sh
      # pwd
      /opt/app
      # ls -l
      -rw-r--r-- application.jar
      

      Run the application using the same java -jar command passing application.jar as the JAR file

      # java -jar application.jar
      Hello world.
      

    Note that our application was copied into docker, but we have to manually start it.

  5. Make the application to run on start-up

    CMD ["java", "-jar", "application.jar"]
    

    The CMD instruction instructs docker container to run the given command when the container starts.

    Steps:

    1. Add the CMD instruction to the Dockerfile

      CMD ["java", "-jar", "application.jar"]
      
    2. Build the new docker image

      $ docker build . -t demo:local
      Sending build context to Docker daemon  14.77MB
      Step 1/4 : FROM adoptopenjdk/openjdk14:jre-14.0.1_7-alpine
       ---> 82f70d1be68e
      Step 2/4 : WORKDIR /opt/app
       ---> Using cache
       ---> e5738c7ba12f
      Step 3/4 : COPY ./build/libs/demo-all.jar ./application.jar
       ---> Using cache
       ---> b5ff637e4e91
      Step 4/4 : CMD ["java", "-jar", "application.jar"]
       ---> Running in cd66c5b54493
      Removing intermediate container cd66c5b54493
       ---> efab5e9092f4
      Successfully built efab5e9092f4
      Successfully tagged demo:local
      

      Note that now we have four steps, one for every instruction we have in the Dockefile.

    3. Run the docker image

      $ docker run -t demo:local
      Hello world.
      

      Note that we are not using the -i flag anymore. Docker knows what needs to be done and we do not need to interact with it unless we need to debug something. This is how docker will actually run our container.

The complete Dockerfile is shown next

FROM adoptopenjdk/openjdk14:jre-14.0.1_7-alpine
WORKDIR /opt/app
COPY ./build/libs/demo.jar ./application.jar
CMD ["java", "-jar", "application.jar"]

Multi-Stage Docker Build

The docker file depends on the JAR file to be generated before it runs. Docker can be used to first build the executable JAR and then creates the image.

For docker to be able to build the application it now needs the source code and any other artefacts required to run ./gradlew build. We can copy all files, but that’s considered as bad practice as ideally docker build a clean image and should not reuse anything else but the source files from our local filesystem. Docker should be able to build the image by simply checking out the source from the repository and the execute docker build.

We have two options to selectively copy the files required by the ./gradlew build to successfully run.

  1. Create .dockerignore file

    .classpath
    .dockerignore
    .git
    .gitattributes
    .gitignore
    .gradle
    .idea
    .project
    .settings
    .vscode
    Dockerfile
    bin
    build
    gradlew.bat
    out
    

    The COPY command will ignore all matching files.

    Alternatively to adding a .dockerignore, add multiple COPY commands to the Dockerfile

    COPY ./build.gradle .
    COPY ./gradle ./gradle
    COPY ./gradlew .
    COPY ./settings.gradle .
    COPY ./src ./src
    

    I personally prefer this option as I intentionally include the file and folders I need to copy. If the IDE generates new files, these are not automatically copied as I forgot to include them to the .dockerignore.

  2. Clean the project

    $ ./gradlew clean
    

    This is not required, but I prefer to remove any artefacts from the local filesystem to iron out any chances that built artefacts are copied by mistake.

  3. Update the dockerfile making it a multi-stage docker file

    Example using .dockerignore

    FROM adoptopenjdk/openjdk14:jdk-14.0.1_7-alpine-slim AS builder
    WORKDIR /opt/app
    COPY . .
    RUN ./gradlew build
    
    FROM adoptopenjdk/openjdk14:jre-14.0.1_7-alpine
    WORKDIR /opt/app
    COPY --from=builder /opt/app/build/libs/demo.jar ./application.jar
    CMD ["java", "-jar", "application.jar"]
    

    Alternatively, copy individual files and folders (preferred option).

    FROM adoptopenjdk/openjdk14:jdk-14.0.1_7-alpine-slim AS builder
    WORKDIR /opt/app
    COPY ./build.gradle .
    COPY ./gradle ./gradle
    COPY ./gradlew .
    COPY ./settings.gradle .
    COPY ./src ./src
    RUN ./gradlew build
    
    FROM adoptopenjdk/openjdk14:jre-14.0.1_7-alpine
    WORKDIR /opt/app
    COPY --from=builder /opt/app/build/libs/demo.jar ./application.jar
    CMD ["java", "-jar", "application.jar"]
    

    The RUN instruction can be used to run command, such as install packages that we need (like curl) or build the project as we did here.

    We do not need to include the clean Gradle task here as we only copied the sources without any built artefacts. Therefore, there is nothing to clean as the build directory was not copied.

    This time docker will build the JAR and then package it. Unfortunately, it does not take advantage of any caching and makes it a bit slower. While this is slow for development purposes, it ensures that the build is not relying on caches.

  4. Run the docker image

    $ docker run -it demo:local
    Hello world.
    

For more details, please refer to: https://docs.docker.com/develop/develop-images/multistage-build/.

Multi-stage docker images are not very common as this feature is not supported by some providers. Furthermore, the functionality provided by the Multi-stage docker built is also provided by the Pipeline tools such as Jenkins and GoCD and developers/dev-ops prefer to use this as they tend to provide more than just build the project.

Managing Docker Containers

Once an application is built and packaged into a container, this needs to be executed. We ran our application by using the run command

$ docker run -it demo:local

That’s all great for development.

Consider the following:

  1. What happens if our application becomes unresponsive or crashes?
  2. What happens if our application experiences more load and new instances need to be started?
  3. How will we reduce the number of instances running when our application is not being used?
  4. How will we deploy new versions of our application?
  5. Can we have red/green deployments?
  6. How will we monitor our application?
  7. How can we access the logs of our application?

There are many more things to consider when running an application irrespective from docker. Docker enables developers to take advantage of tools that can help us with the above concerns and more as discussed before (such as security). Following are some (not complete) tools and services we can use to manage our docker containers in a production environment.

  1. AWS
  2. Google Cloud
  3. Microsoft Azure
  4. Digital Ocean
  5. Kubernetes
  6. Docker Swarm
  7. Portainer
  8. Rancher

Some of the above services are able work with JAR files directly and we do not need to create a docker container. For example, using AWS Elastic Beanstalk we can deploy the JAR file and have AWS handling the rest.

  1. Docker in Action, Second Edition (O’Reilly Books)
  2. Docker Essentials: The Definitive Guide to Docker Containerization (O’Reilly Video Series)
  3. Docker, Dockerfile, and Docker-Compose (2020 Ready!) (O’Reilly Video Series)
  4. Kubernetes in Action (O’Reilly Books)
  5. Kubernetes: Up and Running, 2nd Edition (O’Reilly Books)