Docker in NutShell Improve this page

Docker in NutShell

Docker is an open platform for developing, shipping, and running applications. Docker is designed to deliver your applications faster. With Docker you can separate your applications from your infrastructure.

Docker does this by combining kernel containerization features with workflows and tooling that help you manage and deploy your applications. At its core, Docker provides a way to run almost any application securely isolated in a container. The isolation and security allow you to run many containers simultaneously on your host.

Docker’s container-based platform allows for highly portable workloads. Docker containers can run on a developer’s local host, on physical or virtual machines in a data center, or in the Cloud.

What are the major Docker components?

Docker has two major components:

  • Docker Engine: the open source containerization platform.
  • Docker Hub: our Software-as-a-Service platform for sharing and managing Docker containers.

What is Docker’s architecture?

Docker uses a client-server architecture. The Docker client talks to the Docker daemon, which does the heavy lifting of building, running, and distributing your Docker containers. Both the Docker client and the daemon can run on the same system, or you can connect a Docker client to a remote Docker daemon.

How does a Docker image work?

We’ve already seen that Docker images are read-only templates from which Docker containers are launched. Each image consists of a series of layers. Docker makes use of union file systems to combine these layers into a single image. Union file systems allow files and directories of separate file systems, known as branches, to be transparently overlaid, forming a single coherent file system.

One of the reasons Docker is so lightweight is because of these layers. When you change a Docker image—for example, update an application to a new version— a new layer gets built. Thus, rather than replacing the whole image or entirely rebuilding, as you may do with a virtual machine, only that layer is added or updated. Now you don’t need to distribute a whole new image, just the update, making distributing Docker images faster and simpler.

Every image starts from a Base image, for example ubuntu, a base Ubuntu image, or fedora, a base Fedora image. You can also use images of your own as the basis for a new image.

Docker images are then built from these base images using a simple, descriptive set of steps we call instructions. Each instruction creates a new layer in our image. Instructions include actions like:

  • Run a command
  • Add a file or directory
  • Create an environment variable
  • What process to run when launching a container from this image

These instructions are stored in a file called a Dockerfile. A Dockerfile is a text based script that contains instructions and commands for building the image from the base image. Docker reads this Dockerfile when you request a build of an image, executes the instructions, and returns a final image.

# Base Image
FROM biocontainers/biocontainers:latest

# Metadata
LABEL base.image="biocontainers:latest"
LABEL version="3"
LABEL software="Comet"
LABEL software.version="2016012"
LABEL description="an open source tandem mass spectrometry sequence database search tool"
LABEL website=""
LABEL documentation=""
LABEL license=""
LABEL tags="Proteomics"

# Maintainer
MAINTAINER Felipe da Veiga Leprevost <[email protected]>

USER biodocker

RUN && \
  wget$ZIP -O /tmp/$ZIP && \
  unzip /tmp/$ZIP -d /home/biodocker/bin/Comet/ && \
  chmod -R 755 /home/biodocker/bin/Comet/* && \
  rm /tmp/$ZIP

RUN mv /home/biodocker/bin/Comet/comet_binaries_2016012/comet.2016012.linux.exe /home/biodocker/bin/Comet/comet

ENV PATH /home/biodocker/bin/Comet:$PATH

WORKDIR /data/

CMD ["comet"]

How does a container work?

A container consists of an operating system, user-added files, and meta-data. As we’ve seen, each container is built from an image. That image tells Docker what the container holds, what process to run when the container is launched, and a variety of other configuration data. The Docker image is read-only.

When Docker runs a container from an image, it adds a read-write layer on top of the image (using a union file system as we saw earlier) in which your application can then run.

When you run a container, either by using the docker binary or via the API, the Docker client tells the Docker daemon to run a container.

$ docker run -i -t ubuntu /bin/bash

The Docker Engine client is launched using the docker binary with the run option running a new container. The bare minimum the Docker client needs to tell the Docker daemon to run the container is:

What Docker image to build the container from, for example: ubuntu The command you want to run inside the container when it is launched, for example: /bin/bash

DockerHub Automated Builds

DockerHub provides limited resources for automated builds. Sometimes, depending on how your program compiles, you may end up having problems trying to build your image. Last time we checked, DockerHub is providing the following resources:

  • 2 hours
  • 2 GB RAM
  • 1 CPU
  • 30 GB Disk Space

If you want to containerize your software, make sure that you are not using more than provided or else the image will not be available through the Docker Registry.

How to avoid Timeouts

One of the problems are are facing with some images is the TimeOut error. This happens when the compilation time takes more than 2 hours. In order to avoid this, one strategy is to break your build in separate Dockerfile recipes, and then use the FROM command to import them.