🐋 Docker

Updated at 2017-02-13 01:46

Docker encapsulates processes inside Linux containers which feel a lot like lightweight virtual machines.

OS X  > boot2docker (Linux VM) > Docker Image > Docker Container
Linux > Docker Image > Docker Container

Docker is between deployment automation and virtualization software. It's all about encapsulated environments. Docker containers are a thin layer on top of the host OS, containing the application and its dependencies, while using the host to run the processes.

Deployment Automation: Puppet, Chef, Ansible
-> Docker
Virtualization Software: VirtualBox, Vagrant

Docker is usually used at the top levels of development stacks.

    - works on all machines, giving uniform dev and production environment
    - folder sync
    - port forwarding
        - works on the machine image run by Vagrant
        - installs Docker
        - downloads Docker containers
        - starts Docker containers
            - one Docker container for each application

Docker Hub is the central for Docker images. You can host the images yourself but Docker Hub is like GitHub for images, private or public. Other options are AWS ECR and dogestry.

docker images # Lists the container images on the current host.
docker login  # Saves credentials to .docker/config.json
docker pull user/image-name:tag
docker push user/image-name:tag

You use boot2docker or Docker Toolbox on non-Linux machines. Docker Toolbox contains a bare bones Linux VM.

# execute Docker Quickstart Terminal in Applications after install

Images are Docker's bread and butter. Container images are used to create running containers. Each running container has its own file system, libraries, shell, etc.

# Searching for public container images.
docker search <KEYWORD_IN_IMAGE_NAME>
docker search tutorial

# Downloading a container image from Docker index.
docker pull learn/tutorial
# tag defaults to `latest` if it's missing

# You run the containers.
# When the command processing stops, the container stops.
docker run <USER>/<IMAGE_REPOSITORY_NAME> # Use command specified in the image.

You turn manipulated containers to images. After downloading the image, you can run commands inside them and them commit the changes.

docker run learn/tutorial apt-get install -y ping
docker ps -l # Find out the identifier for the new container version.
docker commit <CONTAINER_ID> learn/pinger:1 # Saving changes with a new name.
# => Returns the identifier for the new container image e.g. effb66b31edb

Containers can be run in interactive mode or daemon mode.

docker run name:tag echo "Hello World!"
docker run -it name:tag /bin/bash
docker run -d name:tag sleep 10

You can get a lot more details about containers and images.

docker run -d alpine sleep 300

# Show all running containers.
docker ps
docker ps -a # even stopped ones

# Shows CPU, memory and net usage in automatically updating table.
docker stats

# Show a message per container event like start, die, etc.
docker events

# Show what is running inside a container.
# Note that user name might sometimes be a bit off because of of namespaces.

# Full information about a container.
docker inspect <IMAGE_OR_CONTAINER_ID>

# See image or container environmental variables:
docker inspect -f "{{ .Config.Env }}" <IMAGE_OR_CONTAINER_ID>

Anything that goes to stdout or stderr inside Docker is sent to a configurable destination, which is a local JSON file by default. That is what docker logs shows. This is fine for low volumes but consider that:

  • Logging inside the container is usually the best as high volume logging to JSON file backend might slow down Docker daemon.
  • Logging from your application to a remote server / syslog works but cannot be implemented with 3rd party software.
  • Logging from your in-container process manager or wrapper to a remote server; like New Relic supervisor plugin or Spotify syslog redirector.
  • Logging outside the container with another application like Logspout.
  • You can also just disable logging, if you really want it; --log-driver=none.
docker logs <CONTAINER_ID>
docker logs -f <CONTAINER_ID>

Docker images and containers are made up of stacked file system layers identified by hashes. It's a lot similar that how Git works.

# Lists image layers with commit comments and executed command:
docker history <IMAGE_ID>

# Shows what has changed in a container:
docker run alpine sh -c 'mkdir gorilla && echo "King Kong" > gorilla/test.txt'
docker ps -a
docker diff <CONTAINER_ID>
# A /gorilla
# A /gorilla/test.txt

You can tag specific layers. They act as version control mechanism.

docker tag <IMAGE_ID> user/image-name:image-tag
docker tag d42a you/my-app:v1
docker push you/my-app:v1
docker pull you/my-app:v1

Avoid using latest tag (not specifying a tag). latest is the default tag given to build -t, commit, pull etc when a tag is not specified. So latest basically means "the latest build without a tag specific". So if you even specify a tag, the latest is not pointing to the latest image. You can avoid a lot of hassle if you just use version tags all the time.

Images and containers can be removed with rmi and rm.

docker images
docker rmi <IMAGE_ID>
docker rmi <IMAGE_ID>:<TAG>
docker rmi -f <IMAGE_ID>:<TAG> # also remove related containers

docker ps -l
docker rm <CONTAINER_ID>
docker rm -f <CONTAINER_ID> # also stops if running

Building a custom image is easy enough.

docker build -t ruksi/redis . # Build current directory using found Dockerfile
docker images # See that the new image was added: "ruksi/redis"
docker run -p 6379:6379 ruksi/redis # p=redirect port, check for any errors
docker run -p 6379:6379 -d ruksi/redis # d=run background
docker ps # You can find the container id here.
docker logs <CONTAINER_ID> # Shows what have been printed while running.
docker stop <CONTAINER_ID>

Docker build instructions are in Dockerfile. Each line becomes a layers so you might want to do everything in a single script you ADD inside the container.

MAINTAINER <AUTHOR_NAME>       # Set who is the author of this Dockerfile.
RUN <COMMAND>                  # Execute a command inside the container.
COPY /path/to/* /path/to       # Copy files from host to container.
# ADD /path/to/* /path/to      # Copy with some magic like unarchiving tars, avoid
CMD ["command", "param1"]      # Default command when starting the container.
ENTRYPOINT ["command", "param1"] # Make container as an executable.
EXPOSE <PORT>                  # Make container listen to this port.
WORKDIR /path/to               # Set working directory.
ENV <KEY> <VALUE>              # Set environmental variables.
USER <UID_OR_NAME>             # Run as this user, defaults to root.
VOLUME ['/path/to']            # Share files from host to container.
# You execute the Dockerfile with `build`
docker build -t you/image:tag .

You can ignore files with .dockerignore. Tells docker build to exclude specific files or directories.


Avoid using EXPOSE in Dockerfiles. An image shouldn't have ports, the actual container run should specify ports to expose when you run the image. But comment what ports should be opened and what they serve.

You can overwrite the CMD and ENTRYPOINT.

  • ENTRYPOINT makes the image work seemingly like an executable.
  • CMD is used to specify sensible default command.
  • CMD will be appended to the end of ENTRYPOINT.
# `ping` image definition:
FROM ubuntu:trusty
ENTRYPOINT ["/bin/ping", "-c", "3"]
CMD ["localhost"]

docker run ping             # /bin/ping -c 3 localhost
docker run ping   # /bin/ping -c 3, overwriting cmd

# /bin/echo localhost, overwriting entrypoint
docker run --entrypoint /bin/echo ping

The most common base images:

  • scratch: An empty image, you can run only raw binary using scratch. Used for minimal application images. Note that you won't even have any shell in here for debugging.
  • busybox: A small set of common utilities.
  • alpine: BusyBox + a package repository.
  • ubuntu: A full Linux distribution.
# Basic usage of scratch with a built binary:
FROM scratch
COPY hello /
CMD ["/hello"]

# Usage of more common base images:
docker run --rm busybox /bin/sh -c "ls /bin" # => Image is 1.1MB
docker run --rm alpine /bin/sh -c "ls /bin"  # => Image is 4.8MB
docker run --rm ubuntu /bin/sh -c "ls /bin"  # => Image is 188MB

Docker favors microservices. Try to having only one running process per container.

# Basic setup:
1 Nginx Docker, balances traffic to Node containers
1 Redis Docker
1 PostgreSQL Docker
1-10 Node.js Dockers, depending on the traffic

You can tell Docker to automatically restart itself.

docker run -d busybox sleep 3
docker run -d --restart=always busybox sleep 3
docker run -d --restart=on-failure:3 busybox sh -c "sleep 3 && exit 1"
docker events
docker ps -a
docker rm -f `docker ps -aq`

You can enter a running container with exec.

docker run -d --restart=always busybox sleep 300
docker ps
docker exec -it `docker ps -q` /bin/sh
docker rm -f `docker ps -aq`

You can create a container without running it.

docker create busybox sleep 10
docker ps -a
docker start <CONTAINER_ID>
docker stop <CONTAINER_ID>
docker start <CONTAINER_ID>
docker rm -f `docker ps -aq`

You can send any signal to processes inside Docker with kill. You can catch those signals in your application and act on it e.g. self-restart.

docker kill <CONTAINER_ID>
docker kill --signal=USR1 <CONTAINER_ID>
docker ps -a
docker start <CONTAINER_ID>
# You can even restart killed containers.

Pausing keeps the container alive but won't allocate any CPU cycles to it.

docker run -d busybox sleep 10
docker pause <CONTAINER_ID>
docker ps # You notice that the status is paused.
docker unpause <CONTAINER_ID>
docker rm -f `docker ps -aq`

Dockers running on a same machine can communicate using their identifiers, names or aliases given with --link. This is rarely useful for production, but nice for development and testing. Also check out Docker Compose for more advanced linking.

# Start a nginx server
docker run -d -p 3000:80 nginx:alpine
docker ps
# Short identifier is something like 80e66a8a8e24

# Hosts file is pretty basic without any links.
docker run --rm alpine /bin/cat /etc/hosts
#    dcdcf609565c

# Let's what our hosts file looks like after linking
docker run --link 80e66a8a8e24:noob --rm alpine /bin/cat /etc/hosts
#    noob 80e66a8a8e24 awesome_pasteur
#    a6caec808b7c

docker run --link 80e66a8a8e24:noob --rm alpine sh -c "printenv | grep NOOB_"
# Show 13 or so env variables that tell more details about the linked container.

# Now you can curl the linked container, note that the port is the original
# set on image.
docker run --link 80e66a8a8e24:noob --rm alpine \
    sh -c "apk add --update curl && curl noob:80"
# Shows basic nginx HTML response.

# You can also use the container id or generated name but that is less useful
# for prebuilt images.
docker run --link 80e66a8a8e24 --rm alpine \
    sh -c "apk add --update curl && curl 80e66a8a8e24:80"

You can add metadata to containers using labels.

docker run -d -l purpose=annihilate -l means=chaos alpine sleep 300
docker ps
docker inspect 286452d55895
# ...
# "Labels": {
#     "means": "chaos",
#     "purpose": "annihilate"
# },
# ...
docker ps -a -f label=means=chaos       # finds the container
docker ps -a -f label=means=diplomacy   # doesn't find the container

You can share storage volumes (directories) from host to container. But the less Docker image relies on the host having files, the better, so avoid using volumes if not required.

mkdir gorilla
echo "King Kong" > gorilla/test.txt
cat gorilla/test.txt
docker run alpine sh -c "cat /gorilla/test.txt"
# cat: can't open '/gorilla/test.txt': No such file or directory
docker run -v `pwd`/gorilla:/gorilla alpine sh -c "cat /gorilla/test.txt"
# King Kong
docker run -v `pwd`/gorilla:/gorilla alpine sh -c 'echo "Diddy Kong" > /gorilla/test.txt'
cat gorilla/test.txt
# Diddy Kong

# You can make volumes read only with --read-only=true

You can copy files to host from a running Docker or back with docker cp. Prefer ADD in Dockerfile though.

mkdir hedgehog
echo "Sonic" > hedgehog/test.txt
cat hedgehog/test.txt
docker run alpine sh -c "cat /hedgehog/test.txt"
docker run -v `pwd`/hedgehog:/hedgehog alpine sh -c "cat /hedgehog/test.txt"
docker run -d -v `pwd`/hedgehog:/hedgehog alpine sleep 300
docker ps
docker cp b2417cface1c:/hedgehog/test.txt ./dunno.txt
docker rm -f `docker ps -aq`
cat dunno.txt
# Sonic

You can limit CPU usage and pin containers to a specific CPU.

docker run alpine sleep 3
docker run --cpu-shares 1024 alpine sleep 3 # Up to 100%, not limited, default.
docker run --cpu-shares 512 alpine sleep 3  # Up to 50% when competition.
docker run --cpu-shares 256 alpine sleep 3  # Up to 25% when competition.
docker run --cpu-shares 2 alpine sleep 3    # Really low priority execution.
# All of these get the same number of CPU cycles, the difference
# is for how long. And it's more like a scheduler hint than a limitation.
# 1024 might get 100 microseconds, 512 50 microseconds and so on.

docker run alpine sleep 3
docker run --cpuset-cpus="0" alpine sleep 3   # Runs only on 1st CPU
docker run --cpuset-cpus="1" alpine sleep 3   # Runs only on 2nd CPU
docker run --cpuset-cpus="0,1" alpine sleep 3 # Runs only on 1st and 2nd CPU

You can limit RAM and swap memory usage of containers. Memory restrictions are hard limits.

docker run alpine sleep 3
docker run -m 512m alpine sleep 3 # Limited to 512MB of RAM and 512MB of swap.
docker run -m 256m alpine sleep 3 # Limited to 256MB of RAM and 256MB of swap.
docker run -m 256m --memory-swap=-1 alpine sleep 3 # 256 RAM, no swap
docker run -m 128m --memory-swap=192m alpine sleep 3 # 128 RAM, 64MB of swap.

ulimits allows limiting open files and running processes.

docker run alpine sleep 3
docker run --ulimit nofile=10:20 alpine sleep 3 # 10 soft, 20 hard, open files
docker run --ulimit nproc=10:20 alpine sleep 3  # 10 soft, 20 hard, processes

Common Docker deployment flow:

  1. Get the application code.
  2. Build a Docker image with the compiled source and all dependencies installed. Tag it after build number or commit hash, NOT latest or empty.
  3. Start a container based on the image and run all tests against it.
  4. If fails, send email or Slack message.
  5. If passes, push the Docker image to registry.
  6. Trigger a deployment using orchestration tools: - Orchestration (Simple):
    • Docker Swarm
    • New Relic Centurion
    • Spotify Helios
    • Ansible Docker Tooling

- Orchestration (Advanced): - Kubernetes - Mesos

  1. Note that the most of this should be automated! Jenkins is a common build environment and has multiple Docker plugins available.