Docker II: Basic Concepts

Image

As we all know, the operating system is divided into kernel and userspace. For Linux, after the kernel starts, it mounts a root filesystem to provide userspace support. A Docker image is the equivalent of a root filesystem. For example, the official ubuntu:18.04 image contains a complete root filesystem for the Ubuntu 18.04 minimal system.

A Docker image is a special filesystem that provides the container’s runtime programs, libraries, resources, and configurations, as well as some configuration parameters (such as anonymous volumes, environment variables, users, and so on) for the runtime. The image does not contain any dynamic data and its contents are not changed after the build.

Tiered Storage

Because images contain the complete root filesystem of the operating system and are often very large, Docker was designed to make full use of Union FS technology to design a tiered storage architecture. So strictly speaking, an image is not a packaged file like an ISO, but a virtual concept, which is not composed of a single file, but a set of filesystems, or, in other words, a union of multiple layers of filesystems.

When an image is built, it is built one layer at a time, with the first layer being the basis for the next. No further changes are made after each layer is built, and any changes on the latter layer occur only on its own layer. For example, deleting a file from a previous layer is not actually deleting the file from the previous layer, but only marking it as deleted in the current layer. When the final container is run, it will not see the file, but the file will actually follow the image all the way. As a result, extra care needs to be taken when building the image; each tier should try to contain only what needs to be added to that tier, and any extras should be cleaned up by the end of that tier’s build.

The feature of tiered storage also makes it easier to reuse and customise mirrors. It is even possible to build new images using a previously built image as a base layer, and then add further layers to customise what you need.

Container

The relationship between Image and Container is like class and instance in object-oriented programming, where the Image is a static definition and the Container is the runtime entity of the Image. Containers can be created, started, stopped, deleted, suspended, and so on.

Containers are essentially processes, but unlike processes that execute directly on the host, container processes run in their own separate namespace. So containers can have their own root filesystem, their own network configuration, their own process space, and even their own user ID space. Container processes run in an isolated environment and can be used as if they were operating on a host-independent system. This feature makes container-encapsulated applications more secure than running directly on the host. Because of this isolation, many people often confuse containers with virtual machines when they first learn Docker.

As mentioned earlier, images use tiered storage, and so do containers. When a container is running, the image is used as the base layer, and a storage layer for the current container is created on top of it. We can call this storage layer, which is prepared for reading and writing containers at runtime, the container storage layer.

The survival cycle of the container storage layer is the same as that of the container, and when the container dies, the container storage layer dies with it. Therefore, any information stored in the container storage layer is lost when the container is deleted.

According to Docker’s best practices, containers should not write any data to their storage tier, and the container storage tier should remain stateless. All file writes should be made to a data volume, or to a bound host directory, where reads and writes skip the container storage tier and occur directly to the host (or network storage) for better performance and stability.

The survival cycle of a data volume is independent of the container; if the container dies, the data volume does not die. Therefore, after using a data volume, the data will not be lost after the container is deleted or re-run.

Registry

Once an image is built, it can be easily run on the current host, but if you need to use it on other servers, you need a centralised service for storing and distributing images, and the Docker Registry is such a service.

A Docker Registry can contain multiple repositories, each of which can contain multiple tags, each of which corresponds to an image.

Typically, a repository will contain images of different versions of the same software, and tags are often used to correspond to different versions of that software. We can specify which version of the software is being mirrored by using the format <repository name>:<tag>. If no tag is given, the default tag will be the latest.

In the case of the Ubuntu mirror (opens new window), ubuntu is the name of the repository, which contains different version labels, e.g., 16.04, 18.04, and we can specify which version of the mirror is needed by using ubuntu:16.04, or ubuntu:18.04. If the label is ignored, e.g. ubuntu, it will be treated as ubuntu: latest.

Repository names often appear as two-part paths, e.g., jwilder/nginx-proxy, where the former often means the username in a Docker Registry multi-user environment, and the latter is often the corresponding software name. This is not absolute and depends on the specific Docker Registry software or service being used.

Public Docker Registry

The Public Docker Registry is a registry service that is open to users and allows them to manage images. Typically these public services allow users to upload and download public images for free and may provide a paid service for users to manage private images.

The most commonly used public Registry is the official Docker Hub, which is also the default Registry and has a large number of high-quality official mirrors. In addition to these, there is Red Hat’s Quay.io; Google’s Google Container Registry, which is used for mirrors of Kubernetes; and the code hosting platform GitHub launched, ghcr.io.

Private Docker Registry

In addition to using the public service, you can also set up a private Docker Registry locally; Docker provides a Docker Registry image that can be used as a private registry service.

The open-source Docker Registry image only provides a server-side implementation of the Docker Registry API, which is sufficient to support the docker commands without affecting their use. However, it does not include a graphical interface, as well as advanced features such as image maintenance, user management, and access control.

In addition to the official Docker Registry, there are third-party software implementations of the Docker Registry API that even provide a user interface and some advanced features. For example, Harbor and Sonatype Nexus.

Docker II: Basic Concepts

Published by jamie on 24 April 202424 April 2024

Image

Tiered Storage

Container

Registry

Public Docker Registry

Private Docker Registry

0 Comments

Leave a Reply Cancel reply

Docker VII: DockerFile

Docker VI: Other Ways to Make Docker Images

Docker V: Customize Docker Image using Dockerfile

Docker II: Basic Concepts

Published by jamie on 24 April 202424 April 2024

Image

Tiered Storage

Container

Registry

Public Docker Registry

Private Docker Registry

0 Comments

Leave a Reply Cancel reply

Related Posts

Docker VII: DockerFile

Docker VI: Other Ways to Make Docker Images

Docker V: Customize Docker Image using Dockerfile