4

I am trying to understand the underlining architecture of Docker. The diagram that's been shown everywhere claims that in contrast with Virtualisation technologies such as "VirtualBox", Docker uses the OS of the host directly and only ships the applications together with dependencies such as libraries etc. Now, from what I can see in Docker, every image includes an OS. It starts with a FROM <os-image> tag. Isn't this contradictory to what's been claimed? Please advise.

Mike M
  • 143
  • 4

2 Answers2

7

First, Docker is just a company. =)

There are two methods of isolating things,

  • Methods that isolate the kernel.
  • Methods that do not isolate the kernel.

For all intents and purposes, methods that do not isolate the kernel are called "containerization", while those that do are called "virtualization". In industry almost 100% of the use cases of "containerization" refer to Linux Containerization. It's for the most part correct to say that containers are a Linux thing. One more point of confusion, many non-Linux systems that support "native" containerization do so with a virtual machines which means that you have the native kernel (like Darwin/BSD) running on the host, and a Linux kernel running in a virtual machine which hosts just the container environment. As a rule of thumb,

  • Containerization is always less secure: vulnerable to kernel-level exploits.
  • Containerization is always faster: less context switching and hypervisor overhead.

It's not true that just because something does not virtualize the kernel, that it's not isolated from the host. While it's true Linux containers are just processes, and thus,

  • Visible from the host
  • Subject to any kernel level resource optimizations, like memory deduplication

Containerized processes must

  • Run in a different namespaces which, barring a kernel-level exploit, isolate them from other process on the machine
  • (Usually) run in isolated cgroups subject to different quotas and limits.

As a last point, just to drive it home, because

  • Containerization typically refers to the implementation in Linux.
  • Linux has no native concept of containerization itself, only providing cgroups (resource control) and namespaces (isolation)
  • A container is just a native process

Then we tend to say any process on Linux that makes use of namespaces is running in a container, moreso if it's using cgroups.

As a final point, typically when you hear "Docker Image" people mean an OCI Compliant Image, which is what everyone uses.


When you see

FROM <os-image>

In a dockerfile, what you're actually saying is that you want to in git-parlance clone a working set of stuff and build on top of it. This stuff does not include a kernel. But it will include everything else because you will not have access to the system's stuff. (The container is in a different namespace and isolated). For example, a container must include a copy of glibc if needed, and a container of Debian must include apt and other Debian based utilities that constitute a "core" system.

Evan Carroll
  • 2,921
  • 6
  • 37
  • 85
0

run uname -r inside a docker image and then on your base os, you'll notice it's the same kernel

bash-4.4# uname -r
3.10.0-1160.59.1.el7.x86_64
bash-4.4# exit
[petturn@h-mybox-2 ~]$ uname -r
3.10.0-1160.59.1.el7.x86_64

that's probably the biggest distinction between virtualization and containerization. A virtual machine is running it's own kernel, a container runs inside the space of your base OS. You see the processes and everything. What neither of them can (or should) be able to do is interact with the base OS.

When you build a container "from ubuntu" or whatever, I believe that has more to do with specifying the toolchains you're building from.

If you build a container "from scratch" (I've never tried this, but it seems like fun), I think that'd become apparent!

Peter Turner
  • 1,482
  • 4
  • 18
  • 39