YouTube placeholder

Container Virtualization

Review: Hardware Virtualization

How did we create a virtual machine (VM)?
  • Start with a physical machine

  • Create software (hypervisor) responsible for isolating the guest OS inside the VM

  • VM resources (memory, disk, networking, etc.) are provided by the physical machine but visibility outside of the VM is limited

What were the implications?
  • VM and physical machine share same instruction set, so must the host and guest

  • Guest OS can provide a different application binary interface (ABI) inside the VM

  • Lots of challenges in getting this to work because guest OS expects to have privileged hardware access

Operating System Virtualization

How do we create a virtual operating system (container)?
  • Start with a real operating system

  • Create software responsible for isolating guest software inside the container

    • (That software seems to lack a canonical name—​and today it’s actually a bunch of different tools.)

  • Container resources (processes, files, network sockets, etc.) are provided by the real operating system but visibility outside the container is limited

What are the implications?
  • Container and real OS share same kernel

  • So applications inside and outside the kernel must share the same ABI

  • Challenges is getting this to work are due to shared OS namespaces

Containers v. VMs

You can run Windows inside a container provided by Linux.
  • False. Container shares the kernel with the host.

You can run SUSE Linux inside an Ubuntu container.
  • True. All long as both distributions use the same kernel, differences are confined to different binary tools and file locations.

Running ps inside the container will show all processes.
  • False. Container process namespaces is isolated from the host.

Hypervisor v. Container Virtualization

lxc

Why Virtualize an OS?

Shares many (but not all) of the benefits of hardware virtualization with much lower overhead.

Decoupling
  1. Cannot run multiple operating systems on the same machine.

  2. Can transfer software setups to another machine as long as it has a identical or nearly identical hardware kernel.

  3. Can adjust hardware container resources to system needs.

Isolation
  1. Container should not leak information inside and outside the container

  2. Can isolate all of the configuration and software packages a particular application needs to run

OS v. Hardware Overhead

Hardware virtualization system call path:
  • Application inside the VM makes a system call

  • Trap to the host OS (or hypervisor)

  • Hand trap back to the guest OS

OS virtualization system call path:
  • Application inside the container makes a system call

  • Trap to the OS

  • Remember all of the work we had to do to deprivilege the guest OS and deal with uncooperative machine architectures like x86?

  • OS virtualization does not require any of this: there is only one OS!

OS Virtualization is About Names

What kind of names must the container virtualize?
  • Process IDs

    • top inside the container shows only processes running inside container

    • top outside the container may show processes inside the container, but with different process IDs

  • File names

    • Processes inside the container may have a limited or different view of the mounted file system

    • File names may resolve to different names—​and some file names outside the container may be removed

  • User names:

    • Containers may have different users with different roles

    • root inside the container should not be root outside the container

  • Host name and IP address

    • Processes inside the container may use a different host name and IP address when performing network operations

OS Virtualization is About Control

The OS may want to ensure that the entire container—​or everything that runs inside it—​cannot consume more than a certain amount of:

  • CPU time

  • memory

  • disk or network bandwidth

Not a New Idea

Forms of OS virtualization go back to chroot from 1982:

chroot - run command or interactive shell with special root directory
How is this done?
  • Instead of starting path resolution at inode #2, start somewhere else.

Modern container management systems like Docker combine and build upon multiple lower-levels tools and services.

Linux namespaces

Since 2002 Linux has provided namespace separation for a variety of resources that typically had unified namespaces

  • Mount points: allows different namespaces to see different views of the file system

  • Process IDs: new processes are allocated IDs in their current namespace and all parent namespaces

  • Network: namespaces can have private IP addresses and their own routing tables, and can communicate with other namespaces through virtual interfaces

  • Devices: devices can be present or hidden in different namespaces

cgroups

…​a Linux kernel feature that limits, accounts for, and isolates the resource usage of a collection of processes.

  • Processes and their children remain in the same cgroup

  • cgroups may it possible to control the resources allocated to a set of processes

UnionFS

A stackable unification file system.

Path name resolution:
  • Does /foo/bar exist in the top layer? If yes, return its contents.

  • Does /foo/bar exist in the next layer? If yes, return its contents.

  • Etc.

Can also hide parts of the lower file systems:
  • Does /foo/bar exist in the top layer? If yes, return its contents.

  • Access to /foo in the next layer is prohibited, so stop. (Even if /foo/bar exists.

COW File System

Previous container libraries made a copy of the parent’s entire file system. (Containers need a lot of it.)

What could we do instead?
  • Copy on write!

  • Only make modifications to the underlying file system when the container modifies files.

  • Speeds start up and reduces storage usage.

    • The container mainly needs read-only access to host files.

What is Docker?

Docker builds on previous technologies
  • Provides a unified set of tools for container management on a variety of systems

  • Layered file system images for easy updates

  • Now involved in development of containerization libraries on Linux

Docker linux interfaces

Example Dockerfile

FROM komljen/ubuntu
MAINTAINER Alen Komljen <alen.komljen@live.com>

ENV MONGO_VERSION 2.6.6

RUN \
  apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10 && \
  echo "deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist
10gen" \
       > /etc/apt/sources.list.d/mongodb.list && \
  apt-get update && \
  apt-get -y install \
          mongodb-org=${MONGO_VERSION} && \
  rm -rf /var/lib/apt/lists/*

VOLUME ["/data/db"]

RUN rm /usr/sbin/policy-rc.d
CMD ["/usr/bin/mongod"]

EXPOSE 27017

Created 4/28/2017
Updated 9/18/2020
Commit 4eceaab // History // View
Built 4/27/2017 @ 20:00 EDT