OS-level virtualization#Implementations
{{Short description|Operating system virtualization paradigm}}
{{More citations needed|date=November 2020}}
OS-level virtualization is an operating system (OS) virtualization paradigm in which the kernel allows the existence of multiple isolated user space instances, including containers (LXC, Solaris Containers, AIX WPARs, HP-UX SRP Containers, Docker, Podman), zones (Solaris Containers), virtual private servers (OpenVZ), partitions, virtual environments (VEs), virtual kernels (DragonFly BSD), and jails (FreeBSD jail and chroot).{{Cite web |url=https://www.networkworld.com/article/749098/cisco-subnet-software-containers-used-more-frequently-than-most-realize.html |title=Software containers: Used more frequently than most realize |last1=Hogg |first1=Scott |date=2014-05-26 |website=Network World |publisher=Network world, Inc. |access-date=2015-07-09 |quote=There are many other OS-level virtualization systems such as: Linux OpenVZ, Linux-VServer, FreeBSD Jails, AIX Workload Partitions (WPARs), HP-UX Containers (SRP), Solaris Containers, among others. }} Such instances may look like real computers from the point of view of programs running in them. A computer program running on an ordinary operating system can see all resources (connected devices, files and folders, network shares, CPU power, quantifiable hardware capabilities) of that computer. Programs running inside a container can only see the container's contents and devices assigned to the container.
On Unix-like operating systems, this feature can be seen as an advanced implementation of the standard chroot mechanism, which changes the apparent root folder for the current running process and its children. In addition to isolation mechanisms, the kernel often provides resource-management features to limit the impact of one container's activities on other containers. Linux containers are all based on the virtualization, isolation, and resource management mechanisms provided by the Linux kernel, notably Linux namespaces and cgroups.{{cite web|url=http://www.netdevconf.org/1.1/proceedings/slides/rosen-namespaces-cgroups-lxc.pdf|title=Namespaces and Cgroups, the basis of Linux Containers|first=Rosen|last=Rami|access-date=18 August 2016}}
Although the word container most commonly refers to OS-level virtualization, it is sometimes used to refer to fuller virtual machines operating in varying degrees of concert with the host OS,{{Citation needed|date=September 2024}} such as Microsoft's Hyper-V containers.{{Citation needed|date=September 2024}} For an overview of virtualization since 1960, see Timeline of virtualization technologies.
Operation
On ordinary operating systems for personal computers, a computer program can see (even though it might not be able to access) all the system's resources. They include:
- Hardware capabilities that can be employed, such as the CPU and the network connection
- Data that can be read or written, such as files, folders and network shares
- Connected peripherals it can interact with, such as webcam, printer, scanner, or fax
The operating system may be able to allow or deny access to such resources based on which program requests them and the user account in the context in which it runs. The operating system may also hide those resources, so that when the computer program enumerates them, they do not appear in the enumeration results. Nevertheless, from a programming point of view, the computer program has interacted with those resources and the operating system has managed an act of interaction.
With operating-system-virtualization, or containerization, it is possible to run programs within containers, to which only parts of these resources are allocated. A program expecting to see the whole computer, once run inside a container, can only see the allocated resources and believes them to be all that is available. Several containers can be created on each operating system, to each of which a subset of the computer's resources is allocated. Each container may contain any number of computer programs. These programs may run concurrently or separately, and may even interact with one another.
Containerization has similarities to application virtualization: In the latter, only one computer program is placed in an isolated container and the isolation applies to file system only.
Uses
Operating-system-level virtualization is commonly used in virtual hosting environments, where it is useful for securely allocating finite hardware resources among a large number of mutually-distrusting users. System administrators may also use it for consolidating server hardware by moving services on separate hosts into containers on the one server.
Other typical scenarios include separating several programs to separate containers for improved security, hardware independence, and added resource management features.{{Cite web |date=2022-10-20 |title=Secure Bottlerocket deployments on Amazon EKS with KubeArmor {{!}} Containers |url=https://aws.amazon.com/blogs/containers/secure-bottlerocket-deployments-on-amazon-eks-with-kubearmor/ |access-date=2023-06-20 |website=aws.amazon.com |language=en-US}} The improved security provided by the use of a chroot mechanism, however, is not perfect.{{Cite book |title=Mastering FreeBSD and OpenBSD security |series=O'Reilly Series |first1=Yanek |last1=Korff |first2=Paco |last2=Hope |first3=Bruce |last3=Potter |publisher=O'Reilly Media, Inc. |year=2005 |isbn=0596006268 |page=59 |url=https://books.google.com/books?id=gqKwaHmXp4YC&pg=PA59 }} Operating-system-level virtualization implementations capable of live migration can also be used for dynamic load balancing of containers between nodes in a cluster.
= Overhead =
Operating-system-level virtualization usually imposes less overhead than full virtualization because programs in OS-level virtual partitions use the operating system's normal system call interface and do not need to be subjected to emulation or be run in an intermediate virtual machine, as is the case with full virtualization (such as VMware ESXi, QEMU, or Hyper-V) and paravirtualization (such as Xen or User-mode Linux). This form of virtualization also does not require hardware support for efficient performance.
= Flexibility =
Operating-system-level virtualization is not as flexible as other virtualization approaches since it cannot host a guest operating system different from the host one, or a different guest kernel. For example, with Linux, different distributions are fine, but other operating systems such as Windows cannot be hosted. Operating systems using variable input systematics are subject to limitations within the virtualized architecture. Adaptation methods including cloud-server relay analytics maintain the OS-level virtual environment within these applications.{{Cite book |last1=Huang |first1=D. |title=Proceedings of the 10th Parallel Data Storage Workshop |chapter=Experiences in using os-level virtualization for block I/O |year=2015|pages=13–18 |doi=10.1145/2834976.2834982 |isbn=9781450340083 |s2cid=3867190 }}
Solaris partially overcomes the limitation described above with its branded zones feature, which provides the ability to run an environment within a container that emulates an older Solaris 8 or 9 version in a Solaris 10 host. Linux branded zones (referred to as "lx" branded zones) are also available on x86-based Solaris systems, providing a complete Linux user space and support for the execution of Linux applications; additionally, Solaris provides utilities needed to install Red Hat Enterprise Linux 3.x or CentOS 3.x Linux distributions inside "lx" zones.{{Cite web |url=http://docs.oracle.com/cd/E19044-01/sol.containers/817-1592/zones.intro-1/index.html |title=System administration guide: Oracle Solaris containers-resource management and Oracle Solaris zones, Chapter 16: Introduction to Solaris zones |year=2010 |access-date=2014-09-02 |publisher=Oracle Corporation }}{{Cite web |url=http://docs.oracle.com/cd/E19044-01/sol.containers/817-1592/gchhy/index.html |title=System administration guide: Oracle Solaris containers-resource nanagement and Oracle Solaris zones, Chapter 31: About branded zones and the Linux branded zone |year=2010 |access-date=2014-09-02 |publisher=Oracle Corporation }} However, in 2010 Linux branded zones were removed from Solaris; in 2014 they were reintroduced in Illumos, which is the open source Solaris fork, supporting 32-bit Linux kernels.{{Cite web |url=http://www.slideshare.net/bcantrill/illumos-lx |title=The dream is alive! Running Linux containers on an illumos kernel |date=2014-09-28 |access-date=2014-10-10 |author=Bryan Cantrill |website=slideshare.net }}
= Storage =
Some implementations provide file-level copy-on-write (CoW) mechanisms. (Most commonly, a standard file system is shared between partitions, and those partitions that change the files automatically create their own copies.) This is easier to back up, more space-efficient and simpler to cache than the block-level copy-on-write schemes common on whole-system virtualizers. Whole-system virtualizers, however, can work with non-native file systems and create and roll back snapshots of the entire system state.
{{Anchor|IMPLEMENTATIONS}}Implementations
class="wikitable sortable" style="font-size: 85%; text-align: center; width: 100%" | |
rowspan="2" | Mechanism
! rowspan="2" | Operating system ! rowspan="2" | License ! rowspan="2" | Actively developed since or between ! colspan="10" | Features | |
---|---|
File system isolation
!I/O rate limiting !Memory limits !Network isolation !Nested virtualization !Partition checkpointing and live migration !Root privilege isolation | |
chroot
| Most UNIX-like operating systems | Varies by operating system | 1982 | {{Partial}}{{Efn|name="root-escape"|Root user can easily escape from chroot. Chroot was never supposed to be used as a security mechanism.{{Cite web |url=http://www.freebsd.org/doc/en/books/developers-handbook/secure-chroot.html|title=3.5. Limiting your program's environment |work=freebsd.org}}}} | {{No}} | {{No}} | {{No}} | {{No}} | {{No}} | {{No}} | {{Yes}} | {{No}} | {{No}} | |
Docker
|Linux,{{Cite web |url=http://www.infoq.com/news/2014/03/docker_0_9|title=Docker drops LXC as default execution environment |work=InfoQ }} Windows x64{{Cite web |date=9 February 2023 |title=Install Docker desktop on Windows {{!}} Docker documentation |url=https://docs.docker.com/desktop/install/windows-install/ |work=Docker }} macOS{{Cite web |url=https://docs.docker.com/docker-for-mac/ |title=Get started with Docker desktop for Mac |date=December 6, 2019 |website=Docker documentation}} |{{open source|Apache License 2.0}} | 2013 | {{Yes}} | {{Yes}} | {{Partial}}{{Efn|name="docker-disk-quotas"|For btrfs, overlay2, windowsfilter, and zfs storage drivers. | {{Yes}} {{Nowrap|(since 1.10)}} | {{Yes}} | {{Yes}} | {{Yes}} | {{Yes}} | {{No|Only in experimental mode with CRIU [https://criu.org/Docker]}} | {{Yes}} {{Nowrap|(since 1.10)}} | |
Linux-VServer (security context) |{{Open source|GNU GPLv2}} | 2001 | {{Yes}} | {{Yes}} | {{Yes}} | {{Yes}}{{Efn|name="cfq"|Using the CFQ scheduler, there is a separate queue per guest.}} | {{Yes}} | {{Yes}} | {{Partial}}{{Efn|name="vserver-net"|Networking is based on isolation, not virtualization.}} | {{Dunno}} | {{No}} | {{Partial|Partial{{Efn|name="linux-vserver-paper"|A total of 14 user capabilities are considered safe within a container. The rest may cannot be granted to processes within that container without allowing that process to potentially interfere with things outside that container.{{Cite web |url=http://linux-vserver.org/Paper#Secure_Capabilities|title=Paper - Linux-VServer| website=linux-vserver.org }}}}}} | |
lmctfy
| Linux | {{open source|Apache License 2.0}} | 2013{{Ndash}}2015 | {{Yes}} | {{Yes}} | {{Yes}} | {{Yes}}{{Efn|name="cfq"}} | {{Yes}} | {{Yes}} | {{Partial}}{{Efn|name="vserver-net"}} | {{Dunno}} | {{No}} | {{Partial|Partial{{Efn|name="linux-vserver-paper"}}}} | |
LXC
| Linux |{{open source|GNU GPLv2}} | 2008 | {{Yes}} | {{Partial}}{{Efn|name="lxc-dq"|Disk quotas per container are possible when using separate partitions for each container with the help of LVM, or when the underlying host filesystem is btrfs, in which case btrfs subvolumes are automatically used.}} | {{Partial}}{{Efn|name="lxc-iolimit"|I/O rate limiting is supported when using Btrfs.}} | {{Yes}} | {{Yes}} | {{Yes}} | {{Yes}} | {{Yes}} | |
Singularity
| Linux |{{Open source|BSD Licence}} | {{Yes}} | {{Yes}} | {{No}} | {{No}} | {{No}} | {{No}} | {{No}} | {{No}} | |
OpenVZ
|{{open source|GNU GPLv2}} | 2005 | {{Yes}} | {{Yes}} | {{Yes}}{{Efn|name="ioprio"|Available since Linux kernel 2.6.18-028stable021. Implementation is based on CFQ disk I/O scheduler, but it is a two-level schema, so I/O priority is not per-process, but rather per-container.{{Cite web |url=http://wiki.openvz.org/I/O_priorities_for_VE |title=I/O priorities for containers |work=OpenVZ Virtuozzo Containers Wiki }}}} | {{Yes}} | {{Yes}} | {{Yes}}{{Efn|name="vn"|Each container can have its own IP addresses, firewall rules, routing tables and so on. Three different networking schemes are possible: route-based, bridge-based, and assigning a real network device (NIC) to a container.}} | {{Partial}}{{Efn|name="docker-inside-openvz"|Docker containers can run inside OpenVZ containers.{{Cite web |url=https://openvz.org/Docker_inside_CT |title=Docker inside CT }}}} | {{Yes}} | {{Yes|Yes{{Efn|name="openvz-wiki-container"|Each container may have root access without possibly affecting other containers.{{Cite web |url=http://wiki.openvz.org/Container|title=Container |work=OpenVZ Virtuozzo Containers Wiki }}}}}} | |
Virtuozzo
|{{Proprietary|Trialware}} | {{Yes}} | {{Yes}} | {{Yes}} | {{Yes}}{{Efn|name="vz4"|Available since version 4.0, January 2008.}} | {{Yes}} | {{Yes}} | {{Yes}}{{Efn|name="vn"}} | {{Partial}}{{Efn|name="vz-docker-inside-ct"|Docker containers can run inside Virtuozzo containers.{{Cite web |url=http://www.odin.com/news/pr/release/article/parallels-virtuozzo-now-provides-native-support-for-docker/ |title=Parallels Virtuozzo now provides native support for Docker}}}} | {{Yes}} | {{Yes}} | |
Solaris Containers (Zones) | illumos (OpenSolaris), Solaris |{{Free|CDDL}}, | 2004 | {{Yes}} | {{Yes}} (ZFS) | {{Yes}} | {{Partial}}{{Efn|name="solaris-iolimit"|Yes with illumos{{Cite web |last=Pijewski |first=Bill |title=Our ZFS I/O Throttle |url=https://wdp.dtrace.org/2011/03/our-zfs-io-throttle/ |date=March 1, 2011 | website=wdp.dtrace.org}}}} | {{Yes}} | {{Yes}} | {{Yes}}{{Efn|name="crossbow"|See Solaris network virtualization and resource control for more details.}}[http://www.opensolaris.org/os/project/crossbow/faq/ Network virtualization and resource control (Crossbow) FAQ] {{Webarchive|url=https://web.archive.org/web/20080601182802/http://www.opensolaris.org/os/project/crossbow/faq/ |date=2008-06-01 }}{{Cite web |url=https://docs.oracle.com/cd/E36784_01/html/E36813/index.html |title=Managing network virtualization and network resources in Oracle® Solaris 11.2 |website=docs.oracle.com }} | {{Partial}}{{Efn|name="solaris-nested"|Only when top level is a KVM zone (illumos) or a kz zone (Oracle).}} | {{Partial}}{{Efn|name="kernelzone"|Starting in Solaris 11.3 Beta, Solaris Kernel Zones may use live migration.}}{{Efn|name="coldmig"|Cold migration (shutdown-move-restart) is implemented.}} | {{Yes|Yes}}{{Efn|name="solaris-E29024"|Non-global zones are restricted so they may not affect other zones via a capability-limiting approach. The global zone may administer the non-global zones.Oracle Solaris 11.1 administration, Oracle Solaris zones, Oracle Solaris 10 zones and resource management E29024.pdf, pp. 356–360. Available [http://www.oracle.com/technetwork/documentation/solaris-11-192991.html within an archive].}} |
FreeBSD jail
|{{Open source|BSD License}} | {{Yes}} | {{Yes}} (ZFS) | {{Yes}}{{Efn|Check the "allow.quotas" option and the "Jails and file systems" section on the [http://www.freebsd.org/cgi/man.cgi?query%3Djail&sektion%3D8 FreeBSD jail man page] for details.}} | {{Yes}} | {{Yes}} | {{Yes}} | {{Partial}}{{Cite web |url=http://www.7he.at/freebsd/vps/|title=VPS for FreeBSD |access-date=2016-02-20 }}{{Cite web |url=https://forums.freebsd.org/threads/34284/ |title=[Announcement] VPS // OS virtualization // alpha release |date=31 August 2012 |access-date=2016-02-20 }} | |
vkernel
|{{Open source|BSD Licence}} | 2006{{Cite web |author=Matthew Dillon |author-link=Matthew Dillon |year=2006 |url=http://bxr.su/d/sys/sys/vkernel.h |title=sys/vkernel.h |website=BSD cross reference |publisher=DragonFly BSD }} | {{Yes}}{{Cite web |url=http://mdoc.su/d/vkd.4 |title=vkd(4) — Virtual kernel disc |publisher=DragonFly BSD |quote="treats the disk image as copy-on-write." }} | {{N/A}} | {{Dunno}} | {{Yes}}{{Cite web |author=Sascha Wildner |date=2007-01-08 |url=http://bxr.su/d/share/man/man7/vkernel.7 |title=vkernel, vcd, vkd, vke — virtual kernel architecture |work=DragonFly miscellaneous information manual |publisher=DragonFly BSD}}
| {{Yes}}{{r|vkernel.7}} | {{Yes}}{{Cite web |url=http://mdoc.su/d/vke.4 |title=vkernel, vcd, vkd, vke - virtual kernel architecture |work=DragonFly On-Line Manual Pages |publisher=DragonFly BSD }} | {{Dunno}} | {{Dunno}} | {{Yes}} | |
sysjail
|{{Open source|BSD License}} | 2006–2009 | {{Yes}} | {{No}} | {{No}} | {{No}} | {{No}} | {{No}} | {{Yes}} | {{No}} | {{No}} | {{dunno}} | |
WPARs
|AIX |{{Proprietary|Commercial proprietary software}} | 2007 | {{Yes}} | {{No}} | {{Yes}} | {{Yes}} | {{Yes}} | {{Yes}} | {{Yes}}{{Efn|Available since TL 02.{{Cite web |url=http://www-01.ibm.com/support/docview.wss?uid=isg1fixinfo109461|title=IBM fix pack information for: WPAR network isolation - United States |website=ibm.com |date=21 July 2011 }}}} | {{No}} | {{Dunno}} | |
iCore Virtual Accounts
|{{Proprietary|Freeware}} | 2008 | {{Yes}} | {{No}} | {{Yes}} | {{No}} | {{No}} | {{No}} | {{No}} | {{Dunno}} | {{No}} | {{Dunno}} | |
Sandboxie
| Windows | {{open source|GNU GPLv3}} | 2004 | {{Yes}} | {{Yes}} | {{Partial}} | {{No}} | {{No}} | {{No}} | {{Partial}} | {{No}} | {{No}} | {{Yes}} | |
systemd-nspawn
| Linux | {{Open source|GNU LGPLv2.1+}} | 2010 | {{Yes}} | {{Yes}} | {{Yes}}{{Cite web |url=https://www.freedesktop.org/software/systemd/man/systemd-nspawn.html#--property= |title=systemd-nspawn |website=www.freedesktop.org }}{{Cite web |url=https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/resource_management_guide/sec-modifying_control_groups |title=2.3. Modifying control groups Red Hat Enterprise Linux 7 |website=Red Hat Customer portal}} | {{Yes}} | {{Dunno}} | {{Dunno}} | {{Yes}} | |
Turbo
| Windows |{{Proprietary|Freemium}} | 2012 | {{Yes}} | {{No}} | {{No}} | {{No}} | {{No}} | {{No}} | {{Yes}} | {{No}} | {{No}} | {{Yes}} | |
rkt (rocket)
| Linux | {{Open source|Apache License 2.0}} | {{Yes}} | {{Yes}} | {{Yes}} | {{Yes}} | {{Yes}} | {{Yes}} | {{Yes}} | {{Dunno}} | {{Dunno}} | {{Yes}} |
Linux containers not listed above include:
- LXD, an alternative wrapper around LXC developed by Canonical{{Cite web |access-date=2021-02-11 |title=LXD |url=https://linuxcontainers.org/lxd/ |website=linuxcontainers.org }}
- Podman,[https://indico.cern.ch/event/757415/contributions/3421994/attachments/1855302/3047064/Podman_Rootless_Containers.pdf Rootless containers with Podman and fuse-overlayfs], CERN workshop, 2019-06-04 an advanced Kubernetes ready root-less secure drop-in replacement for Docker with support for multiple container image formats, including OCI and Docker images
- Charliecloud, a set of container tools used on HPC systems{{Cite web |url=https://hpc.github.io/charliecloud/ |access-date=4 October 2020 |title=Overview — Charliecloud 0.25 documentation }}
- Kata Containers MicroVM Platform{{Cite web |url=https://katacontainers.io/ |title=Home |website=katacontainers.io}}
- Bottlerocket is a Linux-based open-source operating system that is purpose-built by Amazon Web Services for running containers on virtual machines or bare metal hosts{{Cite web |url=https://aws.amazon.com/bottlerocket/ |title=Bottlerocket is a Linux-based operating system purpose-built to run containers }}
- Azure Linux is an open-source Linux distribution that is purpose-built by Microsoft Azure and similar to Fedora CoreOS
See also
- Container Linux
- Container orchestration
- Flatpak package manager
- Linux cgroups
- Linux namespaces
- Hypervisor
- Portable application creators
- Open Container Initiative
- Sandbox (software development)
- Separation kernel
- Serverless computing
- Snap package manager
- Storage hypervisor
- Virtual private server (VPS)
- Virtual resource partitioning
Notes
{{Notelist|30em}}
References
{{Reflist|30em}}
External links
- [https://www.kernelthread.com/publications/virtualization/ An introduction to virtualization] {{Webarchive|url=https://web.archive.org/web/20191128152118/http://www.kernelthread.com/publications/virtualization |date=2019-11-28 }}
- [https://wiki.openvz.org/Introduction_to_virtualization A short intro to three different virtualization techniques]
- [https://thijs.ai/papers/scheepers-virtualization-containerization.pdf Virtualization and containerization of application infrastructure: A comparison], June 22, 2015, by Mathijs Jeroen Scheepers
- [https://lwn.net/Articles/646054/ Containers and persistent data], LWN.net, May 28, 2015, by Josh Berkus
{{Virtualization software}}
{{DEFAULTSORT:Operating-system-level virtualization}}
Category:Operating system security