Skip to content

Portal | Level: L2: Operations | Topics: Virtualization, Server Hardware, Linux Fundamentals | Domain: Datacenter & Hardware

Virtualization - Primer

Why This Matters

Containers get all the hype, but virtualization is the foundation underneath most of it. Your Kubernetes nodes run on VMs. Your CI runners are VMs. Your database tier probably runs on VMs. Even "bare metal" cloud instances often use lightweight hypervisors (AWS Nitro is a hypervisor). Understanding virtualization means understanding the layer that hosts everything else.

If you manage infrastructure — whether on-prem or cloud — you will configure, troubleshoot, migrate, and performance-tune virtual machines. This primer gives you the mechanics.


Hypervisor Types

 ┌────────────────────────────┐    ┌────────────────────────────┐
 │        Type 1               │    │        Type 2               │
 │    (Bare-Metal)             │    │    (Hosted)                 │
 │                             │    │                             │
 │  ┌──────┐ ┌──────┐         │    │  ┌──────┐ ┌──────┐         │
 │  │ VM 1 │ │ VM 2 │         │    │  │ VM 1 │ │ VM 2 │         │
 │  └──┬───┘ └──┬───┘         │    │  └──┬───┘ └──┬───┘         │
 │     │        │              │    │     │        │              │
 │  ┌──▼────────▼───┐         │    │  ┌──▼────────▼───┐         │
 │  │  Hypervisor    │         │    │  │  Hypervisor    │         │
 │  └──────┬────────┘         │    │  └──────┬────────┘         │
 │         │                   │    │         │                   │
 │  ┌──────▼────────┐         │    │  ┌──────▼────────┐         │
 │  │   Hardware     │         │    │  │  Host OS       │         │
 │  └───────────────┘         │    │  └──────┬────────┘         │
 │                             │    │  ┌──────▼────────┐         │
 │                             │    │  │   Hardware     │         │
 │                             │    │  └───────────────┘         │
 └────────────────────────────┘    └────────────────────────────┘
Type Examples Overhead Use Case
1 ESXi, KVM, Xen, Hyper-V Low Datacenters, cloud
2 VirtualBox, VMware Workstation Higher Development, testing

KVM is technically a "Type 1.5" — it turns the Linux kernel itself into a hypervisor via a kernel module, giving it bare-metal performance with the flexibility of a hosted system.

Who made it: KVM was created by Avi Kivity at the Israeli startup Qumranet in mid-2006. It was merged into the Linux kernel mainline (v2.6.20) in February 2007 -- one of the fastest kernel merges for a feature of that size. Red Hat acquired Qumranet in 2008 and has been KVM's primary sponsor since. The key insight: instead of building a hypervisor from scratch, make the Linux kernel itself the hypervisor.


KVM + QEMU + libvirt — The Linux Stack

This trio is the standard Linux virtualization stack:

 ┌──────────────────────────────────────┐
 │   Management (virt-manager, virsh,   │
 │               Cockpit, oVirt)        │
 │                  │                   │
 │           ┌──────▼──────┐            │
 │           │   libvirt    │  ← API + daemon (libvirtd)
 │           └──────┬──────┘            │
 │                  │                   │
 │           ┌──────▼──────┐            │
 │           │    QEMU      │  ← hardware emulation (userspace)
 │           └──────┬──────┘            │
 │                  │                   │
 │           ┌──────▼──────┐            │
 │           │    KVM       │  ← CPU virtualization (kernel module)
 │           └──────┬──────┘            │
 │                  │                   │
 │           ┌──────▼──────┐            │
 │           │  Hardware    │  ← VT-x/AMD-V extensions
 │           └─────────────┘            │
 └──────────────────────────────────────┘

KVM provides hardware-accelerated CPU and memory virtualization via /dev/kvm.

QEMU emulates everything else: disks, network cards, USB, display. Without KVM, QEMU does full software emulation (slow). With KVM, QEMU delegates CPU work to hardware (fast).

libvirt provides a unified API for managing VMs. You talk to libvirt; libvirt talks to QEMU/KVM. This abstraction means your tools (virsh, virt-manager, Terraform) don't need to know QEMU command-line flags.

Checking Hardware Support

# Check for VT-x (Intel) or AMD-V (AMD)
grep -E '(vmx|svm)' /proc/cpuinfo

# Verify KVM module is loaded
lsmod | grep kvm

# Quick capability check
virt-host-validate

VM Lifecycle

 Define    Create    Start    Running    Pause/Resume
                                     
                                     ├──→  Shutdown (graceful)
                                     ├──→  Destroy (force off)
                                     ├──→  Suspend (save state to disk)
                                     ├──→  Migrate (live, to another host)
                                     └──→  Snapshot (point-in-time capture)
                                              
                                              └──→  Revert

Creating a VM

# Using virt-install (the standard way)
virt-install \
  --name web01 \
  --ram 4096 \
  --vcpus 2 \
  --disk path=/var/lib/libvirt/images/web01.qcow2,size=40,format=qcow2 \
  --os-variant centos-stream9 \
  --network bridge=br0 \
  --graphics vnc,listen=0.0.0.0 \
  --cdrom /var/lib/libvirt/boot/CentOS-Stream-9-latest-x86_64-dvd1.iso \
  --noautoconsole

Core virsh Commands

virsh list --all              # List all VMs (running + stopped)
virsh start web01             # Start a VM
virsh shutdown web01          # Graceful shutdown (sends ACPI signal)
virsh destroy web01           # Force power off (like pulling the plug)
virsh reboot web01            # Graceful reboot
virsh suspend web01           # Pause (freeze in memory)
virsh resume web01            # Unpause

virsh console web01           # Serial console access (Ctrl+] to exit)
virsh vncdisplay web01        # Get VNC display number

virsh dominfo web01           # VM metadata
virsh domblklist web01        # List block devices
virsh domiflist web01         # List network interfaces

VirtIO — Paravirtualized Devices

Fully emulated devices (e.g., emulating an Intel e1000 NIC) carry overhead because the guest OS doesn't know it's virtual. VirtIO is a paravirtualized framework where the guest explicitly cooperates with the hypervisor.

Device Type Emulated VirtIO Performance Gap
Network e1000, rtl8139 virtio-net 2-10x
Block storage IDE, SATA virtio-blk/scsi 2-5x
Memory ballooning N/A virtio-balloon N/A
Random number N/A virtio-rng N/A
Console Emulated serial virtio-serial Cleaner

Always use VirtIO devices in production. Most modern OS images include VirtIO drivers. Windows guests need the VirtIO driver ISO installed.

Under the hood: VirtIO works by establishing a shared memory ring buffer between guest and host. Instead of the guest writing to fake hardware registers that the hypervisor intercepts and translates, both sides read/write directly to shared memory. This eliminates the "trap and emulate" overhead that makes emulated devices slow.


CPU Pinning and NUMA

CPU Pinning

By default, vCPUs float across physical CPUs. For latency-sensitive workloads, pin vCPUs to specific physical cores:

# Pin vCPU 0 to physical CPU 2, vCPU 1 to physical CPU 3
virsh vcpupin web01 0 2
virsh vcpupin web01 1 3

# Verify
virsh vcpupin web01

NUMA Awareness

Modern servers have Non-Uniform Memory Access (NUMA) — CPUs have "local" memory that's fast and "remote" memory that's slow:

 ┌──────────────────┐    ┌──────────────────┐
 │   NUMA Node 0     │    │   NUMA Node 1     │
 │                    │    │                    │
 │  CPU 0-7          │    │  CPU 8-15          │
 │  Local RAM: 64GB  │────│  Local RAM: 64GB  │
 │                    │ QPI│                    │
 └──────────────────┘    └──────────────────┘
# View NUMA topology
numactl --hardware
# or
virsh capabilities | grep -A 20 topology

# Pin a VM to NUMA node 0
virsh numatune web01 --mode strict --nodeset 0

Rule: Keep a VM's vCPUs and memory on the same NUMA node. Cross-node memory access adds 40-100ns latency per access.

Debug clue: If a database VM has unexplained latency spikes, check NUMA placement first: numastat -p $(pgrep qemu). If "Other Node" memory is significant, the VM is accessing remote NUMA memory. This is one of the most common hidden performance killers in virtualized database workloads.


Memory Ballooning

The balloon driver lets the hypervisor reclaim memory from guests dynamically:

# Set maximum memory to 8GB, current to 4GB
virsh setmaxmem web01 8G --config
virsh setmem web01 4G --live

# Inflate balloon (reclaim memory from guest) — set to 2GB
virsh setmem web01 2G --live

# Deflate balloon (give memory back) — set to 6GB
virsh setmem web01 6G --live

The guest OS sees less available RAM when the balloon inflates. The kernel handles this by swapping or reclaiming cache.


Storage Formats

Format Thin Provision Snapshots Performance Use Case
raw No No Best High-performance I/O
qcow2 Yes Yes Good General purpose
vmdk Varies Varies Good VMware compatibility
# Create a qcow2 image
qemu-img create -f qcow2 /var/lib/libvirt/images/web01.qcow2 40G

# Check image info
qemu-img info /var/lib/libvirt/images/web01.qcow2

# Convert between formats
qemu-img convert -f vmdk -O qcow2 input.vmdk output.qcow2

Snapshot Management

# Create a snapshot
virsh snapshot-create-as web01 --name "pre-upgrade" --description "Before kernel upgrade"

# List snapshots
virsh snapshot-list web01

# Revert to snapshot
virsh snapshot-revert web01 --snapshotname "pre-upgrade"

# Delete a snapshot
virsh snapshot-delete web01 --snapshotname "pre-upgrade"

Snapshots are not backups. They create a chain of differential disk images that degrades performance as the chain grows. Use them for short-term rollback points (hours, not weeks).

Gotcha: A common disaster: leaving VM snapshots running for weeks. Each snapshot adds a layer to the disk chain. I/O must traverse every layer to read data. Production VMs with 10+ snapshots can see 5-10x I/O degradation. Worse, consolidating (deleting) old snapshots requires rewriting the entire chain -- which can take hours and cause an outage if disk space runs out mid-merge.


Live Migration

Move a running VM from one host to another with near-zero downtime:

# Prerequisites:
# - Shared storage (NFS, Ceph, GlusterFS) or use --copy-storage-all
# - Same CPU architecture/features on both hosts
# - Network connectivity between hosts
# - libvirtd running on both hosts

# Basic live migration
virsh migrate --live --persistent web01 qemu+ssh://dest-host/system

# With bandwidth limit (MiB/s)
virsh migrate --live --persistent --bandwidth 500 web01 qemu+ssh://dest-host/system

# Copy storage along with the VM (no shared storage needed)
virsh migrate --live --persistent --copy-storage-all web01 qemu+ssh://dest-host/system

Migration workflow: 1. Memory pages copied to destination (iterative — re-copies dirty pages) 2. VM briefly paused when dirty page rate converges 3. Remaining state (CPU registers, device state) transferred 4. VM resumes on destination

Typical downtime: 10-200ms for well-behaved workloads.


VMware ESXi — Quick Reference

If you work in enterprise environments, you'll likely encounter ESXi:

# ESXi CLI (ssh to host)
esxcli vm process list           # List running VMs
esxcli vm process kill --type=soft --world-id=12345  # Graceful shutdown
vim-cmd vmsvc/getallvms          # List all registered VMs
vim-cmd vmsvc/power.on 42        # Power on VM by ID

# Datastore operations
esxcli storage filesystem list   # List datastores

Key differences from KVM: - Proprietary hypervisor (free tier available, features require vSphere license) - vCenter for multi-host management (equivalent to oVirt for KVM) - VMFS or vSAN for storage (vs qcow2/raw on local/shared FS) - vMotion = live migration equivalent


Key Takeaways

  1. KVM + QEMU + libvirt is the standard open-source virtualization stack.
  2. Always use VirtIO devices — emulated legacy devices waste CPU cycles.
  3. NUMA-aware placement matters on multi-socket servers — cross-node access kills latency.
  4. qcow2 for flexibility, raw for maximum I/O performance.
  5. Snapshots degrade performance over time — use them short-term and clean up.
  6. Live migration needs shared storage or explicit storage copy.
  7. Learn virsh — it's the CLI that every tool above libvirt eventually calls.

Wiki Navigation

Prerequisites

Next Steps