Efficient Storage for Linux Virtualization

I was recently helping someone migrate some virtual machines from VMware to Linux. That is, the virtual machines had been running on top of VMware Workstation, which was running as an application on Windows. The plan was to move the VMs to a Linux platform, taking advantage of the KVM (or Kernel Virtual Machine) support built into the Linux kernel. KVM significantly enhances performance. QEMU (or the Quick Emulator) is the hypervisor or virtual machine monitor, the user-space application controlling the VMs.

We were doing full system emulation on native hardware. That is, the actual hardware was AMD64, and the VM was being presented with a simulated AMD64 platform. We saw a qemu-kvm process using CPU and memory. That was the hypervisor.

The owner of the virtual machines got a pleasant surprise. It’s the same thing you see in Learning Tree’s Linux virtualization course. The storage can be even more space-efficient than you expect thanks to two layers of technology.

Copy-on-Write

Copy-on-write is an optimization method for storage.

Imagine that several of us are working on a project, and we all need a rather expensive reference book. We could buy one book and share it. If you need to to make notes on a few pages, then you will need to make your own copy of those pages and make your notes there. It’s a large book, most of the pages won’t need to be copied. The total amount of paper or time required will be far less than making a personal copy for each of us. That situation would be a physical analogy to copy-on-write.

Copy-on-write storage for a virtual disk image can provide for a large potential storage, while only really using what’s needed so far. The QEMU hypervisor supports qcow2, QEMU Copy-On-Write v2, a second major version that also supports multiple snapshots.

Examining qcow2 Disk Images

We defined a new VM image, giving it a 16 GB disk. We started the VM with a Debian Linux DVD ISO image attached as its optical drive.

The installation kernel saw the virtual disk image as /dev/sda, an ordinary SATA or SCSI disk. The installation partitioned the disk, created an Ext4 file system, and installed the operating system. Once done, we rebooted off the virtual disk and checked things out.

Running the df -hT command, we saw that Debian (like most current distributions) prefers UUIDs. The root file system was mounted on a device named /dev/disk/by-uuid/2162c5f0-32ef-4609-8319-7222cfe0c66f. The interesting part was in the later columns:

Type Size Used Avail Use% Mounted on
ext4  16G 2.7G   12G  19% /

Even Smaller Yet!

We were using qcow2 so we expected the disk image to be a little over 2.7 GB in size. Let’s go out to the host OS and look:

$ ls -lh
total 1.1G
-rw-r--r--.  1  qemu qemu 1.1G Jan  7 09:35 image.qcow2

That’s less than half the size of what we’re actually using within the VM! What’s happening? Let’s investigate. I’ve broken the one very long line of output in the second command to keep it readable.

$ file image.qcow2
image.qcow2: QEMU QCOW Image (v2), 17179869184 bytes
$ qemu-img check image.qcow2
47732/262144 = 18.21% allocated, \
                96.73% fragmented, \
                95.47% compressed clusters
Image end offset: 1170997248

qcow2 uses Zlib compression!

Great Space Savings, But What About Performance?

Yes, the qemu-kvm hypervisor must run all virtual disk I/O through a compression module. But consider the speed advantage the CPU has over mechanical disks.

Compression makes good sense for prototyping. For deployment, a raw image or even a dedicated hardware device might be preferred. You can use the qemu-img command to convert a qcow2 virtual disk image into another format.

Check back next time and we’ll look into performance tuning for virtual machines!

image sources

  • Untitled design(1): Bob Cromwell

Type to search blog.learningtree.com

Do you mean "" ?

Sorry, no results were found for your query.

Please check your spelling and try your search again.