Xen Storage Management

From Xen


Block Device

Storage 'published' in the form of a block type device file, usually somewhere in /dev. (See mknod (1)). Block devices act like one huge file, possibly with extra features defined by the hardware or software behind them.

Hardware Block Device

Domain0 access to real physical storage, be it a whole IDE disk, a partition, or a hardware RAID.

Software Block Device

Software volume management (evms, lvm) and software raid (md) wrap hardware and other software block devices and give them new names as part of the interface of adding functionality to the storage they manage. For example, a software raid of /dev/hda and /dev/hdb might be accessed through an 'md' device named /dev/md0. Alternatively, it might also be part of an lvm volume group which is managed by evms and named 'example'. In that case its name would be /dev/evms/example.

Network Block Device

There are several kernel drivers which export storage as a simple block device which clients may use identically to how they would use a regular hardware or software block device.

One can even combine all three classes. A Xen Project domain (or any unix host for that matter) could create a filesystem on an evms-managed volume created from a region of an lvm2 container whose storage regions are each software raid devices created from network block devices exported from other hosts. This sort of configuration could be used to create a cluster-wide block device which would tolerate some storage node failures.

Types of network block device

an extremely simple protocol available in the mainline unix kernel
an enhanced nbd from RedHat, enhanced for cluser use
a heavily-engineered protocol and industry standard for accessing devices across an IP network

Network File System

Network filesystems differ from block devices in that like regular filesystems, they implement unix filesystem semantics such as directories, symlinks, hardlinks, charachter and block device files, access control lists and locking mechanisms. Not all network filesystems implement these features equally, though.

There are many to chose from. Here are some candidates for use with Xen Project software:

mature, widely supported
developed by RedHat for clusters
successor to gfs, still in development
designed for high client:server ratios and high latency networks
descended from afs, still in development
descended (indirectl) from Coda, designed for large clusters

Use Volume Management

There are many reasons to use volume management for all block devices, and few disadvantages.

A volume manager can gracefully handle situations in which the hardware block device names change. The volume manager scans devices as it comes to know of them and stitches them together or gives them access-method-independent names so that, for example, guest domain configurations need not be updated when /dev/hda suddenly becomes /dev/hde because the hard drive was moved from the ata66 controller on the old motherboard to the spiffy ultra133 controller which was just added to the host.

A volume manager can also facilitate hardware upgrades. It can allow the administrator to safely migrate data in use by a live guest domain from one device to another without impacting the operation of the guest.

Configuring Guest Access

Domain0 grants a guest domain access to host domain block devices with the 'disk' parameter in the domain's configuration:

disk = [ 'phy:dom0,domU,mode' ]

Domain0 device path, in this case /dev/dom0. Paths are relative to /dev
How the host domain presents the device to the guest domain.
'r' for read-only, 'w' for read-write.

The path and filename of a block device in Domain0 will depend on the modules and userspace processes exposing the hardware and the (optional) presence of a /dev manager such as udev.


Guest domains can be given swap space by exporting any block device as a VBD, the same way one exports a device containing a filesystem. As with filesystem devices, it is helpful to manage swap devices with a volume manager to simplify device management. For example, if a guest domain's root, home and etc filesystems and swapspace were all in the same container, that container and all its contents could be transparently migrated to a new device, local or remote, without impacting the operation of the guest.