Xen 
 
Home Products Support Community News
 
   

Xen paravirt_ops for upstream Linux kernel

What is paravirt_ops?

paravirt_ops (pv-ops for short) is a piece of Linux kernel infrastructure to allow it to run paravirtualized on a hypervisor. It currently supports VMWare's VMI, Rusty's lguest, and most interestingly, Xen.

The infrastructure allows you to compile a single kernel binary which will either boot native on bare hardware (or in hvm mode under Xen), or boot fully paravirtualized in any of the environments you've enabled in the kernel configuration.

It uses various techniques, such as binary patching, to make sure that the performance impact when running on bare hardware is effectively unmeasurable when compared to a non-paravirt_ops kernel.

At present paravirt_ops is available for x86_32, x86_64 and ia64 architectures.

Xen pv_ops (domU) support has been in mainline Linux since 2.6.23, and is the basis of all on-going Linux/Xen development (the old Xenlinux patches officially ended with 2.6.18.x-xen, though various distros have their own forward-ports of them). Redhat has decided to base all their future Xen-capable products on the in-kernel Xen support, starting with Fedora 9.

Current state

Xen/paravirt_ops has been in mainline Linux since 2.6.23, though it is probably first usable in 2.6.24. Latest Linux kernels (2.6.27 and newer) are good for domU use. Fedora 9, Fedora 10, Fedora 11 and Fedora 12 distributions include pv_ops based Xen domU kernel.

  • Features in 2.6.26:
    • x86-32 support
    • SMP
    • Console (hvc0)
    • Blockfront (xvdX)
    • Netfront
    • Balloon (reversible contraction only)
    • paravirtual framebuffer + mouse (pvfb)
    • 2.6.26 onwards pv domU is PAE-only (on x86-32)
  • Features added in 2.6.27:
    • x86-64 support
    • Save/restore/migration
    • Further pvfb enhancements
  • Features added in 2.6.28:
    • ia64 (itanium) pv_ops xen domU support
    • Various bug fixes and cleanups
    • Expand Xen blkfront for > 16 xvd devices

    • Implement CPU hotplugging
    • Add debugfs support
  • Features added in 2.6.29:
    • bugfixes
    • performance improvements
    • swiotlb (required for dom0 support)
  • Features added in 2.6.30:
    • bugfixes
  • Features added in 2.6.31:
    • bugfixes
  • Features added in 2.6.32:
    • bugfixes
  • Work in progress:
    • dom0 support, currently planned for Linux 2.6.34 (latest pv_ops dom0 patches can be found from jeremy's git tree, see instructions below)
    • pv-on-hvm driver support
    • Balloon expansion (using memory hotplug) to grow bigger than initial domU memory size
  • To be done:
    • Device hotplug
    • Other device drivers
    • kdump/kexec
    • blktap2 support (dom0)
    • pvscsi backend (dom0)
    • pvusb backend (dom0)
    • ...?

Using Xen/paravirt_ops

Building with domU support

  1. Get a current kernel. The latest kernel.org kernel is generally a good choice.
  2. Configure as normal; you can start with your current .config file
  3. If building 32 bit kernel make sure you have CONFIG_X86_PAE enabled (which is set by selecting CONFIG_HIGHMEM64G)
    • non-PAE mode doesn't work in 2.6.25, and has been dropped altogether from 2.6.26 and newer kernel versions.
  4. Enable these core options:
    1. CONFIG_PARAVIRT_GUEST
    2. CONFIG_XEN
  5. And Xen pv device support
    1. CONFIG_HVC_DRIVER and CONFIG_HVC_XEN
    2. CONFIG_XEN_BLKDEV_FRONTEND
    3. CONFIG_XEN_NETDEV_FRONTEND
  6. And build as usual

Running

The kernel build process will build two kernel images: arch/x86/boot/bzImage and vmlinux. They are two forms of the same kernel, and are functionally identical. However, only relatively recent versions of the Xen tools stack support loading bzImage files (post-Xen 3.2), so you must use the vmlinux form of the kernel (gzipped, if you prefer). If you've built a modular kernel, then all the modules will be the same either way. Some aspects of the kernel configuration have changed:

  • The console is now /dev/hvc0, so put "console=hvc0" on the kernel command line
  • Disk devices are always /dev/xvdX. If you want to dual-boot a system on both Xen and native, then it's best that use use lvm, LABEL or UUID to refer to your filesystems in your /etc/fstab.

Testing

Xen/paravirt_ops has not had wide use or testing, so any testing you do is extremely valuable. If you have an existing Xen configuration, then updating the kernel to a current pv-ops and trying to use it as you usually would, then any feedback on how well that works (success or failure) would be very interesting. In particular, information about:

  • performance: better/worse/same?
  • bugs: outright crash, or something just not right?
  • missing features: what can't you live without?

Debugging

If you do encounter problems, then getting as much information as possible is very helpful. If the domain crashes very early, before any output appears on the console, then booting with: "earlyprintk=xen" should provide some useful information. Note that "earlyprintk=xen" only works for domU if you have Xen hypervisor built in debug mode! If you are running a debug build of Xen hypervisor (set "debug = y" in Config.mk in the Xen source tree), then you should get crash dumps on the Xen console. You can view those with "xm dmesg". Also, CTRL+O can be used to send SysRq (not really specific to pv_ops, but can be handy for kernel debugging).

Contributing

Xen/paravirt_ops is very much a work in progress, and there are still feature gaps compared to 2.6.18-xen. Many of these gaps are not a huge amount of work to fill in.

Devices

The Xen device model is more or less unchanged in the pv-ops kernel. Converting a driver from the xen-unstable or 2.6.18-xen tree should mostly be a matter of getting it to compile. There have been changes in the Linux device model between 2.6.18 and 2.6.26, so converting a driver will mostly be a matter of forward-porting to the new kernel, rather than any Xen specific issues.

CPU hotplug

All the mechanism should already be in place to support CPU hotplug; it should just be a matter of making it work.

Device hotplug

In principle this is already implemented and should work. I'm not sure, however, that it's all plumbed through properly, so that hot-adding a device generates the appropriate udev events to cause devices to appear.

Device unplug/module unload

The 2.6.18-xen patches don't really support device unplug (and driver module unload), mainly because of the difficulties in dealing with granted pages. This should be fixed in the pvops kernel. The main thing to implement is to make sure that on driver termination, rather than freeing granted pages back into the kernel heap, they should be added to a list; that list is polled by a kernel thread which periodically tries to ungrant the pages and return them to the kernel heap if successful.

Getting the current development version

All x86 Xen/pv-ops changes queued for upstream Linus are in Ingo Molnar's tip.git tree. You can get general information about fetching and using this tree in his README. The x86/xen topic branch contains most of the Xen-specific work, though changes in other branches may be necessary too. Using the auto-latest branch is the merged product of all the other topic branches.

Bleeding edge work, including Xen dom0 support

The current day-to-day development is happening in a git repository. This repo has numerous topic branches to track individual lines of development, and a couple of roll-up branches which contain everything merged together for easy compilation and running.

(The old Mercurial/hg repository and patch queue is deprecated and will no longer be updated.)

The latest master branch can be found from:

Status updates:

To check out a working tree, use:

$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git linux-2.6-xen
$ cd linux-2.6-xen

that will check out the 'xen/master' branch automatically, since that is the default.

and later when you want to update the tree use:

$ cd linux-2.6-xen
$ git pull

Then:

make menuconfig

NOTE0: Make sure you have correct CPU type (Processor Family) set in the kernel configuration, Xen Dom0 options won't show up at all if you have too old CPU selected (too old means a CPU that doesn't support PAE; Pentium Pro was the first CPU to have PAE).

NOTE1: If you're building 32 bit version of the kernel, you first need to enable PAE support, since Xen only supports 32 bit PAE kernels nowadays. Xen kernel build options won't show up at all before you've enabled PAE for 32 bit builds (Processor type and features -> High Memory Support (64GB) -> PAE (Physical Address Extension) Support). PAE is not needed for 64 bit kernels.

NOTE 2: If building 32 bit PAE dom0 kernel make sure you have CONFIG_HIGHPTE=n. There's a known race/bug that causes dom0 kernel crashes if you have CONFIG_HIGHPTE=y.

NOTE 3: Xen dom0 support depends on ACPI support. Make sure you enable ACPI support or you won't see Dom0 options at all.

and add the Xen Dom0 option.

 Symbol: XEN_DOM0 [=y]
   Prompt: Enable Xen privileged domain support
     Defined at arch/x86/xen/Kconfig:41
     Depends on: PARAVIRT_GUEST && XEN && X86_IO_APIC && ACPI
     Location:
       -> Processor type and features
         -> Paravirtualized guest support (PARAVIRT_GUEST [=y])
         -> Xen guest support (XEN [=y])

For reference, the xen config options of a working Dom0 (Feel free to edit explain any options that you use below to help others):

  • CONFIG_XEN=y
  • CONFIG_XEN_MAX_DOMAIN_MEMORY=32
  • CONFIG_XEN_SAVE_RESTORE=y
  • CONFIG_XEN_DOM0=y
  • CONFIG_XEN_PRIVILEGED_GUEST=y
  • CONFIG_XEN_PCI=y
  • CONFIG_PCI_XEN=y
  • CONFIG_XEN_BLKDEV_FRONTEND=m
  • CONFIG_NETXEN_NIC=m
  • CONFIG_XEN_NETDEV_FRONTEND=m
  • CONFIG_XEN_KBDDEV_FRONTEND=m
  • CONFIG_HVC_XEN=y
  • CONFIG_XEN_FBDEV_FRONTEND=m
  • CONFIG_XEN_BALLOON=y
  • CONFIG_XEN_SCRUB_PAGES=y
  • CONFIG_XEN_DEV_EVTCHN=y
  • CONFIG_XEN_BACKEND=y
  • CONFIG_XEN_BLKDEV_BACKEND=y
  • CONFIG_XEN_NETDEV_BACKEND=y
  • CONFIG_XENFS=y
  • CONFIG_XEN_COMPAT_XENFS=y
  • CONFIG_XEN_XENBUS_FRONTEND=m

The XENFS and XEN_COMPAT_XENFS config options are needed for /proc/xen support. If CONFIG_XEN_DEV_EVTCHN is compiled as a module, make sure to load the xen-evtchn.ko module or xend will not start.

You might also need to add a line to /etc/fstab. Xen 3.4.2 and newer automatically mount /proc/xen when /etc/init.d/xend is started, so no need to add xenfs mount entry to /etc/fstab on those systems:

none /proc/xen xenfs defaults 0 0

Working example grub.conf with VGA text console:

title        Xen 3.4, kernel 2.6.31
root         (hd0,0)
kernel       /boot/xen-3.4.gz dom0_mem=512M
module       /boot/vmlinuz-2.6.31 root=/dev/sda1 ro nomodeset
module       /boot/initrd.img-2.6.31

Working example grub.conf with serial console output (good for debugging since you can easily log the full kernel boot messages even if it crashes):

title        pv_ops dom0-test (2.6.31) with serial console
root         (hd0,0)
kernel       /xen-3.4.gz dom0_mem=1024M loglvl=all guest_loglvl=all sync_console console_to_ring com1=19200,8n1 console=com1
module       /vmlinuz-2.6.31 ro root=/dev/vg00/lv01 console=hvc0 earlyprintk=xen
module       /initrd-2.6.31.img

Xen requirements for using pv_ops dom0 kernel

Xen hypervisor and tools need to have support for pv_ops dom0 kernels. In general it means:

  • The ability for the Xen hypervisor to load and boot bzImage pv_ops dom0 kernel
  • The ability for the Xen tools to use the sysfs memory ballooning support provided by pv_ops dom0 kernel

These features are available in the official Xen 3.4 release (and later versions). Xen 3.5 development version (xen-unstable) has switched to using pv_ops dom0 kernel as a default. Some distributions have backported these patches/features to older Xen versions. See below for more information.

Linux distribution support for pv_ops dom0 kernels

Fedora 11 Xen hypervisor package contains pv_ops dom0 kernel support, ie. it is able to boot bzImage format dom0 kernels, and pv_ops sysfs memory ballooning support is included aswell. These features/patches are backported from Xen 3.4 version to Xen 3.3.1 in Fedora 11. Even when the hypervisor supports pv_ops dom0 kernels, Fedora 11 will NOT ship with dom0 capable kernel included, because such kernel is not available upstream at the time of release and feature freeze.

Fedora 12 includes Xen 3.4.1 hypervisor, which supports pv_ops dom0 kernels.

Xen 3.3.0 included in Fedora 10 does not support pv_ops dom0 kernels.

Other distributions: to run pv_ops dom0 kernels you need to have at least Xen 3.4 version, because bzImage format kernel support and pv_ops sysfs memory ballooning support were added during Xen 3.4 development. Xen 3.3.x does NOT contain these patches (unless backported, like in Fedora 11).

Which kernel image to boot as dom0 kernel from your custom built kernel source tree?

If you have Xen hypervisor with bzImage dom0 kernel support, ie. xen 3.4 or later version, or Xen hypervisor with bzImage patch backported (Fedora 11 Xen 3.3.1) use "linux-2.6-xen/arch/x86/boot/bzImage" as your dom0 kernel (exactly the same kernel image you use for baremetal Linux).

If you have Xen hypervisor without bzImage dom0 kernel support, ie. any official Xen release up to at least Xen 3.3.1, or most of the Xen versions shipped with Linux distributions (before 2009-03), use "linux-2.6-xen/vmlinux" as your dom0 kernel. (Note that "vmlinux" is huge, so you can also gzip it, if you want to make it a bit smaller).

Also read the previous paragraphs for other requirements.

Are there other Xen dom0 kernels available?

Yes. See this wiki page for more information: http://wiki.xensource.com/xenwiki/XenDom0Kernels

Xend does not start when using pv_ops dom0 kernel?

In December 2009 pv_ops dom0 kernel modules were renamed to have a "xen-" prefix in them, ie. "evtchn.ko" became "xen-evtchn.ko".

This makes Xen 3.4.x xend fail to start, because it tried to load "evtchn.ko", but that doesn't exist. You need to load "xen-evtchn.ko" and then start xend. Fedora 12 xen-3.4.2-2 rpms have this problem fixed.

Also make sure you have xenfs mounted to "/proc/xen", that's needed aswell.

Troubleshooting, what to do if the custom built pv_ops dom0 kernel doesn't work/boot?

You could try these example .config files:

64bit x86_64: http://pasik.reaktio.net/xen/pv_ops-dom0-debug/config-2.6.31.4-pvops_dom0-x86_64

32bit PAE: http://pasik.reaktio.net/xen/pv_ops-dom0-debug/config-2.6.31.5-pvops_dom0-x86_32

Those kernel configs are based on Fedora 11/12 kernel configuration, with some modifications. They've been tested to work on multiple systems.

Example how to compile/build the pv_ops dom0 kernel:

cd linux-2.6-xen
make clean
cp -a .config .config-old
wget -O .config <http_url_to_config_file>
make oldconfig
make menuconfig (if you need to change something)
make bzImage
make modules
make modules_install
# in the following lines replace "version" with the actual kernel version you're compiling.
cp -a .config /boot/config-version
cp -a System.map /boot/System.map-version
cp -a arch/x86/boot/bzImage /boot/vmlinuz-version
# And then generate initrd/initramfs image for your dom0 kernel, example for Fedora/RHEL/CentOS:
mkinitrd -f /boot/initrd-version.img version

and then edit /boot/grub/grub.conf and make sure you have a correct grub entry to boot Xen hypervisor with dom0 kernel (examples above).

In grub.conf it's a good idea to enable all the logging options for Xen ("loglvl=all guest_loglvl=all sync_console console_to_ring") and for pv_ops dom0 kernel ("earlyprintk=xen"), and set up a serial console to be able to see and capture the full boot messages from Xen and from dom0 kernel, in the case system doesn't start up properly or crashes.

So for debugging and testing you should be using a computer with a built-in serial port on the motherboard (com1), or add a PCI serial card if your motherboard lacks a built-in serial port. You can also use SOL (Serial Over Lan) for logging the Xen hypervisor and dom0 kernel messages. Most server-class machines have SOL available through their management processor or IPMI. SOL device looks like a normal serial port for the OS/Xen, but enables you to connect to the serial console over a network, through the management processor.

If you're having problems booting using an IPMI SOL serial console, try this patch: http://lists.xensource.com/archives/html/xen-devel/2010-01/msg00773.html

Is there more information available how to debug and troubleshoot using a serial console?

Please see [XenSerialConsole] wiki page.

Contact

Please mail questions/answers/patches/etc to the Xen-devel mailing list.

Related Reference

Kernel.org Linux on Xen

Suggestion: This page should be merged with: Kernel.org Linux on Xen Alternatively, one of the pages could be used for the high level overview, theory, and quick status and the other could be used for the "howto"-style using it.

XenParavirtOps (last edited 2010-01-22 22:57:46 by PasiKarkkainen)