XenParavirtOps

From Xen

Jump to: navigation, search

Contents

What is paravirt_ops?

paravirt_ops (pv-ops for short) is a piece of Linux kernel infrastructure to allow it to run paravirtualized on a hypervisor. It currently supports VMWare's VMI, Rusty's lguest, and most interestingly, Xen.

The infrastructure allows you to compile a single kernel binary which will either boot native on bare hardware (or in hvm mode under Xen), or boot fully paravirtualized in any of the environments you've enabled in the kernel configuration, and lately also as Xen dom0.

It uses various techniques, such as binary patching, to make sure that the performance impact when running on bare hardware is effectively unmeasurable when compared to a non-paravirt_ops kernel.

At present paravirt_ops is available for x86_32, x86_64 and ia64 architectures.

Dom0 means the first guest that Xen boots. It usually (mostly always) is the one that has the driver support. The other guests that are booted (HVM or PV) are called DomU.

Current state in Linux kernel and in distributions

[Updated Oct 26th 2011]

Linux 3.0 (and later) can run as guest (domU) and as host (dom0). All necessary backends (and frontends) are in the upstream kernel. Look in Roadmap in the features section to get an idea what is happening in next kernel releases.

You can also visit the history section on the story of pvops.

Starting with Linux v2.6.37 dom0 support was added and with 3.0 having the necessary backend drivers. Fedora Core 16 and Ubuntu 11.10 ship with the 3.0 kernel (or later) and the user just needs to install the hypervisor:

$yum install xen


$apt-get install xen

TODO: Verify the Ubuntu

Then reboot the machine and select Xen in the GRUB(2) menu. Also consult Category:FAQ and XenCommonProblems

Are there other distributions that have a Xen dom0 kernels available?

Yes. See this wiki page for more information: XenDom0Kernels and XenKernelFeatures

Help! I am having troubles

First of, did you look in Category:FAQ and XenCommonProblems wiki page?

Did not find anything relevant? Was it any of these:

=== I have graphics card (DRM/TTM/KMS/Xorg) related problems with the pv_ops dom0 kernel.. ===

Screen looks like a checker-board? See errors from the Radeon|Nouveau driver? If so please see the XenPVOPSDRM wiki page.

Dom0 console gets all weird and corrupted in the end of the boot process

Is the last line on the console something like Setting console screen modes and fonts ? Then you might want to disable "console-screen.sh" service from starting automatically and it should workaround the problem.

Dom0 console doesn't show any output and remains blank

Unfortunately, a patch that enables the xen dom0 kernel to use the VGA text console didn't made it into initial release of Linux 3.0.0. This patch was merged into Linux 3.1.0 and also to 3.0.2 stable update.

Maybe it is something obvious? How do I make the bootup more verbose?

If you do encounter problems, then getting as much information as possible is very helpful. If the domain crashes very early, before any output appears on the console, then booting with: earlyprintk=xen should provide some useful information. Note that earlyprintk=xen only works for domU/dom0 if you have Xen hypervisor built in debug mode! If you are running a debug build of Xen hypervisor (set "debug = y" in Config.mk in the Xen source tree), then you should get crash dumps on the Xen console. You can view those with "xm dmesg". Also, CTRL+O can be used to send SysRq (not really specific to pv_ops, but can be handy for kernel debugging).

OK, how do I add earlyprintk=xen and all that?

Edit /boot/grub/grub.conf (or /boot/grub2/grub.cfg) and make sure you have a correct grub entry to boot Xen hypervisor with dom0 kernel.

In grub.conf it's a good idea to enable all the logging options for Xen: loglvl=all guest_loglvl=all sync_console console_to_ring and for Linux pv_ops dom0 kernel: earlyprintk=xen debug loglevel=8, and set up a serial console to be able to see and capture the full boot messages from Xen and from dom0 kernel, in the case system doesn't start up properly or crashes.

For the full list of Xen bootup options, look in XenHypervisorBootOptions Here is an example:


title        pv_ops dom0 debug (2.6.32.27) with serial console
root         (hd0,0)
kernel       /xen-4.0.gz dom0_mem=1024M loglvl=all guest_loglvl=all
sync_console console_to_ring com1=115200,8n1 console=com1 lapic=debug
apic_verbosity=debug apic=debug iommu=off
module       /vmlinuz-2.6.32.27 ro root=/dev/vg00/lv01 console=hvc0
earlyprintk=xen nomodeset initcall_debug debug loglevel=10
module       /initrd-2.6.32.27.img


==== I got a funky serial console (IPMI/AMT, etc) - Is there more information available how to debug and troubleshoot using a serial console? ====

For debugging and testing you should be using a computer with a built-in serial port on the motherboard (com1), or add a PCI serial card if your motherboard lacks a built-in serial port. You can also use SOL (Serial Over Lan) for logging the Xen hypervisor and dom0 kernel messages. Most server-class machines have SOL available through their management processor or IPMI. SOL device looks like a normal serial port for the OS/Xen, but enables you to connect to the serial console over a network, through the management processor.

For more details (and examples) please see XenSerialConsole wiki page.

Nothing is helping! Help!!

Before submitting bugreports to xen-devel mailinglist please read:

further bugs] that we are aware of and working on.

information you NEED to provide so the problem can be diagnosed and debugged!

Please mail questions/answers/patches/etc to the [http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel Xen-devel mailing list].

Roadmap

Status updates:

  • Jeremy's view of the status of pv_ops dom0 kernel (June 2009):

http://lists.xensource.com/archives/html/xen-devel/2009-06/msg01193.html

  • Jeremy's roadmap update (August 2009):

http://lists.xensource.com/archives/html/xen-devel/2009-08/msg00510.html

  • Jeremy's status update (September 2009):

http://lists.xensource.com/archives/html/xen-devel/2009-09/msg00806.html

  • Presentation by Jeremy about pv_ops dom0 kernel at Xen Summit Asia

2009 (November 2009): http://www.xen.org/files/xensummit_intel09/xensummit-asia-2009-talk.pdf

  • Short update (03 December 2009):

http://lists.xensource.com/archives/html/xen-devel/2009-12/msg00190.html

  • Status update (22 December 2009):

http://lists.xensource.com/archives/html/xen-devel/2009-12/msg01127.html

  • Status update (03 March 2010):

http://lists.xensource.com/archives/html/xen-devel/2010-03/msg00162.html

  • Status update from Xen Summit 2010 NA (April 2010):

http://www.slideshare.net/xen_com_mgr/xen-summit-amdpvopsupdate4

  • Status update from Xen Summit 2011 NA (August 2011):

http://xen.org/files/xensummit_santaclara11/aug2/5_KonradW_Update_on_Linux_PVOPS.pdf

Removing the Xen Linux kernel upstream delta

Linux distributions are using xen with the Linux kernel but the Linux kernel is typically patched with support for features and enhancements which are not yet upstream. This page documents the delta with the Linux kernel and provides a detailed analysis / status update on each of the patches for each Linux distribution. The purpose of this is page is for Linux distributions to describe each patch being carried forward for their delta, analyze whether or not it can be upstreamed and if so how, see who can or will work on it, and also identify patches which can or should be dropped. The delta for Linux distributions is very similar but SUSE likely caries the largest delta, our end objective should be to distill which delta is common and identify who is working on what. There's quite a bit of patches which likely are no longer needed due to a proper replacement already being upstream or because they are no longer applicable, identifying this also needs to be done.

Distributions delta

  • OpenSUSE's 13.1 kernel sources can be found at on the

[[1]] repository on the origin/openSUSE-13.1 branch. The xen specific changes are under the kernel-source/patches.xen/ directory.

  • Debian testing: TBD

Upstream delta details

Feature pvops status current solution remark
pci-guestdev alternative mechanism is documented and exists upstream already SUSE patch pci-guestdev PCI passthrough
mem-hotplug needs verification SUSE patch xen-mem-hotplug Daniel Kiper will work on this on 2015
hypercall preemption Being worked on SUSE patch xen-privcmd-hcall-preemption Luis Rodriguez will work on this
multi page ring Proper upstream solution obsoletes this: indirect descriptor SUSE patches xen-*multi-page-ring Can be nuked
IPV6 autoconf Luis R. Rodriguez working on it SUSE patch ipv6-no-autoconf the work consists of a proper way to address adding

the root bridge block feature, and also fixing bridging code to address this early without userspace, patches are ongoing review upstream

pvSCSI patches posted by Jürgen Groß, queued for upstream 3.18 SUSE patch xen3-auto-xen-drivers.diff
500GB+ RAM support patches posted by Jürgen Groß, parts queued for upstream 3.18 SUSE patches xen-x86_64-note-init-p2m,

xen-x86_64-unmapped-initrd, xen-x86-bigmem

Jürgen Groß working on this
pvUSB Jürgen Groß working on it SUSE patch xen3-auto-xen-drivers.diff Greg KH welcomes this
user mode pvclock Konrad Wilk will do it SUSE patch xen-x86_64-vread-pvclock Still being worked on
expose ballooning limits needs effort SUSE patch xen-balloon-max-target Sysfs parameter needs to be exposed, no takers yet
retaining tasklet in netback needs effort SUSE patch xen-netback-kernel-threads David Vrabel will inspect how network latency is

impacted without this, some folks have done some testing on this

add BLKIF_OP_PACKET needs effort SUSE patch xen-blkif-op-packet Luis Rodriguez will work on this
CD-ROM command forwarding needs effort SUSE patch xen-blkfront-cdrom Luis Rodriguez will work this
CD-ROM removable media attribute needs effort SUSE patch xen-blkback-cdrom Luis Rodriguez will work on this
CD-ROM avoid takeover in HVM needs effort SUSE patch xen-blkfront-hvm-no-cdrom Luis Rodriguez will work on this
DCDBAS address translation needs effort SUSE patch xen-dcdbas Need verification from Dell - Doug Warzecha

douglas_warzecha@dell.com

netback multicall for notifications needs effort SUSE patch xen-netback-notify-multi multiple notifications should not be sent one by one,

but via a multicall

interrupt trigger mode and polarity needs effort SUSE patch xen-setup-gsi Not applicable to netback anymore, can be removed
PAT support patches posted by Jürgen Groß - Especially for WC memory type, Konrad Wilk had a try,

patches have not been accepted. Some rework needed. Jürgen Groß working on it.

fix balloon target driver to

have a logarithmic scale of hitting the 'target'

needs effort - Dan and Jan talked about it and Jan posted a patch

for this some time ago. http://lists.xensource.com/archives/html/xen-devel/2008-04/msg00143.html and http://lists.xensource.com/archives/html/xen-devel/2010-06/msg01076.html

netfront support skb coalescing needs effort, no takers yet - Borrow the ideas from Netchannel2 to coalesce the

SKB's so that the backend does not have to jump all over the memory and instead can easily fetch the pages in a physical contingous manner.

microcode update needs effort - Jeremy wrote one

http://oss.oracle.com/git/kwilk/xen.git/p=kwilk/xen.git;a=commit;h=0a49ceea0d032864a72a8744c82c3786a01f34f4 but sadly the upstream maintainer (Borislav, https://lkml.org/lkml/2011/1/30/108) was not too keen. He was thinking of making grub do the microcode update, but that sounds to have run afoul of what Ingo [x86 maintainer] thinks (http://mid.gmane.org/20110525193601.GE17864@elte.hu) in a different thread. Currently only boot time loading done directly by the hypervisor via an own microcode blob is possible. Runtime loading not supported, unclear whether to do it in the kernel or by the microcode loader being Xen-aware. Konrad will send patch to Luis Rodriguez

MTRR support needs effort - Usually this is used for graphics cards, but the 10G

Myantic drivers do enable it on their PCIe MMIO space to speed up performance (sets it to Write Combine). We need to implement (or perhaps backport from 2.6.18) the Xen MTRR code. Drivers should be able to use PAT instead of MTRR. PAT should be a proper replacement and otherwise hardware would take care of this, we should ask x86 maintainers to rip out MTRR support from the kernel. What should be done is drivers that currently required MTRR support should be modified to use PAT.

remove VM_IO vesitages needs effort, Jürgen Groß will look at it - Linux 3.0 has dropped most of the _PAGE_IOMAP code.

There is one usage left in xen_make_pte and it should be redo to do proper M2P and P2M without consulting the _PAGE_IOMAP flag. Potentially remove the _PAGE_IOMAP usage in the 1-1 phys code. _PAGE_IOMAP used by mmap batches code to map foreign patches. Can be done with M2P override perhaps? The PAGE_IOMAP was used to say MFN==PFN. The outstanding issue is with xenfs doing hypercall on the behest of the tools - specifically mapping memory and providing the MFN value.

Device hotplug (MODULE_ALIAS) needs effort - Konrad sent patch for hotplug upstream
Suspend event channel

support for faster checkpointing in Remus FT

needs effort - Ian poked remus folks, remus folks should fix this otherwise
expected to be dead (under

Xen) code cannot be easily verified to indeed be dead

needs effort - e.g. IOMMU, PCI ATS, PRI, and PASID, leaving the risk

of bad interaction between hypervisor and Dom0 if a new, active user of that code appears and goes unnoticed. Kconfig option would be developed and patch carried over which won't go upstream. i915 DRM driver should be fixed to not use the Intel AGP code. Intel should be working on this. Konrad will poke them.

old hypervisor support needs effort - running the pvops kernel on top of a pre-4.0.1

hypervisor (especially as Dom0) might not work. This should work. We should get folks to test this and if issues found file a bug report for it.

blktap driver (what about blktap3?) needs effort - Performance and support of all possible blktap

configurations via qemu/qdisk is to be verified.

oprofile support Boris Ostrovsky is working on it - -

Changelog

Kernel Features added
2.6.26
  • x86-32 support
  • SMP
  • Console (hvc0)
  • Blockfront (xvdX)
  • Netfront
  • Balloon (reversible contraction only)
  • paravirtual framebuffer + mouse (pvfb)
  • 2.6.26 onwards pv domU is PAE-only (on x86-32)
2.6.27
  • x86-64 support
  • Save/restore/migration
  • Further pvfb enhancements
2.6.28
  • ia64 (itanium) pv_ops xen domU support
  • Various bug fixes and cleanups
  • Expand Xen blkfront for > 16 xvd devices
  • Implement CPU hotplugging
  • Add debugfs support
2.6.29
  • bugfixes
  • performance improvements
  • swiotlb (required for dom0 support)
2.6.30
  • bugfixes.
2.6.31
  • bugfixes.
2.6.32
  • bugfixes.
2.6.33
  • bugfixes.
  • save/restore/migration bugfixes. These bugfixes can also be found from the 2.6.32.10 update.
2.6.34
  • bugfixes.
2.6.35
  • bugfixes.
2.6.36
  • Xen-SWIOTLB (required for Xen PCI frontend driver and Xen dom0 support).
  • Xen PV-on-HVM optimized paravirtualized drivers for fully virtualized (HVM) guest VMs.
  • Xen VBD (Virtual Block Device) online dynamic resize support for resizing guest disks (xvd*) on-the-fly.
2.6.37
  • Core Xen dom0 support (no backend drivers yet).
  • Xen PCI frontend driver required for Xen PCI Passthru to PV guest/domU.
  • Enhanced PV-on-HVM drivers: pirq remappings. Deliver IRQs as Xen event channels for better performance. Requires Xen 4.1 (or newer) hypervisor.
2.6.38
  • Generic Xen dom0 backend bits, required by all xen backend drivers.
  • xen-gntdev driver (grant device).
  • Bugfixes.
2.6.39
  • xen-netback backend driver to be used in dom0 to serve virtual networks to VMs. Currently unoptimized, optimizations will be added in later kernel versions.
  • Many dom0 related bugfixes and improvements.
  • PV-on-HVM driver fixes and improvements (xen balloon driver and PV spinlocks support for HVM guests).
  • xen-gntalloc driver for userspace grant allocation between Xen domains.
  • xen-gntdev support for HVM guests.
  • Xen watchdog driver.
  • 1-1 identity mapping in P2M. Allows us to automatically figure out if a page is for an I/O hole or not based on the E820. Fixes some device drivers.
  • IRQ code rework to support dynamic IRQs so that we're not limited to running 155 VMs.
  • Balloon driver has been prepared for memory hotplug and gntalloc.
  • save/restore bugfixes.
  • Dom0 startup crash fix when certain CONFIG_ options were set.
  • Bug fixes in xen-kbdfront, xen-netfront, xen-pcifront, and xen-blkfront.
  • Handling of guest events is now round-robin, fixes starvation issue of later guests not having their services served.
  • Many cleanups and bugfixes.
3.0
  • xen-blkback backend driver to be used in dom0 to serve virtual block devices (disks) to VMs.
  • Bugfixes and cleanups.
  • In 3.0.2: Enable use of vga text console in Xen dom0.
3.1
  • xen-pciback backend driver to be used in dom0 to support PCI passthru to VMs.
  • support for VGA text console in dom0.
  • memory hotplug support for xen balloon driver (allows adding more memory to the VM online / on-the-fly).
  • self-balloon driver to decrease memory in the guest and make the swap pages be shuffled by tmem to be compressed/shared/etc.
  • tmem driver to shuffle file-system and swap pages between guests as appropiate.
  • Xen PCI glue code cleanup.
  • Xen MMU debugfs tracing API support.
  • blkback providing completion latency that follows the hardware's completion latency.
3.2
  • blkback emulating the 'feature-barrier' option.
  • blkback providing TRIM/DISCARD operations ('feature-discard')
  • syncing wall clock time from Dom0 to hypervisor
3.3
  • Support v2 of granttables
  • /dev/xen/privcmd is now usuable instead of /proc/xen/privcmd
  • network backend can work in HVM
  • PCI graphic cards (ATI ES1000 for example) work now.
  • Multiple fixes in blkback, blkfront, xenbus, etc
  • PIRQ MFN bitmap is supported (meaning IRQ ack are done faster)
3.4
  • 'xenpm' works now if the xen-acpi-processor is loaded. The Xen ACPI processor is the driver that uploads ACPI power management data to the hypervisor.
  • blkback backend can work in HVM
  • pciback can do FLR, multiple fixes added
  • PV console works in HVM
3.5
  • Many performance enhancements (fewer MSR traps on AMD)
  • APIC IPI interface support, enabling the 'perf' tool in dom0
  • Additional HVM driver domain support
  • Memory reporting improvements
3.6
  • Fix a lot of bugs: Systems with MP BIOS failing, Systems with ACPI NUMA failing, FLR in xen-pciback leaving the devices unusable, 32-bit PCI sounds cards in dom0 not working, fix crashes when using acpidump, fix crashes with CONFIG_MAX_DOMAIN_PAGES=512
  • Make the P2M interaction on MMIO ranges use less memory during booting (aka, Reuse existing P2M leafs).
  • Simplification and cleanups in the code base. Coverity fixes
  • Performance optimizations by caching TLS and GDT descriptors
  • Performance optimizations in PTE page manipulations
  • Xen MCE driver added (to see MCE events that Xen hypervisor gets)
  • Xen PCPU driver added (to online/offline physical CPUs via dom0)
3.7
  • Initial support for ARM working under Xen as both guest and initial domain.
  • Security fixes.
  • Fix RCU warning, add fallback code for old hypervisors, fix memory leaks in gntdev driver, fix some pvops calls failing, Fixes in xen-[kbd|fb|blk|net|hvc]-backend to deal with CLOSED transition
  • Allow xen/privcmd to use v2 of MMAPBATCH command (and fixes for it)
  • Support Xen backends to work with paged out grants (meaning work with HVM guests that have its memory paged out)
  • Performance optimization in xen/privcmd for migrating guests.
  • Performance improvements when doing kdump for PVonHVM guests.
  • Xen DBGP driver added (USB EHCI debug driver)
  • FLR support in xen-pciback.
  • Support wildcards in xen-pciback.hide=(*) argument parsing.
  • Xen VGA EFI support, and keyboard shift status flag.
  • Late usage of Xen-SWIOTLB allowing PV PCI passthrough guest to boot without'iommu=soft' as an argument and late initialization of SWIOTLB.
  • Support more than 128GB in a PV guest.
  • Cleanups in the initial pagetable creation.
3.8
  • Persistent feature grant in xen-block system allowing greater performance.
  • More fixes in the Xen-pciback for wildcard parsing
  • Xen Processor Aggregator Device (PAD) added.
  • Optimizations for xen/privcmd for ARM and PVH via new hypercall (add_to_physmap_range)
  • Xen ARM can use the balloon driver
  • Fixes for vcpu onlining/offlining, grant table initialization, parsing of cpu onlining/offlining values, checks in xen-pciback, locking fixes in gtndev,fix stack corruptions, fix xen_iret checks. xen-pciback DoSing dom0 with messages, fix mmap batch ioctl error path.
  • Further enh to allow PVHVM backend drivers (so moving dom0 functionality in guests)
3.9
  • Memory hotplug support
3.11
  • ACPI Sx handling
3.12
  • Correct usage of ticker locks
  • NMI injection and delivery
3.14
  • MSI-X handling in non-Dom0 domUs
3.15
  • Support of multi-vector MSI
3.17
  • EFI boot of Dom0

I want to build the Xen/paravirt_ops components myself

Novice

If you are using a 3.0 kernel or later, you already have it working and there is no need to compile a new kernel - unless you feel adventurous.

Expert

The top level view is that you need to:

  1. Get a current kernel. The latest kernel.org

kernel is generally a good choice. Or you can use the development version.

  1. Configure as normal; you can start with your current .config file or

use make defconfig.

  1. Then configure .config Xen options
  2. Build the kernel
  3. Update #configure_modules modules (optional)
  4. Build the Xen hypervisor
  5. Update services (optional)
  6. Update GRUB1 or GRUB2
  7. Reboot and run it
  8. Troubleshoot if necessary

Getting the mainline Linux kernel

In order to get a proper Xen mainline kernel please check: Mainline_Linux_Kernel_Configs .

Build Xen

Xen requirements for using pv_ops dom0 kernel

Xen hypervisor and tools need to have support for pv_ops dom0 kernels. In general it means:

  • The ability for the Xen hypervisor to load and boot bzImage pv_ops

dom0 kernel.

  • The ability for the Xen tools to use the sysfs memory ballooning

support provided by pv_ops dom0 kernel.

  • Current recommended 2.6.32.x version of pvops dom0 kernel requires

new IOAPIC setup hypercall from Xen hypervisor. This means you need to have at least Xen version 4.0.1.

Using older Xen versions is known to be problematic, for example Xen 4.0.0 libraries have problems with recent 2.6.32.x kernels, making xend fail to start due to evtchn/gntdev device node creation issues. Using Xen 3.4.2 or older won't work at all, since old hypervisor versions lack the new required IOAPIC setup hypercall and boot will fail with IRQ related issues.

It's recommended to run the latest Xen 4.0.x version, at least Xen 4.0.1.

=== Which kernel image to boot as dom0 kernel from your custom built kernel source tree? ===

If you have Xen hypervisor with bzImage dom0 kernel support, ie. xen 3.4 or later version, use "linux-2.6-xen/arch/x86/boot/bzImage" as your dom0 kernel (exactly the same kernel image you use for baremetal Linux).

If you have Xen hypervisor without bzImage dom0 kernel support, ie. any official Xen release up to at least Xen 3.3.1, or most of the Xen versions shipped with Linux distributions (before 2009-03), use "linux-2.6-xen/vmlinux" as your dom0 kernel. (Note that "vmlinux" is huge, so you can also gzip it, if you want to make it a bit smaller).

Also read the previous paragraphs for other requirements.

Get your sources from the repository, something like this can be used but the exact path will change:


 $ cd /usr/src
 $ git clone http://xenbits.xenproject.org/xen.git
 $ cd xen
 $ sudo make xen
 $ sudo make tools
 $ sudo make install-xen
 $ sudo make install-tools


If you encounter any error NOT related to deps report them to xen-devel, if you need help compiling you will find help in xen-users. At the end in dist folder you will find xen.gz follow that symlink to your real image.

Update services

The Ubuntu way to register a service is this:


 $ sudo update-rc.d xendomains defaults 21 20

For Red Hat use chkconfig

For RC16 use systemctl:

 $ sudo systemctl start xenstored.service

Other System Updates

The XENFS and XEN_COMPAT_XENFS config options are needed for /proc/xen support. If CONFIG_XEN_DEV_EVTCHN is compiled as a module, make sure to load the xen-evtchn.ko module or xend will not start.

You might also need to add a line to /etc/fstab. Xen 3.4.2 and newer automatically mount /proc/xen when /etc/init.d/xend is started, so no need to add xenfs mount entry to /etc/fstab on those systems:


none /proc/xen xenfs defaults 0 0


GRUB1 and GRUB2 (Booting)

Working grub.conf example with VGA text console:


title        Xen 4.0, dom0 Linux kernel 2.6.32.24
root         (hd0,0)
kernel       /boot/xen-4.0.gz dom0_mem=512M
module       /boot/vmlinuz-2.6.32.24 root=/dev/sda1 ro nomodeset
module       /boot/initrd.img-2.6.32.24


NOTE! You need to give correct root= parameter, ie. replace /dev/sda1 with your actual root device. Check your earlier grub kernel entries for the correct option. Also you need to have the "nomodeset" option for the time being. Check your earlier grub kernel entries for the correct option. Its good to look at your initial kernel boot configuration and use it as closely as possible, then modify the 40_custom file to suite your xen needs. Keep in mind that your initial kernel might not use the /dev/sdX it can use the UUID notation, well use it also.

Here is the content extracted from grub.cfg for a Ubuntu/Fedora Core kernel using grub2:


menuentry 'GNU/Linux, with Linux 2.6.38-8-generic' --class gnu-linux
--class gnu --class os {
    recordfail
    set gfxpayload=$linux_gfx_mode
    insmod part_msdos
    insmod ext2
    set root='(/dev/sda,msdos2)'
    search --no-floppy --fs-uuid --set=root 016e7c8a-4bdd-4873-92dd-d71171a49d6d
    linux    /boot/vmlinuz-2.6.38-8-generic
root=UUID=016e7c8a-4bdd-4873-92dd-d71171a49d6d ro   quiet splash
vt.handoff=7
    initrd    /boot/initrd.img-2.6.38-8-generic
}

Compare them and you will see the similarties, on this particular system the boot device is located in /dev/sda2 yet if used in grub the system will not boot. Insted GRUB uses the UUID schema to detect the devices, that way the root argument can be passed to the linux kernel image.

Here is the contents of 40_customCompare them and you will see the similarities, on this particular system the boot device is located in /dev/sda2 yet if used in grub the system will not boot instead GRUB searches for the correct UUID and its is used in the kernel image, this can be easily detected when looking at the original ubuntu boot entry..

Here is the contents of 40_custom:


#! /sbin/sh
exec tail -n +3 $0
# This file provides an easy way to add custom menu entries.  Simply type the
# menu entries you want to add after this comment.  Be careful not to change
# the 'exec tail' line above.
menuentry 'GNU/Linux, with Linux 2.6.32.40-pv' --class gnu-linux
--class gnu --class os {
        recordfail
        insmod part_msdos
        insmod ext2
        search --no-floppy --fs-uuid --set=root
016e7c8a-4bdd-4873-92dd-d71171a49d6d
        set root='(/dev/sda,msdos2)'
        search --no-floppy --fs-uuid --set=root
016e7c8a-4bdd-4873-92dd-d71171a49d6d
        multiboot /boot/xen-4.2-unstable.gz
        module /boot/vmlinuz-2.6.32.40-pv placeholder
root=UUID=016e7c8a-4bdd-4873-92dd-d71171a49d6d dom0_mem=1024
console=tty  quiet splash vt.handoff=7
        module /boot/initrd.img-2.6.32.40-pv
}

Xen MUST be started using multiboot this will point to the image created when xen was compiles, the next module line is the linux kernel built from the kernel repository, this can be either jeremy or kernel.org, after the image you append placeholder or dummy=dummy because grub will strip the first argument from the module line. The last line is the ram drive created after the kernel.

Here is another working example grub.conf with serial console output (good for debugging since you can easily log the full kernel boot messages even if it crashes):


title        pv_ops dom0 (2.6.32.24) with serial console
root         (hd0,0)
kernel       /xen-4.0.gz dom0_mem=1024M loglvl=all guest_loglvl=all
sync_console console_to_ring com1=19200,8n1 console=com1
module       /vmlinuz-2.6.32.24 ro root=/dev/vg00/lv01 console=hvc0
earlyprintk=xen nomodeset
module       /initrd-2.6.32.24.img

For more information about using a serial console with Xen please check the XenSerialConsole wiki page.

Booting DomU guests

If you've built a modular kernel, then all the modules will be the same either way. Some aspects of the kernel configuration have changed:

  • The console is now /dev/hvc0, so put "console=hvc0" on the kernel command line
  • Disk devices are always /dev/xvdX. If you want to dual-boot a

system on both Xen and native, then it's best that use use lvm, LABEL or UUID to refer to your filesystems in your /etc/fstab.

Running

=== Troubleshooting, what to do if the custom built pv_ops dom0 kernel doesn't work/boot? ===

Make sure you read the ##troubleshooting section. It has tons of good suggestions. Also consult the XenCommonProblems or Category:FAQ pages.

Historical contents

Timeline of pvops

Xen pv_ops (domU) support has been in mainline Linux since 2.6.23 (though it is probably first usable in 2.6.24), and is the basis of all on-going Linux/Xen development (the old Xenlinux patches officially ended with 2.6.18.x-xen, though various distros have their own forward-ports of them). Latest Linux kernels (2.6.27 and newer) are good for domU use. Starting from Fedora 9 all the new Fedora distribution versions include pv_ops based Xen domU kernel. Ubuntu 10.04 ("Lucid Lynx") and Debian 6.0 ("Squeeze") also includes Xen PV domU kernel. Red Hat Enterprise Linux 6.0 also includes pvops based Xen domU kernel.

Contributing

[This is obsolete, but it might be worth reading] Lots of work has been done to close the feature gaps compared to 2.6.18-xen. Many of these have been done - but there are still some left.

Devices

[Done] The Xen device model is more or less unchanged in the pv-ops kernel. Converting a driver from the xen-unstable or 2.6.18-xen tree should mostly be a matter of getting it to compile. There have been changes in the Linux device model between 2.6.18 and 2.6.26, so converting a driver will mostly be a matter of forward-porting to the new kernel, rather than any Xen specific issues.

CPU hotplug

[Done] All the mechanism should already be in place to support CPU hotplug; it should just be a matter of making it work. [2011-07-30]: It works with 3.0.

Device hotplug

[Done] In principle this is already implemented and should work. I'm not sure, however, that it's all plumbed through properly, so that hot-adding a device generates the appropriate udev events to cause devices to appear.

Device unplug/module unload

[Done] The 2.6.18-xen patches don't really support device unplug (and driver module unload), mainly because of the difficulties in dealing with granted pages. This should be fixed in the pvops kernel. The main thing to implement is to make sure that on driver termination, rather than freeing granted pages back into the kernel heap, they should be added to a list; that list is polled by a kernel thread which periodically tries to ungrant the pages and return them to the kernel heap if successful.

Development of PVOOPS in Ingo's tree

[This is obsolete, but it might be worth reading]

All x86 Xen/pv-ops changes queued for upstream Linus are in Ingo Molnar's [http://git.kernel.org/?p=linux/kernel/git/x86/linux-2.6-tip.git;a=summary tip.git] tree. You can get general information about fetching and using this tree in his [http://people.redhat.com/mingo/tip.git/README README]. The [http://git.kernel.org/?p=linux/kernel/git/x86/linux-2.6-tip.git;a=shortlog;h=x86/xen x86/xen] topic branch contains most of the Xen-specific work, though changes in other branches may be necessary too. Using the [http://git.kernel.org/?p=linux/kernel/git/x86/linux-2.6-tip.git;a=shortlog;h=auto-latest auto-latest] branch is the merged product of all the other topic branches.

Testing

[This is obsolete, but it might be worth reading]

Xen/paravirt_ops has not had wide use or testing, so any testing you do is extremely valuable. If you have an existing Xen configuration, then updating the kernel to a current pv-ops and trying to use it as you usually would, then any feedback on how well that works (success or failure) would be very interesting. In particular, information about:

  • performance: better/worse/same?
  • bugs: outright crash, or something just not right?
  • missing features: what can't you live without?

Linux distribution support for pv_ops dom0 kernels

[This is obsolete, but it might be worth reading]

Fedora: Fedora 14 includes Xen 4.0.1 hypervisor and is able to run pvops dom0 kernel out-of-the-box. Fedora 14 does not ship with a dom0 capable kernel in the default distribution, but xendom0 kernel rpms are available from developer repositories. Fedora 13 and earlier versions ship with Xen 3.4.x and are not recommended for pvops dom0 usage, unless you update the Xen hypervisor to 4.0.x version. See this tutorial for more help: Fedora13Xen4Tutorial , and also check the Fedora dom0 wiki page: http://fedoraproject.org/wiki/Features/XenPvopsDom0 .

Debian: Debian 6.0 ("Squeeze") includes Xen 4.0.x hypervisor, and also dom0 capable kernel based on the pvops tree.

Other distributions: When using pvops dom0 kernel 2.6.32 or newer you need to have Xen hypervisor 4.0.1 or newer version.

2.6.32 and earlier kernels

[This is obsolete, but it might be worth reading] The kernel build process will build two kernel images: arch/x86/boot/bzImage and vmlinux. They are two forms of the same kernel, and are functionally identical. However, older versions of the Xen tools stack lack support loading bzImage files (pre-Xen 3.2), so you must use the vmlinux form of the kernel (gzipped, if you prefer).

Config files for 2.6.32

Example .config files:

64bit x86_64 (branch: xen/stable-2.6.32.x):

32bit PAE:

  • xen/stable-2.6.31.x (2.6.31.6):

http://pasik.reaktio.net/xen/kernel-config/config-2.6.31.6-pvops-dom0-xen-master-x86_32

  • xen/stable-2.6.32.x (2.6.32.10):

http://pasik.reaktio.net/xen/kernel-config/config-2.6.32.10-pvops-dom0-xen-stable-x86_32

  • xen/stable-2.6.32.x (2.6.32.27):

http://pasik.reaktio.net/xen/kernel-config/config-2.6.32.27-pvops-dom0-xen-stable-x86_32

Those kernel configs are based on Fedora 11/12 kernel configuration, with some modifications. They've been tested to work on multiple systems. Note that these .config files have various debugging options enabled which will decrease performance so don't use these .config files for performance testing!

Personal tools