Xen 
 
Home Products Support Community News
 
   

Overview

This page is also being used to list all projects for Google Summer of Code (GSOC) 2010. While the primary purpose of this page is Xen Cloud Platform (XCP) work items, we are also allowing other Xen projects to be listed for review by GSOC participants. Have a look at the XCP short term roadmap and the XCP proposals page.

This page lists XCP project suggestions, suitable for new people to have a go at.

Suggestions are rated in three dimensions, each rated on a scale of 1 to 5.

  • Size, identifying the amount of code which would need to be written;
  • Knowledge required, identifying the amount of knowledge of the existing XCP code is required (and not necessarily programming experience); and
  • Impact, identifying the value of the benefits of an implementation to XCP.

For each suggestion, a contact is named who may be able to provide hints, advice or assistance relating to that area of XCP. Usually this is the originator of the suggestion. Being named as a contact does not entail the provision of a guarantee of help.

New suggestions or edits to existing suggestions are welcome.

Suggestions

Description Contact Size Knowledge required Impact
DRBD integration dave.scott@eu.citrix.com 2 2 3
OCFS 2 SM backend dave.scott@eu.citrix.com 2 2 3
parallax SM backend andrew.warfield@citrix.com 2 2 3
Create a query language for the CLI jonathan.davies@eu.citrix.com 4 1 2
Guest RPC mechanism dave.scott@eu.citrix.com 4 4 5
Convenient HTTP GET URL for consoles dave.scott@eu.citrix.com 1 1 2
Xen/XCP kernel/userland packages for Debian Ubuntu RHEL marcus.granado@eu.citrix.com 2 3 3
Move HTTP server into a separate process jonathan.ludlam@eu.citrix.com 3 1 4
Import and export improvements jonathan.ludlam@eu.citrix.com 3 4 4
Pool master in a VM jonathan.ludlam@eu.citrix.com 3 4 4
Improve SR-IOV support jonathan.ludlam@eu.citrix.com 2 5 5
Clone-on-attach jonathan.ludlam@eu.citrix.com 2 2 4
Expose new xen features jonathan.ludlam@eu.citrix.com 2 3 5
Database rollback jonathan.ludlam@eu.citrix.com 2 2 3
Procedure to remove database fields dave.scott@eu.citrix.com 3 2 3
Domain management daemon dave.scott@eu.citrix.com 5 5 5
Upgrade pool hosts in any order dave.scott@eu.citrix.com 5 4 5
Helper VMs dave.scott@eu.citrix.com 3 3 3
Automated pool configuration jonathan.davies@eu.citrix.com 2 2 4
Better error handling jonathan.davies@eu.citrix.com 2 1 1
Extend ocamldebug to support multiple threads jonathan.knowles@eu.citrix.com 4 4 4
Better parameterization of xapi marcus.granado@eu.citrix.com 2 2 3
Extra external authentication plugins marcus.granado@eu.citrix.com 2 2 4
Pool split rob.hoes@citrix.com 3 3 4
Better iSCSI multipathing pasik@iki.fi 2 3 3
HXEN Linux Port stephen.spector@xen.org ? ? ?
MacOS X fakeserver support avsm2@cl.cam.ac.uk ? ? ?
Bytecode Zamcov coverage testing avsm2@cl.cam.ac.uk ? ? ?
XenDbg - Xen Debugger support kamala.narasimhan@citrix.com ? ? ?
Perf - Virtualize PMU jeremy@goop.org 2-4 3 3-5
pciback domain jeremy@goop.org 4 4-5 5
unify graphics pass through in HVM guests konrad.wilk@oracle.com 1 2 3
pxe boot xcp using a stateless ramdisk root pasik@iki.fi 4 5 5

DRBD integration

DRBD project website describes DRBD as

block devices designed as a building block to form high availability (HA) clusters. This is done by mirroring a whole block device via an assigned network. DRBD can be understood as network based raid-1.

XCP has a set of storage plugins for a range of storage types. Several of these operate on top of block devices, in particular:

  • iSCSI - used for iSCSI "LUN per VDI" applications e.g. where you want to attach an existing LUN to a guest
  • HBA - used for FC "LUN per VDI" applications e.g. where you want to attach an existing LUN to a guest
  • LVM - carves up a single local disk into guests disks, allows snapshot, fast clone etc.
  • LVMoiSCSI - carves up a single iSCSI LUN into guest disks, allows snapshot, fast clone etc
  • LVMoHBA - carves up a single FC LUN into guest disks, allows snapshot, fast clone etc

One way to integrate DRBD would be to take the LVM storage backend and modify it so that, instead of running directly over a local guest disk, it is indirected through a DRBD device. This would allow (e.g.) active/passive replication of the SR and some form of VM failover, without requiring a shared disk.

OCFS 2 SM backend

OCFS2 website describes OCFS2 as

a POSIX-compliant shared-disk cluster file system for Linux capable of providing both high performance and high availability.

Two of the XCP storage plugins represent guest disks as .vhd files on a filesystem:

  • NFS - used when a shared NFS server is available
  • EXT - creates an ext3 filesystem on a local disk

An additional storage plugin could be created which represents guest disks as .vhd files on an OCFS2 filesystem.

parallax SM backend

parallax website describes parallax as

a more flexible storage system is needed to support rapid virtual machine creation and state capture.
...
Parallax can perform very low overhead snapshots, and can quickly provision new volumes based on template images.

Similar to the DRBD and OCFS2 suggestions above, a new storage plugin could be created which interfaces XCP to a parallax service.

Create a query language for the CLI

Scope

Limited to CLI component.

Description

At present, the "xe" command-line interface (CLI) is a rather difficult tool to use to extract data from the database. A common task is to want to join across tables. For example, to destroy the VDIs on local storage, you have to use a sequence of commands, as follows:

sr=`xe sr-list name-label=Local\ storage params=uuid --minimal`
for vdi in `xe vdi-list sr-uuid=$sr managed=true params=uuid --minimal | sed 's/,/ /g'`
do
echo vdi=$vdi
xe vdi-destroy uuid=$vdi
done

This is rather unclean. Better would be to allow the user to express all of this in one go. For example, in an SQL-like language:

SELECT vdi-destroy(vdi.uuid) FROM vdi JOIN sr ON vdi.sr-uuid=sr.uuid WHERE sr.name-label='Local storage'

Of course, SQL need not be the language of choice.

Getting started

  • Familiarise yourself with the current CLI.
  • Look at code in the xen-api.hg repository:
    • o ocaml/xe-cli/newcli.ml (client-side) o ocaml/xapi/cli_operations.ml (server-side)
  • Consider whether to implement joins, projection and filtering on the server-side or the client-side.

Guest RPC mechanism

Scope

xenstore, xapi

Description

xapi uses xenstore to configure domains and some guests use xenstore to signal back a limited amount of information (e.g. OS version, IP address). These xenstore protocols are low-level, ad-hoc and very error-prone. Extending them for new applications is very tedious. A lot of the time we only want something a bit more like a procedure call (e.g. "ask the guest to execute this script"). It would be great if it were possible to register an RPC server skeleton inside the guest and associate it with a named function and then call it via the XenAPI.

Getting started

  • For some of the ad-hoc stuff have a look in xen-api.hg
    • o ocaml/xenops/device.ml

Convenient HTTP GET URL for consoles

Scope

Limited: some HTTP and HTML

Description

On XCP console access is quite difficult. You have to either use the Javascript UI or manually login via the API, get a session, talk the non-standard protocol to make a tunnel and then talk VNC/RFB.

Running 'xe console-list' lists URLs for consoles... but these are non-standard HTTP CONNECT URLs: cutting and pasting into a web-browser results in a 404 error. Instead we could register a GET handler for the console URL and return an HTML fragment referencing the Java VNC client. We would also get HTTP basic authentication for free. The result would be:

  • user pastes console URL into the location bar
  • user is asked for username and password
  • user gets console

Getting started

  • Look in xen-api.hg:
    • o ocaml/xapi/console.ml

Xen/XCP kernel/userland packages for Debian Ubuntu RHEL

Scope

Build system around XAPI.

Description

  • Improve the XAPI build system to build packages for major distros
  • provide selected versions of these distros with Xen-enabled kernel packages.

Getting started

  • Build system

Move HTTP server into a separate process

Scope

Mainly the HTTP part of xapi

Description

The xapi process currently listens on the http port and deals with requests itself. To enable the use of other processes listening on the http port (for example, a storage service), the suggestion is to move the http server out into a separate process and then pass the connection (by passing the file descriptor) to the appropriate process. The authentication layer could then live in the http server and be shared by all services.

Getting started

The stdext library in xen-api-libs.hg already has a binding for the 'sendmessage', the POSIX call used to send fds between processes. An example of its use is in the fork-and-exec daemon 'fe' in xen-api-libs.hg. The stdext library also has some useful functions for helping with the daemonization procedure.

Import and export improvements

Scope

xapi and the nascent storage daemon. Careful thought has to be given to the design.

Description

Our current import/export file format is a tar containing an xml description of the metadata of the VM, and any disk images which consist of blocks of size equal to some number of megabytes, which are omitted from the tar if they are entirely zero. This could clearly be improved, particularly given that many of our disk images are backed by VHD files that are already sparse. However, the VHD files are not purely the answer, since thought would have to be given to what would be done when a chain of VHDs made up the VDI. Also thought must go into how an import would work to a non-VHD based SR.

Getting started

Examine the code in import.ml/export.ml

Pool master in a VM

Scope

xapi, maybe dom0

Description

The pool master is currently a designated physical host. If that host dies, a new host must be designated to be the new master. An alternative design is to have the pool master be xapi running in a VM. There are several advantages to this:

  1. If the host on which the master VM is running dies, the master VM can simply be started on another host, thus preserving (for example) the IP of the master
  2. Currently our control domain is limited to one vCPU. The master VM however could be multi-cpu, and hence CPU intensive tasks such as HA planning would be less problematic.
  3. The master VM could be migrated, hence not tied to a particular host.

Getting started

The SDK can be used as an example. The pool join logic will need to be made more permissive, and logic to ensure the master isn't selected as an appropriate host for VMs would need to be implemented.

Improve SR-IOV support

Scope

xenops, metrics (rrd), guest agent

Description

Hardware that supports SRIOV is now becoming available on server class machines, and Xen already has support for this. Xapi also has been seen to support this, but only as a prototype - with the metrics code being particularly fragile. The code for this prototype is available, but it is crude and makes some assumptions about the internal state of the guest. A better design needs to be made, which might involve modifications to the guest agent, and then the design needs to be implemented.

Getting started

The code for the prototype can be made available.

Clone-on-attach

To be filled in

Expose new xen features

Scope

xen-api.hg, libxenlight

Description

For XCP, we have intentionally been quite restrictive in what features we expose to users - preventing people from using some features that Xen is capable of. It would be nice to expose these features (walled off from normal use via an 'experimental features' type restriction). This would also serve as a good start at the integration of libxenlight.

Getting started

It's unlikely in the first iteration that we would be linking directly against libxenlight as it's a very new project and we don't want to expose xapi to any memory corruption issues that a new C project possibly has. Therefore it might be prudent to either use the libxenlight cli (xl) or link the library to a separate process and define an RPC API to talk to it.

Database rollback

Scope

database layer in xapi, and possible API definitions

Description

The database is a flat xml file, and any updates simply overwrites the previous database. It would be nice to be able to 'tag' a known good database and roll back to it if anything unfortunate happens.

Of course, the database is not totally self-contained, and necessarily has contains information about the outside world, e.g. disk images, network settings and so forth. Thus this feature would need to have a step to synchronise the database with the world. Some of this is already done - e.g. in dbsync, but some extra work would be needed, e.g. to make sure SR.scan was called, and to synchronise the network state (one way or the other).

It's not known at this stage whether this is a good approach or not - determining this would be part of the project.

Getting started

A good first step would be to try to make a (probably non-exhaustive) list of the database tables/rows/fields that represent external state not under xapi's direct control, and assess feasibility of the approach.

Procedure to remove database fields

To be filled in

Domain management daemon

To be filled in

Upgrade pool hosts in any order

To be filled in

Helper VMs

To be filled in

Automated pool configuration

Scope

Does not need to interfere with any code in xen-api.hg; could be implemented as a stand-alone API or CLI client.

Description

It is non-trivial to configure a pool of XCP hosts. Configuring a pool often consists of installing or upgrading hosts, configuring networking (bonds, management interfaces, etc.) and storage repositories. Unfortunately, many of these steps are fraught with potential difficulties. For example, the order in which hosts are upgraded is important, and the order in which hosts are joined to a pool and network bonds are created is important.

Typically, a sysadmin has a good idea of what configuration they want, and implement this configuration through a combination of XenCenter and the CLI. For casual administrators, the overhead of learning the CLI is large, and the job of setting up the pool may be a one-off task. It would be very useful to reduce the overhead of configuring a pool.

Design a means by which an administrator can specify a description of the configuration and use an (off-host?) tool to achieve this configuration. The tool could make use of the CLI and API, issuing the relevant calls in the appropriate order. An advanced version could also integrate PXE support to control host installation.

Getting started

Think about the steps involved in joining hosts to a pool such that they have a two-way bond.

Better error handling

Scope

The entire OCaml codebase.

Description

A regular source of bugs stems from incorrect assumptions about lengths of lists. It is common for the function List.hd to be executed on an empty list, giving rise to a 'Failure(hd)' exception. Usually, this exception is uncaught and rises to the top-level of xapi. It is usually hard to diagnose because the description is not very descriptive.

Instead, every place in the code where List.hd or List.tl are called should check that whether the list is empty and raise an appropriate exception with an appropriately descriptive error message if it is.

Getting started

Look at all the places where List.hd or List.tl are called where their argument is not guaranteed to be a non-empty list.

Extend ocamldebug to support multiple threads

Scope

Extension of the OCaml distribution.

Description

The OCaml distribution comes with a debugger called OCamlDebug. While OCamlDebug comes with support for remote debugging (over a network connection), it has no support for multiple threads.

Since Xapi makes significant use of threading, it's not currently possible to use OcamlDebug to debug remote Xapi instances. As a workaround for the lack of a proper debugger, developers often resort to inserting logging statements in their code. However, this method requires developers to recompile their code and restart the Xapi instance of interest - in the process often losing the state they had hoped to inspect.

A successful project should achieve the following:

  • extend the over-the-wire protocol between OCamlDebug and OCamlRun to support multiple threads
  • extend OCamlRun and OCamlDebug to support this protocol
  • generate a set of patches to the OCaml distribution. We could keep these patches internally (not desirable, due to the maintenance overhead), or we could contribute them to the OCaml project (much more desirable).

Getting started

Examine how multi-threaded debugging works in other languages:

Read up on and/or reverse-engineer the existing OCamlDebug protocol.

Better parameterization of xapi

Scope

The entire XAPI codebase.

Description

  • Expose most XAPI internal hard-coded constants inside easy-to-modify plantext configuration files in Dom0.
    • o eg. timeout constants, strings
  • Create Ocaml xen-api-lib module to read from such configuration files
  • synchronization/refresh mechanism (inotify?)

Getting started

  • Have a look at all the timeout contants in the database and http connection logic inside XAPI.

Extra external authentication plugins

Scope

External authentication framework in ocaml/auth.

Description

  • Create new interesting External Authentication Plugins (EAP) for XAPI
    • o Improve PAM EAP (eg. use a new /etc/pam.d/xapi-pam-eap instead of xapi's /etc/pam.d/xapi) o Samba EAP (either directly or via PAM EAP) o Radius EAP (either directly or via PAM EAP) o a bash-hookable EAP (that could be extended/customized via bash in Dom0)

Getting started

  • see files in ocaml/auth for the Likewise/AD EAP and the existing PAM EAP.
  • modify the ocaml PAM bindings in xen-api-lib.hg/pam to accept a different pam configuration file as a parameter from PAM EAP.

Pool split

Scope

  • Pool join and eject code

Description

Currently, the only way to get a host out of a pool is by using Pool.eject, which bring the host back into a state resembling a fresh install. This means that it is not possible to remove a host from a pool, while keeping its state, and any VMs running on it. It may be useful to be able to split a pool into two (or more) smaller pools while maintaining all metadata and VMs.

Points to think about:

  • How to choose the masters of the new pools.
  • What to do with shared storage. To which pool go halted VMs that are on shared storage.

As a further extension, consider how to join two (or more) pools that contain more than one host. Currently, one would have to first break down one of the pools to individual hosts (again losing all state), and then add these hosts to the other pool.

Better iSCSI multipathing

Scope

  • Enable iSCSI multipathing when all the ports/interfaces are in the same IP subnet as the iSCSI target array (for example Dell Equallogic iSCSI SAN storage).

Description

Currently it's not possible to establish multiple separate iSCSI sessions through multiple eth-interfaces if all the interfaces have IP from the same subnet. upstream open-iscsi has supported 'ifaces' feature for some time, so you can bind each open-iscsi 'iface' to a specific eth-interface. This allows you to establish separate iSCSI sessions from every NIC even when they are in the same IP subnet. RHEL 5.3 and newer support this functionality.

Example of this open-iscsi 'ifaces' functionality: http://pasik.reaktio.net/open-iscsi-multiple-ifaces-test.txt

MacOS X fakeserver support

Scope

  • Port the "fakeserver" to run under MacOS X, and possibly other related UNIX-style operating systems where Xen doesnt work (e.g. OpenBSD)

Scope

XAPI only runs on Linux at the moment, but for development purposes with the toolstack, it would be useful to run the toolstack and "fakeserver" on MacOS X, with VMs being represented by stub callbacks (and possibly use the fake VNC server which is also in the tree).

XenDbg - Xen Debugger Support

Goal: Help expedite troubleshooting issues, find faulting code easily etc. To that end adding better debugging support will help. A debugger may not necessarily help track down timing/some interrupt related issues but for the rest it should certainly be helpful.

Why target this for a summer project? We can break down this topic into more modular sub topics to be worked out in different stages by different folks. Also, once we build a sound base and provide an interface for creating add-ons/plug-ins/extensions, others in the community can easily add those extensions to better expose the under the hood data they care about most without destabilizing the core debugging functionality.

Components/Tasks breakdown:

Here is a quick/rough breakdown of tasks/components at a very high level (first cut) -

Target changes:

This would include additions to the hypervisor itself for it to be debugged. Likely to be too intrusive for upstream; so we might want to maintain it as patches that would easily apply against all major releases and latest xen-unstable.

Host changes/GUI debugger:

A GUI debugger, home grown or built on top of an existing one; preferably home grown as it would provide more flexibility in the long run. Phase 1 could be very basic with break into, step-in, registers, callstack etc. Phase II could special case and better handle dom0, guests etc.

Debugging medium:

Phase 1 could just include serial debugging. Firewire and USB support could be added in subsequent phases.

Symbol Engine:

Better symbol handling to hunt down the right symbols and show callstack with appropriate symbol and proper source correlation with disassembler etc.

Core Xen debugging commands:

Under the hood information about Xen core components like memory management info can be exposed through core Xen debugging commands. Most of the current key handler implementation would fall under this.

Xen debugging extension interface:

Provide an interface that would read from, write to memory location, provide symbol info, basic register info etc. for others to build extension on top of the core commands.

Sample Extensions:

Example usage of debugging extension interface.

Live debugging support:

Support to run on target and glean minimal information.

Crash dump generation and analysis:

Support for this could be added in subsequent phases.

Perf - Virtualize PMU

The goal of this project is to get Linux's "perf" performance measuring infrastructure in a Xen domain, and making it available to the usermode "perf" tool. Ideally this will be available to both PV and HVM guests.

The project would consist of several parts:

  • Xen: examine the existing PMU support, and determine whether it is sufficient, or extend it appropriately.
  • Linux: Interface the existing kernel perf infrastucture to use Xen's ABI.
  • (Bonus) Add virtualization-specific events, such as stolen time and other hypervisor scheduler events
  • (Bonus) Add system-wide performance monitoring, so that interdomain effects can also be observed
  • (Bonus) Use the tools to evaluate pvops performance, and improve it

This project will require an understanding of the CPU's PMU hardware features, some aspects of Xen's internals, and the architecture of Linux's performance measuring infrastructure. Ideally the outcome will be a set of clean patches which can be applied to mainline Xen and Linux development.

pciback domain

The goal of this project is to develop a special-purpose "pciback" domain. This would be similar to other special-purpose domains, such as for xenstore, qemu stub, etc. It's purpose would be to be responsible for managing all the PCI devices (including platform hardware such as bus bridges, interrupt routing, etc) on the system, and acting as a PCI backend for other domains. In this model, "dom0" or driver domains would still exist, but rather than directly accessing hardware they would get the devices passed through pcifront.

This has several outcomes:

  • This simplifies modifications to a PV kernel, because it eliminates most of the difference between a "dom0" and a "domU" kernel.
  • Goes a long way to allowing HVM domains to be driver domains (with the addition of HVM backend drivers)
  • Removes dom0 from the trusted base, at least with respect to hardware access (the pciback domain and Xen would manage VT-d and related hardware features; dom0 would be no different from any other domain with passed-through devices)

The steps to complete this would be:

  • Work out what base OS to use for pciback (minios?)
  • Work out what's needed to drive all the platform hardware (port PCI code from Linux?)
  • Implement pciback within this framework
  • Work out how to boot the system (how do bring both dom0 and pciback up at the same time?)
  • Evaluate performance (does this add additional context switches? Does it matter? Are there any surprises?)
  • Is it worthwhile? Does the maintenance burden of a new special purpose domain outweigh the benefits?

unify graphics pass through in HVM guests

The biggest problem with passing a graphics card to an HVM guest is that the guest device drivers expects the card to be in a reset state. The BIOS is the one that does that, but running the graphic card in an HVM presents a twist to the problem - the card has been already set and a subsequent BIOS-reset is required. Depending on the card this can take the form of D3 PCI state, Function Level Reset, or some proprietary mechanism, or calls from the option ROM to the Bochs BIOS. To compound the problem, having QEMU load the option BIOS isn't that easy. There exists a mechanism to this for Intel Integrated Graphics (IGD) controllers but it has not been extended to other cards (NVidia, ATI, etc).

The steps involved are to:

  • Work out a generic framework for extracting ROM images, passing them in QEMU
  • Figure out required resets mechanisms
  • Update QEMU to have appropriate PnP BIOS calls in place if required.

Advantages of this:

  • Allow un-restricted access to graphic cards.
  • 2D and 3D operations in guests.
  • Pave the work for other PCI cards to be passed in (fibre)

pxe boot xcp using a stateless ramdisk root

This feature allows you to run xcp hosts without any local disks attached. The xcp dom0 filesystem is stored on a ramdisk root, which is loaded from a tftp server during pxe boot. This method doesn't require a dedicated iscsi root disk for each xcp host. steps:

  • figure out all the files in the current xcp dom0 having some state information or per-host customized configuration data in them.
  • create a prototype xcp dom0 ramdisk root image that can be pxe booted.
  • plan how to centrally manage all the configuration/state files.
  • modify the ramdisk image to work with central configuration management tool.
  • create scripts that can be used to create and update the ramdisk root images.
  • write documentation about how to set up the pxe-based automatic xcp provisioning system.
  • implement actual virtual machine appliance that can be used as the pxe/boot server for stateless xcp hosts, and also possibly as the state/configuration manager.
  • NFS-root might be an easy option to start experimenting with this feature.

This feature will require a lot more actual planning. VirtualIron (based on Xen) implemented this feature already years ago, so we could check their dom0 images for tips/help.

Benefits

  • quick and simple provisioning of new xcp physical hosts: power on the server, and the server will automatically register itself to the management server after pxe booting xcp. No software to install manually.
  • quick and simple software updates: just reboot the physical xcp host, and it'll automatically load (pxe boot) the new version of hypervisor/kernel/toolstack.

XAPI project suggestions (last edited 2010-05-27 14:31:29 by DavidScott)