At the September 2006 Xen Summit in San Jose, California, several of us met after the session on testing. It became very clear before, during, and after the session that many people were doing exactly the same sort of testing of Xen. Well, it seems kind of silly for so much effort to be duplicated in the community. Therefore, what is being attempted here is to collect all available information on Xen testing in order to cooperate and collaborate -- and spread the work around so no one has to do everything.
Testing Xen is enormously complex. Attempting to ensure complete testing of the code requires us to test Xen and the Linux kernel as modified for Xen. Factor into that the number of paravirtual and HVM guests possible, and the phenomenal breadth of hardware that can be used, and the combinatorics are staggering.
Hence, this proposal: we will employ a strategy of black box testing. We will test as much as we can of the various APIs and ABIs available; this provides us the flexibility to test indepedently of Xen itself (i.e., the tests are not tied to a particular version or instance of Xen), and it provides us the flexibility to contemplate other hypervisors and host OSs in the future -- as long as they abide by the defined interfaces.
Testing, not Certification
The goal is to determine whether or not Xen works correctly. The goal is explicitly not certifying Xen -- we have no desire to be involved in any sort of 'branding' or 'marketing' exercise around whether or not one can say 'Xen Inside'. Our only desire is to determine if Xen does what it claims to do or not.
In order to organize things, testing is broken down into these categories:
- Functional: is Xen doing what it should be doing? The tests in this category determine proper operation; e.g., does
xm createactually create and boot a guest domain?
- Regression: having fixed a bug, can we demonstrate the bug is still fixed? Ideally, for every bug filed, there will be a test created to demonstrate it, and the regression test suite will grow over time.
- Performance: does Xen operate quickly enough? Or at least within expected boundaries? There are two aspects to this, though: (1) Xen, and (2) the platform running Xen. The former of those will be to determine things like 'is the difference in disk I/O speed between Linux, dom0 and domU some reasonable percentage?'; the latter will have to be left to the platform vendors to determine whether or not Xen runs as quickly as they expect on their systems.
- Stress: these tests are to answer questions such as 'how many domUs can I actually run?'; 'how many domains can I migrate at the same time?'. Again, there will be a component that is the limit of what Xen can handle, and there will be limits specific to various hardware platforms. The latter are the concern of the platform vendors; this effort is concerned about the former.
The tests to be run, how they are to be run, and the criteria for determining whether or not the tests passed are contained in these tables:
- XenTest/KeyToColumns -- how to interpret the columns in the above tables