Remus provides transparent high availability to ordinary virtual machines running on Xen. It does this by continually live migrating a copy of a running VM to a backup server, which automatically activates if the primary server fails. Key features:
- The backup VM is an exact copy of the primary VM (disk/memory/network). When failure occurs, the VM continues running on the backup host as if failure had never occurred.
- The backup is completely up-to-date: even active TCP sessions are maintained without interruption.
- Protection is transparent: existing guests can be protected without modifying them in any way.
Host (dom0) requirements
- Xen hypervisor with remus support and tools (included with Xen 4.0+)
- Note: Remus is not included with XCP, XenServer, or with some of the Linux pre-packaged versions of Xen, so please check your distribution or you may need to build Xen from source
- Xen dom0 kernel that meets the Remus dom0 requirements
- Shared storage is not required
- DRBD shared storage is supported, allowing faster and automatic re synchronization after a failed host is brought back online
- Otherwise to bring a failed node back online, the VM must be turned off to perform disk re-synchronization
Guest (domU) requirements
- Xen PV guests that meet the Remus PV domU requirements
- Xen HVM guests don't require any changes for Remus
Installation varies slightly depending upon the host platform, so please see the guides below for examples.
- Install Xen 4.2.1 with Remus and DRBD on Ubuntu 12.10
- Install Xen 4.1.4 with Remus and DRBD on Ubuntu 12.10
Using DRBD instead of blktap2 for storage replication allows for quick re-synchronization of the disk backend after failed host is back online. Since storage (re)synchronization is done online - while the VM is operational, there is no need to shutdown the VM. Once storage is synchronized, one can start, stop and restart Remus on a running VM anytime.
However, DRBD must be custom built with support for protocol D (see the above install guides), so the normal packaged versions of DRBD are not suitable.
Note that DRBD will be operated in dual primary mode, which carries a number of risks and management issues. Please research this topic to be aware of the potential complications.
- Install DRBD 8.3.11 (remus version) on Debian Squeeze/Ubuntu 10.04
- Install DRBD 8.3.11 (remus version) on Ubuntu 12.10
- For PV domUs, Remus requires "suspend event channel" kernel support. Otherwise Remus can run most any PV domU in a degraded performance mode. This kernel support is not widely available (not currently available in Ubuntu, for example), but is available with OpenSUSE. See Remus PV domU requirements for more information.
In Xen 4.0.0:
- Xen hypervisor and tools have Remus support.
- Only linux-2.6.18-xen is supported as Xen dom0 kernel with Remus.
- If using a PV domU you need to run linux-2.6.18-xen as domU kernel.
In Xen 4.0.1:
- Pvops dom0 kernel support for Remus has been added in Xen 4.0.1-rc4, so it's available in Xen 4.0.1 final release. You can use Linux 2.6.32 based pvops dom0 kernel with Remus.
- PV domU kernel still needs to be linux-2.6.18-xen.
In Xen 4.2:
- Many bugfixes to Remus.
- Remus support for pvops domU kernels: Linux 126.96.36.199 and later upstream kernel.org versions are now supported as PV domU kernels, in addition to Jeremy's xen.git xen/stable-2.6.32.x branch.
- For better Remus performance you should use a domU kernel with "suspend event channel" support, which means linux-2.6.18-xen, or any of the xenlinux forwardports (novell sles11sp1 2.6.32 kernel, for example). pvops domU kernels don't have suspend event channel support yet.
- Checkpoint compression for less data to transfer between hosts.
In Xen 4.4:
- Experimental support for xenlight (xl). Per the man page, there is no support for network or disk buffering at the moment.
Xen 4.5 plans: as per 4.5 Development Update
- Significant improvements to Remus in Xen are currently reviewed, such as better XL integration, better DRDB support, stability and performance improvements and enablement of COLO support
- COLO support in Xen (see COLO XPDS13 video, COLO XPDS13 presentation)
- The "xl remus" command has a similar, but not identical, syntax to the traditional Xend based "remus" command. See Remus_Toolkit_Differences. Also, "xl remus" still has limitations in the 4.4 release, per the man page.
- Research papers about Remus:
- Remus - NSDI 2008 Design and fundamental concepts.
- SecondSite - VEE 2012 Remus in wide-area, with checkpoint compression, replication and storage re-synchronization using DRBD.
- Configuring and installing Remus (Xend version) with DRBD, Xen 4.1.2 under Debian Squeeze/Ubuntu 10.04: http://remusha.wikidot.com/
- Video demonstration of Remus: http://joburg.eu/en/video/2jV4lOgFJMY/Screen-cast-Remus-High-Availability-based-on-Xen-Hypervisor
- Source code for Remus version of DRBD 8.3.11 http://remusha.wikidot.com/local--files/configuring-and-installing-remus/drbd-8.3.11-remus.tar.gz