Xen 
 
Home About Xen.org Xen Xen Summit Wiki Mailing List Bug Tracker Xen Downloads
 
   

SAN based EVMS-HA SLES 10 setup

This small howto describes the setup and configuration of Xen installations in a fiber/iscsi SAN environment. It assumes you have some basic knowledge ofHeartbeat 2, Xen andEVMS. If not; please read some basics on their corresponding websites.

  • EVMS-HA is also possible using a 2-node DRDB configuration. This is not (yet) described here. The essential difference is that a DRDB with HA configuration also requires the actual failover of resources, whereas on a SAN based level this is not required as the I/O access should always be possible (hence the fencing with EVMS-HA, to prevent problems and inconsistency).

Why is this example Novell SLES 10 based? Usualy when working with SANs this implies that business continuity is important and commercial support is a handy "addon" when things go wrong. This setup will probably work for other distributions as well.

Install SLES10 dom0

Install plain SLES10 copies, make sure at installation time you check the Xen-software. EVMS and HA software should also be installed. An X is not required, but recommended if you want to use the Yast Xen GUI interface. Do not apply EVMS or LVM to your disks, instead keep the disks on the plain old DOS partition structure. Change /boot/grub/menu.lst so that it boots the Xen kernel. In a new install, this simply requires changing "default 0" into "default 1". Now reboot the system. Don't forget to update the system before continuing.

SAN operations

Please make sure you present the same LUNs to all your participating Dom0 domains and they can simultaneously perform I/O operations on the LUNs! Preferably name the LUNs the same on each domain, this is particularly easier in FC-SAN (ie. /dev/disk/by-id/WWN-LUN) than with iSCSI, iSCSI might require some udev rules (not in this scope atm).

EVMS configuration files

As this example starts from the concept of using EVMS to administer your Xen backend SAN LUNs, you do not want to include local disk management inside the EVMS-HA cluster for the ease of administration to keep local and xen disk management stricly separated. Change /etc/evms.conf in the following way:

sysfs_devices { 
  include = [ * ] 
  exclude =  [ cciss* ]
   ... 
}
 ... 
csm { 
  admin_mode = no 
}

Where "cciss*" is to be replaced by your local disk(s). admin_mode has little to do with admininistration but more with recovery/maintenance if things have gone bad, so turn it off. More on this in the evms user guide

Heartbeat 2 configuration

Setup heartbeat on all nodes to enable EVMS cluster awareness. Next to the usual ip and node configuration (which can be done using yast in SLES10), just adding 2 lines to /etc/ha.d/ha.cf will do:

respawn root /sbin/evmsd
apiauth evms uid=hacluster,root

A full configuration would look like this:

autojoin any
crm true
auto_failback off 
ucast xenbr0 172.16.0.1
node dom0-1
node dom0-2
respawn root /sbin/evmsd
apiauth evms uid=hacluster,root
  • ucast xenbr0 172.16.0.1 might seem a little odd here, and it is. This applies specifically (and only) with bonded network interfaces in your dom0! This is a specific issue with software bridging and multicasts for HA, they simply don't seem to work (yet). So use ucast to each node with bonding, also xenbr0 tends to work better than using other interfaces.

Start the cluster node by node (slowly, give each one the time to come up and discover its neighbours):

/etc/init.d/hearbeat start

Make sure both sync and come up (keep an eye on /var/log/messages). You can use the crmadmin tool to query the state of master and nodes. Also usefull is cl_status for checking link & daemon status.

EVMS Cluster

* When all nodes are up and running, start evmsgui (or evmsn, whichever you prefer) on one of the nodes. If you click the settings menu and find the option "Node administered" enabled, congratulations you've got an cluster aware EVMS!

* Create a Containter with the Cluster Segm Manager, select your attached SAN storage objects (for the ease of demonstration, SAN storage is named sdb and sdc), choose whichever node name, type shared storage and name the container c_sanstorage for example.

* You can patch through the SAN disks to EVMS volumes (see the disk list in available objects). Don't do that, as these volumes will be fixed in size (as they were originally presented from the SAN), instead use EVMS for storage management. For this, create another container, this time a LVM2 Region Manager, in which you store all the objects from the CCM c_sanstorage (objects will have a name like c_sanstorage/sdb, c_sanstorage/sdc, ...). Choose the extent size at will and name it for instance vgsan.

* Go into the Region tab, and create a region from the LVM2 freespace named and sized at your will; domu01stor, domu02stor, ...

* Save the configuration, all settings will now be applied and you will find your correctly sized+named volumes in /dev/evms/c_sanstorage/

* Close the evmsgui/evmsn tool and partition the evms disks:

fdisk /dev/evms/c_sanstorage/domu01stor

* Parititioning might require a start of the evmsgui/evmsn tool and a save action, in order for the partitions to appear like this:

/dev/evms/c_sanstorage/domu01storp1
/dev/evms/c_sanstorage/domu01storp2

* Format the partitions; ext2/3 and mkswap. Now it's easy to local mount the EVMS-device in question on a temporary mount:

mount /dev/evms/c_sanstorage/domu01storp2 /mnt/tmp
  • Note that experienced EVMS users can also perform the latter operations using EVMS tools, just add a Dos Segm Manager on the EVMS volume in question and size at will and format.

* Copy an existing unpacked Xen domU on it (ie. jailtime images). This will result in a full file structure inside the EVMS volume:

/mnt/tmp/bin
/mnt/tmp/boot
/mnt/tmp/etc
 ...

* umount

* Modify the Xen configuration file(s) and let them point (and boot) at your EVMS volumes.

Known issues

  • For the EVMS configuration to be updated at all nodes, you have to select each node in "node administered" and select save for each node (as only then the correct devices will be created on the node).
  • This could be a structural question, but... being cluster aware, shouldn't the EVMS-HA combination (with CCM) provide locking on volumes created beneath CCM? It is perfectly possible to corrupt data on an EVMS volume on node 2 which volume is also mounted on node 1. Some kind of locking should step up:

dom0-2# mount /dev/evms/c_sanstorage/domu01storp2 /mnt/tmp
failure: volume domu01storp2 is already mounted on node dom0-1
  • Or something amongst those lines. It has nothing to do with shared/private CCM type, issue also exists in private CCMs. To top that, in a private CCM, each node sees the CCM as owned by itself, not by its neighbour!

Both issues are kind of essential with Xen domains, you wouldn't want to boot the same domain twice (one copy on dom0-1 and another running on dom0-2) as data corruption is garanteed!

Addition

After some research and help from Novell I figured the following:

  • The EVMS-HA combination, provides transparant block mapping, but does NOT provide locking (as CLVM does)! Hence you'd have to go for a bloated Filebacked Xen system on a OCFS2 filesystem... or use the following for preventing double-domU's running:

The high availability program Heartbeat 2 has a Xen resource agent (OCF RA). Currently it features a dynamic restart of a resource (domU) it a physical HA-node (dom0) goes down. Live migration is supported in Heartbeat 2 and can be tested with SLES10 SP1 RC4 (I have not tested it myself yet, will update on procedure asap).

EVMS-HAwSAN-SLES10 (last edited 2007-05-29 07:09:38 by subspawn)