Migrar XEN KVM

De Slacam_Wiki
Ir para: navegação, pesquisa

Migrate Paravirtualized Xen to KVM under RHEL

Posted by Roel Gloudemans on 9 July 2009 | 0 Comments

http://www.gloudemans.info/migrate-paravirtualized-xen-to-kvm-under-rhel/

Tags:

Update July 11, 2009: Re-registering VMs at RHN uses an extra entitlement with RHEL5.4Beta
Update July 15, 2009: Swap usage, clock and disk cache of the virtual machine
Update July 16, 2009: Replace virsh create with virsh define & start to create a managed domain and not a transient one
Update September 2, 2009: Re-registering with RHN works
Update September 2, 2009: RHEL5.4 has been released. Added a note about services on the physical host
Update September 6, 2009: Updating TimeKeeping and Hugepages
Update April 1, 2010: Hugepages configuration for RHEL 5.5
Update July 3, 2010: Make Hugepages mountpoint persistent
RedHat Enterprise Linux version 5.4 is out. It heralds the arrival of KVM as RedHat's official hypervisor. RedHat will be supporting Xen for the rest of the RHEL5 life cycle, so for the moment, there is no need to migrate to KVM.

However migrating to KVM has some advantages. For one KVM looks simpler from the outside, another is that it works with a normal kernel, meaning that all drivers that work on a normal kernel work as well. This not only encompasses display drivers, but CPU scaling (dynamically adapting the speed of the CPU) as well. This is not only very "green" but makes a difference is your or the companies wallet as well.

RedHat put a lot of work into making Xen easier to manage in RHEL5.0-5.3. As a result Xen uses a single disk image from which it can boot. The format of this image is the same as for KVM. One would suspect that migrating from one Hypervisor to another would be easy and it is. This blog will describe a step-by-step scenario on how to do it.

The starting situation is a RHEL5.3 Physical host with RHEL5.3 paravirtualized guests. The guests have two networking interfaces, one bridged to the physical network interface, and one bridged to a dummy network interface for an internal host network. Note that the minimum requirement to run with virtio is RHEL5.3.

Note:
I had some trouble with selinux in the rhel 5.4 beta. It is related to the attributes on /var/lib/libvirt. I do not use this directory to store the images, bit I use raw LVM volumes. To get my system running again, I just disabled selinux.

Configure the virtio drivers
Open /etc/modprobe.conf in the editor. In our case /etc/modprobe.conf contains the following lines:

alias eth0 xennet
alias eth1 xennet
alias scsi_hostadapter xenblk

change it to

alias eth0 virtio_net
alias eth1 virtio_net
alias scsi_hostadapter virtio_blk

Now add the virtio drivers to the kernel boot image (modify this lane to mirror the latest kernel version)

mkinitrd -f --with=virtio_blk --with=virtio_pci --builtin=xenblk initrd-2.6.18-128.1.16.el5.img 2.6.18-128.1.16.el5
The --builtin is necessary only when currently running under a xen kernel in paravirtualized mode

Internal clock
The internal clock of KVM is less stable than the clock under Xen. Heavy loads have been know to cause clock drift. There are two workarounds:

  • Boot with divider=10 notsc (see earlier) and start ntpd at boot (chkconfig --level 2345 ntpd on; configure ntp first)
  • Use the -no-kvm-pit-reinjection option with qemu-kvm. One of the improvements added to the final version is that libvirt seems to add this option by default now, so everything should work out of the box. You still need to start ntp though.


Also see ( https://bugzilla.redhat.com/show_bug.cgi?id=507834 )

Now shut down the virtual system (shutdown -h now)

Updating the host
The physical host needs some updating as well. First, before you start, make sure all virtual systems are stopped (xm list) and that you are logged on as root. If RHEL5.4 is already released, yum will update the system automatically to this version. If now, the system needs to be subscribed to the RHEL5.4 beta channel. You can do this at RedHat network, if your system is subscribed to rhn. Also make sure the system has access to the Virtual Platform channel beta. Aside from the updates, some new packages need to be installed as well and all virtualization services must be disabled at boot time until we are ready with the configuration work.

yum clean all #for safety
yum update
yum install kernel kvm kvm-tools kmod-kvm kvm-qemu-img bridge-utils
chkconfig --level 2345 xend off
chkconfig --level 2345 xendomains off
chkconfig --level 2345 rhn-virtualization-host off

Edit /boot/grub/menu.lst and set the default boot kernel to the newest non-xen kernel (see example grub config)

Network configuration
By default only a network that is connected via NAT to the outside world is created. There are three options, leave it as is, but check that the IP range does not conflict with anything on the local network, change the IP range, or convert it to a host only network. I left the network, but adapted the IP range and created a new network for host-only networking. Be sure to change the uuid of the network. The format of the uuid should not change. Change any hex number [0-9|a-f] in the uuid string.

/etc/libvirt/qemu/networks/default.xml
<network><name>default</name><uuid>cc06c2a2-0766-45ee-baaa-896e04c7a3be</uuid><forward mode="nat"/><bridge name="virbr0" stp="on" forwarddelay="0"/><ip address="a.b.c.d" netmask="255.255.255.0"><dhcp><range start="a.b.c.e" end="a.b.c.f"/></dhcp></ip></network>
/etc/libvirt/qemu/networks/hostonly.xml
<network><name>hostonly</name><uuid>04255669-803e-d8f6-352a-086fa45ae09d</uuid><bridge name="virbr1" stp="on" forwarddelay="0"/><ip address="a.b.g.h" netmask="255.255.255.0"><dhcp><range start="a.b.g.i" end="a.b.g.j"/></dhcp></ip></network>


The host-only network should be started at boot, so  ln -s /etc/libvirt/qemu/networks/hostonly.xml /etc/libvirt/qemu/networks/autostart . Note that his network will replace the network coupled to the dummy0 interface, so dummy0 should not start up after a reboot. To do this, move  /etc/sysconfig/network-scripts/ifcfg-dummy0  to a safe location, or edit it and change the ONBOOT option from "yes" to "no".

Note:
If you run any services on the physical host, which are bound to the network interface of the host only network, you need to watch the boot order. Most services are started before libvirtd. The Virtual bridges only exist after libvirtd has been started. Any services started before libvirtd will not be able to bind to the virbrX interface. Named (bind) for instance binds to the interfaces. If you use the host only network to access a nameserver on the physical hosts, you need to restart named after boot (of the physical host), or the guests cannot access the nameserver.

The bridged network is a bit more complex. Use the configuration file of eth0 as a basis. cp /etc/sysconfig/network-scripts/ifcfg-eth0 /etc/sysconfig/network-scripts/ifcfg-br0. Remove the lines crossed out below and change/add the bold statements.

&#13;
/etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
BOOTPROTO=static
BROADCAST=a.b.c.255
HWADDR=ab:cd:ef:gh:ij:kl
IPADDR=a.b.c.d
NETMASK=255.255.255.0
NETWORK=a.b.c.0
BRIDGE=br0
ONBOOT=yes
/etc/sysconfig/network-scripts/ifcfg-br0
DEVICE=br0
BOOTPROTO=static
BROADCAST=a.b.c.255
HWADDR=ab:cd:ef:gh:ij:kl
IPADDR=a.b.c.d
NETMASK=255.255.255.0
NETWORK=a.b.c.0
ONBOOT=yes
TYPE=Bridge


Now br0 can be used as a bridge interface. All traffic over the bridge interface is subject to filtering by IPtables. I think this is a great feature and allows you to centralize firewalling on each host. Even better is that the firewall rules are now susceptible to change if the virtual machine is ever compromised. However, Xen worked in a different fashion. Our Xen based images will have their own firewall rules. To skip the firewall rules for the physical host do:


echo net.bridge.bridge-nf-call-ip6tables = 0 &gt;&gt; /etc/sysctl.conf
echo net.bridge.bridge-nf-call-iptables = 0 &gt;&gt; /etc/sysctl.conf
echo net.bridge.bridge-nf-call-arptables = 0 &gt;&gt; /etc/sysctl.conf

Swap usage and caching
If your physical machine is only running Virtual Machines and the memory is not oversubscribed (all VM's together use not more than 80-90%) of total memory, you might want to limit swapfile usage. Since the kernel sees the VMs as a process, rules for processes apply as well. One of those rules means that pages that are not referenced for a while are paged out to swap. The purpose is to free up memory to use for other processes or cache. This speeds up things that are being used. For a VM this is unwanted behavior. On a dedicated host nothing else does run and I don't want my VMs being cached, since that is already happening inside the VM. Double caching gives inconsistent performance behavior, let alone the effects when the host crashes.

There are two ways to put a stop to paging and swapping. The first is not to create a swapfile at all. The second one is to set the kernel dwappiness parameter to a low value. I've set it to 0.

echo vm.swappiness = 0 &gt;&gt; /etc/sysctl.conf


See the virtual machine config file on how to turn off disk caching for virtual machines.

Converting the virtual machine configuration file
There are two ways of converting to KVM. The easiest one is to use virt-manager and create a new virtual machine with exactly the same details as the old one, but point it to a different virtual disk (smallest possible) to prevent overwriting any existing data. Then stop the machine (no need to really install anything) and change the configuration file in/etc/libvirt/qemu by hand to point at the right disk image. This method requires you to reboot first. Else the configuration tools wont see the networks we just created.

The other method is to convert the virtual machine definition by hand. Below is an old Xen definition file (/etc/xen/test1:

name = "test1"
uuid = "4a07fde8-f244-2a6d-9603-85ff2179a9bb"
maxmem = 512
memory = 512
vcpus = 2
bootloader = "/usr/bin/pygrub"
on_poweroff = "destroy"
on_reboot = "restart"
on_crash = "restart"
vfb = [ "type=vnc,vncunused=1,keymap=en-us" ]
disk = [ "tap:aio:/var/lib/xen/images/test1.img,xvda,w" ]
vif = [ "mac=00:16:3e:1a:d0:96,bridge=xenbr0", "mac=00:16:3e:1a:d0:97,bridge=xenbr1" ]


This information can be converted into a KVM configuration file  (/etc/libvirt/qemu/test1.xml . Take care to use the same MAC addresses for the network interfaces or else they won't be recognized when the virtual machine is booted. Also watch the serial and console arguments to not point to the same serial port for multiple VMs. You could use  virsh list  and virsh dumpxml  as a starting point. However you must do this  before  starting with this howto.


<domain type="kvm"><name>test1</name><uuid>48156322-4e0c-b658-b80a-1bf3b608b49d</uuid><memory>524288</memory><currentmemory>524288</currentmemory><vcpu>2</vcpu><os><type arch="x86_64" machine="pc">hvm</type><boot dev="hd"/></os><features><acpi/><apic/><pae/></features><clock offset="utc"/><on_poweroff>destroy</on_poweroff><on_reboot>restart</on_reboot><on_crash>restart</on_crash><devices><emulator>/usr/libexec/qemu-kvm</emulator><disk type="file" device="disk"><driver name="qemu" cache="none"/><source file="/var/lib/xen/images/test1.img"/><target dev="vda" bus="virtio"/></disk><interface type="bridge"><mac address="00:16:3e:1a:d0:96"/><source bridge="br0"/><model type="virtio"/></interface><interface type="network"><mac address="00:16:3e:1a:d0:97"/><source network="hostonly"/><model type="virtio"/></interface><serial type="pty"><source path="/dev/pts/2"/><target port="0"/></serial><console type="pty"><source path="/dev/pts/2"/><target port="0"/></console><input type="mouse" bus="ps2"/><graphics type="vnc" port="-1" autoport="yes" keymap="en-us"/></devices></domain>


If you are using a partition as a virtual disk the Xen configuration  disk = [ "phy:/dev/vgvm/lvmyvolume,xvda,w" ] translates to:

 

<disk device="disk" type="block"><driver cache="none"/><source dev="/dev/vgvm/lvmyvolume"/><target dev="vda" bus="virtio"/></disk>


If you want to bind the virtual cpu to a physical one use the following vcpu syntax:


<vcpu cpuset="cpu1,cpu2,cpu3">virtual cpus</vcpu>
for example
<vcpu cpuset="0,1">4</vcpu>


Also see  http://libvirt.org/formatdomain.html  If you want to verify that the xml file is correct, use the  virt-xml-validate command.
now reboot the host Starting the virtual machines
You can now start the virtual machines by using the virsh command. Open a console directly after starting the domain to monitor boot progress. You also might want to start the machine after booting.

virsh define /etc/libvirt/qemu/[mymachine.xml]
virsh list
virsh start [mymachines ID]
virsh console [mymachines ID]
virsh autostart [mymachines ID]

Improving Performance with Hugepages
Note:
There could be some unwanted interaction with SELinux here. If you run into problems, either don't use Hugepages or turn SELinux off

KVM uses 4kB memory pages by default, just like any other process. One of the main differences between a normal average process and a kvm virtual machine process is the amount of memory allocated to it. Virtual machines normally use hundreds or even gigabytes of memory. This means a lot of overhead when the CPU switches between virtual machines since large memory tables need to be updated each time.&#13;

RHEL 5.4 and Hugepages
Linux also has Hugepages, special memory pages that are 1,2 or 4MB in size, shortening the list of memory pages dramatically and improving performance up to 10%. Sadly, support for Hugepages hasn't been implemented into libvirt. There is work on it in Fedora 12, but I don't expect to see those developments in RHEL5. There is a way however. First lets start by reserving the Hugepages. The file /proc/meminfo should contain the Hugepage size of the system somewhere in the last lines.

Now calculate the amount of Hugepages needed for the virtual machines and add at least 6 pages extra for each virtual machines. If you do not reserve enough pages, your virtual machine won't start. KVM uses some additional pages when starting up the VM, so if you don't add those 6 pages, the last VM will not start. Add the total of Hugepages to your kernel configuration by doing:

echo vm.nr_hugepages = XXXX &gt;&gt; /etc/sysctl.conf


Make the Hugepages accessible to KVM


mkdir /hugepages
echo hugetlbfs /hugepages hugetlbfs defaults 0 0


Now the Hugepages are set-up (they become accessible after a system reboot). Lets rig libvirt so the Hugepages are actually used after a system reboot. To do this we need to move the  qemu-kvm  binary and replace it with a script of our own. The binary is located in  /usr/libexec . Execute  mv /usr/libexec/qemu-kvm /usr/libexec/qemu-kvm2 . Now create the script  /usr/libexec/qemu-kvm  with the following contents:


  1. !/bin/bash
    exec /usr/libexec/qemu-kvm2 -mem-path /hugepages "$@"


Now reboot the system and start your virtual machines like normal.
Note:
Be careful when updating the libvirt package. An update will overwrite our script, so you need to reapply the change after each libvirt update.

RHEL 5.5 and Hugepages
RHEL 5.5 has native support for Hugepages. First make sure that the libhugetlbfs package is installed. Then execute thehuge_page_setup_helper command and answer the questions.

[root@aurora ~]# rpm -qa | grep huge
libhugetlbfs-1.3-7.el5
libhugetlbfs-1.3-7.el5
[root@aurora ~]# huge_page_setup_helper.py
Current configuration:
* Total System Memory......: 7909 MB
* Shared Mem Max Mapping...: 7100 MB
* System Huge Page Size....: 2 MB
* Number of Huge Pages.....: 3550
* Total size of Huge Pages.: 7100 MB
* Remaining System Memory..: 809 MB
* Huge Page User Group.....: root (0)

How much memory would you like to allocate for huge pages? (input in MB, unless postfixed with GB):


Now add the Hugepages mountpoint to

 /etc/fstab

mkdir /dev/hugepages
echo hugetlbfs /dev/hugepages hugetlbfs defaults 0 0


On next reboot there will be a problem, as /dev is governed by udev. That means that the hugepages mountpoint disappears automatically on reboot. To fix this a patch must be applied to /sbin/start_udev. To make sure thuis patch stays in place, even after rolling out a udev patch, a script has been created to see if the patch has been applied and if not, applies it.

Download the patch here and place it in /usr/local/bin
Download the init script here and place it in /etc/init.d

Then do:

chkconfig --add libvirt_hugepages
chkconfig libvirt_hugepages on


As a last step, add the following to the virtual machile XML config files in  /etc/libvirt/qemu  (on the same level as <memory>)


<memorybacking><hugepages/></memorybacking>

Now reboot the system and the virtual machines should be started using Hugepages memory. You can verify this by looking at the qemu-kvm command in the process list. It should contain a -mem-path parameter now. If the Hugepages mountpoint is added after the system has rebooted, restart libvirtd, or else libvirt won't see the Hugepages.</memory>