From mboxrd@z Thu Jan  1 00:00:00 1970
From: Andreas Olsowski <andreas.olsowski@uni.leuphana.de>
Subject: slow live magration / xc_restore on xen4 pvops
Date: Tue, 01 Jun 2010 23:17:31 +0200
Message-ID: <4C0578EB.2040800@uni.leuphana.de>
References: <2FD61F37AFF16D4DB46149330E4273C702FF9687@dcl-ex.dcml.docomolabs-usa.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: quoted-printable
Return-path: <xen-devel-bounces@lists.xensource.com>
In-Reply-To: <2FD61F37AFF16D4DB46149330E4273C702FF9687@dcl-ex.dcml.docomolabs-usa.com>
List-Unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xensource.com>
List-Help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-Subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
Sender: xen-devel-bounces@lists.xensource.com
Errors-To: xen-devel-bounces@lists.xensource.com
To: xen-devel@lists.xensource.com
List-Id: xen-devel@lists.xenproject.org

Hi,

in preparation for our soon to arrive central storage array i wanted to=20
test live magration and remus replication and stumbled upon a  problem.
When migrating a test-vm (512megs ram, idle) between my 3 servers two of=20
them are extremely slow in "receiving" the vm. There is little to no cpu=20
utilization from xc_restore until shortly before migration is complete.
The same goes for xm restore.
The xend.log contains:
[2010-06-01 21:16:27 5211] DEBUG (XendCheckpoint:286)=20
restore:shadow=3D0x0, _static_max=3D0x20000000, _static_min=3D0x0,
[2010-06-01 21:16:27 5211] DEBUG (XendCheckpoint:305) [xc_restore]:=20
/usr/lib/xen/bin/xc_restore 48 43 1 2 0 0 0 0
[2010-06-01 21:16:27 5211] INFO (XendCheckpoint:423) xc_domain_restore=20
start: p2m_size =3D 20000
[2010-06-01 21:16:27 5211] INFO (XendCheckpoint:423) Reloading memory=20
pages:   0%
[2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal=20
error: Error when reading batch size
[2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal=20
error: error when buffering batch, finishing

When receiving a vm via live migration finally finishes. You can see the=20
large gap in the timestamps.
The vm is perfectly fine after that, it just takes way too long.


First off let me explain my server setup, detailed information on trying=20
to narrow down the error follows.
I have 3 servers running xen4 with 2.6.31.13-pvops as kernel, its the=20
current kernel from jeremy's xen/master git branch.
The guests are running vanilla 2.6.32.11 kernels.

The 3 servers differ slightly in hardware, two are Dell PE 2950 and one=20
is a Dell R710, the 2950's have 2 Quad-Xeon CPUs (L5335 and L5410), the=20
R710 has 2 Quad Xeon E5520.
All machines have 24gigs of RAM.

They are called "tarballerina" (E5520), "xentruio1" (L5335) ad=20
"xenturio2" (L5410).

Currently i use tarballerina for testing purposes but i dont consider=20
anything in my setup "stable".
xenturio1 has 27 guests running, xenturio2 25.
No guest does anything that would even put a dent into the systems=20
performance (ldap servers, radius, department webservers, etc.).

I created a test-vm on my current central iscsi storage, called "hatest"=20
that idles around, has 2 VCPUs and 512megs of ram.

First i testen xm save/restore:
tarballerina:~# time xm restore /var/saverestore-t.mem
real    0m13.227s
user    0m0.090s
sys     0m0.023s
xenturio1:~# time xm restore /var/saverestore-x1.mem
real    4m15.173s
user    0m0.138s
sys     0m0.029s


When migrating to xenturio1 or 2 it the migration takes 181 to 278=20
seconds, when migrating it to tarballerina it takes rougly 30seconds:
tarballerina:~# time xm migrate --live hatest 10.0.1.98
real    3m57.971s
user    0m0.086s
sys     0m0.029s
xenturio1:~# time xm migrate --live hatest 10.0.1.100
real    0m43.588s
user    0m0.123s
sys     0m0.034s


--- attempt of narrowing it down ----
My first guess was that since tarballerina had almost no guest running=20
that did anything, it could be a issue of memory usage by the tapdisk2=20
processes (each dom0 has been mem-set to 4096M).
I then started almost all vms that i have on tarballerina:
tarballerina:~# time xm save saverestore-t /var/saverestore-t.mem
real    0m2.884s
tarballerina:~# time xm restore /var/saverestore-t.mem
real    0m15.594s


i tried this several times, sometimes it too 30+ seconds.

Then i started 2 VMs that run load and io generating processes  (stress,=20
dd, openssl encryption, md5sum).
But this didnt affect xm restore perfomance, it still was quite fast:
tarballerina:~# time xm save saverestore-t /var/saverestore-t.mem
real    0m7.476s
user    0m0.101s
sys     0m0.022s
tarballerina:~# time xm restore /var/saverestore-t.mem
real    0m45.544s
user    0m0.094s
sys     0m0.022s

i tried several times again, restore took 17 to 45 seconds

Then i tried migrating the test-vm to tarballerina again, still fast,=20
inspite of several vms including load and io generating vms:
This ate almost all available ram.
cputimes for xc_restore according to target machine's "top":
tarballerina -> xenturio1: 0:05:xx , cpu 2-4%, near the end 40%.
xenturio1 > tarballerina: 0:04:xx, cpu 4-8%, near the end 54%.

tarballerina:~# time xm migrate --live hatest 10.0.1.98
real    3m29.779s
user    0m0.102s
sys     0m0.017s
xenturio1:~# time xm migrate --live hatest 10.0.1.100
real    0m28.386s
user    0m0.154s
sys     0m0.032s


so my attempt of narrowing the problem down failed, its neither the free=20
memory of the dom0 nor the load, io or the memory the other domUs utilize=
.
---end attempt---

More info(xm list, meminfo, table with migration times, etc.) on my=20
setup can be found here:
http://andiolsi.rz.uni-lueneburg.de/node/37

There was another guy who has the same error in his logfile, this might=20
be unrelated or not:
http://lists.xensource.com/archives/html/xen-users/2010-05/msg00318.html

Further information can be given, should demand for i arise.

With best regards

---
Andreas Olsowski <andreas.olsowski@uni.leuphana.de>
Leuphana Universit=E4t L=FCneburg
System- und Netzwerktechnik
Rechenzentrum, Geb 7, Raum 15
Scharnhorststr. 1
21335 L=FCneburg

Tel: ++49 4131 / 6771309