From mboxrd@z Thu Jan 1 00:00:00 1970 From: "AkshayKumar Mehta" Subject: XCP Date: Tue, 1 Jun 2010 10:49:45 -0700 Message-ID: <2FD61F37AFF16D4DB46149330E4273C702FF9687@dcl-ex.dcml.docomolabs-usa.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1789925968==" Return-path: Content-class: urn:content-classes:message List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen-devel@lists.xensource.com Cc: Pradeep Padala List-Id: xen-devel@lists.xenproject.org This is a multi-part message in MIME format. --===============1789925968== Content-class: urn:content-classes:message Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01CB01B2.D2AF240D" This is a multi-part message in MIME format. ------_=_NextPart_001_01CB01B2.D2AF240D Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi there, =20 We are using latest version of XCP on 6 hosts. While issuing VM.start or VM.start_on xmlrpc functional call , it says : =20 =20 {'Status': 'Failure', 'ErrorDescription': ['SESSION_INVALID', 'OpaqueRef:cfb6df14-387d-40a1-cc27-d5962cba7712']} =20 However if I put VM.start in a loop maybe after 20-30 tries it succeeds .=20 But VM.start_on does not succeed even after 70 tries. One more observation - VM.clone succeeds after 7-8 tries=20 VM.hard_shutdown works fine=20 =20 Can you guide me on this issue, Akshay =20 =20 =20 ------_=_NextPart_001_01CB01B2.D2AF240D Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi there,

 

We are using latest version of XCP on 6 hosts. While = issuing VM.start or VM.start_on xmlrpc functional call , it says = :

 

 

{'Status': 'Failure', 'ErrorDescription': ['SESSION_INVALID', = 'OpaqueRef:cfb6df14-387d-40a1-cc27-d5962cba7712']}

 

However if I put VM.start in a loop maybe after  = 20-30 tries it succeeds .

But VM.start_on does not succeed even after 70 = tries.

One more observation -  VM.clone succeeds after = 7-8 tries

VM.hard_shutdown works fine =

 

Can you guide me on this = issue,

Akshay

 

 

 

------_=_NextPart_001_01CB01B2.D2AF240D-- --===============1789925968== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============1789925968==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jonathan Ludlam Subject: Re: XCP Date: Tue, 1 Jun 2010 20:06:34 +0100 Message-ID: <4BB2088A-1981-4A69-8E3F-DB52EAE75B6F@eu.citrix.com> References: <2FD61F37AFF16D4DB46149330E4273C702FF9687@dcl-ex.dcml.docomolabs-usa.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0850431297==" Return-path: In-Reply-To: <2FD61F37AFF16D4DB46149330E4273C702FF9687@dcl-ex.dcml.docomolabs-usa.com> Content-Language: en-US List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: AkshayKumar Mehta Cc: Padala , Pradeep, "xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org --===============0850431297== Content-Language: en-US Content-Type: multipart/alternative; boundary="_000_4BB2088A19814A698E3FDB52EAE75B6Feucitrixcom_" --_000_4BB2088A19814A698E3FDB52EAE75B6Feucitrixcom_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable IIRC, there was a fairly nasty session caching bug that could cause issues = like this. It's been fixed since, and the upcoming 0.5 release (coming in a= week or so) should fix it. Cheers, Jon On 1 Jun 2010, at 18:49, AkshayKumar Mehta wrote: Hi there, We are using latest version of XCP on 6 hosts. While issuing VM.start or VM= .start_on xmlrpc functional call , it says : {'Status': 'Failure', 'ErrorDescription': ['SESSION_INVALID', 'OpaqueRef:cf= b6df14-387d-40a1-cc27-d5962cba7712']} However if I put VM.start in a loop maybe after 20-30 tries it succeeds . But VM.start_on does not succeed even after 70 tries. One more observation - VM.clone succeeds after 7-8 tries VM.hard_shutdown works fine Can you guide me on this issue, Akshay --_000_4BB2088A19814A698E3FDB52EAE75B6Feucitrixcom_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable
Hi there,
 = ;
We are using latest version of XCP on 6 hos= ts. While issuing VM.start or VM.start_on xmlrpc functional call , it says = :
 
<= div style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; m= argin-left: 0in; font-size: 12pt; font-family: 'Times New Roman'; "> 
{'Status': 'Failure', 'ErrorD= escription': ['SESSION_INVALID', 'OpaqueRef:cfb6df14-387d-40a1-cc27-d5962cb= a7712']}
 =
= However if I put VM.start in a loop maybe after  20-30 trie= s it succeeds .
But VM.start_on does not= succeed even after 70 tries.
One more o= bservation -  VM.clone succeeds after 7-8 tries
VM.hard_shutdown works fine
 
Can you guide me on this issue,<= o:p>
Akshay
 =
=  
<ATT00001..t= xt>

= --_000_4BB2088A19814A698E3FDB52EAE75B6Feucitrixcom_-- --===============0850431297== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============0850431297==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: "AkshayKumar Mehta" Subject: RE: XCP Date: Tue, 1 Jun 2010 12:15:40 -0700 Message-ID: <2FD61F37AFF16D4DB46149330E4273C702FF96D7@dcl-ex.dcml.docomolabs-usa.com> References: <2FD61F37AFF16D4DB46149330E4273C702FF9687@dcl-ex.dcml.docomolabs-usa.com> <4BB2088A-1981-4A69-8E3F-DB52EAE75B6F@eu.citrix.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1397388739==" Return-path: Content-class: urn:content-classes:message In-Reply-To: <4BB2088A-1981-4A69-8E3F-DB52EAE75B6F@eu.citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Jonathan Ludlam Cc: Pradeep Padala , xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org This is a multi-part message in MIME format. --===============1397388739== Content-class: urn:content-classes:message Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01CB01BE.D35B7F64" This is a multi-part message in MIME format. ------_=_NextPart_001_01CB01BE.D35B7F64 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi Jonathan Ludlam, =20 Thanks! Let me know how to update to .5 without disturbing the existing configuration =20 Again thanks for your quick reply. Akshay =20 ________________________________ From: Jonathan Ludlam [mailto:Jonathan.Ludlam@eu.citrix.com]=20 Sent: Tuesday, June 01, 2010 12:07 PM To: AkshayKumar Mehta Cc: xen-devel@lists.xensource.com; Pradeep Padala Subject: Re: [Xen-devel] XCP =20 IIRC, there was a fairly nasty session caching bug that could cause issues like this. It's been fixed since, and the upcoming 0.5 release (coming in a week or so) should fix it. =20 Cheers, =20 Jon =20 =20 =20 On 1 Jun 2010, at 18:49, AkshayKumar Mehta wrote: Hi there, =20 We are using latest version of XCP on 6 hosts. While issuing VM.start or VM.start_on xmlrpc functional call , it says : =20 =20 {'Status': 'Failure', 'ErrorDescription': ['SESSION_INVALID', 'OpaqueRef:cfb6df14-387d-40a1-cc27-d5962cba7712']} =20 However if I put VM.start in a loop maybe after 20-30 tries it succeeds . But VM.start_on does not succeed even after 70 tries. One more observation - VM.clone succeeds after 7-8 tries VM.hard_shutdown works fine =20 Can you guide me on this issue, Akshay =20 =20 =20 =20 ------_=_NextPart_001_01CB01BE.D35B7F64 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi Jonathan = Ludlam,

 

Thanks! Let me know how to update to .5 without = disturbing the existing configuration

 

Again thanks for your quick = reply.

Akshay

 


From: = Jonathan Ludlam [mailto:Jonathan.Ludlam@eu.citrix.com]
Sent: Tuesday, June 01, = 2010 12:07 PM
To: AkshayKumar Mehta
Cc: = xen-devel@lists.xensource.com; Pradeep Padala
Subject: Re: [Xen-devel] = XCP

 

IIRC, there was a fairly nasty session caching bug that could = cause issues like this. It's been fixed since, and the upcoming 0.5 release = (coming in a week or so) should fix it.

 

Cheers,

 

Jon

 

 

 

On 1 Jun 2010, at 18:49, AkshayKumar = Mehta wrote:



Hi there,

 

We are using latest version of XCP on 6 hosts. While = issuing VM.start or VM.start_on xmlrpc functional call , it says = :

 

 

{'Status': 'Failure', 'ErrorDescription': ['SESSION_INVALID', = 'OpaqueRef:cfb6df14-387d-40a1-cc27-d5962cba7712']}

 

However if I put VM.start in a loop maybe after  = 20-30 tries it succeeds .

But VM.start_on does not succeed even after 70 = tries.

One more observation -  VM.clone succeeds after = 7-8 tries

VM.hard_shutdown works = fine

 

Can you guide me on this = issue,

Akshay

 

 

 

<ATT00001..txt>

=

 

------_=_NextPart_001_01CB01BE.D35B7F64-- --===============1397388739== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============1397388739==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andreas Olsowski Subject: slow live magration / xc_restore on xen4 pvops Date: Tue, 01 Jun 2010 23:17:31 +0200 Message-ID: <4C0578EB.2040800@uni.leuphana.de> References: <2FD61F37AFF16D4DB46149330E4273C702FF9687@dcl-ex.dcml.docomolabs-usa.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <2FD61F37AFF16D4DB46149330E4273C702FF9687@dcl-ex.dcml.docomolabs-usa.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org Hi, in preparation for our soon to arrive central storage array i wanted to=20 test live magration and remus replication and stumbled upon a problem. When migrating a test-vm (512megs ram, idle) between my 3 servers two of=20 them are extremely slow in "receiving" the vm. There is little to no cpu=20 utilization from xc_restore until shortly before migration is complete. The same goes for xm restore. The xend.log contains: [2010-06-01 21:16:27 5211] DEBUG (XendCheckpoint:286)=20 restore:shadow=3D0x0, _static_max=3D0x20000000, _static_min=3D0x0, [2010-06-01 21:16:27 5211] DEBUG (XendCheckpoint:305) [xc_restore]:=20 /usr/lib/xen/bin/xc_restore 48 43 1 2 0 0 0 0 [2010-06-01 21:16:27 5211] INFO (XendCheckpoint:423) xc_domain_restore=20 start: p2m_size =3D 20000 [2010-06-01 21:16:27 5211] INFO (XendCheckpoint:423) Reloading memory=20 pages: 0% [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal=20 error: Error when reading batch size [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal=20 error: error when buffering batch, finishing When receiving a vm via live migration finally finishes. You can see the=20 large gap in the timestamps. The vm is perfectly fine after that, it just takes way too long. First off let me explain my server setup, detailed information on trying=20 to narrow down the error follows. I have 3 servers running xen4 with 2.6.31.13-pvops as kernel, its the=20 current kernel from jeremy's xen/master git branch. The guests are running vanilla 2.6.32.11 kernels. The 3 servers differ slightly in hardware, two are Dell PE 2950 and one=20 is a Dell R710, the 2950's have 2 Quad-Xeon CPUs (L5335 and L5410), the=20 R710 has 2 Quad Xeon E5520. All machines have 24gigs of RAM. They are called "tarballerina" (E5520), "xentruio1" (L5335) ad=20 "xenturio2" (L5410). Currently i use tarballerina for testing purposes but i dont consider=20 anything in my setup "stable". xenturio1 has 27 guests running, xenturio2 25. No guest does anything that would even put a dent into the systems=20 performance (ldap servers, radius, department webservers, etc.). I created a test-vm on my current central iscsi storage, called "hatest"=20 that idles around, has 2 VCPUs and 512megs of ram. First i testen xm save/restore: tarballerina:~# time xm restore /var/saverestore-t.mem real 0m13.227s user 0m0.090s sys 0m0.023s xenturio1:~# time xm restore /var/saverestore-x1.mem real 4m15.173s user 0m0.138s sys 0m0.029s When migrating to xenturio1 or 2 it the migration takes 181 to 278=20 seconds, when migrating it to tarballerina it takes rougly 30seconds: tarballerina:~# time xm migrate --live hatest 10.0.1.98 real 3m57.971s user 0m0.086s sys 0m0.029s xenturio1:~# time xm migrate --live hatest 10.0.1.100 real 0m43.588s user 0m0.123s sys 0m0.034s --- attempt of narrowing it down ---- My first guess was that since tarballerina had almost no guest running=20 that did anything, it could be a issue of memory usage by the tapdisk2=20 processes (each dom0 has been mem-set to 4096M). I then started almost all vms that i have on tarballerina: tarballerina:~# time xm save saverestore-t /var/saverestore-t.mem real 0m2.884s tarballerina:~# time xm restore /var/saverestore-t.mem real 0m15.594s i tried this several times, sometimes it too 30+ seconds. Then i started 2 VMs that run load and io generating processes (stress,=20 dd, openssl encryption, md5sum). But this didnt affect xm restore perfomance, it still was quite fast: tarballerina:~# time xm save saverestore-t /var/saverestore-t.mem real 0m7.476s user 0m0.101s sys 0m0.022s tarballerina:~# time xm restore /var/saverestore-t.mem real 0m45.544s user 0m0.094s sys 0m0.022s i tried several times again, restore took 17 to 45 seconds Then i tried migrating the test-vm to tarballerina again, still fast,=20 inspite of several vms including load and io generating vms: This ate almost all available ram. cputimes for xc_restore according to target machine's "top": tarballerina -> xenturio1: 0:05:xx , cpu 2-4%, near the end 40%. xenturio1 > tarballerina: 0:04:xx, cpu 4-8%, near the end 54%. tarballerina:~# time xm migrate --live hatest 10.0.1.98 real 3m29.779s user 0m0.102s sys 0m0.017s xenturio1:~# time xm migrate --live hatest 10.0.1.100 real 0m28.386s user 0m0.154s sys 0m0.032s so my attempt of narrowing the problem down failed, its neither the free=20 memory of the dom0 nor the load, io or the memory the other domUs utilize= . ---end attempt--- More info(xm list, meminfo, table with migration times, etc.) on my=20 setup can be found here: http://andiolsi.rz.uni-lueneburg.de/node/37 There was another guy who has the same error in his logfile, this might=20 be unrelated or not: http://lists.xensource.com/archives/html/xen-users/2010-05/msg00318.html Further information can be given, should demand for i arise. With best regards --- Andreas Olsowski Leuphana Universit=E4t L=FCneburg System- und Netzwerktechnik Rechenzentrum, Geb 7, Raum 15 Scharnhorststr. 1 21335 L=FCneburg Tel: ++49 4131 / 6771309 From mboxrd@z Thu Jan 1 00:00:00 1970 From: Keir Fraser Subject: Re: slow live magration / xc_restore on xen4 pvops Date: Wed, 2 Jun 2010 08:11:31 +0100 Message-ID: References: <4C0578EB.2040800@uni.leuphana.de> Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <4C0578EB.2040800@uni.leuphana.de> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Andreas Olsowski , "xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org Hi Andreas, This is an interesting bug, to be sure. I think you need to modify the restore code to get a better idea of what's going on. The file in the Xen tree is tools/libxc/xc_domain_restore.c. You will see it contains many DBGPRINTF and DPRINTF calls, some of which are commented out, and some of which may 'log' at too low a priority level to make it to the log file. For your purposes you might change them to ERROR calls as they will definitely get properly logged. One area of possible concern is that our read function (RDEXACT, which is a macro mapping to rdexact) was modified for Remus to have a select() call with a timeout of 1000ms. Do I entirely trust it? Not when we have the inexplicable behaviour that you're seeing. So you might tr= y mapping RDEXACT() to read_exact() instead (which is what we already do when building for __MINIOS__). This all assumes you know your way around C code at least a little bit. -- Keir On 01/06/2010 22:17, "Andreas Olsowski" wrote: > Hi, >=20 > in preparation for our soon to arrive central storage array i wanted to > test live magration and remus replication and stumbled upon a problem. > When migrating a test-vm (512megs ram, idle) between my 3 servers two of > them are extremely slow in "receiving" the vm. There is little to no cpu > utilization from xc_restore until shortly before migration is complete. > The same goes for xm restore. > The xend.log contains: > [2010-06-01 21:16:27 5211] DEBUG (XendCheckpoint:286) > restore:shadow=3D0x0, _static_max=3D0x20000000, _static_min=3D0x0, > [2010-06-01 21:16:27 5211] DEBUG (XendCheckpoint:305) [xc_restore]: > /usr/lib/xen/bin/xc_restore 48 43 1 2 0 0 0 0 > [2010-06-01 21:16:27 5211] INFO (XendCheckpoint:423) xc_domain_restore > start: p2m_size =3D 20000 > [2010-06-01 21:16:27 5211] INFO (XendCheckpoint:423) Reloading memory > pages: 0% > [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal > error: Error when reading batch size > [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal > error: error when buffering batch, finishing >=20 > When receiving a vm via live migration finally finishes. You can see the > large gap in the timestamps. > The vm is perfectly fine after that, it just takes way too long. >=20 >=20 > First off let me explain my server setup, detailed information on trying > to narrow down the error follows. > I have 3 servers running xen4 with 2.6.31.13-pvops as kernel, its the > current kernel from jeremy's xen/master git branch. > The guests are running vanilla 2.6.32.11 kernels. >=20 > The 3 servers differ slightly in hardware, two are Dell PE 2950 and one > is a Dell R710, the 2950's have 2 Quad-Xeon CPUs (L5335 and L5410), the > R710 has 2 Quad Xeon E5520. > All machines have 24gigs of RAM. >=20 > They are called "tarballerina" (E5520), "xentruio1" (L5335) ad > "xenturio2" (L5410). >=20 > Currently i use tarballerina for testing purposes but i dont consider > anything in my setup "stable". > xenturio1 has 27 guests running, xenturio2 25. > No guest does anything that would even put a dent into the systems > performance (ldap servers, radius, department webservers, etc.). >=20 > I created a test-vm on my current central iscsi storage, called "hatest" > that idles around, has 2 VCPUs and 512megs of ram. >=20 > First i testen xm save/restore: > tarballerina:~# time xm restore /var/saverestore-t.mem > real 0m13.227s > user 0m0.090s > sys 0m0.023s > xenturio1:~# time xm restore /var/saverestore-x1.mem > real 4m15.173s > user 0m0.138s > sys 0m0.029s >=20 >=20 > When migrating to xenturio1 or 2 it the migration takes 181 to 278 > seconds, when migrating it to tarballerina it takes rougly 30seconds: > tarballerina:~# time xm migrate --live hatest 10.0.1.98 > real 3m57.971s > user 0m0.086s > sys 0m0.029s > xenturio1:~# time xm migrate --live hatest 10.0.1.100 > real 0m43.588s > user 0m0.123s > sys 0m0.034s >=20 >=20 > --- attempt of narrowing it down ---- > My first guess was that since tarballerina had almost no guest running > that did anything, it could be a issue of memory usage by the tapdisk2 > processes (each dom0 has been mem-set to 4096M). > I then started almost all vms that i have on tarballerina: > tarballerina:~# time xm save saverestore-t /var/saverestore-t.mem > real 0m2.884s > tarballerina:~# time xm restore /var/saverestore-t.mem > real 0m15.594s >=20 >=20 > i tried this several times, sometimes it too 30+ seconds. >=20 > Then i started 2 VMs that run load and io generating processes (stress, > dd, openssl encryption, md5sum). > But this didnt affect xm restore perfomance, it still was quite fast: > tarballerina:~# time xm save saverestore-t /var/saverestore-t.mem > real 0m7.476s > user 0m0.101s > sys 0m0.022s > tarballerina:~# time xm restore /var/saverestore-t.mem > real 0m45.544s > user 0m0.094s > sys 0m0.022s >=20 > i tried several times again, restore took 17 to 45 seconds >=20 > Then i tried migrating the test-vm to tarballerina again, still fast, > inspite of several vms including load and io generating vms: > This ate almost all available ram. > cputimes for xc_restore according to target machine's "top": > tarballerina -> xenturio1: 0:05:xx , cpu 2-4%, near the end 40%. > xenturio1 > tarballerina: 0:04:xx, cpu 4-8%, near the end 54%. >=20 > tarballerina:~# time xm migrate --live hatest 10.0.1.98 > real 3m29.779s > user 0m0.102s > sys 0m0.017s > xenturio1:~# time xm migrate --live hatest 10.0.1.100 > real 0m28.386s > user 0m0.154s > sys 0m0.032s >=20 >=20 > so my attempt of narrowing the problem down failed, its neither the free > memory of the dom0 nor the load, io or the memory the other domUs utilize= . > ---end attempt--- >=20 > More info(xm list, meminfo, table with migration times, etc.) on my > setup can be found here: > http://andiolsi.rz.uni-lueneburg.de/node/37 >=20 > There was another guy who has the same error in his logfile, this might > be unrelated or not: > http://lists.xensource.com/archives/html/xen-users/2010-05/msg00318.html >=20 > Further information can be given, should demand for i arise. >=20 > With best regards >=20 > --- > Andreas Olsowski > Leuphana Universit=E4t L=FCneburg > System- und Netzwerktechnik > Rechenzentrum, Geb 7, Raum 15 > Scharnhorststr. 1 > 21335 L=FCneburg >=20 > Tel: ++49 4131 / 6771309 >=20 >=20 >=20 > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andreas Olsowski Subject: Re: slow live magration / xc_restore on xen4 pvops Date: Wed, 2 Jun 2010 17:46:45 +0200 Message-ID: <20100602174645.9b37b6b1.andreas.olsowski@uni.leuphana.de> References: <4C0578EB.2040800@uni.leuphana.de> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org Hi Keir, i changed all DRPRINTF calls to ERROR and // DPRINTF to ERROR as well. There are no DBGPRINTF calls in my xc_domain_restore.c though. This is the new xend.log output, of course in this case the "ERROR Internal= error:" is actually debug output. xenturio1:~# tail -f /var/log/xen/xend.log [2010-06-02 15:44:19 5468] DEBUG (XendCheckpoint:286) restore:shadow=3D0x0,= _static_max=3D0x20000000, _static_min=3D0x0, [2010-06-02 15:44:19 5468] DEBUG (XendCheckpoint:305) [xc_restore]: /usr/li= b/xen/bin/xc_restore 50 51 1 2 0 0 0 0 [2010-06-02 15:44:19 5468] INFO (XendCheckpoint:423) ERROR Internal error: = xc_domain_restore start: p2m_size =3D 20000 [2010-06-02 15:44:19 5468] INFO (XendCheckpoint:423) [2010-06-02 15:44:19 5468] INFO (XendCheckpoint:423) ERROR Internal error: = Reloading memory pages: 0% [2010-06-02 15:44:19 5468] INFO (XendCheckpoint:423) [2010-06-02 15:44:19 5468] INFO (XendCheckpoint:423) ERROR Internal error: = reading batch of -7 pages [2010-06-02 15:44:19 5468] INFO (XendCheckpoint:423) [2010-06-02 15:44:19 5468] INFO (XendCheckpoint:423) ERROR Internal error: = reading batch of 1024 pages [2010-06-02 15:44:19 5468] INFO (XendCheckpoint:423) [2010-06-02 15:49:02 5468] INFO (XendCheckpoint:423) ERROR Internal error: = reading batch of 1024 pages [2010-06-02 15:49:02 5468] INFO (XendCheckpoint:423) [2010-06-02 15:49:02 5468] INFO (XendCheckpoint:423) ERROR Internal error: = reading batch of 1024 pages [2010-06-02 15:49:02 5468] INFO (XendCheckpoint:423) [2010-06-02 15:49:03 5468] INFO (XendCheckpoint:423) ERROR Internal error: = reading batch of 1024 pages ... [2010-06-02 15:49:09 5468] INFO (XendCheckpoint:423) ERROR Internal err100% ... One can see the timegap bewteen the first and the following memory batch re= ads. After that restoration works as expected. You might notice, that you have "0%" and then "100%" and no steps inbetween= , whereas with xc_save you have, is that intentional or maybe another sympt= om for the same problem? as for the read_exact stuff: tarballerina:/usr/src/xen-4.0.0# find . -type f -iname \*.c -exec grep -H R= DEXACT {} \; tarballerina:/usr/src/xen-4.0.0# find . -type f -iname \*.c -exec grep -H r= dexact {} \; There are no RDEXACT/rdexact matches in my xen source code. In a few hours i will shutdown all virtual machines on one of the hosts exp= eriencing slow xc_restores, maybe reboot it and check if xc_restore is any = faster without load or utilization on the machine. Ill check in with results later. On Wed, 2 Jun 2010 08:11:31 +0100 Keir Fraser wrote: > Hi Andreas, >=20 > This is an interesting bug, to be sure. I think you need to modify the > restore code to get a better idea of what's going on. The file in the Xen > tree is tools/libxc/xc_domain_restore.c. You will see it contains many > DBGPRINTF and DPRINTF calls, some of which are commented out, and some of > which may 'log' at too low a priority level to make it to the log file. F= or > your purposes you might change them to ERROR calls as they will definitely > get properly logged. One area of possible concern is that our read functi= on > (RDEXACT, which is a macro mapping to rdexact) was modified for Remus to > have a select() call with a timeout of 1000ms. Do I entirely trust it? Not > when we have the inexplicable behaviour that you're seeing. So you might = try > mapping RDEXACT() to read_exact() instead (which is what we already do wh= en > building for __MINIOS__). >=20 > This all assumes you know your way around C code at least a little bit. >=20 > -- Keir --=20 Andreas Olsowski Leuphana Universit=E4t L=FCneburg System- und Netzwerktechnik Rechenzentrum, Geb 7, Raum 15 Scharnhorststr. 1 21335 L=FCneburg Tel: ++49 4131 / 6771309 From mboxrd@z Thu Jan 1 00:00:00 1970 From: Keir Fraser Subject: Re: slow live magration / xc_restore on xen4 pvops Date: Wed, 2 Jun 2010 16:55:19 +0100 Message-ID: References: <20100602174645.9b37b6b1.andreas.olsowski@uni.leuphana.de> Mime-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20100602174645.9b37b6b1.andreas.olsowski@uni.leuphana.de> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Andreas Olsowski , "xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org On 02/06/2010 16:46, "Andreas Olsowski" wrote: > One can see the timegap bewteen the first and the following memory batch > reads. > After that restoration works as expected. > You might notice, that you have "0%" and then "100%" and no steps inbetween, > whereas with xc_save you have, is that intentional or maybe another symptom > for the same problem? Does the log look similar for a restore on a fast system (except the timestamps of course)? > as for the read_exact stuff: > tarballerina:/usr/src/xen-4.0.0# find . -type f -iname \*.c -exec grep -H > RDEXACT {} \; > tarballerina:/usr/src/xen-4.0.0# find . -type f -iname \*.c -exec grep -H > rdexact {} \; > > There are no RDEXACT/rdexact matches in my xen source code. Ah, because you're using 4.0. Well, I wouldn't worry about it just now anyway. It may be more fruitful to continue looking for a concrete behavioural different between a fast and slow restore, apart from merely timing, by inspecting logs. -- Keir > In a few hours i will shutdown all virtual machines on one of the hosts > experiencing slow xc_restores, maybe reboot it and check if xc_restore is any > faster without load or utilization on the machine. > > Ill check in with results later. > > > On Wed, 2 Jun 2010 08:11:31 +0100 > Keir Fraser wrote: > >> Hi Andreas, >> >> This is an interesting bug, to be sure. I think you need to modify the >> restore code to get a better idea of what's going on. The file in the Xen >> tree is tools/libxc/xc_domain_restore.c. You will see it contains many >> DBGPRINTF and DPRINTF calls, some of which are commented out, and some of >> which may 'log' at too low a priority level to make it to the log file. For >> your purposes you might change them to ERROR calls as they will definitely >> get properly logged. One area of possible concern is that our read function >> (RDEXACT, which is a macro mapping to rdexact) was modified for Remus to >> have a select() call with a timeout of 1000ms. Do I entirely trust it? Not >> when we have the inexplicable behaviour that you're seeing. So you might try >> mapping RDEXACT() to read_exact() instead (which is what we already do when >> building for __MINIOS__). >> >> This all assumes you know your way around C code at least a little bit. >> >> -- Keir > From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Jackson Subject: Re: slow live magration / xc_restore on xen4 pvops Date: Wed, 2 Jun 2010 17:18:57 +0100 Message-ID: <19462.33905.936222.605434@mariner.uk.xensource.com> References: <2FD61F37AFF16D4DB46149330E4273C702FF9687@dcl-ex.dcml.docomolabs-usa.com> <4C0578EB.2040800@uni.leuphana.de> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <4C0578EB.2040800@uni.leuphana.de> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Andreas Olsowski Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org Andreas Olsowski writes ("[Xen-devel] slow live magration / xc_restore on xen4 pvops"): > [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal > error: Error when reading batch size > [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal > error: error when buffering batch, finishing These errors, and the slowness of migrations, are caused by changes made to support Remus. Previously, a migration would be regarded as complete as soon as the final information including CPU states was received at the migration target. xc_domain_restore would return immediately at that point. Since the Remus patches, xc_domain_restore waits until it gets an IO error, and also has a very short timeout which induces IO errors if nothing is received if there is no timeout. This is correct in the Remus case but wrong in the normal case. The code should be changed so that xc_domain_restore (a) takes an explicit parameter for the IO timeout, which should default to something much longer than the 100ms or so of the Remus case, and (b) gets told whether (i) it should return immediately after receiving the "tail" which contains the CPU state; or (ii) it should attempt to keep reading after receiving the "tail" and only return when the connection fails. In the case (b)(i), which should be the usual case, the behaviour should be that which we would get if changeset 20406:0f893b8f7c15 was reverted. The offending code is mostly this, from 20406: + // DPRINTF("Buffered checkpoint\n"); + + if ( pagebuf_get(&pagebuf, io_fd, xc_handle, dom) ) { + ERROR("error when buffering batch, finishing\n"); + goto finish; + } + memset(&tmptail, 0, sizeof(tmptail)); + if ( buffer_tail(&tmptail, io_fd, max_vcpu_id, vcpumap, + ext_vcpucontext) < 0 ) { + ERROR ("error buffering image tail, finishing"); + goto finish; + } + tailbuf_free(&tailbuf); + memcpy(&tailbuf, &tmptail, sizeof(tailbuf)); + + goto loadpages; + + finish: Ian. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Jackson Subject: Re: slow live magration / xc_restore on xen4 pvops Date: Wed, 2 Jun 2010 17:20:26 +0100 Message-ID: <19462.33994.142221.554740@mariner.uk.xensource.com> References: <2FD61F37AFF16D4DB46149330E4273C702FF9687@dcl-ex.dcml.docomolabs-usa.com> <4C0578EB.2040800@uni.leuphana.de> <19462.33905.936222.605434@mariner.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <19462.33905.936222.605434@mariner.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Andreas Olsowski , xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org I wrote: > These errors, and the slowness of migrations, [...] Actually looking at your log it has a 4min30 delay in it which is quite striking and well beyond the kind of delay which ought to occur due to the problem I just wrote about. Does the same problem happen with xl ? Ian. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Keir Fraser Subject: Re: slow live magration / xc_restore on xen4 pvops Date: Wed, 2 Jun 2010 17:24:46 +0100 Message-ID: References: <19462.33905.936222.605434@mariner.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <19462.33905.936222.605434@mariner.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Ian Jackson , Andreas Olsowski Cc: Brendan, "xen-devel@lists.xensource.com" , Cully List-Id: xen-devel@lists.xenproject.org On 02/06/2010 17:18, "Ian Jackson" wrote: > Andreas Olsowski writes ("[Xen-devel] slow live magration / xc_restore on xen4 > pvops"): >> [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal >> error: Error when reading batch size >> [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal >> error: error when buffering batch, finishing > > These errors, and the slowness of migrations, are caused by changes > made to support Remus. Previously, a migration would be regarded as > complete as soon as the final information including CPU states was > received at the migration target. xc_domain_restore would return > immediately at that point. This probably needs someone with Remus knowledge to take a look, to keep all cases working correctly. I'll Cc Brendan. It'd be good to get this fixed for a 4.0.1 in a few weeks. -- Keir > Since the Remus patches, xc_domain_restore waits until it gets an IO > error, and also has a very short timeout which induces IO errors if > nothing is received if there is no timeout. This is correct in the > Remus case but wrong in the normal case. > > The code should be changed so that xc_domain_restore > (a) takes an explicit parameter for the IO timeout, which > should default to something much longer than the 100ms or so of > the Remus case, and > (b) gets told whether > (i) it should return immediately after receiving the "tail" > which contains the CPU state; or > (ii) it should attempt to keep reading after receiving the "tail" > and only return when the connection fails. > > In the case (b)(i), which should be the usual case, the behaviour > should be that which we would get if changeset 20406:0f893b8f7c15 was > reverted. The offending code is mostly this, from 20406: > > + // DPRINTF("Buffered checkpoint\n"); > + > + if ( pagebuf_get(&pagebuf, io_fd, xc_handle, dom) ) { > + ERROR("error when buffering batch, finishing\n"); > + goto finish; > + } > + memset(&tmptail, 0, sizeof(tmptail)); > + if ( buffer_tail(&tmptail, io_fd, max_vcpu_id, vcpumap, > + ext_vcpucontext) < 0 ) { > + ERROR ("error buffering image tail, finishing"); > + goto finish; > + } > + tailbuf_free(&tailbuf); > + memcpy(&tailbuf, &tmptail, sizeof(tailbuf)); > + > + goto loadpages; > + > + finish: > > Ian. > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel From mboxrd@z Thu Jan 1 00:00:00 1970 From: Brendan Cully Subject: Re: slow live magration / xc_restore on xen4 pvops Date: Wed, 2 Jun 2010 09:27:45 -0700 Message-ID: <20100602162745.GA27542@kremvax.cs.ubc.ca> References: <2FD61F37AFF16D4DB46149330E4273C702FF9687@dcl-ex.dcml.docomolabs-usa.com> <4C0578EB.2040800@uni.leuphana.de> <19462.33905.936222.605434@mariner.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <19462.33905.936222.605434@mariner.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Ian Jackson Cc: xen-devel@lists.xensource.com, Andreas Olsowski List-Id: xen-devel@lists.xenproject.org On Wednesday, 02 June 2010 at 17:18, Ian Jackson wrote: > Andreas Olsowski writes ("[Xen-devel] slow live magration / xc_restore on xen4 pvops"): > > [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal > > error: Error when reading batch size > > [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal > > error: error when buffering batch, finishing > > These errors, and the slowness of migrations, are caused by changes > made to support Remus. Previously, a migration would be regarded as > complete as soon as the final information including CPU states was > received at the migration target. xc_domain_restore would return > immediately at that point. > > Since the Remus patches, xc_domain_restore waits until it gets an IO > error, and also has a very short timeout which induces IO errors if > nothing is received if there is no timeout. This is correct in the > Remus case but wrong in the normal case. > > The code should be changed so that xc_domain_restore > (a) takes an explicit parameter for the IO timeout, which > should default to something much longer than the 100ms or so of > the Remus case, and > (b) gets told whether > (i) it should return immediately after receiving the "tail" > which contains the CPU state; or > (ii) it should attempt to keep reading after receiving the "tail" > and only return when the connection fails. I'm going to have a look at this today, but the way the code was originally written I don't believe this should have been a problem: 1. reads are only supposed to be able to time out after the entire first checkpoint has been received (IOW this wouldn't kick in until normal migration had already completed) 2. in normal migration, the sender should close the fd after sending all data, immediately triggering an IO error on the receiver and completing the restore. I did try to avoid disturbing regular live migration as much as possible when I wrote the code. I suspect some other regression has crept in, and I'll investigate. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andreas Olsowski Subject: Re: slow live magration / xc_restore on xen4 pvops Date: Thu, 03 Jun 2010 00:59:58 +0200 Message-ID: <4C06E26E.3030404@uni.leuphana.de> References: <2FD61F37AFF16D4DB46149330E4273C702FF9687@dcl-ex.dcml.docomolabs-usa.com> <4C0578EB.2040800@uni.leuphana.de> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <4C0578EB.2040800@uni.leuphana.de> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org I did some further research now and shut down all virtual machines on xenturio1, after that i got (3 runs): (xm save takes ~5 seconds , user and sys are always negligible so i removed those to reduce text) xenturio1:~# time xm restore /var/saverestore-x1.mem real 0m25.349s 0m27.456s 0m27.208s So the fact that there were running machines did impact performance of xc_restore. I proceeded to start create 20 "dummy" vms with 1gig ram and 4vpcus each (dom0 has 4096M fixed, 24gig total available): xenturio1:~# for i in {1..20} ; do echo creating dummy$i ; xt vm create dummy$i -vlan 27 -mem 1024 -cpus 4 ; done creating dummy1 vm/create> successfully created vm 'dummy1' .... creating dummy20 vm/create> successfully created vm 'dummy20' and started them for i in {1..20} ; do echo starting dummy$i ; xm start dummy$i ; done So my memory allocation should now be 100% (4gig dom0 20gig domUs), but why did i have 512megs to spare for "saverestore-x1"? Oh well, onwards. Once again i ran a save/restore, 3 times to be sure (edited the additional results in output). With 20 running vms: xenturio1:~# time xm restore /var/saverestore-x1.mem real 1m16.375s 0m31.306s 1m10.214s With 16 running vms: xenturio1:~# time xm restore /var/saverestore-x1.mem real 1m49.741s 1m38.696s 0m55.615s With 12 running vms: xenturio1:~# time xm restore /var/saverestore-x1.mem real 1m3.101s 2m4.254s 1m27.193s With 8 running vms: xenturio1:~# time xm restore /var/saverestore-x1.mem real 0m36.867s 0m43.513s 0m33.199s With 4 running vms: xenturio1:~# time xm restore /var/saverestore-x1.mem real 0m40.454s 0m44.929s 1m7.215s Keep in mind, those dumUs dont do anything at all, they just idle. What is going on there the results seem completely random, running more domUs can be faster than running less? How is that even possible? So i deleted the dummyXs and started the productive domUs again, in 3 steps to take further measurements: after first batch: xenturio1:~# time xm restore /var/saverestore-x1.mem real 0m23.968s 1m22.133s 1m24.420s after second batch: xenturio1:~# time xm restore /var/saverestore-x1.mem real 1m54.310s 1m11.340s 1m47.643s after third batch: xenturio1:~# time xm restore /var/saverestore-x1.mem real 1m52.065s 1m34.517s 2m8.644s 1m25.473s 1m35.943s 1m45.074s 1m48.407s 1m18.277s 1m18.931s 1m27.458s So my current guess is, xc_restore speed depends on the amount of used memory or rather how much is beeing grabbed by running processes. Does that make any sense? But if that is so, explain: I started 3 vms running "stress" that do: load average: 30.94, 30.04, 21.00 Mem: 5909844k total, 4020480k used, 1889364k free, 288k buffers But still: tarballerina:~# time xm restore /var/saverestore-t.mem real 0m38.654s Why doesnt xc_restore slow down on tarballerina, no matter what i do? Again: all 3 machines have 24gigs ram, 2x quad xeons and dom0 is fixed to 4096M ram. all use the same xen4 sources, the same kernels with the same configs. Is the Xeon E5520 with DDR3 really this much faster than the L5335 and L5410 with DDR2? If someone were to tell me, that this is expected behaviour i wouldnt like it, but at least i could accept it. Are machines doing plenty of cpu and memory utilizaton not a good measurement in this or any case? I think tomorrow night i will migrate all machines from xenturio1 to tarballerina, but first i have to verify that all vlans are available, that i cannot do right now. --- Andreas From mboxrd@z Thu Jan 1 00:00:00 1970 From: Brendan Cully Subject: Re: slow live magration / xc_restore on xen4 pvops Date: Wed, 2 Jun 2010 18:04:19 -0700 Message-ID: <20100603010418.GB2028@kremvax.cs.ubc.ca> References: <19462.33905.936222.605434@mariner.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Keir Fraser Cc: Andreas Olsowski , "xen-devel@lists.xensource.com" , Ian Jackson List-Id: xen-devel@lists.xenproject.org On Wednesday, 02 June 2010 at 17:24, Keir Fraser wrote: > On 02/06/2010 17:18, "Ian Jackson" wrote: > > > Andreas Olsowski writes ("[Xen-devel] slow live magration / xc_restore on xen4 > > pvops"): > >> [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal > >> error: Error when reading batch size > >> [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal > >> error: error when buffering batch, finishing > > > > These errors, and the slowness of migrations, are caused by changes > > made to support Remus. Previously, a migration would be regarded as > > complete as soon as the final information including CPU states was > > received at the migration target. xc_domain_restore would return > > immediately at that point. > > This probably needs someone with Remus knowledge to take a look, to keep all > cases working correctly. I'll Cc Brendan. It'd be good to get this fixed for > a 4.0.1 in a few weeks. I've done a bit of profiling of the restore code and observed the slowness here too. It looks to me like it's probably related to superpage changes. The big hit appears to be at the front of the restore process during calls to allocate_mfn_list, under the normal_page case. It looks like we're calling xc_domain_memory_populate_physmap once per page here, instead of batching the allocation? I haven't had time to investigate further today, but I think this is the culprit. > > -- Keir > > > Since the Remus patches, xc_domain_restore waits until it gets an IO > > error, and also has a very short timeout which induces IO errors if > > nothing is received if there is no timeout. This is correct in the > > Remus case but wrong in the normal case. > > > > The code should be changed so that xc_domain_restore > > (a) takes an explicit parameter for the IO timeout, which > > should default to something much longer than the 100ms or so of > > the Remus case, and > > (b) gets told whether > > (i) it should return immediately after receiving the "tail" > > which contains the CPU state; or > > (ii) it should attempt to keep reading after receiving the "tail" > > and only return when the connection fails. > > > > In the case (b)(i), which should be the usual case, the behaviour > > should be that which we would get if changeset 20406:0f893b8f7c15 was > > reverted. The offending code is mostly this, from 20406: > > > > + // DPRINTF("Buffered checkpoint\n"); > > + > > + if ( pagebuf_get(&pagebuf, io_fd, xc_handle, dom) ) { > > + ERROR("error when buffering batch, finishing\n"); > > + goto finish; > > + } > > + memset(&tmptail, 0, sizeof(tmptail)); > > + if ( buffer_tail(&tmptail, io_fd, max_vcpu_id, vcpumap, > > + ext_vcpucontext) < 0 ) { > > + ERROR ("error buffering image tail, finishing"); > > + goto finish; > > + } > > + tailbuf_free(&tailbuf); > > + memcpy(&tailbuf, &tmptail, sizeof(tailbuf)); > > + > > + goto loadpages; > > + > > + finish: > > > > Ian. > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > From mboxrd@z Thu Jan 1 00:00:00 1970 From: "AkshayKumar Mehta" Subject: RE: XCP Date: Wed, 2 Jun 2010 20:03:22 -0700 Message-ID: <2FD61F37AFF16D4DB46149330E4273C702FF99D8@dcl-ex.dcml.docomolabs-usa.com> References: <2FD61F37AFF16D4DB46149330E4273C702FF9687@dcl-ex.dcml.docomolabs-usa.com> <4BB2088A-1981-4A69-8E3F-DB52EAE75B6F@eu.citrix.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1556130466==" Return-path: Content-class: urn:content-classes:message In-Reply-To: <4BB2088A-1981-4A69-8E3F-DB52EAE75B6F@eu.citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Jonathan Ludlam Cc: Pradeep Padala , xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org This is a multi-part message in MIME format. --===============1556130466== Content-class: urn:content-classes:message Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01CB02C9.539C4817" This is a multi-part message in MIME format. ------_=_NextPart_001_01CB02C9.539C4817 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi Jonathan Ludlam, Is it possible for me to download from repository - Unstable version is fine till the release comes by.=20 Let me know the steps/link to repository. Akshay =20 ________________________________ From: Jonathan Ludlam [mailto:Jonathan.Ludlam@eu.citrix.com]=20 Sent: Tuesday, June 01, 2010 12:07 PM To: AkshayKumar Mehta Cc: xen-devel@lists.xensource.com; Pradeep Padala Subject: Re: [Xen-devel] XCP =20 IIRC, there was a fairly nasty session caching bug that could cause issues like this. It's been fixed since, and the upcoming 0.5 release (coming in a week or so) should fix it. =20 Cheers, =20 Jon =20 =20 =20 On 1 Jun 2010, at 18:49, AkshayKumar Mehta wrote: Hi there, =20 We are using latest version of XCP on 6 hosts. While issuing VM.start or VM.start_on xmlrpc functional call , it says : =20 =20 {'Status': 'Failure', 'ErrorDescription': ['SESSION_INVALID', 'OpaqueRef:cfb6df14-387d-40a1-cc27-d5962cba7712']} =20 However if I put VM.start in a loop maybe after 20-30 tries it succeeds . But VM.start_on does not succeed even after 70 tries. One more observation - VM.clone succeeds after 7-8 tries VM.hard_shutdown works fine =20 Can you guide me on this issue, Akshay =20 =20 =20 =20 ------_=_NextPart_001_01CB02C9.539C4817 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi Jonathan = Ludlam,

Is it possible for me to download from repository = – Unstable version is fine till the release comes by. =

Let me know the steps/link to = repository.

Akshay

 


From: = Jonathan Ludlam [mailto:Jonathan.Ludlam@eu.citrix.com]
Sent: Tuesday, June 01, = 2010 12:07 PM
To: AkshayKumar Mehta
Cc: = xen-devel@lists.xensource.com; Pradeep Padala
Subject: Re: [Xen-devel] = XCP

 

IIRC, there was a fairly nasty session caching bug that could = cause issues like this. It's been fixed since, and the upcoming 0.5 release = (coming in a week or so) should fix it.

 

Cheers,

 

Jon

 

 

 

On 1 Jun 2010, at 18:49, AkshayKumar = Mehta wrote:



Hi there,

 

We are using latest version of XCP on 6 hosts. While = issuing VM.start or VM.start_on xmlrpc functional call , it says = :

 

 

{'Status': 'Failure', 'ErrorDescription': ['SESSION_INVALID', = 'OpaqueRef:cfb6df14-387d-40a1-cc27-d5962cba7712']}

 

However if I put VM.start in a loop maybe after  = 20-30 tries it succeeds .

But VM.start_on does not succeed even after 70 = tries.

One more observation -  VM.clone succeeds after = 7-8 tries

VM.hard_shutdown works = fine

 

Can you guide me on this = issue,

Akshay

 

 

 

<ATT00001..txt>

=

 

------_=_NextPart_001_01CB02C9.539C4817-- --===============1556130466== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============1556130466==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Brendan Cully Subject: Re: slow live magration / xc_restore on xen4 pvops Date: Wed, 2 Jun 2010 21:31:43 -0700 Message-ID: <20100603043142.GA52378@zanzibar.domain.invalid> References: <19462.33905.936222.605434@mariner.uk.xensource.com> <20100603010418.GB2028@kremvax.cs.ubc.ca> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <20100603010418.GB2028@kremvax.cs.ubc.ca> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: keir.fraser@eu.citrix.com, Ian.Jackson@eu.citrix.com, andreas.olsowski@uni.leuphana.de, xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org On Wednesday, 02 June 2010 at 18:04, Brendan Cully wrote: > On Wednesday, 02 June 2010 at 17:24, Keir Fraser wrote: > > On 02/06/2010 17:18, "Ian Jackson" wrote: > > > > > Andreas Olsowski writes ("[Xen-devel] slow live magration / xc_restore on xen4 > > > pvops"): > > >> [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal > > >> error: Error when reading batch size > > >> [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal > > >> error: error when buffering batch, finishing > > > > > > These errors, and the slowness of migrations, are caused by changes > > > made to support Remus. Previously, a migration would be regarded as > > > complete as soon as the final information including CPU states was > > > received at the migration target. xc_domain_restore would return > > > immediately at that point. > > > > This probably needs someone with Remus knowledge to take a look, to keep all > > cases working correctly. I'll Cc Brendan. It'd be good to get this fixed for > > a 4.0.1 in a few weeks. > > I've done a bit of profiling of the restore code and observed the > slowness here too. It looks to me like it's probably related to > superpage changes. The big hit appears to be at the front of the > restore process during calls to allocate_mfn_list, under the > normal_page case. It looks like we're calling > xc_domain_memory_populate_physmap once per page here, instead of > batching the allocation? I haven't had time to investigate further > today, but I think this is the culprit. By the way, this only seems to matter on pvops -- restore is still pretty quick on 2.6.18. I'm somewhat surprised that there'd be any significant difference in allocating guest memory between the two kernels (isn't this almost entirely Xen's responsibility?), but it does explain why this wasn't noticed until recently. > > > > > > > Since the Remus patches, xc_domain_restore waits until it gets an IO > > > error, and also has a very short timeout which induces IO errors if > > > nothing is received if there is no timeout. This is correct in the > > > Remus case but wrong in the normal case. > > > > > > The code should be changed so that xc_domain_restore > > > (a) takes an explicit parameter for the IO timeout, which > > > should default to something much longer than the 100ms or so of > > > the Remus case, and > > > (b) gets told whether > > > (i) it should return immediately after receiving the "tail" > > > which contains the CPU state; or > > > (ii) it should attempt to keep reading after receiving the "tail" > > > and only return when the connection fails. > > > > > > In the case (b)(i), which should be the usual case, the behaviour > > > should be that which we would get if changeset 20406:0f893b8f7c15 was > > > reverted. The offending code is mostly this, from 20406: > > > > > > + // DPRINTF("Buffered checkpoint\n"); > > > + > > > + if ( pagebuf_get(&pagebuf, io_fd, xc_handle, dom) ) { > > > + ERROR("error when buffering batch, finishing\n"); > > > + goto finish; > > > + } > > > + memset(&tmptail, 0, sizeof(tmptail)); > > > + if ( buffer_tail(&tmptail, io_fd, max_vcpu_id, vcpumap, > > > + ext_vcpucontext) < 0 ) { > > > + ERROR ("error buffering image tail, finishing"); > > > + goto finish; > > > + } > > > + tailbuf_free(&tailbuf); > > > + memcpy(&tailbuf, &tmptail, sizeof(tailbuf)); > > > + > > > + goto loadpages; > > > + > > > + finish: > > > > > > Ian. > > > > > > _______________________________________________ > > > Xen-devel mailing list > > > Xen-devel@lists.xensource.com > > > http://lists.xensource.com/xen-devel > > > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > From mboxrd@z Thu Jan 1 00:00:00 1970 From: Keir Fraser Subject: Re: slow live magration / xc_restore on xen4 pvops Date: Thu, 3 Jun 2010 06:47:52 +0100 Message-ID: References: <20100603010418.GB2028@kremvax.cs.ubc.ca> Mime-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20100603010418.GB2028@kremvax.cs.ubc.ca> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Brendan Cully Cc: Andreas Olsowski , "xen-devel@lists.xensource.com" , Ian Jackson , "Zhai, Edwin" List-Id: xen-devel@lists.xenproject.org On 03/06/2010 02:04, "Brendan Cully" wrote: > I've done a bit of profiling of the restore code and observed the > slowness here too. It looks to me like it's probably related to > superpage changes. The big hit appears to be at the front of the > restore process during calls to allocate_mfn_list, under the > normal_page case. It looks like we're calling > xc_domain_memory_populate_physmap once per page here, instead of > batching the allocation? I haven't had time to investigate further > today, but I think this is the culprit. Ccing Edwin Zhai. He wrote the superpage logic for domain restore. -- Keir From mboxrd@z Thu Jan 1 00:00:00 1970 From: Brendan Cully Subject: Re: slow live magration / xc_restore on xen4 pvops Date: Wed, 2 Jun 2010 23:45:50 -0700 Message-ID: <20100603064545.GB52378@zanzibar.kublai.com> References: <20100603010418.GB2028@kremvax.cs.ubc.ca> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="nFreZHaLTZJo0R7j" Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Keir Fraser Cc: Ian Jackson , Jeremy Fitzhardinge , "xen-devel@lists.xensource.com" , Andreas Olsowski , "Zhai, Edwin" List-Id: xen-devel@lists.xenproject.org --nFreZHaLTZJo0R7j Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Thursday, 03 June 2010 at 06:47, Keir Fraser wrote: > On 03/06/2010 02:04, "Brendan Cully" wrote: > > > I've done a bit of profiling of the restore code and observed the > > slowness here too. It looks to me like it's probably related to > > superpage changes. The big hit appears to be at the front of the > > restore process during calls to allocate_mfn_list, under the > > normal_page case. It looks like we're calling > > xc_domain_memory_populate_physmap once per page here, instead of > > batching the allocation? I haven't had time to investigate further > > today, but I think this is the culprit. > > Ccing Edwin Zhai. He wrote the superpage logic for domain restore. Here's some data on the slowdown going from 2.6.18 to pvops dom0: I wrapped the call to allocate_mfn_list in uncanonicalize_pagetable to measure the time to do the allocation. kernel, min call time, max call time 2.6.18, 4 us, 72 us pvops, 202 us, 10696 us (!) It looks like pvops is dramatically slower to perform the xc_domain_memory_populate_physmap call! I'll attach the patch and raw data below. --nFreZHaLTZJo0R7j Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="alloc-profile.diff" diff --git a/tools/libxc/xc_domain_restore.c b/tools/libxc/xc_domain_restore.c --- a/tools/libxc/xc_domain_restore.c +++ b/tools/libxc/xc_domain_restore.c @@ -24,6 +24,7 @@ #include #include +#include #include "xg_private.h" #include "xg_save_restore.h" @@ -500,6 +501,9 @@ unsigned long pfn; uint64_t pte; struct domain_info_context *dinfo = &ctx->dinfo; + unsigned int allocated = 0; + struct timeval tvs, tve; + int elapsed, minelapsed = -1, maxelapsed = -1; pte_last = PAGE_SIZE / ((ctx->pt_levels == 2)? 4 : 8); @@ -520,9 +524,17 @@ if ( ctx->p2m[pfn] == INVALID_P2M_ENTRY ) { unsigned long force_pfn = superpages ? FORCE_SP_MASK : pfn; + allocated++; + gettimeofday(&tvs, NULL); if (allocate_mfn_list(xc_handle, dom, ctx, 1, &pfn, &force_pfn, superpages) != 0) return 0; + gettimeofday(&tve, NULL); + elapsed = (tve.tv_sec - tvs.tv_sec) * 1000000 + (tve.tv_usec - tvs.tv_usec); + if (minelapsed < 0 || elapsed < minelapsed) + minelapsed = elapsed; + if (maxelapsed < 0 || elapsed > maxelapsed) + maxelapsed = elapsed; } pte &= ~MADDR_MASK_X86; pte |= (uint64_t)ctx->p2m[pfn] << PAGE_SHIFT; @@ -532,6 +544,8 @@ else ((uint64_t *)page)[i] = (uint64_t)pte; } + if (allocated) + DPRINTF("xdr: allocated %u pages (min alloc time %d us, max %d)\n", allocated, minelapsed, maxelapsed); return 1; } --nFreZHaLTZJo0R7j Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="2.6.18-alloc" Content-Transfer-Encoding: quoted-printable [2010-06-02 23:37:19 5394] DEBUG (XendCheckpoint:305) [xc_restore]: /usr/li= b/xen/bin/xc_restore 4 2 1 2 0 0 0 0 [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xc_domain_restore star= t: p2m_size =3D 10800 [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) Reloading memory pages= : 0% [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 24) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 7) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 11) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 12) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 7) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 16) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 10) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 10) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 49) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 10) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 7) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 6) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 10) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 10) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 10) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 6) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 11) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 6) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 56) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 8) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 6) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 11) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 6) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 8) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 6) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 8) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 52) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 10) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 6) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 8) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 6) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 8) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 72) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 7) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 8) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 7) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 10) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 18) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 47) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 14) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 10) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 6) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 10) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 6) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 8) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 8) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 7) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 54) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 10) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 8) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 8) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 6) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 13) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 6) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 8) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 49) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 6) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 10) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 10) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 18) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 49) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 11) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 8) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 10) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 8) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 14) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 7) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 11) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 6) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 8) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 10) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 10) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 50) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 8) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 7) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 11) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 8) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 7) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 8) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 8) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 10) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 13) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 10) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 6) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 8) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 8) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 8) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 51) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 11) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 9) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 7) [2010-06-02 23:37:19 5394] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 4 us, max 11) [2010-06-02 23:37:22 5394] INFO (XendCheckpoint:423) ERROR Internal error: = Error when reading batch size [2010-06-02 23:37:22 5394] INFO (XendCheckpoint:423) ERROR Internal error: = error when buffering batch, finishing [2010-06-02 23:37:22 5394] INFO (XendCheckpoint:423)=20 [2010-06-02 23:37:22 5394] INFO (XendCheckpoint:423) =08=08=08=08100% [2010-06-02 23:37:22 5394] INFO (XendCheckpoint:423) Memory reloaded (0 pag= es) [2010-06-02 23:37:22 5394] INFO (XendCheckpoint:423) read VCPU 0 [2010-06-02 23:37:22 5394] INFO (XendCheckpoint:423) Completed checkpoint l= oad [2010-06-02 23:37:22 5394] DEBUG (XendCheckpoint:394) store-mfn 17534 [2010-06-02 23:37:22 5394] INFO (XendCheckpoint:423) Domain ready to be bui= lt. [2010-06-02 23:37:22 5394] DEBUG (XendCheckpoint:394) console-mfn 17533 [2010-06-02 23:37:22 5394] INFO (XendCheckpoint:423) Restore exit with rc= =3D0 --nFreZHaLTZJo0R7j Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=pvops-alloc Content-Transfer-Encoding: quoted-printable [2010-06-02 23:32:32 5409] DEBUG (XendCheckpoint:305) [xc_restore]: /usr/li= b/xen/bin/xc_restore 16 2 1 2 0 0 0 0 [2010-06-02 23:32:32 5409] INFO (XendCheckpoint:423) xc_domain_restore star= t: p2m_size =3D 10800 [2010-06-02 23:32:32 5409] INFO (XendCheckpoint:423) Reloading memory pages= : 0% [2010-06-02 23:32:33 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 212 us, max 271) [2010-06-02 23:32:33 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 214 us, max 1250) [2010-06-02 23:32:33 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 208 us, max 965) [2010-06-02 23:32:33 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 215 us, max 1000) [2010-06-02 23:32:33 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 210 us, max 1023) [2010-06-02 23:32:33 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 206 us, max 1001) [2010-06-02 23:32:34 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 209 us, max 986) [2010-06-02 23:32:34 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 214 us, max 984) [2010-06-02 23:32:34 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 214 us, max 950) [2010-06-02 23:32:34 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 215 us, max 967) [2010-06-02 23:32:34 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 213 us, max 948) [2010-06-02 23:32:34 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 212 us, max 974) [2010-06-02 23:32:34 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 211 us, max 951) [2010-06-02 23:32:34 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 215 us, max 1042) [2010-06-02 23:32:34 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 213 us, max 953) [2010-06-02 23:32:35 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 215 us, max 953) [2010-06-02 23:32:35 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 211 us, max 6859) [2010-06-02 23:32:35 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 214 us, max 976) [2010-06-02 23:32:35 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 215 us, max 985) [2010-06-02 23:32:35 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 211 us, max 6841) [2010-06-02 23:32:35 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 202 us, max 1272) [2010-06-02 23:32:35 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 211 us, max 296) [2010-06-02 23:32:35 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 211 us, max 276) [2010-06-02 23:32:36 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 214 us, max 271) [2010-06-02 23:32:36 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 214 us, max 287) [2010-06-02 23:32:36 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 214 us, max 266) [2010-06-02 23:32:36 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 215 us, max 289) [2010-06-02 23:32:36 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 211 us, max 296) [2010-06-02 23:32:36 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 215 us, max 281) [2010-06-02 23:32:36 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 210 us, max 677) [2010-06-02 23:32:36 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 214 us, max 269) [2010-06-02 23:32:37 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 216 us, max 10696) [2010-06-02 23:32:37 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 209 us, max 484) [2010-06-02 23:32:37 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 210 us, max 290) [2010-06-02 23:32:37 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 213 us, max 307) [2010-06-02 23:32:37 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 209 us, max 7151) [2010-06-02 23:32:37 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 214 us, max 253) [2010-06-02 23:32:37 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 207 us, max 297) [2010-06-02 23:32:37 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 213 us, max 293) [2010-06-02 23:32:37 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 212 us, max 8070) [2010-06-02 23:32:38 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 214 us, max 296) [2010-06-02 23:32:38 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 214 us, max 254) [2010-06-02 23:32:38 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 213 us, max 299) [2010-06-02 23:32:38 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 210 us, max 7929) [2010-06-02 23:32:38 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 214 us, max 8251) [2010-06-02 23:32:38 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 214 us, max 653) [2010-06-02 23:32:38 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 211 us, max 4304) [2010-06-02 23:32:38 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 213 us, max 269) [2010-06-02 23:32:39 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 211 us, max 278) [2010-06-02 23:32:39 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 214 us, max 4072) [2010-06-02 23:32:39 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 214 us, max 259) [2010-06-02 23:32:39 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 208 us, max 3910) [2010-06-02 23:32:39 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 209 us, max 268) [2010-06-02 23:32:39 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 213 us, max 255) [2010-06-02 23:32:39 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 216 us, max 257) [2010-06-02 23:32:39 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 209 us, max 255) [2010-06-02 23:32:40 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 215 us, max 259) [2010-06-02 23:32:40 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 216 us, max 255) [2010-06-02 23:32:40 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 214 us, max 253) [2010-06-02 23:32:40 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 215 us, max 280) [2010-06-02 23:32:40 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 216 us, max 258) [2010-06-02 23:32:40 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 212 us, max 265) [2010-06-02 23:32:40 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 210 us, max 621) [2010-06-02 23:32:40 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 215 us, max 259) [2010-06-02 23:32:40 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 209 us, max 252) [2010-06-02 23:32:41 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 214 us, max 284) [2010-06-02 23:32:41 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 213 us, max 255) [2010-06-02 23:32:41 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 215 us, max 258) [2010-06-02 23:32:41 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 208 us, max 295) [2010-06-02 23:32:41 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 206 us, max 253) [2010-06-02 23:32:41 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 209 us, max 2750) [2010-06-02 23:32:41 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 212 us, max 275) [2010-06-02 23:32:41 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 211 us, max 261) [2010-06-02 23:32:42 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 211 us, max 5924) [2010-06-02 23:32:42 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 206 us, max 601) [2010-06-02 23:32:42 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 214 us, max 589) [2010-06-02 23:32:42 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 211 us, max 285) [2010-06-02 23:32:42 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 212 us, max 2656) [2010-06-02 23:32:42 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 210 us, max 255) [2010-06-02 23:32:42 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 215 us, max 629) [2010-06-02 23:32:42 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 213 us, max 254) [2010-06-02 23:32:42 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 217 us, max 2866) [2010-06-02 23:32:43 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 215 us, max 257) [2010-06-02 23:32:43 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 216 us, max 275) [2010-06-02 23:32:43 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 213 us, max 807) [2010-06-02 23:32:43 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 213 us, max 304) [2010-06-02 23:32:43 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 210 us, max 262) [2010-06-02 23:32:43 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 212 us, max 264) [2010-06-02 23:32:43 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 214 us, max 252) [2010-06-02 23:32:43 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 212 us, max 257) [2010-06-02 23:32:44 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 212 us, max 253) [2010-06-02 23:32:44 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 205 us, max 256) [2010-06-02 23:32:44 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 214 us, max 251) [2010-06-02 23:32:44 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 215 us, max 301) [2010-06-02 23:32:44 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 212 us, max 257) [2010-06-02 23:32:44 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 211 us, max 2724) [2010-06-02 23:32:44 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 209 us, max 641) [2010-06-02 23:32:44 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 212 us, max 257) [2010-06-02 23:32:44 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 215 us, max 260) [2010-06-02 23:32:45 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 215 us, max 258) [2010-06-02 23:32:45 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 211 us, max 255) [2010-06-02 23:32:45 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 210 us, max 268) [2010-06-02 23:32:45 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 214 us, max 302) [2010-06-02 23:32:45 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 213 us, max 470) [2010-06-02 23:32:45 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 205 us, max 2353) [2010-06-02 23:32:45 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 209 us, max 275) [2010-06-02 23:32:45 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 211 us, max 256) [2010-06-02 23:32:46 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 207 us, max 262) [2010-06-02 23:32:46 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 207 us, max 263) [2010-06-02 23:32:46 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 213 us, max 257) [2010-06-02 23:32:46 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 214 us, max 269) [2010-06-02 23:32:46 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 215 us, max 263) [2010-06-02 23:32:46 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 211 us, max 428) [2010-06-02 23:32:46 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 213 us, max 614) [2010-06-02 23:32:46 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 213 us, max 256) [2010-06-02 23:32:46 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 207 us, max 258) [2010-06-02 23:32:47 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 214 us, max 257) [2010-06-02 23:32:47 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 216 us, max 253) [2010-06-02 23:32:47 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 213 us, max 263) [2010-06-02 23:32:47 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 213 us, max 285) [2010-06-02 23:32:47 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 216 us, max 382) [2010-06-02 23:32:47 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 212 us, max 267) [2010-06-02 23:32:47 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 209 us, max 266) [2010-06-02 23:32:47 5409] INFO (XendCheckpoint:423) xdr: allocated 512 pag= es (min alloc time 213 us, max 258) [2010-06-02 23:32:51 5409] INFO (XendCheckpoint:423) ERROR Internal error: = Error when reading batch size [2010-06-02 23:32:51 5409] INFO (XendCheckpoint:423) ERROR Internal error: = error when buffering batch, finishing [2010-06-02 23:32:51 5409] INFO (XendCheckpoint:423)=20 [2010-06-02 23:32:51 5409] INFO (XendCheckpoint:423) =08=08=08=08100% [2010-06-02 23:32:51 5409] INFO (XendCheckpoint:423) Memory reloaded (0 pag= es) [2010-06-02 23:32:51 5409] INFO (XendCheckpoint:423) read VCPU 0 [2010-06-02 23:32:51 5409] INFO (XendCheckpoint:423) Completed checkpoint l= oad [2010-06-02 23:32:51 5409] INFO (XendCheckpoint:423) Domain ready to be bui= lt. [2010-06-02 23:32:51 5409] INFO (XendCheckpoint:423) Restore exit with rc= =3D0 --nFreZHaLTZJo0R7j Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --nFreZHaLTZJo0R7j-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeremy Fitzhardinge Subject: Re: slow live magration / xc_restore on xen4 pvops Date: Wed, 02 Jun 2010 23:53:42 -0700 Message-ID: <4C075176.5010603@goop.org> References: <20100603010418.GB2028@kremvax.cs.ubc.ca> <20100603064545.GB52378@zanzibar.kublai.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20100603064545.GB52378@zanzibar.kublai.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: keir.fraser@eu.citrix.com, andreas.olsowski@uni.leuphana.de, xen-devel@lists.xensource.com, Ian.Jackson@eu.citrix.com, edwin.zhai@intel.com List-Id: xen-devel@lists.xenproject.org On 06/02/2010 11:45 PM, Brendan Cully wrote: > On Thursday, 03 June 2010 at 06:47, Keir Fraser wrote: > >> On 03/06/2010 02:04, "Brendan Cully" wrote: >> >> >>> I've done a bit of profiling of the restore code and observed the >>> slowness here too. It looks to me like it's probably related to >>> superpage changes. The big hit appears to be at the front of the >>> restore process during calls to allocate_mfn_list, under the >>> normal_page case. It looks like we're calling >>> xc_domain_memory_populate_physmap once per page here, instead of >>> batching the allocation? I haven't had time to investigate further >>> today, but I think this is the culprit. >>> >> Ccing Edwin Zhai. He wrote the superpage logic for domain restore. >> > Here's some data on the slowdown going from 2.6.18 to pvops dom0: > > I wrapped the call to allocate_mfn_list in uncanonicalize_pagetable > to measure the time to do the allocation. > > kernel, min call time, max call time > 2.6.18, 4 us, 72 us > pvops, 202 us, 10696 us (!) > > It looks like pvops is dramatically slower to perform the > xc_domain_memory_populate_physmap call! > That appears to be implemented as a raw hypercall, so the kernel has very little to do with it. The only thing I can see there that might be relevent is that the mlock hypercalls could be slow for some reason? J From mboxrd@z Thu Jan 1 00:00:00 1970 From: Brendan Cully Subject: Re: slow live magration / xc_restore on xen4 pvops Date: Wed, 2 Jun 2010 23:55:43 -0700 Message-ID: <20100603065542.GC52378@zanzibar.kublai.com> References: <20100603010418.GB2028@kremvax.cs.ubc.ca> <20100603064545.GB52378@zanzibar.kublai.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <20100603064545.GB52378@zanzibar.kublai.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: keir.fraser@eu.citrix.com, andreas.olsowski@uni.leuphana.de, xen-devel@lists.xensource.com, Ian.Jackson@eu.citrix.com, edwin.zhai@intel.com, jeremy@goop.org List-Id: xen-devel@lists.xenproject.org On Wednesday, 02 June 2010 at 23:45, Brendan Cully wrote: > On Thursday, 03 June 2010 at 06:47, Keir Fraser wrote: > > On 03/06/2010 02:04, "Brendan Cully" wrote: > > > > > I've done a bit of profiling of the restore code and observed the > > > slowness here too. It looks to me like it's probably related to > > > superpage changes. The big hit appears to be at the front of the > > > restore process during calls to allocate_mfn_list, under the > > > normal_page case. It looks like we're calling > > > xc_domain_memory_populate_physmap once per page here, instead of > > > batching the allocation? I haven't had time to investigate further > > > today, but I think this is the culprit. > > > > Ccing Edwin Zhai. He wrote the superpage logic for domain restore. > > Here's some data on the slowdown going from 2.6.18 to pvops dom0: > > I wrapped the call to allocate_mfn_list in uncanonicalize_pagetable > to measure the time to do the allocation. > > kernel, min call time, max call time > 2.6.18, 4 us, 72 us > pvops, 202 us, 10696 us (!) > > It looks like pvops is dramatically slower to perform the > xc_domain_memory_populate_physmap call! Looking at changeset 20841: Allow certain performance-critical hypercall wrappers to register data buffers via a new interface which allows them to be 'bounced' into a pre-mlock'ed page-sized per-thread data area. This saves the cost of mlock/munlock on every such hypercall, which can be very expensive on modern kernels. ...maybe the lock_pages call in xc_memory_op (called from xc_domain_memory_populate_physmap) has gotten very expensive? Especially considering this hypercall is now issued once per page. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Keir Fraser Subject: Re: slow live magration / xc_restore on xen4 pvops Date: Thu, 3 Jun 2010 08:12:17 +0100 Message-ID: References: <20100603065542.GC52378@zanzibar.kublai.com> Mime-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20100603065542.GC52378@zanzibar.kublai.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Brendan Cully , "andreas.olsowski@uni.leuphana.de" , "xen-devel@lists.xensource.com" , Ian List-Id: xen-devel@lists.xenproject.org On 03/06/2010 07:55, "Brendan Cully" wrote: >> kernel, min call time, max call time >> 2.6.18, 4 us, 72 us >> pvops, 202 us, 10696 us (!) >> >> It looks like pvops is dramatically slower to perform the >> xc_domain_memory_populate_physmap call! > > Looking at changeset 20841: > > Allow certain performance-critical hypercall wrappers to register data > buffers via a new interface which allows them to be 'bounced' into a > pre-mlock'ed page-sized per-thread data area. This saves the cost of > mlock/munlock on every such hypercall, which can be very expensive on > modern kernels. > > ...maybe the lock_pages call in xc_memory_op (called from > xc_domain_memory_populate_physmap) has gotten very expensive? > Especially considering this hypercall is now issued once per page. Maybe there are two issues here then. I mean, there's slow, and there's 10ms for a presumably in-core kernel operation, which is rather mad. Getting our batching back for 4k allocations is the most critical thing though, of course. -- Keir From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Zhai, Edwin" Subject: Re: slow live magration / xc_restore on xen4 pvops Date: Thu, 03 Jun 2010 16:58:26 +0800 Message-ID: <4C076EB2.9030108@intel.com> References: <20100603010418.GB2028@kremvax.cs.ubc.ca> <20100603064545.GB52378@zanzibar.kublai.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------050307000604070608040606" Return-path: In-Reply-To: <20100603064545.GB52378@zanzibar.kublai.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: keir.fraser@eu.citrix.com, andreas.olsowski@uni.leuphana.de, xen-devel@lists.xensource.com, Ian.Jackson@eu.citrix.com, edwin.zhai@intel.com, jeremy@goop.org List-Id: xen-devel@lists.xenproject.org This is a multi-part message in MIME format. --------------050307000604070608040606 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit I assume this is PV domU rather than HVM, right? 1. we need check if super page is the culprit by SP_check1.patch. 2. if this can fix this problem, we need further check where the extra costs comes: the speculative algorithm, or the super page population hypercall by SP_check2.patch If SP_check2.patch works, the culprit is the new allocation hypercall(so guest creation also suffer); Else, the speculative algorithm. Does it make sense? Thanks, edwin Brendan Cully wrote: > On Thursday, 03 June 2010 at 06:47, Keir Fraser wrote: > >> On 03/06/2010 02:04, "Brendan Cully" wrote: >> >> >>> I've done a bit of profiling of the restore code and observed the >>> slowness here too. It looks to me like it's probably related to >>> superpage changes. The big hit appears to be at the front of the >>> restore process during calls to allocate_mfn_list, under the >>> normal_page case. It looks like we're calling >>> xc_domain_memory_populate_physmap once per page here, instead of >>> batching the allocation? I haven't had time to investigate further >>> today, but I think this is the culprit. >>> >> Ccing Edwin Zhai. He wrote the superpage logic for domain restore. >> > > Here's some data on the slowdown going from 2.6.18 to pvops dom0: > > I wrapped the call to allocate_mfn_list in uncanonicalize_pagetable > to measure the time to do the allocation. > > kernel, min call time, max call time > 2.6.18, 4 us, 72 us > pvops, 202 us, 10696 us (!) > > It looks like pvops is dramatically slower to perform the > xc_domain_memory_populate_physmap call! > > I'll attach the patch and raw data below. > -- best rgds, edwin --------------050307000604070608040606 Content-Type: text/plain; name="SP_check1.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="SP_check1.patch" diff -r 4ab68bf4c37e tools/libxc/xc_domain_restore.c --- a/tools/libxc/xc_domain_restore.c Thu Jun 03 07:30:54 2010 +0100 +++ b/tools/libxc/xc_domain_restore.c Thu Jun 03 16:30:30 2010 +0800 @@ -1392,6 +1392,8 @@ int xc_domain_restore(xc_interface *xch, if ( hvm ) superpages = 1; + superpages = 0; + if ( read_exact(io_fd, &dinfo->p2m_size, sizeof(unsigned long)) ) { PERROR("read: p2m_size"); --------------050307000604070608040606 Content-Type: text/plain; name="SP_check2.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="SP_check2.patch" diff -r 4ab68bf4c37e tools/libxc/xc_domain_restore.c --- a/tools/libxc/xc_domain_restore.c Thu Jun 03 07:30:54 2010 +0100 +++ b/tools/libxc/xc_domain_restore.c Thu Jun 03 16:48:38 2010 +0800 @@ -248,6 +248,7 @@ static int allocate_mfn_list(xc_interfac if ( super_page_populated(xch, ctx, pfn) ) goto normal_page; +#if 0 pfn &= ~(SUPERPAGE_NR_PFNS - 1); mfn = pfn; @@ -263,6 +264,7 @@ static int allocate_mfn_list(xc_interfac DPRINTF("No 2M page available for pfn 0x%lx, fall back to 4K page.\n", pfn); ctx->no_superpage_mem = 1; +#endif normal_page: if ( !batch_buf ) --------------050307000604070608040606 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --------------050307000604070608040606-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Jackson Subject: Re: slow live magration / xc_restore on xen4 pvops Date: Thu, 3 Jun 2010 11:01:55 +0100 Message-ID: <19463.32147.268104.94905@mariner.uk.xensource.com> References: <2FD61F37AFF16D4DB46149330E4273C702FF9687@dcl-ex.dcml.docomolabs-usa.com> <4C0578EB.2040800@uni.leuphana.de> <19462.33905.936222.605434@mariner.uk.xensource.com> <20100602162745.GA27542@kremvax.cs.ubc.ca> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20100602162745.GA27542@kremvax.cs.ubc.ca> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Brendan Cully Cc: "xen-devel@lists.xensource.com" , Andreas Olsowski List-Id: xen-devel@lists.xenproject.org Brendan Cully writes ("Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops"): > 2. in normal migration, the sender should close the fd after sending > all data, immediately triggering an IO error on the receiver and > completing the restore. This is not true. In normal migration, the fd is used by the machinery which surrounds xc_domain_restore (in xc_save and also in xl or xend). In any case it would be quite wrong for a library function like xc_domain_restore to eat the fd. It's not necessary for xc_domain_restore to behave this way in all cases; all that's needed is parameters to tell it how to behave. > I did try to avoid disturbing regular live migration as much as > possible when I wrote the code. I suspect some other regression has > crept in, and I'll investigate. The short timeout is another regression. A normal live migration or restore should not fall over just because no data is available for 100ms. Ian. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jonathan Ludlam Subject: Re: XCP Date: Thu, 3 Jun 2010 11:24:52 +0100 Message-ID: <89D54B65-B3ED-4A4A-B1FB-D044F00DBE85@eu.citrix.com> References: <2FD61F37AFF16D4DB46149330E4273C702FF9687@dcl-ex.dcml.docomolabs-usa.com> <4BB2088A-1981-4A69-8E3F-DB52EAE75B6F@eu.citrix.com> <2FD61F37AFF16D4DB46149330E4273C702FF99D8@dcl-ex.dcml.docomolabs-usa.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="_005_89D54B65B3ED4A4AB1FBD044F00DBE85eucitrixcom_" Return-path: In-Reply-To: <2FD61F37AFF16D4DB46149330E4273C702FF99D8@dcl-ex.dcml.docomolabs-usa.com> Content-Language: en-US List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: AkshayKumar Mehta Cc: Padala , Pradeep, "xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org --_005_89D54B65B3ED4A4AB1FBD044F00DBE85eucitrixcom_ Content-Type: multipart/alternative; boundary="_000_89D54B65B3ED4A4AB1FBD044F00DBE85eucitrixcom_" --_000_89D54B65B3ED4A4AB1FBD044F00DBE85eucitrixcom_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable There aren't any binaries at the moment, but you could build your own from = the public repositories - I sent a script out on the 25th March that will c= ompile everything up. You'll need to replace the xapi and fe executables (x= api is output in xen-api.hg/ocaml/xapi/ and fe is found in xen-api-libs.hg/= forking_executioner/) I'll reattach the script to this mail. Cheers, Jon --_000_89D54B65B3ED4A4AB1FBD044F00DBE85eucitrixcom_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable There aren't any binaries = at the moment, but you could build your own from the public repositories - = I sent a script out on the 25th March that will compile everything up. You'= ll need to replace the xapi and fe executables (xapi is output in xen-api.h= g/ocaml/xapi/ and fe is found in xen-api-libs.hg/forking_executioner/)
=
I'll reattach the script to this mail.

<= div>Cheers,

Jon

= --_000_89D54B65B3ED4A4AB1FBD044F00DBE85eucitrixcom_-- --_005_89D54B65B3ED4A4AB1FBD044F00DBE85eucitrixcom_ Content-Type: application/octet-stream; name="build.sh" Content-Description: build.sh Content-Disposition: attachment; filename="build.sh"; size=2492; creation-date="Thu, 03 Jun 2010 11:24:53 GMT"; modification-date="Thu, 03 Jun 2010 11:24:53 GMT" Content-Transfer-Encoding: base64 IyEvYmluL2Jhc2gNCg0KIw0KIyBTaW1wbGlzdGljIHNjcmlwdCB0byBpbnN0YWxsIGJ1aWxkIGVu dmlyb25tZW50IGZvciBYQ1AgeGFwaSBpbiBhIENlbnRPUyBWTQ0KIyANCiMgTm90ZSB0aGF0IHRo aXMgZG9lcyAqbm90KiBidWlsZCBYZW4gbm9yIGEga2VybmVsIHN1aXRhYmxlIGZvciBYQ1AgLSBm b3INCiMgWGVuIGEgNjQgYml0IGJ1aWxkIGVudmlyb25tZW50IGlzIHJlcXVpcmVkLiANCiMgDQoj IEFzc3VtZXMgcm9vdCBwcml2aWxlZ2VzLiBPbmx5IHRlc3RlZCBvbiBhIHZhbmlsbGEgQ2VudE9T IDUuNCBpbnN0YWxsDQojIA0Kc2V0IC1lDQoNCiMgRmlyc3RseSBpbnN0YWxsIHRoZSByZXF1aXJl ZCBycG1zIGFuZCBtZXJjdXJpYWwNCiMgVGhlIHBhcnRpY3VsYXIgdmVyc2lvbiBvZiBtZXJjdXJp YWwgaXMgdW5pbXBvcnRhbnQNCg0KeXVtIGluc3RhbGwgLXkgZ2NjIGF1dG9jb25mIGF1dG9tYWtl IHRldGV4IGdob3N0c2NyaXB0IGphdmEtMS42LjAtb3BlbmpkayBqYXZhLTEuNi4wLW9wZW5qZGst ZGV2ZWwgYW50IHBhbS1kZXZlbCBweXRob24tZGV2ZWwgemxpYi1kZXZlbCBvcGVuc3NsLWRldmVs IGRldjg2DQpycG0gLWUgbGliYWlvDQp3Z2V0IGh0dHA6Ly9tZXJjdXJpYWwuc2VsZW5pYy5jb20v cmVsZWFzZS9tZXJjdXJpYWwtMS4zLjEudGFyLmd6DQp0YXIgeHpmIG1lcmN1cmlhbC0xLjMuMS50 YXIuZ3oNCmNkIG1lcmN1cmlhbC0xLjMuMQ0KbWFrZSBpbnN0YWxsDQpjZCAuLg0KDQojIFNldCB1 cCBiYXNoX3Byb2ZpbGUgZm9yIG1lcmN1cmlhbCBhbmQgZm9yIHhhcGkNCmNhdCA+PiB+Ly5iYXNo X3Byb2ZpbGUgPDwgRU9GDQpQWVRIT05QQVRIPS91c3IvbG9jYWwvbGliL3B5dGhvbjIuNC9zaXRl LXBhY2thZ2VzOiR7UFlUSE9OUEFUSH0NCmV4cG9ydCBQWVRIT05QQVRIDQpleHBvcnQgUEFUSD0k e1BBVEh9Oi9vcHQveGVuc291cmNlL2Jpbg0KRU9GDQoNCi4gfi8uYmFzaF9wcm9maWxlDQoNCiMg RW5hYmxlIG1lcmN1cmlhbCBwYXRjaCBxdWV1ZSBleHRlbnNpb24NCmNhdCA+PiB+Ly5oZ3JjIDw8 IEVPRg0KW2V4dGVuc2lvbnNdDQpoZ2V4dC5tcSA9IA0KRU9GDQoNCiMgTm93IGNsb25lIHRoZSB4 ZW4gcmVwb3NpdG9yeS4gV2UgbmVlZCBhIGZldyBsaWJyYXJpZXMgDQojIGZvciB4YXBpIHRvIGxp bmsgYWdhaW5zdC4NCmhnIGNsb25lIGh0dHA6Ly94ZW5iaXRzLnhlbnNvdXJjZS5jb20veGVuLTMu NC10ZXN0aW5nLmhnDQpjZCB4ZW4tMy40LXRlc3RpbmcuaGcNCmhnIHN0cmlwIFJFTEVBU0UtMy40 LjINCmNkIC5oZw0KaGcgY2xvbmUgaHR0cDovL3hlbmJpdHMueGVuLm9yZy94YXBpL3hlbi0zLjQu cHEuaGcgcGF0Y2hlcw0KY2QgLi4NCmhnIHFwdXNoIC1hDQpleHBvcnQgV0dFVD13Z2V0DQptYWtl IGRpc3QtdG9vbHMNCmNkIGRpc3QvaW5zdGFsbA0KIyBSc3luYyByZW1vdmVzIHRoZSAvZXRjL2lu aXQuZCBzeW1saW5rIHVubGVzcyB3ZSBkbyB0aGUNCiMgbmV4dCAyIGxpbmVzDQpta2RpciBldGMv cmMuZA0KbXYgZXRjL2luaXQuZCBldGMvcmMuZA0KcnN5bmMgLWF2IC4gLw0KY2QgLi4vLi4vLi4N Cg0KIyBDbG9uZSB0aGUgcmVwb3NpdG9yeSB3aXRoIG9jYW1sIGFuZCBhIGZldyBsaWJyYXJpZXMN CmhnIGNsb25lIGh0dHA6Ly94ZW5iaXRzLnhlbi5vcmcveGFwaS94ZW4tZGlzdC1vY2FtbC5oZw0K Y2QgeGVuLWRpc3Qtb2NhbWwuaGcNCm1ha2UNCmNkIC4uDQoNCiMgeGVuLWFwaS1saWJzLmhnDQpo ZyBjbG9uZSBodHRwOi8veGVuYml0cy54ZW4ub3JnL3hhcGkveGVuLWFwaS1saWJzLmhnDQpjZCB4 ZW4tYXBpLWxpYnMuaGcNCmNobW9kIDc1NSBhdXRvZ2VuLnNoDQouL2F1dG9nZW4uc2gNCi4vY29u ZmlndXJlDQouL3JlYnVpbGQNCmNkIC4uDQoNCiMgeGVuLWFwaS5oZw0KaGcgY2xvbmUgaHR0cDov L3hlbmJpdHMueGVuLm9yZy94YXBpL3hlbi1hcGkuaGcNCmNkIHhlbi1hcGkuaGcNCm1ha2UNCmNk IC4uDQoNCiMgQXQgdGhpcyBwb2ludCwgd2UndmUgYnVpbHQgYSB4YXBpIHRoYXQgY2FuIGJlIHB1 dCBvbnRvIGFuIFhDUCBob3N0DQojIFdlIGNhbiBhbHNvIHJ1biBpdCBsb2NhbGx5IGluIFNESyBt b2RlLiBUaGlzIGlzIHVzZWZ1bCBmb3IgdGVzdGluZw0KIyBVbmNvbW1lbnQgdGhpcyBiaXQgaWYg eW91IHdhbnQgdGhpcyBzZXQgdXANCg0KI2NkIHhlbi1hcGkuaGcNCiNtYWtlIGluc3RhbGwNCiNj ZCBkaXN0L3N0YWdpbmcNCiNyc3luYyAtYXYgLiAvDQojY2QgLi4vLi4NCiNjYXQgPiAvZXRjL3hl bnNvdXJjZS1pbnZlbnRvcnkuc2tlbCA8PCBFT0YNCiNQUk9EVUNUX0JSQU5EPSdYZW5DbG91ZFBs YXRmb3JtJw0KI1BST0RVQ1RfTkFNRT0neGNwJw0KI1BST0RVQ1RfVkVSU0lPTj0nMC4xLjEnDQoj QlVJTERfTlVNQkVSPScxJw0KI0VPRg0KI21vZHByb2JlIGR1bW15IA0KIy4vc2NyaXB0cy9pbml0 LmQtc2RraW5pdCBzdGFydA0KIy9ldGMvaW5pdC5kL2ZlIHN0YXJ0DQojL2V0Yy9pbml0LmQveGVu c2VydmljZXMgc3RhcnQNCiMvZXRjL2luaXQuZC94YXBpIHN0YXJ0DQo= --_005_89D54B65B3ED4A4AB1FBD044F00DBE85eucitrixcom_ Content-Type: text/html; name="ATT00001..htm" Content-Description: ATT00001..htm Content-Disposition: attachment; filename="ATT00001..htm"; size=9640; creation-date="Thu, 03 Jun 2010 11:24:53 GMT"; modification-date="Thu, 03 Jun 2010 11:24:53 GMT" Content-Transfer-Encoding: base64 PGh0bWw+PGhlYWQ+PC9oZWFkPjxib2R5IHN0eWxlPSJ3b3JkLXdyYXA6IGJyZWFrLXdvcmQ7IC13 ZWJraXQtbmJzcC1tb2RlOiBzcGFjZTsgLXdlYmtpdC1saW5lLWJyZWFrOiBhZnRlci13aGl0ZS1z cGFjZTsgIj48ZGl2PjwvZGl2PjxkaXY+PGJyPjwvZGl2PjxkaXY+PGRpdj48ZGl2Pk9uIDMgSnVu IDIwMTAsIGF0IDA0OjAzLCBBa3NoYXlLdW1hciBNZWh0YSB3cm90ZTo8L2Rpdj48YnIgY2xhc3M9 IkFwcGxlLWludGVyY2hhbmdlLW5ld2xpbmUiPjxibG9ja3F1b3RlIHR5cGU9ImNpdGUiPjxvOnNt YXJ0dGFndHlwZSBuYW1lc3BhY2V1cmk9InVybjpzY2hlbWFzLW1pY3Jvc29mdC1jb206b2ZmaWNl OnNtYXJ0dGFncyIgbmFtZT0iUGVyc29uTmFtZSI+DQo8IS0tW2lmICFtc29dPg0KPHN0eWxlPg0K c3QxXDoqe2JlaGF2aW9yOnVybCgjZGVmYXVsdCNpZW9vdWkpIH0NCjwvc3R5bGU+DQo8IVtlbmRp Zl0tLT4NCjxzdHlsZT4NCjwhLS0NCiAvKiBGb250IERlZmluaXRpb25zICovDQogQGZvbnQtZmFj ZQ0KCXtmb250LWZhbWlseTpIZWx2ZXRpY2E7DQoJcGFub3NlLTE6MiAxMSA2IDQgMiAyIDIgMiAy IDQ7fQ0KQGZvbnQtZmFjZQ0KCXtmb250LWZhbWlseToiTVMgTWluY2hvIjsNCglwYW5vc2UtMToy IDIgNiA5IDQgMiA1IDggMyA0O30NCkBmb250LWZhY2UNCgl7Zm9udC1mYW1pbHk6VGFob21hOw0K CXBhbm9zZS0xOjIgMTEgNiA0IDMgNSA0IDQgMiA0O30NCkBmb250LWZhY2UNCgl7Zm9udC1mYW1p bHk6IlxATVMgTWluY2hvIjsNCglwYW5vc2UtMToyIDIgNiA5IDQgMiA1IDggMyA0O30NCiAvKiBT dHlsZSBEZWZpbml0aW9ucyAqLw0KIHAuTXNvTm9ybWFsLCBsaS5Nc29Ob3JtYWwsIGRpdi5Nc29O b3JtYWwNCgl7bWFyZ2luOjBpbjsNCgltYXJnaW4tYm90dG9tOi4wMDAxcHQ7DQoJZm9udC1zaXpl OjEyLjBwdDsNCglmb250LWZhbWlseToiVGltZXMgTmV3IFJvbWFuIjt9DQphOmxpbmssIHNwYW4u TXNvSHlwZXJsaW5rDQoJe2NvbG9yOmJsdWU7DQoJdGV4dC1kZWNvcmF0aW9uOnVuZGVybGluZTt9 DQphOnZpc2l0ZWQsIHNwYW4uTXNvSHlwZXJsaW5rRm9sbG93ZWQNCgl7Y29sb3I6cHVycGxlOw0K CXRleHQtZGVjb3JhdGlvbjp1bmRlcmxpbmU7fQ0Kc3Bhbi5FbWFpbFN0eWxlMTgNCgl7bXNvLXN0 eWxlLXR5cGU6cGVyc29uYWwtcmVwbHk7DQoJZm9udC1mYW1pbHk6QXJpYWw7DQoJY29sb3I6bmF2 eTt9DQpAcGFnZSBTZWN0aW9uMQ0KCXtzaXplOjguNWluIDExLjBpbjsNCgltYXJnaW46MS4waW4g MS4yNWluIDEuMGluIDEuMjVpbjt9DQpkaXYuU2VjdGlvbjENCgl7cGFnZTpTZWN0aW9uMTt9DQot LT4NCjwvc3R5bGU+DQo8IS0tW2lmIGd0ZSBtc28gOV0+PHhtbD4NCiA8bzpzaGFwZWRlZmF1bHRz IHY6ZXh0PSJlZGl0IiBzcGlkbWF4PSIxMDI2IiAvPg0KPC94bWw+PCFbZW5kaWZdLS0+PCEtLVtp ZiBndGUgbXNvIDldPjx4bWw+DQogPG86c2hhcGVsYXlvdXQgdjpleHQ9ImVkaXQiPg0KICA8bzpp ZG1hcCB2OmV4dD0iZWRpdCIgZGF0YT0iMSIgLz4NCiA8L286c2hhcGVsYXlvdXQ+PC94bWw+PCFb ZW5kaWZdLS0+DQoNCg0KPGRpdiBsYW5nPSJFTi1VUyIgbGluaz0iYmx1ZSIgdmxpbms9InB1cnBs ZSIgc3R5bGU9IndvcmQtd3JhcDogYnJlYWstd29yZDsNCi13ZWJraXQtbmJzcC1tb2RlOiBzcGFj ZTstd2Via2l0LWxpbmUtYnJlYWs6IGFmdGVyLXdoaXRlLXNwYWNlIj4NCg0KPGRpdiBjbGFzcz0i U2VjdGlvbjEiPjxwIGNsYXNzPSJNc29Ob3JtYWwiPjxmb250IHNpemU9IjIiIGNvbG9yPSJuYXZ5 IiBmYWNlPSJBcmlhbCI+PHNwYW4gc3R5bGU9ImZvbnQtc2l6ZToNCjEwLjBwdDtmb250LWZhbWls eTpBcmlhbDtjb2xvcjpuYXZ5Ij5IaTwvc3Bhbj48L2ZvbnQ+PGZvbnQgc2l6ZT0iMiIgZmFjZT0i VGFob21hIj48c3BhbiBzdHlsZT0iZm9udC1zaXplOjEwLjBwdDtmb250LWZhbWlseTpUYWhvbWEi PiBKb25hdGhhbiBMdWRsYW0sPG86cD48L286cD48L3NwYW4+PC9mb250PjwvcD48cCBjbGFzcz0i TXNvTm9ybWFsIj48Zm9udCBzaXplPSIyIiBmYWNlPSJUYWhvbWEiPjxzcGFuIHN0eWxlPSJmb250 LXNpemU6MTAuMHB0Ow0KZm9udC1mYW1pbHk6VGFob21hIj5JcyBpdCBwb3NzaWJsZSBmb3IgbWUg dG8gZG93bmxvYWQgZnJvbSByZXBvc2l0b3J5IJYNClVuc3RhYmxlIHZlcnNpb24gaXMgZmluZSB0 aWxsIHRoZSByZWxlYXNlIGNvbWVzIGJ5LiA8bzpwPjwvbzpwPjwvc3Bhbj48L2ZvbnQ+PC9wPjxw IGNsYXNzPSJNc29Ob3JtYWwiPjxmb250IHNpemU9IjIiIGZhY2U9IlRhaG9tYSI+PHNwYW4gc3R5 bGU9ImZvbnQtc2l6ZToxMC4wcHQ7DQpmb250LWZhbWlseTpUYWhvbWEiPkxldCBtZSBrbm93IHRo ZSBzdGVwcy9saW5rIHRvIHJlcG9zaXRvcnkuPG86cD48L286cD48L3NwYW4+PC9mb250PjwvcD48 cCBjbGFzcz0iTXNvTm9ybWFsIj48Zm9udCBzaXplPSIyIiBmYWNlPSJUYWhvbWEiPjxzcGFuIHN0 eWxlPSJmb250LXNpemU6MTAuMHB0Ow0KZm9udC1mYW1pbHk6VGFob21hIj5Ba3NoYXk8L3NwYW4+ PC9mb250Pjxmb250IHNpemU9IjIiIGNvbG9yPSJuYXZ5IiBmYWNlPSJBcmlhbCI+PHNwYW4gc3R5 bGU9ImZvbnQtc2l6ZToxMC4wcHQ7Zm9udC1mYW1pbHk6QXJpYWw7Y29sb3I6bmF2eSI+PG86cD48 L286cD48L3NwYW4+PC9mb250PjwvcD48cCBjbGFzcz0iTXNvTm9ybWFsIj48Zm9udCBzaXplPSIy IiBjb2xvcj0ibmF2eSIgZmFjZT0iQXJpYWwiPjxzcGFuIHN0eWxlPSJmb250LXNpemU6DQoxMC4w cHQ7Zm9udC1mYW1pbHk6QXJpYWw7Y29sb3I6bmF2eSI+PG86cD4mbmJzcDs8L286cD48L3NwYW4+ PC9mb250PjwvcD4NCg0KPGRpdj4NCg0KPGRpdiBjbGFzcz0iTXNvTm9ybWFsIiBhbGlnbj0iY2Vu dGVyIiBzdHlsZT0idGV4dC1hbGlnbjpjZW50ZXIiPjxmb250IHNpemU9IjMiIGZhY2U9IlRpbWVz IE5ldyBSb21hbiI+PHNwYW4gc3R5bGU9ImZvbnQtc2l6ZToxMi4wcHQiPg0KDQo8aHIgc2l6ZT0i MiIgd2lkdGg9IjEwMCUiIGFsaWduPSJjZW50ZXIiIHRhYmluZGV4PSItMSI+DQoNCjwvc3Bhbj48 L2ZvbnQ+PC9kaXY+PHAgY2xhc3M9Ik1zb05vcm1hbCI+PGI+PGZvbnQgc2l6ZT0iMiIgZmFjZT0i VGFob21hIj48c3BhbiBzdHlsZT0iZm9udC1zaXplOjEwLjBwdDsNCmZvbnQtZmFtaWx5OlRhaG9t YTtmb250LXdlaWdodDpib2xkIj5Gcm9tOjwvc3Bhbj48L2ZvbnQ+PC9iPjxmb250IHNpemU9IjIi IGZhY2U9IlRhaG9tYSI+PHNwYW4gc3R5bGU9ImZvbnQtc2l6ZToxMC4wcHQ7Zm9udC1mYW1pbHk6 VGFob21hIj4gSm9uYXRoYW4gTHVkbGFtDQpbbWFpbHRvOkpvbmF0aGFuLkx1ZGxhbUBldS5jaXRy aXguY29tXSA8YnI+DQo8Yj48c3BhbiBzdHlsZT0iZm9udC13ZWlnaHQ6Ym9sZCI+U2VudDo8L3Nw YW4+PC9iPiBUdWVzZGF5LCBKdW5lIDAxLCAyMDEwIDEyOjA3DQpQTTxicj4NCjxiPjxzcGFuIHN0 eWxlPSJmb250LXdlaWdodDpib2xkIj5Ubzo8L3NwYW4+PC9iPiA8c3QxOnBlcnNvbm5hbWUgdzpz dD0ib24iPkFrc2hheUt1bWFyDQogTWVodGE8L3N0MTpwZXJzb25uYW1lPjxicj4NCjxiPjxzcGFu IHN0eWxlPSJmb250LXdlaWdodDpib2xkIj5DYzo8L3NwYW4+PC9iPiA8YSBocmVmPSJtYWlsdG86 eGVuLWRldmVsQGxpc3RzLnhlbnNvdXJjZS5jb20iPnhlbi1kZXZlbEBsaXN0cy54ZW5zb3VyY2Uu Y29tPC9hPjsNCjxzdDE6cGVyc29ubmFtZSB3OnN0PSJvbiI+UHJhZGVlcCBQYWRhbGE8L3N0MTpw ZXJzb25uYW1lPjxicj4NCjxiPjxzcGFuIHN0eWxlPSJmb250LXdlaWdodDpib2xkIj5TdWJqZWN0 Ojwvc3Bhbj48L2I+IFJlOiBbWGVuLWRldmVsXSBYQ1A8L3NwYW4+PC9mb250PjxvOnA+PC9vOnA+ PC9wPg0KDQo8L2Rpdj48cCBjbGFzcz0iTXNvTm9ybWFsIj48Zm9udCBzaXplPSIzIiBmYWNlPSJU aW1lcyBOZXcgUm9tYW4iPjxzcGFuIHN0eWxlPSJmb250LXNpemU6DQoxMi4wcHQiPjxvOnA+Jm5i c3A7PC9vOnA+PC9zcGFuPjwvZm9udD48L3A+PHAgY2xhc3M9Ik1zb05vcm1hbCI+PGZvbnQgc2l6 ZT0iMyIgZmFjZT0iVGltZXMgTmV3IFJvbWFuIj48c3BhbiBzdHlsZT0iZm9udC1zaXplOg0KMTIu MHB0Ij5JSVJDLCB0aGVyZSB3YXMgYSBmYWlybHkgbmFzdHkgc2Vzc2lvbiBjYWNoaW5nIGJ1ZyB0 aGF0IGNvdWxkIGNhdXNlDQppc3N1ZXMgbGlrZSB0aGlzLiBJdCdzIGJlZW4gZml4ZWQgc2luY2Us IGFuZCB0aGUgdXBjb21pbmcgMC41IHJlbGVhc2UgKGNvbWluZw0KaW4gYSB3ZWVrIG9yIHNvKSBz aG91bGQgZml4IGl0LjxvOnA+PC9vOnA+PC9zcGFuPjwvZm9udD48L3A+DQoNCjxkaXY+PHAgY2xh c3M9Ik1zb05vcm1hbCI+PGZvbnQgc2l6ZT0iMyIgZmFjZT0iVGltZXMgTmV3IFJvbWFuIj48c3Bh biBzdHlsZT0iZm9udC1zaXplOg0KMTIuMHB0Ij48bzpwPiZuYnNwOzwvbzpwPjwvc3Bhbj48L2Zv bnQ+PC9wPg0KDQo8L2Rpdj4NCg0KPGRpdj48cCBjbGFzcz0iTXNvTm9ybWFsIj48Zm9udCBzaXpl PSIzIiBmYWNlPSJUaW1lcyBOZXcgUm9tYW4iPjxzcGFuIHN0eWxlPSJmb250LXNpemU6DQoxMi4w cHQiPkNoZWVycyw8bzpwPjwvbzpwPjwvc3Bhbj48L2ZvbnQ+PC9wPg0KDQo8L2Rpdj4NCg0KPGRp dj48cCBjbGFzcz0iTXNvTm9ybWFsIj48Zm9udCBzaXplPSIzIiBmYWNlPSJUaW1lcyBOZXcgUm9t YW4iPjxzcGFuIHN0eWxlPSJmb250LXNpemU6DQoxMi4wcHQiPjxvOnA+Jm5ic3A7PC9vOnA+PC9z cGFuPjwvZm9udD48L3A+DQoNCjwvZGl2Pg0KDQo8ZGl2PjxwIGNsYXNzPSJNc29Ob3JtYWwiPjxm b250IHNpemU9IjMiIGZhY2U9IlRpbWVzIE5ldyBSb21hbiI+PHNwYW4gc3R5bGU9ImZvbnQtc2l6 ZToNCjEyLjBwdCI+Sm9uPG86cD48L286cD48L3NwYW4+PC9mb250PjwvcD4NCg0KPC9kaXY+DQoN CjxkaXY+PHAgY2xhc3M9Ik1zb05vcm1hbCI+PGZvbnQgc2l6ZT0iMyIgZmFjZT0iVGltZXMgTmV3 IFJvbWFuIj48c3BhbiBzdHlsZT0iZm9udC1zaXplOg0KMTIuMHB0Ij48bzpwPiZuYnNwOzwvbzpw Pjwvc3Bhbj48L2ZvbnQ+PC9wPg0KDQo8ZGl2PjxwIGNsYXNzPSJNc29Ob3JtYWwiPjxmb250IHNp emU9IjMiIGZhY2U9IlRpbWVzIE5ldyBSb21hbiI+PHNwYW4gc3R5bGU9ImZvbnQtc2l6ZToNCjEy LjBwdCI+PG86cD4mbmJzcDs8L286cD48L3NwYW4+PC9mb250PjwvcD4NCg0KPC9kaXY+DQoNCjxk aXY+PHAgY2xhc3M9Ik1zb05vcm1hbCI+PGZvbnQgc2l6ZT0iMyIgZmFjZT0iVGltZXMgTmV3IFJv bWFuIj48c3BhbiBzdHlsZT0iZm9udC1zaXplOg0KMTIuMHB0Ij48bzpwPiZuYnNwOzwvbzpwPjwv c3Bhbj48L2ZvbnQ+PC9wPg0KDQo8ZGl2Pg0KDQo8ZGl2PjxwIGNsYXNzPSJNc29Ob3JtYWwiPjxm b250IHNpemU9IjMiIGZhY2U9IlRpbWVzIE5ldyBSb21hbiI+PHNwYW4gc3R5bGU9ImZvbnQtc2l6 ZToNCjEyLjBwdCI+T24gMSBKdW4gMjAxMCwgYXQgMTg6NDksIDxzdDE6cGVyc29ubmFtZSB3OnN0 PSJvbiI+QWtzaGF5S3VtYXIgTWVodGE8L3N0MTpwZXJzb25uYW1lPg0Kd3JvdGU6PG86cD48L286 cD48L3NwYW4+PC9mb250PjwvcD4NCg0KPC9kaXY+PHAgY2xhc3M9Ik1zb05vcm1hbCI+PGZvbnQg c2l6ZT0iMyIgZmFjZT0iVGltZXMgTmV3IFJvbWFuIj48c3BhbiBzdHlsZT0iZm9udC1zaXplOg0K MTIuMHB0Ij48YnI+DQo8YnI+DQo8bzpwPjwvbzpwPjwvc3Bhbj48L2ZvbnQ+PC9wPg0KDQo8c3Bh biBzdHlsZT0ib3JwaGFuczogMjt3aWRvd3M6IDI7LXdlYmtpdC1ib3JkZXItaG9yaXpvbnRhbC1z cGFjaW5nOiAwcHg7DQotd2Via2l0LWJvcmRlci12ZXJ0aWNhbC1zcGFjaW5nOiAwcHg7LXdlYmtp dC10ZXh0LWRlY29yYXRpb25zLWluLWVmZmVjdDogbm9uZTsNCi13ZWJraXQtdGV4dC1zaXplLWFk anVzdDogYXV0bzstd2Via2l0LXRleHQtc3Ryb2tlLXdpZHRoOiAwcHg7d29yZC1zcGFjaW5nOg0K MHB4Ij4NCg0KPGRpdiBsaW5rPSJibHVlIiB2bGluaz0icHVycGxlIj4NCg0KPGRpdj4NCg0KPGRp dj48cCBjbGFzcz0iTXNvTm9ybWFsIj48Zm9udCBzaXplPSIyIiBmYWNlPSJBcmlhbCI+PHNwYW4g c3R5bGU9ImZvbnQtc2l6ZToxMC4wcHQ7DQpmb250LWZhbWlseTpBcmlhbCI+SGkgdGhlcmUsPHUx OnA+PC91MTpwPjwvc3Bhbj48L2ZvbnQ+PG86cD48L286cD48L3A+DQoNCjwvZGl2Pg0KDQo8ZGl2 PjxwIGNsYXNzPSJNc29Ob3JtYWwiPjxmb250IHNpemU9IjIiIGZhY2U9IkFyaWFsIj48c3BhbiBz dHlsZT0iZm9udC1zaXplOjEwLjBwdDsNCmZvbnQtZmFtaWx5OkFyaWFsIj48dTE6cD4mbmJzcDs8 L3UxOnA+PC9zcGFuPjwvZm9udD48bzpwPjwvbzpwPjwvcD4NCg0KPC9kaXY+DQoNCjxkaXY+PHAg Y2xhc3M9Ik1zb05vcm1hbCI+PGZvbnQgc2l6ZT0iMiIgZmFjZT0iQXJpYWwiPjxzcGFuIHN0eWxl PSJmb250LXNpemU6MTAuMHB0Ow0KZm9udC1mYW1pbHk6QXJpYWwiPldlIGFyZSB1c2luZyBsYXRl c3QgdmVyc2lvbiBvZiBYQ1Agb24gNiBob3N0cy4gV2hpbGUgaXNzdWluZw0KVk0uc3RhcnQgb3Ig Vk0uc3RhcnRfb24geG1scnBjIGZ1bmN0aW9uYWwgY2FsbCAsIGl0IHNheXMgOjx1MTpwPjwvdTE6 cD48L3NwYW4+PC9mb250PjxvOnA+PC9vOnA+PC9wPg0KDQo8L2Rpdj4NCg0KPGRpdj48cCBjbGFz cz0iTXNvTm9ybWFsIj48Zm9udCBzaXplPSIyIiBmYWNlPSJBcmlhbCI+PHNwYW4gc3R5bGU9ImZv bnQtc2l6ZToxMC4wcHQ7DQpmb250LWZhbWlseTpBcmlhbCI+PHUxOnA+Jm5ic3A7PC91MTpwPjwv c3Bhbj48L2ZvbnQ+PG86cD48L286cD48L3A+DQoNCjwvZGl2Pg0KDQo8ZGl2PjxwIGNsYXNzPSJN c29Ob3JtYWwiPjxmb250IHNpemU9IjIiIGZhY2U9IkFyaWFsIj48c3BhbiBzdHlsZT0iZm9udC1z aXplOjEwLjBwdDsNCmZvbnQtZmFtaWx5OkFyaWFsIj48dTE6cD4mbmJzcDs8L3UxOnA+PC9zcGFu PjwvZm9udD48bzpwPjwvbzpwPjwvcD4NCg0KPC9kaXY+DQoNCjxkaXY+PHAgY2xhc3M9Ik1zb05v cm1hbCI+PGZvbnQgc2l6ZT0iMiIgZmFjZT0iQXJpYWwiPjxzcGFuIHN0eWxlPSJmb250LXNpemU6 MTAuMHB0Ow0KZm9udC1mYW1pbHk6QXJpYWwiPnsnU3RhdHVzJzogJ0ZhaWx1cmUnLCAnRXJyb3JE ZXNjcmlwdGlvbic6DQpbJ1NFU1NJT05fSU5WQUxJRCcsICdPcGFxdWVSZWY6Y2ZiNmRmMTQtMzg3 ZC00MGExLWNjMjctZDU5NjJjYmE3NzEyJ119PHUxOnA+PC91MTpwPjwvc3Bhbj48L2ZvbnQ+PG86 cD48L286cD48L3A+DQoNCjwvZGl2Pg0KDQo8ZGl2PjxwIGNsYXNzPSJNc29Ob3JtYWwiPjxmb250 IHNpemU9IjIiIGZhY2U9IkFyaWFsIj48c3BhbiBzdHlsZT0iZm9udC1zaXplOjEwLjBwdDsNCmZv bnQtZmFtaWx5OkFyaWFsIj48dTE6cD4mbmJzcDs8L3UxOnA+PC9zcGFuPjwvZm9udD48bzpwPjwv bzpwPjwvcD4NCg0KPC9kaXY+DQoNCjxkaXY+PHAgY2xhc3M9Ik1zb05vcm1hbCI+PGZvbnQgc2l6 ZT0iMiIgZmFjZT0iQXJpYWwiPjxzcGFuIHN0eWxlPSJmb250LXNpemU6MTAuMHB0Ow0KZm9udC1m YW1pbHk6QXJpYWwiPkhvd2V2ZXIgaWYgSSBwdXQgVk0uc3RhcnQgaW4gYSBsb29wIG1heWJlIGFm dGVyJm5ic3A7IDIwLTMwDQp0cmllcyBpdCBzdWNjZWVkcyAuPHUxOnA+PC91MTpwPjwvc3Bhbj48 L2ZvbnQ+PG86cD48L286cD48L3A+DQoNCjwvZGl2Pg0KDQo8ZGl2PjxwIGNsYXNzPSJNc29Ob3Jt YWwiPjxmb250IHNpemU9IjIiIGZhY2U9IkFyaWFsIj48c3BhbiBzdHlsZT0iZm9udC1zaXplOjEw LjBwdDsNCmZvbnQtZmFtaWx5OkFyaWFsIj5CdXQgVk0uc3RhcnRfb24gZG9lcyBub3Qgc3VjY2Vl ZCBldmVuIGFmdGVyIDcwIHRyaWVzLjx1MTpwPjwvdTE6cD48L3NwYW4+PC9mb250PjxvOnA+PC9v OnA+PC9wPg0KDQo8L2Rpdj4NCg0KPGRpdj48cCBjbGFzcz0iTXNvTm9ybWFsIj48Zm9udCBzaXpl PSIyIiBmYWNlPSJBcmlhbCI+PHNwYW4gc3R5bGU9ImZvbnQtc2l6ZToxMC4wcHQ7DQpmb250LWZh bWlseTpBcmlhbCI+T25lIG1vcmUgb2JzZXJ2YXRpb24gLSZuYnNwOyBWTS5jbG9uZSBzdWNjZWVk cyBhZnRlciA3LTgNCnRyaWVzPHUxOnA+PC91MTpwPjwvc3Bhbj48L2ZvbnQ+PG86cD48L286cD48 L3A+DQoNCjwvZGl2Pg0KDQo8ZGl2PjxwIGNsYXNzPSJNc29Ob3JtYWwiPjxmb250IHNpemU9IjIi IGZhY2U9IkFyaWFsIj48c3BhbiBzdHlsZT0iZm9udC1zaXplOjEwLjBwdDsNCmZvbnQtZmFtaWx5 OkFyaWFsIj5WTS5oYXJkX3NodXRkb3duIHdvcmtzIGZpbmU8dTE6cD48L3UxOnA+PC9zcGFuPjwv Zm9udD48bzpwPjwvbzpwPjwvcD4NCg0KPC9kaXY+DQoNCjxkaXY+PHAgY2xhc3M9Ik1zb05vcm1h bCI+PGZvbnQgc2l6ZT0iMiIgZmFjZT0iQXJpYWwiPjxzcGFuIHN0eWxlPSJmb250LXNpemU6MTAu MHB0Ow0KZm9udC1mYW1pbHk6QXJpYWwiPjx1MTpwPiZuYnNwOzwvdTE6cD48L3NwYW4+PC9mb250 PjxvOnA+PC9vOnA+PC9wPg0KDQo8L2Rpdj4NCg0KPGRpdj48cCBjbGFzcz0iTXNvTm9ybWFsIj48 Zm9udCBzaXplPSIyIiBmYWNlPSJBcmlhbCI+PHNwYW4gc3R5bGU9ImZvbnQtc2l6ZToxMC4wcHQ7 DQpmb250LWZhbWlseTpBcmlhbCI+Q2FuIHlvdSBndWlkZSBtZSBvbiB0aGlzIGlzc3VlLDx1MTpw PjwvdTE6cD48L3NwYW4+PC9mb250PjxvOnA+PC9vOnA+PC9wPg0KDQo8L2Rpdj4NCg0KPGRpdj48 cCBjbGFzcz0iTXNvTm9ybWFsIj48Zm9udCBzaXplPSIyIiBmYWNlPSJBcmlhbCI+PHNwYW4gc3R5 bGU9ImZvbnQtc2l6ZToxMC4wcHQ7DQpmb250LWZhbWlseTpBcmlhbCI+QWtzaGF5PHUxOnA+PC91 MTpwPjwvc3Bhbj48L2ZvbnQ+PG86cD48L286cD48L3A+DQoNCjwvZGl2Pg0KDQo8ZGl2PjxwIGNs YXNzPSJNc29Ob3JtYWwiPjxmb250IHNpemU9IjIiIGZhY2U9IkFyaWFsIj48c3BhbiBzdHlsZT0i Zm9udC1zaXplOjEwLjBwdDsNCmZvbnQtZmFtaWx5OkFyaWFsIj48dTE6cD4mbmJzcDs8L3UxOnA+ PC9zcGFuPjwvZm9udD48bzpwPjwvbzpwPjwvcD4NCg0KPC9kaXY+DQoNCjxkaXY+PHAgY2xhc3M9 Ik1zb05vcm1hbCI+PGZvbnQgc2l6ZT0iMiIgZmFjZT0iQXJpYWwiPjxzcGFuIHN0eWxlPSJmb250 LXNpemU6MTAuMHB0Ow0KZm9udC1mYW1pbHk6QXJpYWwiPjx1MTpwPiZuYnNwOzwvdTE6cD48L3Nw YW4+PC9mb250PjxvOnA+PC9vOnA+PC9wPg0KDQo8L2Rpdj4NCg0KPGRpdj48cCBjbGFzcz0iTXNv Tm9ybWFsIj48Zm9udCBzaXplPSIyIiBmYWNlPSJBcmlhbCI+PHNwYW4gc3R5bGU9ImZvbnQtc2l6 ZToxMC4wcHQ7DQpmb250LWZhbWlseTpBcmlhbCI+PHUxOnA+Jm5ic3A7PC91MTpwPjwvc3Bhbj48 L2ZvbnQ+PG86cD48L286cD48L3A+DQoNCjwvZGl2Pg0KDQo8L2Rpdj48cCBjbGFzcz0iTXNvTm9y bWFsIj48Zm9udCBzaXplPSI0IiBmYWNlPSJIZWx2ZXRpY2EiPjxzcGFuIHN0eWxlPSJmb250LXNp emU6MTMuNXB0Ow0KZm9udC1mYW1pbHk6SGVsdmV0aWNhIj4mbHQ7QVRUMDAwMDEuLnR4dCZndDs8 bzpwPjwvbzpwPjwvc3Bhbj48L2ZvbnQ+PC9wPg0KDQo8L2Rpdj4NCg0KPC9zcGFuPjwvZGl2Pg0K DQo8cCBjbGFzcz0iTXNvTm9ybWFsIj48Zm9udCBzaXplPSIzIiBmYWNlPSJUaW1lcyBOZXcgUm9t YW4iPjxzcGFuIHN0eWxlPSJmb250LXNpemU6DQoxMi4wcHQiPjxvOnA+Jm5ic3A7PC9vOnA+PC9z cGFuPjwvZm9udD48L3A+DQoNCjwvZGl2Pg0KDQo8L2Rpdj4NCg0KPC9kaXY+DQoNCjwvZGl2Pg0K DQoNCjwvbzpzbWFydHRhZ3R5cGU+PC9ibG9ja3F1b3RlPjwvZGl2Pjxicj48L2Rpdj48L2JvZHk+ PC9odG1sPg== --_005_89D54B65B3ED4A4AB1FBD044F00DBE85eucitrixcom_ Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --_005_89D54B65B3ED4A4AB1FBD044F00DBE85eucitrixcom_-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Brendan Cully Subject: Re: slow live magration / xc_restore on xen4 pvops Date: Thu, 3 Jun 2010 08:03:05 -0700 Message-ID: <20100603150305.GA53591@zanzibar.domain.invalid> References: <2FD61F37AFF16D4DB46149330E4273C702FF9687@dcl-ex.dcml.docomolabs-usa.com> <4C0578EB.2040800@uni.leuphana.de> <19462.33905.936222.605434@mariner.uk.xensource.com> <20100602162745.GA27542@kremvax.cs.ubc.ca> <19463.32147.268104.94905@mariner.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <19463.32147.268104.94905@mariner.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Ian Jackson Cc: "xen-devel@lists.xensource.com" , Andreas Olsowski List-Id: xen-devel@lists.xenproject.org On Thursday, 03 June 2010 at 11:01, Ian Jackson wrote: > Brendan Cully writes ("Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops"): > > 2. in normal migration, the sender should close the fd after sending > > all data, immediately triggering an IO error on the receiver and > > completing the restore. > > This is not true. In normal migration, the fd is used by the > machinery which surrounds xc_domain_restore (in xc_save and also in xl > or xend). In any case it would be quite wrong for a library function > like xc_domain_restore to eat the fd. The sender closes the fd, as it always has. xc_domain_restore has always consumed the entire contents of the fd, because the qemu tail has no length header under normal migration. There's no behavioral difference here that I can see. > It's not necessary for xc_domain_restore to behave this way in all > cases; all that's needed is parameters to tell it how to behave. I have no objection to a more explicit interface. The current form is simply Remus trying to be as invisible as possible to the rest of the tool stack. > > I did try to avoid disturbing regular live migration as much as > > possible when I wrote the code. I suspect some other regression has > > crept in, and I'll investigate. > > The short timeout is another regression. A normal live migration or > restore should not fall over just because no data is available for > 100ms. (the timeout is 1s, by the way). For some reason you clipped the bit of my previous message where I say this doesn't happen: 1. reads are only supposed to be able to time out after the entire first checkpoint has been received (IOW this wouldn't kick in until normal migration had already completed) Let's take a look at read_exact_timed in xc_domain_restore: if ( completed ) { /* expect a heartbeat every HEARBEAT_MS ms maximum */ tv.tv_sec = HEARTBEAT_MS / 1000; tv.tv_usec = (HEARTBEAT_MS % 1000) * 1000; FD_ZERO(&rfds); FD_SET(fd, &rfds); len = select(fd + 1, &rfds, NULL, NULL, &tv); if ( !FD_ISSET(fd, &rfds) ) { fprintf(stderr, "read_exact_timed failed (select returned %zd)\n", len); return -1; } } 'completed' is not set until the first entire checkpoint (i.e., the entirety of non-Remus migration) has completed. So, no issue. I see no evidence that Remus has anything to do with the live migration performance regression discussed in this thread, and I haven't seen any other reported issues either. I think the mlock issue is a much more likely candidate. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Keir Fraser Subject: Re: slow live magration / xc_restore on xen4 pvops Date: Thu, 3 Jun 2010 16:18:39 +0100 Message-ID: References: <20100603150305.GA53591@zanzibar.domain.invalid> Mime-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20100603150305.GA53591@zanzibar.domain.invalid> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Brendan Cully , Ian Jackson Cc: "xen-devel@lists.xensource.com" , Andreas Olsowski , "Zhai, Edwin" List-Id: xen-devel@lists.xenproject.org On 03/06/2010 16:03, "Brendan Cully" wrote: > I see no evidence that Remus has anything to do with the live > migration performance regression discussed in this thread, and I > haven't seen any other reported issues either. I think the mlock issue > is a much more likely candidate. I agree it's probably lack of batching plus expensive mlocks. The performance difference between different machines under test is either because one runs out of 2MB superpage extents before the other (for some reason) or because mlock operations are for some reason much more likely to take a slow path in the kernel (possibly including disk i/o) for some reason. We need to get batching back, and Edwin is on the case for that: I hope Andreas will try out Edwin's patch to work towards that. We can also reduce mlock cost by mlocking some domain_restore arrays across the entire restore operation, I should imagine. -- Keir From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Jackson Subject: Re: slow live magration / xc_restore on xen4 pvops Date: Thu, 3 Jun 2010 18:15:48 +0100 Message-ID: <19463.58180.892314.230322@mariner.uk.xensource.com> References: <2FD61F37AFF16D4DB46149330E4273C702FF9687@dcl-ex.dcml.docomolabs-usa.com> <4C0578EB.2040800@uni.leuphana.de> <19462.33905.936222.605434@mariner.uk.xensource.com> <20100602162745.GA27542@kremvax.cs.ubc.ca> <19463.32147.268104.94905@mariner.uk.xensource.com> <20100603150305.GA53591@zanzibar.domain.invalid> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20100603150305.GA53591@zanzibar.domain.invalid> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Brendan Cully Cc: "xen-devel@lists.xensource.com" , Andreas Olsowski List-Id: xen-devel@lists.xenproject.org Brendan Cully writes ("Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops"): > The sender closes the fd, as it always has. xc_domain_restore has > always consumed the entire contents of the fd, because the qemu tail > has no length header under normal migration. There's no behavioral > difference here that I can see. No, that is not the case. Look for example at "save" in XendCheckpoint.py in xend, where the save code: 1. Converts the domain config to sxp and writes it to the fd 2. Calls xc_save (which calls xc_domain_save) 3. Writes the qemu save file to the fd > I have no objection to a more explicit interface. The current form is > simply Remus trying to be as invisible as possible to the rest of the > tool stack. My complaint is that that is not currently the case. > 1. reads are only supposed to be able to time out after the entire > first checkpoint has been received (IOW this wouldn't kick in until > normal migration had already completed) OMG I hadn't noticed that you had introduced a static variable for that; I had assumed that "read_exact_timed" was roughly what it said on the tin. I think I shall stop now before I become more rude. Ian. From mboxrd@z Thu Jan 1 00:00:00 1970 From: "AkshayKumar Mehta" Subject: RE: XCP Date: Thu, 3 Jun 2010 10:20:44 -0700 Message-ID: <2FD61F37AFF16D4DB46149330E4273C702FF9A51@dcl-ex.dcml.docomolabs-usa.com> References: <2FD61F37AFF16D4DB46149330E4273C702FF9687@dcl-ex.dcml.docomolabs-usa.com> <4BB2088A-1981-4A69-8E3F-DB52EAE75B6F@eu.citrix.com> <2FD61F37AFF16D4DB46149330E4273C702FF99D8@dcl-ex.dcml.docomolabs-usa.com> <89D54B65-B3ED-4A4A-B1FB-D044F00DBE85@eu.citrix.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1468354788==" Return-path: Content-class: urn:content-classes:message In-Reply-To: <89D54B65-B3ED-4A4A-B1FB-D044F00DBE85@eu.citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Jonathan Ludlam Cc: Pradeep Padala , xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org This is a multi-part message in MIME format. --===============1468354788== Content-class: urn:content-classes:message Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01CB0341.1ADA4CF7" This is a multi-part message in MIME format. ------_=_NextPart_001_01CB0341.1ADA4CF7 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Thanks! I have already downloaded them. Will build it and deployed it Akshay =20 ________________________________ From: Jonathan Ludlam [mailto:Jonathan.Ludlam@eu.citrix.com]=20 Sent: Thursday, June 03, 2010 3:25 AM To: AkshayKumar Mehta Cc: xen-devel@lists.xensource.com; Pradeep Padala Subject: Re: [Xen-devel] XCP =20 There aren't any binaries at the moment, but you could build your own from the public repositories - I sent a script out on the 25th March that will compile everything up. You'll need to replace the xapi and fe executables (xapi is output in xen-api.hg/ocaml/xapi/ and fe is found in xen-api-libs.hg/forking_executioner/) =20 I'll reattach the script to this mail. =20 Cheers, =20 Jon =20 ------_=_NextPart_001_01CB0341.1ADA4CF7 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Thanks! I have already downloaded = them. Will build it and deployed it

Akshay

 


From: = Jonathan Ludlam [mailto:Jonathan.Ludlam@eu.citrix.com]
Sent: Thursday, June 03, = 2010 3:25 AM
To: AkshayKumar Mehta
Cc: = xen-devel@lists.xensource.com; Pradeep Padala
Subject: Re: [Xen-devel] = XCP

 

There aren't any binaries at the moment, but you could build = your own from the public repositories - I sent a script out on the 25th March = that will compile everything up. You'll need to replace the xapi and fe = executables (xapi is output in xen-api.hg/ocaml/xapi/ and fe is found in xen-api-libs.hg/forking_executioner/)

 

I'll reattach the script to this = mail.

 

Cheers,

 

Jon

 

------_=_NextPart_001_01CB0341.1ADA4CF7-- --===============1468354788== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============1468354788==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Brendan Cully Subject: Re: slow live magration / xc_restore on xen4 pvops Date: Thu, 3 Jun 2010 10:29:27 -0700 Message-ID: <20100603172927.GA4817@kremvax.cs.ubc.ca> References: <2FD61F37AFF16D4DB46149330E4273C702FF9687@dcl-ex.dcml.docomolabs-usa.com> <4C0578EB.2040800@uni.leuphana.de> <19462.33905.936222.605434@mariner.uk.xensource.com> <20100602162745.GA27542@kremvax.cs.ubc.ca> <19463.32147.268104.94905@mariner.uk.xensource.com> <20100603150305.GA53591@zanzibar.domain.invalid> <19463.58180.892314.230322@mariner.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <19463.58180.892314.230322@mariner.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Ian Jackson Cc: "xen-devel@lists.xensource.com" , Andreas Olsowski List-Id: xen-devel@lists.xenproject.org On Thursday, 03 June 2010 at 18:15, Ian Jackson wrote: > Brendan Cully writes ("Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops"): > > The sender closes the fd, as it always has. xc_domain_restore has > > always consumed the entire contents of the fd, because the qemu tail > > has no length header under normal migration. There's no behavioral > > difference here that I can see. > > No, that is not the case. Look for example at "save" in > XendCheckpoint.py in xend, where the save code: > 1. Converts the domain config to sxp and writes it to the fd > 2. Calls xc_save (which calls xc_domain_save) > 3. Writes the qemu save file to the fd 4. (in XendDomain) closed the fd. Again, this is the _sender_. I fail to see your point. > > I have no objection to a more explicit interface. The current form is > > simply Remus trying to be as invisible as possible to the rest of the > > tool stack. > > My complaint is that that is not currently the case. > > > 1. reads are only supposed to be able to time out after the entire > > first checkpoint has been received (IOW this wouldn't kick in until > > normal migration had already completed) > > OMG I hadn't noticed that you had introduced a static variable for > that; I had assumed that "read_exact_timed" was roughly what it said > on the tin. > > I think I shall stop now before I become more rude. Feel free to reply if you have an actual Remus-caused regression instead of FUD based on misreading the code. I'd certainly be interested in fixing something real. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Jackson Subject: Re: slow live magration / xc_restore on xen4 pvops Date: Thu, 3 Jun 2010 19:02:02 +0100 Message-ID: <19463.60954.355702.523915@mariner.uk.xensource.com> References: <2FD61F37AFF16D4DB46149330E4273C702FF9687@dcl-ex.dcml.docomolabs-usa.com> <4C0578EB.2040800@uni.leuphana.de> <19462.33905.936222.605434@mariner.uk.xensource.com> <20100602162745.GA27542@kremvax.cs.ubc.ca> <19463.32147.268104.94905@mariner.uk.xensource.com> <20100603150305.GA53591@zanzibar.domain.invalid> <19463.58180.892314.230322@mariner.uk.xensource.com> <20100603172927.GA4817@kremvax.cs.ubc.ca> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20100603172927.GA4817@kremvax.cs.ubc.ca> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Brendan Cully Cc: "xen-devel@lists.xensource.com" , Andreas Olsowski List-Id: xen-devel@lists.xenproject.org Brendan Cully writes ("Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops"): > On Thursday, 03 June 2010 at 18:15, Ian Jackson wrote: > > No, that is not the case. Look for example at "save" in > > XendCheckpoint.py in xend, where the save code: > > 1. Converts the domain config to sxp and writes it to the fd > > 2. Calls xc_save (which calls xc_domain_save) > > 3. Writes the qemu save file to the fd > > 4. (in XendDomain) closed the fd. Again, this is the _sender_. I fail > to see your point. In the receiver this corresponds to the qemu savefile being read from the fd, after xc_domain_restore has returned. So the fd remains readable after xc_domain_restore and the save image data sent by xc_domain_save and received by xc_domain_restore is self-delimiting. In Xen 3.4 this is easily seen in XendCheckpoint.py, where the corresponding receive logic is clearly visible. In Xen 4.x this is different because of the Remus patches. Ian. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Keir Fraser Subject: Re: slow live magration / xc_restore on xen4 pvops Date: Wed, 9 Jun 2010 14:32:35 +0100 Message-ID: References: <4C076EB2.9030108@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <4C076EB2.9030108@intel.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: "Zhai, Edwin" , "andreas.olsowski@uni.leuphana.de" , "xen-devel@lists.xensource.com" , Ian Cc: Dave McCracken , Dave McCracken List-Id: xen-devel@lists.xenproject.org Edwin, Dave, The issue is clearly that xc_domain_restore now only ever issues populate_physmap requests for a single extent at a time. This might be okay when allocating superpages, but that is rarely the case for PV guests (depends on a rare domain config parameter) and is not guaranteed for HVM guests either. The resulting performance is unacceptable, especially when the kernel's underlying mlock() is slow. It looks to me like the root cause is Dave McCracken's patch xen-unstable:19639, which Edwin Zhai's patch xen-unstable:20126 merely builds upon. Ultimately I don't care who fixes it, but I would like a fix for 4.0.1 which releases in the next few weeks, and if I have to do it myself I will simply hack out the above two changesets. I'd rather have domain restore working in reasonable time than the relatively small performance boost of guest superpage mappings. Thanks, Keir On 03/06/2010 09:58, "Zhai, Edwin" wrote: > I assume this is PV domU rather than HVM, right? > > 1. we need check if super page is the culprit by SP_check1.patch. > > 2. if this can fix this problem, we need further check where the extra > costs comes: the speculative algorithm, or the super page population > hypercall by SP_check2.patch > > If SP_check2.patch works, the culprit is the new allocation hypercall(so > guest creation also suffer); Else, the speculative algorithm. > > Does it make sense? > > Thanks, > edwin > > > Brendan Cully wrote: >> On Thursday, 03 June 2010 at 06:47, Keir Fraser wrote: >> >>> On 03/06/2010 02:04, "Brendan Cully" wrote: >>> >>> >>>> I've done a bit of profiling of the restore code and observed the >>>> slowness here too. It looks to me like it's probably related to >>>> superpage changes. The big hit appears to be at the front of the >>>> restore process during calls to allocate_mfn_list, under the >>>> normal_page case. It looks like we're calling >>>> xc_domain_memory_populate_physmap once per page here, instead of >>>> batching the allocation? I haven't had time to investigate further >>>> today, but I think this is the culprit. >>>> >>> Ccing Edwin Zhai. He wrote the superpage logic for domain restore. >>> >> >> Here's some data on the slowdown going from 2.6.18 to pvops dom0: >> >> I wrapped the call to allocate_mfn_list in uncanonicalize_pagetable >> to measure the time to do the allocation. >> >> kernel, min call time, max call time >> 2.6.18, 4 us, 72 us >> pvops, 202 us, 10696 us (!) >> >> It looks like pvops is dramatically slower to perform the >> xc_domain_memory_populate_physmap call! >> >> I'll attach the patch and raw data below. >> From mboxrd@z Thu Jan 1 00:00:00 1970 From: Keir Fraser Subject: Re: slow live magration / xc_restore on xen4 pvops Date: Thu, 10 Jun 2010 10:27:29 +0100 Message-ID: References: <4C06E26E.3030404@uni.leuphana.de> Mime-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <4C06E26E.3030404@uni.leuphana.de> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Andreas Olsowski , "xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org Andreas, You can check whether this is fixed by the latest fixes in http://xenbits.xensource.com/xen-4.0-testing.hg. You should only need to rebuild and reinstall tools/libxc. Thanks, Keir On 02/06/2010 23:59, "Andreas Olsowski" wrote: > I did some further research now and shut down all virtual machines on > xenturio1, after that i got (3 runs): > (xm save takes ~5 seconds , user and sys are always negligible so i > removed those to reduce text) > > xenturio1:~# time xm restore /var/saverestore-x1.mem > real 0m25.349s 0m27.456s 0m27.208s > > So the fact that there were running machines did impact performance of > xc_restore. > > I proceeded to start create 20 "dummy" vms with 1gig ram and 4vpcus > each (dom0 has 4096M fixed, 24gig total available): > xenturio1:~# for i in {1..20} ; do echo creating dummy$i ; xt vm create > dummy$i -vlan 27 -mem 1024 -cpus 4 ; done > creating dummy1 > vm/create> successfully created vm 'dummy1' > .... > creating dummy20 > vm/create> successfully created vm 'dummy20' > > and started them > for i in {1..20} ; do echo starting dummy$i ; xm start dummy$i ; done > > So my memory allocation should now be 100% (4gig dom0 20gig domUs), but > why did i have 512megs to spare for "saverestore-x1"? Oh well, onwards. > > Once again i ran a save/restore, 3 times to be sure (edited the > additional results in output). > > With 20 running vms: > xenturio1:~# time xm restore /var/saverestore-x1.mem > real 1m16.375s 0m31.306s 1m10.214s > > With 16 running vms: > xenturio1:~# time xm restore /var/saverestore-x1.mem > real 1m49.741s 1m38.696s 0m55.615s > > With 12 running vms: > xenturio1:~# time xm restore /var/saverestore-x1.mem > real 1m3.101s 2m4.254s 1m27.193s > > With 8 running vms: > xenturio1:~# time xm restore /var/saverestore-x1.mem > real 0m36.867s 0m43.513s 0m33.199s > > With 4 running vms: > xenturio1:~# time xm restore /var/saverestore-x1.mem > real 0m40.454s 0m44.929s 1m7.215s > > Keep in mind, those dumUs dont do anything at all, they just idle. > What is going on there the results seem completely random, running more > domUs can be faster than running less? How is that even possible? > > So i deleted the dummyXs and started the productive domUs again, in 3 > steps to take further measurements: > > > after first batch: > xenturio1:~# time xm restore /var/saverestore-x1.mem > real 0m23.968s 1m22.133s 1m24.420s > > after second batch: > xenturio1:~# time xm restore /var/saverestore-x1.mem > real 1m54.310s 1m11.340s 1m47.643s > > after third batch: > xenturio1:~# time xm restore /var/saverestore-x1.mem > real 1m52.065s 1m34.517s 2m8.644s 1m25.473s 1m35.943s 1m45.074s > 1m48.407s 1m18.277s 1m18.931s 1m27.458s > > So my current guess is, xc_restore speed depends on the amount of used > memory or rather how much is beeing grabbed by running processes. Does > that make any sense? > > But if that is so, explain: > I started 3 vms running "stress" that do: > load average: 30.94, 30.04, 21.00 > Mem: 5909844k total, 4020480k used, 1889364k free, 288k buffers > > But still: > tarballerina:~# time xm restore /var/saverestore-t.mem > real 0m38.654s > > Why doesnt xc_restore slow down on tarballerina, no matter what i do? > Again: all 3 machines have 24gigs ram, 2x quad xeons and dom0 is fixed > to 4096M ram. > all use the same xen4 sources, the same kernels with the same configs. > > Is the Xeon E5520 with DDR3 really this much faster than the L5335 and > L5410 with DDR2? > > If someone were to tell me, that this is expected behaviour i wouldnt > like it, but at least i could accept it. > Are machines doing plenty of cpu and memory utilizaton not a good > measurement in this or any case? > > I think tomorrow night i will migrate all machines from xenturio1 to > tarballerina, but first i have to verify that all vlans are available, > that i cannot do right now. > > --- > > Andreas > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel From mboxrd@z Thu Jan 1 00:00:00 1970 From: "AkshayKumar Mehta" Subject: RE: XCP - iisues with XCP .5 Date: Mon, 30 Aug 2010 18:33:29 -0700 Message-ID: <2FD61F37AFF16D4DB46149330E4273C703353E53@dcl-ex.dcml.docomolabs-usa.com> References: <2FD61F37AFF16D4DB46149330E4273C702FF9687@dcl-ex.dcml.docomolabs-usa.com> <4BB2088A-1981-4A69-8E3F-DB52EAE75B6F@eu.citrix.com> <2FD61F37AFF16D4DB46149330E4273C702FF99D8@dcl-ex.dcml.docomolabs-usa.com> <89D54B65-B3ED-4A4A-B1FB-D044F00DBE85@eu.citrix.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0324822647==" Return-path: Content-class: urn:content-classes:message List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: AkshayKumar Mehta , Jonathan Ludlam Cc: Pradeep Padala , xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org This is a multi-part message in MIME format. --===============0324822647== Content-class: urn:content-classes:message Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01CB48AC.849004AF" This is a multi-part message in MIME format. ------_=_NextPart_001_01CB48AC.849004AF Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi Jon, =20 I am facing some issues with XCP .5 : 1. Slave machine frequently hangs and on reboot looses NICs information . Running ifconfig show the interfaces and can also ping the network. 2. It fails to migrate VMs. ( though I get a Success in status of the returned object)=20 =20 Question : =20 If I point a fresh installation of XCP master and slaves( new pool ) to a repository that is pre-existing , how can I get the old VMs from the vhd files in new setup? ( This is the only pool utilizing the store repository )=20 =20 =20 regards =20 Akshay =20 ________________________________ From: AkshayKumar Mehta=20 Sent: Thursday, June 03, 2010 10:21 AM To: 'Jonathan Ludlam' Cc: xen-devel@lists.xensource.com; Pradeep Padala Subject: RE: [Xen-devel] XCP =20 Thanks! I have already downloaded them. Will build it and deployed it Akshay =20 ________________________________ From: Jonathan Ludlam [mailto:Jonathan.Ludlam@eu.citrix.com]=20 Sent: Thursday, June 03, 2010 3:25 AM To: AkshayKumar Mehta Cc: xen-devel@lists.xensource.com; Pradeep Padala Subject: Re: [Xen-devel] XCP =20 There aren't any binaries at the moment, but you could build your own from the public repositories - I sent a script out on the 25th March that will compile everything up. You'll need to replace the xapi and fe executables (xapi is output in xen-api.hg/ocaml/xapi/ and fe is found in xen-api-libs.hg/forking_executioner/) =20 I'll reattach the script to this mail. =20 Cheers, =20 Jon =20 ------_=_NextPart_001_01CB48AC.849004AF Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi =  Jon,

 

I am facing some issues with XCP .5 = :

1. Slave machine frequently hangs = and on reboot looses NICs information . Running ifconfig show the interfaces = and can also ping the network.

2. It fails to migrate VMs. ( = though  I get a Success in status of the returned object) =

 

Question = :

 

If I point a fresh installation of = XCP master and slaves( new pool )  to a repository that is pre-existing = , how can I get the old VMs from the vhd files in new setup? ( This is the = only pool utilizing the store repository )

 

 

regards

=

 

Akshay

 


From: = AkshayKumar Mehta
Sent: Thursday, June 03, = 2010 10:21 AM
To: 'Jonathan Ludlam'
Cc: = xen-devel@lists.xensource.com; Pradeep Padala
Subject: RE: [Xen-devel] = XCP

 

Thanks! I have already downloaded = them. Will build it and deployed it

Akshay

 


From: = Jonathan Ludlam [mailto:Jonathan.Ludlam@eu.citrix.com]
Sent: Thursday, June 03, = 2010 3:25 AM
To: AkshayKumar Mehta
Cc: = xen-devel@lists.xensource.com; Pradeep Padala
Subject: Re: [Xen-devel] = XCP

 

There aren't any binaries at the moment, but you could build = your own from the public repositories - I sent a script out on the 25th March = that will compile everything up. You'll need to replace the xapi and fe = executables (xapi is output in xen-api.hg/ocaml/xapi/ and fe is found in xen-api-libs.hg/forking_executioner/)

 

I'll reattach the script to this = mail.

 

Cheers,

 

Jon

 

------_=_NextPart_001_01CB48AC.849004AF-- --===============0324822647== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============0324822647==--