From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jan Beulich" Subject: Re: Performance difference between Xen versions Date: Fri, 29 Apr 2011 17:10:09 +0100 Message-ID: <4DBAFF01020000780003EEFD@vpn.id2.novell.com> References: <4DBAAFF1.8080001@ts.fujitsu.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <4DBAAFF1.8080001@ts.fujitsu.com> Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Juergen Gross Cc: "xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org >>> On 29.04.11 at 14:32, Juergen Gross = wrote: > Hi, >=20 > comparing performance of different Xen versions with BS2000 as HVM guest > showed some weird data I'd like to understand. >=20 > All measurements were done on an Intel Xeon E7220 box. We used a disk- > benchmark and found the cpu utilization was much higher with Xen 4.0=20 > compared > to Xen 3.3. I did some more investigation and narrowed things down to = calls=20 > of > the hypervisor (implicit or explicit). >=20 > Following is a table with timing data for different low-level functions, = all > timing values are tsc ticks obtained via rdtsc: >=20 > Xen 3.3 Xen 4.0 Function > 88 165 just the measurement overhead > 176 330 rdtsc-instruction + cli/sti > 5896 11044 lapic timer query > 7381 13519 setting lapic timer > 4653 8987 reload of cr3 > 3124 5709 invlpg instruction > 792253 792264 wbinvd instruction > 748 1375 int + iret > 5203 9317 hypervisor yield call > 12598102 12597882 memory access loop >=20 > All operations involving the hypervisor take nearly twice the time on = 4.0. > Operations not involving the hypervisor (wbinvd and memory access loop) = are > the same on both systems (this rules out the possibility of different = rdtsc > behavior). >=20 > Is there any easy explanation for this? Both Xen versions are from SLES > (SLES11 or SLES11 SP1). I think cpufreq handling was off by default in 3.3, and is on by default on 4.0. Try turning this off, or using the performance governor. >=20 >=20 > Juergen