Xen 4.2.2 / KVM / VirtualBox benchmark on Haswell

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

* Xen 4.2.2 / KVM / VirtualBox benchmark on Haswell
@ 2013-07-09 15:27 Lars Kurth
  2013-07-09 15:40 ` Thanos Makatos
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Lars Kurth @ 2013-07-09 15:27 UTC (permalink / raw)
  To: xen-devel@lists.xen.org

Not sure whether anyone has seen this:
http://www.phoronix.com/scan.php?page=article&item=intel_haswell_virtualization

Some of the comments are interesting, but not really as negative as they 
used to be. In any case, it may make sense to have a quick look

Lars

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Xen 4.2.2 / KVM / VirtualBox benchmark on Haswell
  2013-07-09 15:27 Xen 4.2.2 / KVM / VirtualBox benchmark on Haswell Lars Kurth
@ 2013-07-09 15:40 ` Thanos Makatos
  2013-07-09 15:53   ` Ian Murray
  2013-07-09 15:54 ` Gordan Bobic
  2013-07-09 16:52 ` Alex Bligh
  2 siblings, 1 reply; 13+ messages in thread
From: Thanos Makatos @ 2013-07-09 15:40 UTC (permalink / raw)
  To: lars.kurth@xen.org, xen-devel@lists.xen.org

> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> bounces@lists.xen.org] On Behalf Of Lars Kurth
> Sent: 09 July 2013 16:28
> To: xen-devel@lists.xen.org
> Subject: [Xen-devel] Xen 4.2.2 / KVM / VirtualBox benchmark on Haswell
> 
> Not sure whether anyone has seen this:
> http://www.phoronix.com/scan.php?page=article&item=intel_haswell_virtua
> lization
> 
> Some of the comments are interesting, but not really as negative as
> they used to be. In any case, it may make sense to have a quick look
> 
> Lars
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

They use PostMark for their disk I/O tests, which is an ancient benchmark.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Xen 4.2.2 / KVM / VirtualBox benchmark on Haswell
  2013-07-09 15:40 ` Thanos Makatos
@ 2013-07-09 15:53   ` Ian Murray
  2013-07-09 15:56     ` Thanos Makatos
  0 siblings, 1 reply; 13+ messages in thread
From: Ian Murray @ 2013-07-09 15:53 UTC (permalink / raw)
  To: Thanos Makatos, lars.kurth@xen.org, xen-devel@lists.xen.org





----- Original Message -----
> From: Thanos Makatos <thanos.makatos@citrix.com>
> To: "lars.kurth@xen.org" <lars.kurth@xen.org>; "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
> Cc: 
> Sent: Tuesday, 9 July 2013, 16:40
> Subject: Re: [Xen-devel] Xen 4.2.2 / KVM / VirtualBox benchmark on Haswell
> 
>>  -----Original Message-----
>>  From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
>>  bounces@lists.xen.org] On Behalf Of Lars Kurth
>>  Sent: 09 July 2013 16:28
>>  To: xen-devel@lists.xen.org
>>  Subject: [Xen-devel] Xen 4.2.2 / KVM / VirtualBox benchmark on Haswell
>> 
>>  Not sure whether anyone has seen this:
>>  http://www.phoronix.com/scan.php?page=article&item=intel_haswell_virtua
>>  lization
>> 
>>  Some of the comments are interesting, but not really as negative as
>>  they used to be. In any case, it may make sense to have a quick look
>> 
>>  Lars
>> 
>>  _______________________________________________
>>  Xen-devel mailing list
>>  Xen-devel@lists.xen.org
>>  http://lists.xen.org/xen-devel
> 
> They use PostMark for their disk I/O tests, which is an ancient benchmark.

is that a good or a bad thing? If so, why?

> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Xen 4.2.2 / KVM / VirtualBox benchmark on Haswell
  2013-07-09 15:27 Xen 4.2.2 / KVM / VirtualBox benchmark on Haswell Lars Kurth
  2013-07-09 15:40 ` Thanos Makatos
@ 2013-07-09 15:54 ` Gordan Bobic
  2013-07-11 10:53   ` Dario Faggioli
  2013-07-09 16:52 ` Alex Bligh
  2 siblings, 1 reply; 13+ messages in thread
From: Gordan Bobic @ 2013-07-09 15:54 UTC (permalink / raw)
  To: lars.kurth; +Cc: xen-devel

 On Tue, 09 Jul 2013 16:27:31 +0100, Lars Kurth <lars.kurth@xen.org> 
 wrote:
> Not sure whether anyone has seen this:
> 
> http://www.phoronix.com/scan.php?page=article&item=intel_haswell_virtualization
>
> Some of the comments are interesting, but not really as negative as
> they used to be. In any case, it may make sense to have a quick look

 Relative figures at least in terms of ordering are similar to what I
 found last time I did a similar test:

 http://www.altechnative.net/2012/08/04/virtual-performance-part-1-vmware/

 My test was harsher, though, because it exposed more of the context
 switching and inter-core (and worse, inter-die since I tested on a
 C2Q) migration overheads.

 The process migration overheads are _expensive_ - I found that on bare
 metal pining CPU/RAM intensive processes to cores made a ~20%
 difference to overall throughput on a C2Q class CPU (no shared caches
 between the two dies made it worse). I expect 4.3.x will be a
 substantial improvement with NUMA awareness improvements to the
 scheduler (looking forward to trying it this weekend).

 Shame phoronix didn't test PV performance, in my tests that made
 a huge difference and put Xen firmly ahead of the competition.

 Gordan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Xen 4.2.2 / KVM / VirtualBox benchmark on Haswell
  2013-07-09 15:53   ` Ian Murray
@ 2013-07-09 15:56     ` Thanos Makatos
  2013-07-09 16:14       ` Gordan Bobic
  0 siblings, 1 reply; 13+ messages in thread
From: Thanos Makatos @ 2013-07-09 15:56 UTC (permalink / raw)
  To: Ian Murray, lars.kurth@xen.org, xen-devel@lists.xen.org

> >>  Not sure whether anyone has seen this:
> >>
> >>
> http://www.phoronix.com/scan.php?page=article&item=intel_haswell_virt
> >> ua
> >>  lization
> >>
> >>  Some of the comments are interesting, but not really as negative as
> >> they used to be. In any case, it may make sense to have a quick look
> >>
> >>  Lars
> >>
> >>  _______________________________________________
> >>  Xen-devel mailing list
> >>  Xen-devel@lists.xen.org
> >>  http://lists.xen.org/xen-devel
> >
> > They use PostMark for their disk I/O tests, which is an ancient
> benchmark.
> 
> is that a good or a bad thing? If so, why?

IMO it's a bad thing because it's far from a representative benchmark, which can lead to wrong conclusions when evaluation I/O performance.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Xen 4.2.2 / KVM / VirtualBox benchmark on Haswell
  2013-07-09 15:56     ` Thanos Makatos
@ 2013-07-09 16:14       ` Gordan Bobic
  2013-07-09 16:21         ` Thanos Makatos
  0 siblings, 1 reply; 13+ messages in thread
From: Gordan Bobic @ 2013-07-09 16:14 UTC (permalink / raw)
  To: Thanos Makatos; +Cc: Ian Murray, lars.kurth@xen.org, xen-devel

 On Tue, 9 Jul 2013 15:56:51 +0000, Thanos Makatos 
 <thanos.makatos@citrix.com> wrote:
>> >>  Not sure whether anyone has seen this:
>> >>
>> >>
>> 
>> http://www.phoronix.com/scan.php?page=article&item=intel_haswell_virt
>> >> ua
>> >>  lization
>> >>
>> >>  Some of the comments are interesting, but not really as negative 
>> as
>> >> they used to be. In any case, it may make sense to have a quick 
>> look
>> >>
>> >>  Lars
>> >>
>> > They use PostMark for their disk I/O tests, which is an ancient
>> benchmark.
>>
>> is that a good or a bad thing? If so, why?
>
> IMO it's a bad thing because it's far from a representative
> benchmark, which can lead to wrong conclusions when evaluation I/O
> performance.

 Ancient doesn't mean non-representative. A good file-system benchmark
 is a tricky one to come up with because most FS-es are good at some
 things and bad at others. If you really want to test the virtualization
 overhead on FS I/O, the only sane way to test it is by putting the
 FS on the host's RAM disk and testing from there. That should
 expose the full extent of the overhead, subject to the same
 caveat about different FS-es being better at different load types.

 Personally I'm in favour of redneck-benchmarks that easily push
 the whole stack to saturation point (e.g. highly parallel kernel
 compile) since those cannot be cheated. But generically speaking,
 the only way to get a worthwhile measure is to create a custom
 benchmark that tests your specific application to saturation
 point. Any generic/synthetic benchmark will provide results
 that are almost certainly going to be misleading for any
 specific real-world load you are planning to run on your
 system.

 For example, on a read-only MySQL load (read-only
 because it simplified testing, no need to rebuild huge data
 sets between runs, just drop all the caches), in custom application
 performance test that I carried out for a client, ESX showed
 a ~40% throughput degradation over bare metal (8 cores/server, 16
 SQL threads cat-ing select-filtered general-log extracts, load
 generator running in same VM). And the test machines (both
 physical and virtual had enough RAM in them that they were both
 only disk I/O bound for the first 2-3 minutes of the test (which
 took the best part of an hour to complete); which goes to show
 that disk I/O bottlenecks are good at covering up overheads
 elsewhere.

 Gordan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Xen 4.2.2 / KVM / VirtualBox benchmark on Haswell
  2013-07-09 16:14       ` Gordan Bobic
@ 2013-07-09 16:21         ` Thanos Makatos
  2013-07-09 16:26           ` Gordan Bobic
  0 siblings, 1 reply; 13+ messages in thread
From: Thanos Makatos @ 2013-07-09 16:21 UTC (permalink / raw)
  To: Gordan Bobic; +Cc: Ian Murray, lars.kurth@xen.org, xen-devel@lists.xen.org

> > IMO it's a bad thing because it's far from a representative
> benchmark,
> > which can lead to wrong conclusions when evaluation I/O performance.
> 
>  Ancient doesn't mean non-representative. A good file-system benchmark

In this particular case it is: PostMark is a single-threaded application that performs read and write operations on a fixed set of files, at an unrealistically low directory depth; modern I/O workloads exhibit much more complicated behaviour than this.

> is a tricky one to come up with because most FS-es are good at some
> things and bad at others. If you really want to test the virtualization
> overhead on FS I/O, the only sane way to test it is by putting the  FS
> on the host's RAM disk and testing from there. That should  expose the
> full extent of the overhead, subject to the same  caveat about
> different FS-es being better at different load types.
> 
>  Personally I'm in favour of redneck-benchmarks that easily push  the
> whole stack to saturation point (e.g. highly parallel kernel
>  compile) since those cannot be cheated. But generically speaking,  the
> only way to get a worthwhile measure is to create a custom  benchmark
> that tests your specific application to saturation  point. Any
> generic/synthetic benchmark will provide results  that are almost
> certainly going to be misleading for any  specific real-world load you
> are planning to run on your  system.
> 
>  For example, on a read-only MySQL load (read-only  because it
> simplified testing, no need to rebuild huge data  sets between runs,
> just drop all the caches), in custom application  performance test that
> I carried out for a client, ESX showed  a ~40% throughput degradation
> over bare metal (8 cores/server, 16  SQL threads cat-ing select-
> filtered general-log extracts, load  generator running in same VM). And
> the test machines (both  physical and virtual had enough RAM in them
> that they were both  only disk I/O bound for the first 2-3 minutes of
> the test (which  took the best part of an hour to complete); which goes
> to show  that disk I/O bottlenecks are good at covering up overheads
> elsewhere.
> 
>  Gordan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Xen 4.2.2 / KVM / VirtualBox benchmark on Haswell
  2013-07-09 16:21         ` Thanos Makatos
@ 2013-07-09 16:26           ` Gordan Bobic
  0 siblings, 0 replies; 13+ messages in thread
From: Gordan Bobic @ 2013-07-09 16:26 UTC (permalink / raw)
  To: Thanos Makatos; +Cc: Ian Murray, lars.kurth@xen.org, xen-devel

 On Tue, 9 Jul 2013 16:21:52 +0000, Thanos Makatos 
 <thanos.makatos@citrix.com> wrote:
>> > IMO it's a bad thing because it's far from a representative
>> benchmark,
>> > which can lead to wrong conclusions when evaluation I/O 
>> performance.
>>
>>  Ancient doesn't mean non-representative. A good file-system 
>> benchmark
>
> In this particular case it is: PostMark is a single-threaded
> application that performs read and write operations on a fixed set of
> files, at an unrealistically low directory depth; modern I/O 
> workloads
> exhibit much more complicated behaviour than this.

 Unless you are running a mail server. Granted, running multiple
 postmarks in parallel might be a better test on today's many-core
 servers, but it'd likely make no little or no difference on a
 disk I/O bound test.

 Gordan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Xen 4.2.2 / KVM / VirtualBox benchmark on Haswell
  2013-07-09 15:27 Xen 4.2.2 / KVM / VirtualBox benchmark on Haswell Lars Kurth
  2013-07-09 15:40 ` Thanos Makatos
  2013-07-09 15:54 ` Gordan Bobic
@ 2013-07-09 16:52 ` Alex Bligh
  2 siblings, 0 replies; 13+ messages in thread
From: Alex Bligh @ 2013-07-09 16:52 UTC (permalink / raw)
  To: lars.kurth, xen-devel; +Cc: Alex Bligh



--On 9 July 2013 16:27:31 +0100 Lars Kurth <lars.kurth@xen.org> wrote:

> Not sure whether anyone has seen this:
> http://www.phoronix.com/scan.php?page=article&item=intel_haswell_virtuali
> zation
>
> Some of the comments are interesting, but not really as negative as they
> used to be. In any case, it may make sense to have a quick look

Last time I looked at the Phoronix benchmarks, they were using the default
disk caching with Xen and Qemu, and these were not identical. From memory
KVM was using writethrough and Xen was using no caching.

This one says "Xen and KVM virtualization were setup through virt-manager".
I don't know whether that evens things out, as I don't use it.

-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Xen 4.2.2 / KVM / VirtualBox benchmark on Haswell
  2013-07-09 15:54 ` Gordan Bobic
@ 2013-07-11 10:53   ` Dario Faggioli
  2013-07-11 16:23     ` George Dunlap
  0 siblings, 1 reply; 13+ messages in thread
From: Dario Faggioli @ 2013-07-11 10:53 UTC (permalink / raw)
  To: Gordan Bobic; +Cc: lars.kurth, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 1258 bytes --]

On mar, 2013-07-09 at 16:54 +0100, Gordan Bobic wrote:
>  The process migration overheads are _expensive_
>
Indeed!

>  - I found that on bare
>  metal pining CPU/RAM intensive processes to cores made a ~20%
>  difference to overall throughput on a C2Q class CPU (no shared caches
>  between the two dies made it worse). I expect 4.3.x will be a
>  substantial improvement with NUMA awareness improvements to the
>  scheduler (looking forward to trying it this weekend).
> 
Well, yes, something good could be expected, although the actual
improvement will depend on the number of involved VMs, their sizes, the
workload they're running, etc.

When I tried to use kernel compile as a benchmark for the NUMA effects,
it did not turn out that useful to me (and that's why I switched to
SpecJBB), but perhaps it was me that was doing something wrong...

Anyway, if you do anything like this, please, do let us know here (and,
please, Cc me :-P).

Thanks and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Xen 4.2.2 / KVM / VirtualBox benchmark on Haswell
  2013-07-11 10:53   ` Dario Faggioli
@ 2013-07-11 16:23     ` George Dunlap
  2013-07-11 16:27       ` Dario Faggioli
  0 siblings, 1 reply; 13+ messages in thread
From: George Dunlap @ 2013-07-11 16:23 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: Gordan Bobic, Lars Kurth, xen-devel@lists.xen.org

On Thu, Jul 11, 2013 at 11:53 AM, Dario Faggioli
<dario.faggioli@citrix.com> wrote:
> On mar, 2013-07-09 at 16:54 +0100, Gordan Bobic wrote:
>>  The process migration overheads are _expensive_
>>
> Indeed!
>
>>  - I found that on bare
>>  metal pining CPU/RAM intensive processes to cores made a ~20%
>>  difference to overall throughput on a C2Q class CPU (no shared caches
>>  between the two dies made it worse). I expect 4.3.x will be a
>>  substantial improvement with NUMA awareness improvements to the
>>  scheduler (looking forward to trying it this weekend).
>>
> Well, yes, something good could be expected, although the actual
> improvement will depend on the number of involved VMs, their sizes, the
> workload they're running, etc.
>
> When I tried to use kernel compile as a benchmark for the NUMA effects,
> it did not turn out that useful to me (and that's why I switched to
> SpecJBB), but perhaps it was me that was doing something wrong...

In my experience, kernel-build has excellent memory locality.  One
effect is that the effect of nested paging on TLB time is almostt nil;
I'm not surprised that the caches make the effect of NUMA almost nil
as well.

 -George

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Xen 4.2.2 / KVM / VirtualBox benchmark on Haswell
  2013-07-11 16:23     ` George Dunlap
@ 2013-07-11 16:27       ` Dario Faggioli
  2013-07-11 17:49         ` Gordan Bobic
  0 siblings, 1 reply; 13+ messages in thread
From: Dario Faggioli @ 2013-07-11 16:27 UTC (permalink / raw)
  To: George Dunlap; +Cc: Gordan Bobic, Lars Kurth, xen-devel@lists.xen.org


[-- Attachment #1.1: Type: text/plain, Size: 1022 bytes --]

On gio, 2013-07-11 at 17:23 +0100, George Dunlap wrote:
> On Thu, Jul 11, 2013 at 11:53 AM, Dario Faggioli
> > When I tried to use kernel compile as a benchmark for the NUMA effects,
> > it did not turn out that useful to me (and that's why I switched to
> > SpecJBB), but perhaps it was me that was doing something wrong...
> 
> In my experience, kernel-build has excellent memory locality.  One
> effect is that the effect of nested paging on TLB time is almostt nil;
> I'm not surprised that the caches make the effect of NUMA almost nil
> as well.
> 
Not to mention I/O, unless you setup a ramfs backed building
environment. Again, when I tried, that was my intention, but perhaps I
failed right at that... Gordan, what about you?

Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Xen 4.2.2 / KVM / VirtualBox benchmark on Haswell
  2013-07-11 16:27       ` Dario Faggioli
@ 2013-07-11 17:49         ` Gordan Bobic
  0 siblings, 0 replies; 13+ messages in thread
From: Gordan Bobic @ 2013-07-11 17:49 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: George Dunlap, Lars Kurth, xen-devel@lists.xen.org

On 07/11/2013 05:27 PM, Dario Faggioli wrote:
> On gio, 2013-07-11 at 17:23 +0100, George Dunlap wrote:
>> On Thu, Jul 11, 2013 at 11:53 AM, Dario Faggioli
>>> When I tried to use kernel compile as a benchmark for the NUMA effects,
>>> it did not turn out that useful to me (and that's why I switched to
>>> SpecJBB), but perhaps it was me that was doing something wrong...
>>
>> In my experience, kernel-build has excellent memory locality.  One
>> effect is that the effect of nested paging on TLB time is almostt nil;
>> I'm not surprised that the caches make the effect of NUMA almost nil
>> as well.
>>
> Not to mention I/O, unless you setup a ramfs backed building
> environment. Again, when I tried, that was my intention, but perhaps I
> failed right at that... Gordan, what about you?

IIRC in my tests the disk I/O was relatively minimal. If you read the 
details here:

http://www.altechnative.net/2012/08/04/virtual-performance-part-1-vmware/

you may notice that I actually primed the test by catting everything to 
/dev/null, so all the reads should have been coming from the page cache. 
I didn't have enough RAM in the machine (only 8GB) to fit all the 
produced binaries in tmpfs at the time.

I don't think this had a large impact, though - the iowait time was 
about 0% all the time because there were plenty of threads that had 
productive compiling work to do while some were waiting to commit to 
disk. Since this was on a C2Q, there was no NUMA in play, so if I had to 
guess at the major cause of performance degradation, it would be related 
to context switching; having said that, I didn't get around to doing any 
in-depth profiling to be able to tell for sure. (Speaking of which, how 
would one go about profiling things at bare-metal hypervisor level?

I will re-run the test on a new machine at some point and see how it 
compares, and this time I will have enough RAM for the whole lot to fit.

Gordan

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2013-07-11 17:49 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-07-09 15:27 Xen 4.2.2 / KVM / VirtualBox benchmark on Haswell Lars Kurth
2013-07-09 15:40 ` Thanos Makatos
2013-07-09 15:53   ` Ian Murray
2013-07-09 15:56     ` Thanos Makatos
2013-07-09 16:14       ` Gordan Bobic
2013-07-09 16:21         ` Thanos Makatos
2013-07-09 16:26           ` Gordan Bobic
2013-07-09 15:54 ` Gordan Bobic
2013-07-11 10:53   ` Dario Faggioli
2013-07-11 16:23     ` George Dunlap
2013-07-11 16:27       ` Dario Faggioli
2013-07-11 17:49         ` Gordan Bobic
2013-07-09 16:52 ` Alex Bligh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).