From mboxrd@z Thu Jan  1 00:00:00 1970
From: Gordan Bobic <gordan@bobich.net>
Subject: Re: Xen 4.2.2 / KVM / VirtualBox benchmark on Haswell
Date: Thu, 11 Jul 2013 18:49:00 +0100
Message-ID: <51DEF00C.9080400@bobich.net>
References: <51DC2BE3.7000009@xen.org>
	<dce25df4538ee4ddd0337067c3234b12@mail.shatteredsilicon.net>
	<1373540028.12772.31.camel@Solace>
	<CAFLBxZYK+0Fxic9FzLR+7V08TDWSisELub_ib1m6jCk2C-e9ug@mail.gmail.com>
	<1373560062.12772.48.camel@Solace>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <1373560062.12772.48.camel@Solace>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Dario Faggioli <dario.faggioli@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>, Lars Kurth <lars.kurth@xen.org>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
List-Id: xen-devel@lists.xenproject.org

On 07/11/2013 05:27 PM, Dario Faggioli wrote:
> On gio, 2013-07-11 at 17:23 +0100, George Dunlap wrote:
>> On Thu, Jul 11, 2013 at 11:53 AM, Dario Faggioli
>>> When I tried to use kernel compile as a benchmark for the NUMA effects,
>>> it did not turn out that useful to me (and that's why I switched to
>>> SpecJBB), but perhaps it was me that was doing something wrong...
>>
>> In my experience, kernel-build has excellent memory locality.  One
>> effect is that the effect of nested paging on TLB time is almostt nil;
>> I'm not surprised that the caches make the effect of NUMA almost nil
>> as well.
>>
> Not to mention I/O, unless you setup a ramfs backed building
> environment. Again, when I tried, that was my intention, but perhaps I
> failed right at that... Gordan, what about you?

IIRC in my tests the disk I/O was relatively minimal. If you read the 
details here:

http://www.altechnative.net/2012/08/04/virtual-performance-part-1-vmware/

you may notice that I actually primed the test by catting everything to 
/dev/null, so all the reads should have been coming from the page cache. 
I didn't have enough RAM in the machine (only 8GB) to fit all the 
produced binaries in tmpfs at the time.

I don't think this had a large impact, though - the iowait time was 
about 0% all the time because there were plenty of threads that had 
productive compiling work to do while some were waiting to commit to 
disk. Since this was on a C2Q, there was no NUMA in play, so if I had to 
guess at the major cause of performance degradation, it would be related 
to context switching; having said that, I didn't get around to doing any 
in-depth profiling to be able to tell for sure. (Speaking of which, how 
would one go about profiling things at bare-metal hypervisor level?

I will re-run the test on a new machine at some point and see how it 
compares, and this time I will have enough RAM for the whole lot to fit.

Gordan