Re: More benchmarks with flatten topology in the Linux kernel

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Dario Faggioli <dario.faggioli@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>,
	Andrew Cooper <Andrew.Cooper3@citrix.com>,
	Juergen Gross <jgross@ssue.com>,
	David Vrabel <david.vrabel@citrix.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>
Subject: Re: More benchmarks with flatten topology in the Linux kernel
Date: Tue, 27 Oct 2015 16:44:17 -0400	[thread overview]
Message-ID: <20151027204417.GF4849@l.oracle.com> (raw)
In-Reply-To: <1445444764.3009.188.camel@citrix.com>

On Wed, Oct 21, 2015 at 06:26:04PM +0200, Dario Faggioli wrote:
> Hi everyone,
> 
> I managed running again the benchmarks I had already showed off here:

Hey!

Thank you for doing that.
> 
>  [PATCH RFC] xen: if on Xen, "flatten" the scheduling domain hierarchy
>  https://lkml.org/lkml/2015/8/18/302
> 
> Basically, this is about Linux guests using topology information for
> scheduling, while they just don't make any sense when on Xen as (unless
> static and guest-lifetime long pinning is used) vCPUs do move around!
> 
> Some more context is also available here:
> 
>  http://lists.xen.org/archives/html/xen-devel/2015-07/msg03241.html
> 
> This email is still about numbers obtained by running things in Dom0,
> and without overloading the host pCPUs at the Xen level (i.e., I'm
> using nr. dom0 vCPUs == nr. host pCPUs).
> 
> With respect to previous round:
>  - I've added results for hackbench
>  - I've run the benches with both my patch[0] and Juergen's patch[1]. 
>    My patch is 'dariof', in the spreadsheet; Juergen's is 'jgross'.
> 
> Here are the numbers:
> 
>  https://docs.google.com/spreadsheets/d/17djcVV3FkmHmv1FKFBe9CQFnNgVumnM2U64MNvjzAn8/edit?usp=sharing
> 
> (If anyone has issues with googledocs, tell me, and I'll try
> cutting-&-pasting in email, as I did the other time.)
> 
> A few comments:
>  * both the patches bring performance improvements. The only 
>    regression seems to happen in hackbench, when running with -g1. 
>    That is certainly not the typical use case of the benchmark, but we 
>    certainly can try figuring out better what happens in that case;
>  * the two patches were supposed to provide almost identical results, 
>    and they actually do that, in most cases (e.g., all the instances 
>    of Unixbench);
>  * when there are differences, it is hard to see a trend, or, in 
>    general, to identify a possible reason by looking at differences 
>    between the patches themselves, at least as far as these data are 
>    concerned. In fact, in the "make xen" case, for instance, 'jgross'
>    is better when building with -j20 and -j24, while 'dariof' is
>    better when building with -j48 and -j62 (the host having 48 pCPUs).
>    In the hackbench case, 'dariof' is better in the least concurrent
>    case, 'jgross' is better in the other three.
>    This all may well be due to some different and independent 
>    factor... Perhaps, a little bit more of investigation is necessary 
>    (and I'm up for it).
> 
> IMO, future steps are:
>  a) running benchmarks in a guest
>  b) running benchmarks in more guests, and when overloading at the Xen 
>     level (i.e., having more vCPUs around than the host has pCPUs)
>  c) tracing and/or collecting stats (e.g., from perf and xenalyze)
> 
> I'm already working on a) and b).
> 
> As far as which approach (mine or Juergen's) to adopt, I'm not sure,
> and it does not seem to make much difference, at least from the
> performance point of view. I don't have any particular issues with
> Juergen's patch, apart from the fact that I'm not yet sure how it makes
> the scheduling domains creation code behave. I can look into that and
> report.
> 
> Also, this is all for PV guests. Any thoughts on what the best route
> would be for HVM ones?

Perhaps the same? What I presume we want is for each CPU to look
exactly like the same from the scheduling perspective. That is - there
should be no penalty in moving a task from one CPU to another. While
right now the Linux scheduler will not move certain tasks. This is
due to to how the topology looks on baremetal - and moving certain
tasks is prohibitive (say moving an task from one core to another core
costs more than moving from core to SMT).


> 
> [0] http://pastebin.com/KF5WyPKz
> [1] http://pastebin.com/xSFLbLwn
> 
> Regards,
> Dario
> -- 
> <<This happens because I choose it to happen!>> (Raistlin Majere)
> -----------------------------------------------------------------
> Dario Faggioli, Ph.D, http://about.me/dario.faggioli
> Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
>

     prev parent reply	other threads:[~2015-10-27 20:44 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-21 16:26 More benchmarks with flatten topology in the Linux kernel Dario Faggioli
2015-10-27 20:44 ` Konrad Rzeszutek Wilk [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151027204417.GF4849@l.oracle.com \
    --to=konrad.wilk@oracle.com \
    --cc=Andrew.Cooper3@citrix.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=dario.faggioli@citrix.com \
    --cc=david.vrabel@citrix.com \
    --cc=jgross@ssue.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.