linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
To: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>,
	Oliver OHalloran <oliveroh@au1.ibm.com>,
	Michael Neuling <mikey@linux.ibm.com>,
	Michael Ellerman <michaele@au1.ibm.com>,
	Anton Blanchard <anton@au1.ibm.com>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	Nick Piggin <npiggin@au1.ibm.com>
Subject: Re: [PATCH 02/11] powerpc/smp: Merge Power9 topology with Power topology
Date: Mon, 20 Jul 2020 13:40:52 +0530	[thread overview]
Message-ID: <20200720081052.GF21103@linux.vnet.ibm.com> (raw)
In-Reply-To: <20200717054436.GB25851@in.ibm.com>

* Gautham R Shenoy <ego@linux.vnet.ibm.com> [2020-07-17 11:14:36]:

> Hi Srikar,
> 
> On Tue, Jul 14, 2020 at 10:06:15AM +0530, Srikar Dronamraju wrote:
> > A new sched_domain_topology_level was added just for Power9. However the
> > same can be achieved by merging powerpc_topology with power9_topology
> > and makes the code more simpler especially when adding a new sched
> > domain.
> > 
> > Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
> > Cc: Michael Ellerman <michaele@au1.ibm.com>
> > Cc: Nick Piggin <npiggin@au1.ibm.com>
> > Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
> > Cc: Nathan Lynch <nathanl@linux.ibm.com>
> > Cc: Michael Neuling <mikey@linux.ibm.com>
> > Cc: Anton Blanchard <anton@au1.ibm.com>
> > Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
> > Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
> > Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> > ---
> >  arch/powerpc/kernel/smp.c | 33 ++++++++++-----------------------
> >  1 file changed, 10 insertions(+), 23 deletions(-)
> > 
> > diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> > index 680c0edcc59d..069ea4b21c6d 100644
> > --- a/arch/powerpc/kernel/smp.c
> > +++ b/arch/powerpc/kernel/smp.c
> > @@ -1315,7 +1315,7 @@ int setup_profiling_timer(unsigned int multiplier)
> >  }
> > 
> >  #ifdef CONFIG_SCHED_SMT
> > -/* cpumask of CPUs with asymetric SMT dependancy */
> > +/* cpumask of CPUs with asymmetric SMT dependency */
> >  static int powerpc_smt_flags(void)
> >  {
> >  	int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
> > @@ -1328,14 +1328,6 @@ static int powerpc_smt_flags(void)
> >  }
> >  #endif
> > 
> > -static struct sched_domain_topology_level powerpc_topology[] = {
> > -#ifdef CONFIG_SCHED_SMT
> > -	{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
> > -#endif
> > -	{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
> > -	{ NULL, },
> > -};
> > -
> >  /*
> >   * P9 has a slightly odd architecture where pairs of cores share an L2 cache.
> >   * This topology makes it *much* cheaper to migrate tasks between adjacent cores
> > @@ -1353,7 +1345,13 @@ static int powerpc_shared_cache_flags(void)
> >   */
> >  static const struct cpumask *shared_cache_mask(int cpu)
> >  {
> > -	return cpu_l2_cache_mask(cpu);
> > +	if (shared_caches)
> > +		return cpu_l2_cache_mask(cpu);
> > +
> > +	if (has_big_cores)
> > +		return cpu_smallcore_mask(cpu);
> > +
> > +	return cpu_smt_mask(cpu);
> >  }
> > 
> >  #ifdef CONFIG_SCHED_SMT
> > @@ -1363,7 +1361,7 @@ static const struct cpumask *smallcore_smt_mask(int cpu)
> >  }
> >  #endif
> > 
> > -static struct sched_domain_topology_level power9_topology[] = {
> > +static struct sched_domain_topology_level powerpc_topology[] = {
> 
> 
> >  #ifdef CONFIG_SCHED_SMT
> >  	{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
> >  #endif
> > @@ -1388,21 +1386,10 @@ void __init smp_cpus_done(unsigned int max_cpus)
> >  #ifdef CONFIG_SCHED_SMT
> >  	if (has_big_cores) {
> >  		pr_info("Big cores detected but using small core scheduling\n");
> I> -		power9_topology[0].mask = smallcore_smt_mask;
> >  		powerpc_topology[0].mask = smallcore_smt_mask;
> >  	}
> >  #endif
> > -	/*
> > -	 * If any CPU detects that it's sharing a cache with another CPU then
> > -	 * use the deeper topology that is aware of this sharing.
> > -	 */
> > -	if (shared_caches) {
> > -		pr_info("Using shared cache scheduler topology\n");
> > -		set_sched_topology(power9_topology);
> > -	} else {
> > -		pr_info("Using standard scheduler topology\n");
> > -		set_sched_topology(powerpc_topology);
> 
> 
> Ok, so we will go with the three level topology by default (SMT,
> CACHE, DIE) and will rely on the sched-domain creation code to
> degenerate CACHE domain in case SMT and CACHE have the same set of
> CPUs (POWER8 for eg).
> 

Right.

> From a cleanup perspective this is better, since we won't have to
> worry about defining multiple topology structures, but from a
> performance point of view, wouldn't we now pay an extra penalty of
> degenerating the CACHE domains on POWER8 kind of systems, each time
> when a CPU comes online ?
> 

So if we end up either adding a topology definition for each of the new
topologies we support or we have to take the extra penalty.

But going ahead

> Do we know how bad it is ? If the degeneration takes a few extra
> microseconds, that should be ok I suppose.
> 

It certainly will add to the penalty, I haven't captured per degeneration
statistics. However I ran an experiment where I run ppc64_cpu --smt=8 ,
followed by ppc64_cpu --smt=1 in a loop of 100 iterations.

On a Power8 System with 256 cpus 8 nodes.

Architecture:        ppc64le
Byte Order:          Little Endian
CPU(s):              256
On-line CPU(s) list: 0-255
Thread(s) per core:  8
Core(s) per socket:  4
Socket(s):           8
NUMA node(s):        8
Model:               2.1 (pvr 004b 0201)
Model name:          POWER8 (architected), altivec supported
Hypervisor vendor:   pHyp
Virtualization type: para
L1d cache:           64K
L1i cache:           32K
L2 cache:            512K
L3 cache:            8192K
NUMA node0 CPU(s):   0-31
NUMA node1 CPU(s):   32-63
NUMA node2 CPU(s):   64-95
NUMA node3 CPU(s):   96-127
NUMA node4 CPU(s):   128-159
NUMA node5 CPU(s):   160-191
NUMA node6 CPU(s):   192-223
NUMA node7 CPU(s):   224-255

ppc64_cpu --smt=1
    N           Min           Max        Median           Avg        Stddev
x 100         38.17         53.78         46.81       46.6766     2.8421603

x 100         41.34         58.24         48.35       47.9649     3.6866087

ppc64_cpu --smt=8
    N           Min           Max        Median           Avg        Stddev
x 100         57.43         75.88         60.61       61.0246      2.418685

x 100         58.21         79.24         62.59       63.3326     3.4094558

But once we cleanup, we could add ways to fixup topologies so that we
reverse the overhead.

-- 
Thanks and Regards
Srikar Dronamraju

  reply	other threads:[~2020-07-20  8:13 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-14  4:36 [PATCH 00/11] Support for grouping cores Srikar Dronamraju
2020-07-14  4:36 ` [PATCH 01/11] powerpc/smp: Cache node for reuse Srikar Dronamraju
2020-07-17  4:51   ` Gautham R Shenoy
2020-07-14  4:36 ` [PATCH 02/11] powerpc/smp: Merge Power9 topology with Power topology Srikar Dronamraju
2020-07-17  5:44   ` Gautham R Shenoy
2020-07-20  8:10     ` Srikar Dronamraju [this message]
2020-07-14  4:36 ` [PATCH 03/11] powerpc/smp: Move powerpc_topology above Srikar Dronamraju
2020-07-17  5:45   ` Gautham R Shenoy
2020-07-14  4:36 ` [PATCH 04/11] powerpc/smp: Enable small core scheduling sooner Srikar Dronamraju
2020-07-17  5:48   ` Gautham R Shenoy
2020-07-20  7:20     ` Srikar Dronamraju
2020-07-20  7:47   ` Jordan Niethe
2020-07-20  8:52     ` Srikar Dronamraju
2020-07-14  4:36 ` [PATCH 05/11] powerpc/smp: Dont assume l2-cache to be superset of sibling Srikar Dronamraju
2020-07-14  5:40   ` Oliver O'Halloran
2020-07-14  6:30     ` Srikar Dronamraju
2020-07-17  6:00   ` Gautham R Shenoy
2020-07-20  6:45     ` Srikar Dronamraju
2020-07-20  8:58       ` Gautham R Shenoy
2020-07-14  4:36 ` [PATCH 06/11] powerpc/smp: Generalize 2nd sched domain Srikar Dronamraju
2020-07-17  6:37   ` Gautham R Shenoy
2020-07-20  6:19     ` Srikar Dronamraju
2020-07-20  9:07       ` Gautham R Shenoy
2020-07-14  4:36 ` [PATCH 07/11] Powerpc/numa: Detect support for coregroup Srikar Dronamraju
2020-07-17  8:08   ` Gautham R Shenoy
2020-07-20 13:56   ` Michael Ellerman
2020-07-21  2:57     ` Srikar Dronamraju
2020-07-14  4:36 ` [PATCH 08/11] powerpc/smp: Allocate cpumask only after searching thread group Srikar Dronamraju
2020-07-17  8:08   ` Gautham R Shenoy
2020-07-14  4:36 ` [PATCH 09/11] Powerpc/smp: Create coregroup domain Srikar Dronamraju
2020-07-17  8:19   ` Gautham R Shenoy
2020-07-17  8:23     ` Gautham R Shenoy
2020-07-20  6:02     ` Srikar Dronamraju
2020-07-14  4:36 ` [PATCH 10/11] powerpc/smp: Implement cpu_to_coregroup_id Srikar Dronamraju
2020-07-17  8:26   ` Gautham R Shenoy
2020-07-20  5:48     ` Srikar Dronamraju
2020-07-20  9:10       ` Gautham R Shenoy
2020-07-20 10:26         ` Srikar Dronamraju
2020-07-14  4:36 ` [PATCH 11/11] powerpc/smp: Provide an ability to disable coregroup Srikar Dronamraju
2020-07-17  8:28   ` Gautham R Shenoy
2020-07-20 13:57   ` Michael Ellerman
2020-07-14  5:06 ` [PATCH 00/11] Support for grouping cores Srikar Dronamraju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200720081052.GF21103@linux.vnet.ibm.com \
    --to=srikar@linux.vnet.ibm.com \
    --cc=anton@au1.ibm.com \
    --cc=ego@linux.vnet.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=michaele@au1.ibm.com \
    --cc=mikey@linux.ibm.com \
    --cc=nathanl@linux.ibm.com \
    --cc=npiggin@au1.ibm.com \
    --cc=oliveroh@au1.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).