From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
To: Valentin Schneider <valentin.schneider@arm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>,
Gautham R Shenoy <ego@linux.vnet.ibm.com>,
Michael Neuling <mikey@neuling.org>,
Peter Zijlstra <peterz@infradead.org>,
LKML <linux-kernel@vger.kernel.org>,
Nicholas Piggin <npiggin@gmail.com>,
Oliver O'Halloran <oohall@gmail.com>,
Jordan Niethe <jniethe5@gmail.com>,
linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
Ingo Molnar <mingo@kernel.org>
Subject: Re: [PATCH v4 09/10] Powerpc/smp: Create coregroup domain
Date: Wed, 29 Jul 2020 11:43:55 +0530 [thread overview]
Message-ID: <20200729061355.GA14603@linux.vnet.ibm.com> (raw)
In-Reply-To: <jhjr1sviswg.mognet@arm.com>
* Valentin Schneider <valentin.schneider@arm.com> [2020-07-28 16:03:11]:
Hi Valentin,
Thanks for looking into the patches.
> On 27/07/20 06:32, Srikar Dronamraju wrote:
> > Add percpu coregroup maps and masks to create coregroup domain.
> > If a coregroup doesn't exist, the coregroup domain will be degenerated
> > in favour of SMT/CACHE domain.
> >
>
> So there's at least one arm64 platform out there with the same "pairs of
> cores share L2" thing (Ampere eMAG), and that lives quite happily with the
> default scheduler topology (SMT/MC/DIE). Each pair of core gets its MC
> domain, and the whole system is covered by DIE.
>
> Now arguably it's not a perfect representation; DIE doesn't have
> SD_SHARE_PKG_RESOURCES so the highest level sd_llc can point to is MC. That
> will impact all callsites using cpus_share_cache(): in the eMAG case, only
> pairs of cores will be seen as sharing cache, even though *all* cores share
> the same L3.
>
Okay, Its good to know that we have a chip which is similar to P9 in
topology.
> I'm trying to paint a picture of what the P9 topology looks like (the one
> you showcase in your cover letter) to see if there are any similarities;
> from what I gather in [1], wikichips and your cover letter, with P9 you can
> have something like this in a single DIE (somewhat unsure about L3 setup;
> it looks to be distributed?)
>
> +---------------------------------------------------------------------+
> | L3 |
> +---------------+-+---------------+-+---------------+-+---------------+
> | L2 | | L2 | | L2 | | L2 |
> +------+-+------+ +------+-+------+ +------+-+------+ +------+-+------+
> | L1 | | L1 | | L1 | | L1 | | L1 | | L1 | | L1 | | L1 |
> +------+ +------+ +------+ +------+ +------+ +------+ +------+ +------+
> |4 CPUs| |4 CPUs| |4 CPUs| |4 CPUs| |4 CPUs| |4 CPUs| |4 CPUs| |4 CPUs|
> +------+ +------+ +------+ +------+ +------+ +------+ +------+ +------+
>
> Which would lead to (ignoring the whole SMT CPU numbering shenanigans)
>
> NUMA [ ...
> DIE [ ]
> MC [ ] [ ] [ ] [ ]
> BIGCORE [ ] [ ] [ ] [ ]
> SMT [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ]
> 00-03 04-07 08-11 12-15 16-19 20-23 24-27 28-31 <other node here>
>
What you have summed up is perfectly what a P9 topology looks like. I dont
think I could have explained it better than this.
> This however has MC == BIGCORE; what makes it you can have different spans
> for these two domains? If it's not too much to ask, I'd love to have a P9
> topology diagram.
>
> [1]: 20200722081822.GG9290@linux.vnet.ibm.com
At this time the current topology would be good enough i.e BIGCORE would
always be equal to a MC. However in future we could have chips that can have
lesser/larger number of CPUs in llc than in a BIGCORE or we could have
granular or split L3 caches within a DIE. In such a case BIGCORE != MC.
Also in the current P9 itself, two neighbouring core-pairs form a quad.
Cache latency within a quad is better than a latency to a distant core-pair.
Cache latency within a core pair is way better than latency within a quad.
So if we have only 4 threads running on a DIE all of them accessing the same
cache-lines, then we could probably benefit if all the tasks were to run
within the quad aka MC/Coregroup.
I have found some benchmarks which are latency sensitive to benefit by
having a grouping a quad level (using kernel hacks and not backed by
firmware changes). Gautham also found similar results in his experiments
but he only used binding within the stock kernel.
I am not setting SD_SHARE_PKG_RESOURCES in MC/Coregroup sd_flags as in MC
domain need not be LLC domain for Power.
--
Thanks and Regards
Srikar Dronamraju
WARNING: multiple messages have this Message-ID (diff)
From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
To: Valentin Schneider <valentin.schneider@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>,
linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
LKML <linux-kernel@vger.kernel.org>,
Nicholas Piggin <npiggin@gmail.com>,
Anton Blanchard <anton@ozlabs.org>,
"Oliver O'Halloran" <oohall@gmail.com>,
Nathan Lynch <nathanl@linux.ibm.com>,
Michael Neuling <mikey@neuling.org>,
Gautham R Shenoy <ego@linux.vnet.ibm.com>,
Ingo Molnar <mingo@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Jordan Niethe <jniethe5@gmail.com>
Subject: Re: [PATCH v4 09/10] Powerpc/smp: Create coregroup domain
Date: Wed, 29 Jul 2020 11:43:55 +0530 [thread overview]
Message-ID: <20200729061355.GA14603@linux.vnet.ibm.com> (raw)
In-Reply-To: <jhjr1sviswg.mognet@arm.com>
* Valentin Schneider <valentin.schneider@arm.com> [2020-07-28 16:03:11]:
Hi Valentin,
Thanks for looking into the patches.
> On 27/07/20 06:32, Srikar Dronamraju wrote:
> > Add percpu coregroup maps and masks to create coregroup domain.
> > If a coregroup doesn't exist, the coregroup domain will be degenerated
> > in favour of SMT/CACHE domain.
> >
>
> So there's at least one arm64 platform out there with the same "pairs of
> cores share L2" thing (Ampere eMAG), and that lives quite happily with the
> default scheduler topology (SMT/MC/DIE). Each pair of core gets its MC
> domain, and the whole system is covered by DIE.
>
> Now arguably it's not a perfect representation; DIE doesn't have
> SD_SHARE_PKG_RESOURCES so the highest level sd_llc can point to is MC. That
> will impact all callsites using cpus_share_cache(): in the eMAG case, only
> pairs of cores will be seen as sharing cache, even though *all* cores share
> the same L3.
>
Okay, Its good to know that we have a chip which is similar to P9 in
topology.
> I'm trying to paint a picture of what the P9 topology looks like (the one
> you showcase in your cover letter) to see if there are any similarities;
> from what I gather in [1], wikichips and your cover letter, with P9 you can
> have something like this in a single DIE (somewhat unsure about L3 setup;
> it looks to be distributed?)
>
> +---------------------------------------------------------------------+
> | L3 |
> +---------------+-+---------------+-+---------------+-+---------------+
> | L2 | | L2 | | L2 | | L2 |
> +------+-+------+ +------+-+------+ +------+-+------+ +------+-+------+
> | L1 | | L1 | | L1 | | L1 | | L1 | | L1 | | L1 | | L1 |
> +------+ +------+ +------+ +------+ +------+ +------+ +------+ +------+
> |4 CPUs| |4 CPUs| |4 CPUs| |4 CPUs| |4 CPUs| |4 CPUs| |4 CPUs| |4 CPUs|
> +------+ +------+ +------+ +------+ +------+ +------+ +------+ +------+
>
> Which would lead to (ignoring the whole SMT CPU numbering shenanigans)
>
> NUMA [ ...
> DIE [ ]
> MC [ ] [ ] [ ] [ ]
> BIGCORE [ ] [ ] [ ] [ ]
> SMT [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ]
> 00-03 04-07 08-11 12-15 16-19 20-23 24-27 28-31 <other node here>
>
What you have summed up is perfectly what a P9 topology looks like. I dont
think I could have explained it better than this.
> This however has MC == BIGCORE; what makes it you can have different spans
> for these two domains? If it's not too much to ask, I'd love to have a P9
> topology diagram.
>
> [1]: 20200722081822.GG9290@linux.vnet.ibm.com
At this time the current topology would be good enough i.e BIGCORE would
always be equal to a MC. However in future we could have chips that can have
lesser/larger number of CPUs in llc than in a BIGCORE or we could have
granular or split L3 caches within a DIE. In such a case BIGCORE != MC.
Also in the current P9 itself, two neighbouring core-pairs form a quad.
Cache latency within a quad is better than a latency to a distant core-pair.
Cache latency within a core pair is way better than latency within a quad.
So if we have only 4 threads running on a DIE all of them accessing the same
cache-lines, then we could probably benefit if all the tasks were to run
within the quad aka MC/Coregroup.
I have found some benchmarks which are latency sensitive to benefit by
having a grouping a quad level (using kernel hacks and not backed by
firmware changes). Gautham also found similar results in his experiments
but he only used binding within the stock kernel.
I am not setting SD_SHARE_PKG_RESOURCES in MC/Coregroup sd_flags as in MC
domain need not be LLC domain for Power.
--
Thanks and Regards
Srikar Dronamraju
next prev parent reply other threads:[~2020-07-29 6:23 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-27 5:32 [PATCH v4 00/10] Coregroup support on Powerpc Srikar Dronamraju
2020-07-27 5:32 ` Srikar Dronamraju
2020-07-27 5:32 ` [PATCH v4 01/10] powerpc/smp: Fix a warning under !NEED_MULTIPLE_NODES Srikar Dronamraju
2020-07-27 5:32 ` Srikar Dronamraju
2020-07-27 5:32 ` [PATCH v4 02/10] powerpc/smp: Merge Power9 topology with Power topology Srikar Dronamraju
2020-07-27 5:32 ` Srikar Dronamraju
2020-07-27 5:32 ` [PATCH v4 03/10] powerpc/smp: Move powerpc_topology above Srikar Dronamraju
2020-07-27 5:32 ` Srikar Dronamraju
2020-07-27 5:32 ` [PATCH v4 04/10] powerpc/smp: Move topology fixups into a new function Srikar Dronamraju
2020-07-27 5:32 ` Srikar Dronamraju
2020-07-27 5:32 ` [PATCH v4 05/10] powerpc/smp: Dont assume l2-cache to be superset of sibling Srikar Dronamraju
2020-07-27 5:32 ` Srikar Dronamraju
2020-07-27 5:32 ` [PATCH v4 06/10] powerpc/smp: Generalize 2nd sched domain Srikar Dronamraju
2020-07-27 5:32 ` Srikar Dronamraju
2020-07-30 5:55 ` Gautham R Shenoy
2020-07-30 5:55 ` Gautham R Shenoy
2020-07-31 7:45 ` Michael Ellerman
2020-07-31 7:45 ` Michael Ellerman
2020-07-31 9:29 ` Srikar Dronamraju
2020-07-31 9:29 ` Srikar Dronamraju
2020-07-31 12:22 ` Michael Ellerman
2020-07-31 12:22 ` Michael Ellerman
2020-07-27 5:32 ` [PATCH v4 07/10] Powerpc/numa: Detect support for coregroup Srikar Dronamraju
2020-07-27 5:32 ` Srikar Dronamraju
2020-07-31 7:49 ` Michael Ellerman
2020-07-31 7:49 ` Michael Ellerman
2020-07-31 9:18 ` Srikar Dronamraju
2020-07-31 9:18 ` Srikar Dronamraju
2020-07-31 11:31 ` Michael Ellerman
2020-07-31 11:31 ` Michael Ellerman
2020-07-27 5:32 ` [PATCH v4 08/10] powerpc/smp: Allocate cpumask only after searching thread group Srikar Dronamraju
2020-07-27 5:32 ` Srikar Dronamraju
2020-07-31 7:52 ` Michael Ellerman
2020-07-31 7:52 ` Michael Ellerman
2020-07-31 9:49 ` Srikar Dronamraju
2020-07-31 9:49 ` Srikar Dronamraju
2020-07-31 12:14 ` Michael Ellerman
2020-07-31 12:14 ` Michael Ellerman
2020-07-27 5:32 ` [PATCH v4 09/10] Powerpc/smp: Create coregroup domain Srikar Dronamraju
2020-07-27 5:32 ` Srikar Dronamraju
2020-07-27 18:52 ` Gautham R Shenoy
2020-07-27 18:52 ` Gautham R Shenoy
2020-07-28 15:03 ` Valentin Schneider
2020-07-28 15:03 ` Valentin Schneider
2020-07-29 6:13 ` Srikar Dronamraju [this message]
2020-07-29 6:13 ` Srikar Dronamraju
2020-07-31 1:05 ` Valentin Schneider
2020-07-31 1:05 ` Valentin Schneider
2020-08-03 6:01 ` Srikar Dronamraju
2020-08-03 6:01 ` Srikar Dronamraju
2020-07-31 7:36 ` Gautham R Shenoy
2020-07-31 7:36 ` Gautham R Shenoy
2020-07-27 5:32 ` [PATCH v4 10/10] powerpc/smp: Implement cpu_to_coregroup_id Srikar Dronamraju
2020-07-27 5:32 ` Srikar Dronamraju
2020-07-31 8:02 ` Michael Ellerman
2020-07-31 8:02 ` Michael Ellerman
2020-07-31 9:58 ` Srikar Dronamraju
2020-07-31 9:58 ` Srikar Dronamraju
2020-07-31 11:29 ` Michael Ellerman
2020-07-31 11:29 ` Michael Ellerman
2020-07-30 17:22 ` [PATCH v4 00/10] Coregroup support on Powerpc Srikar Dronamraju
2020-07-30 17:22 ` Srikar Dronamraju
-- strict thread matches above, loose matches on Subject: below --
2020-07-27 5:17 Srikar Dronamraju
2020-07-27 5:18 ` [PATCH v4 09/10] Powerpc/smp: Create coregroup domain Srikar Dronamraju
2020-07-27 5:18 ` Srikar Dronamraju
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200729061355.GA14603@linux.vnet.ibm.com \
--to=srikar@linux.vnet.ibm.com \
--cc=ego@linux.vnet.ibm.com \
--cc=jniethe5@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mikey@neuling.org \
--cc=mingo@kernel.org \
--cc=nathanl@linux.ibm.com \
--cc=npiggin@gmail.com \
--cc=oohall@gmail.com \
--cc=peterz@infradead.org \
--cc=valentin.schneider@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.