From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: Nathan Lynch <nathanl@linux.ibm.com>,
Daniel Henrique Barboza <danielhb413@gmail.com>,
linuxppc-dev@lists.ozlabs.org
Subject: Re: [RFC PATCH 8/8] powerpc/papr_scm: Use FORM2 associativity details
Date: Tue, 15 Jun 2021 12:35:17 +0530 [thread overview]
Message-ID: <87a6nrobf6.fsf@linux.ibm.com> (raw)
In-Reply-To: <YMhKEJ9WSlapuSE6@yekko>
David Gibson <david@gibson.dropbear.id.au> writes:
> On Tue, Jun 15, 2021 at 11:27:50AM +0530, Aneesh Kumar K.V wrote:
>> David Gibson <david@gibson.dropbear.id.au> writes:
>>
>> > On Mon, Jun 14, 2021 at 10:10:03PM +0530, Aneesh Kumar K.V wrote:
>> >> FORM2 introduce a concept of secondary domain which is identical to the
>> >> conceept of FORM1 primary domain. Use secondary domain as the numa node
>> >> when using persistent memory device. For DAX kmem use the logical domain
>> >> id introduced in FORM2. This new numa node
>> >>
>> >> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> >> ---
>> >> arch/powerpc/mm/numa.c | 28 +++++++++++++++++++++++
>> >> arch/powerpc/platforms/pseries/papr_scm.c | 26 +++++++++++++--------
>> >> arch/powerpc/platforms/pseries/pseries.h | 1 +
>> >> 3 files changed, 45 insertions(+), 10 deletions(-)
>> >>
>> >> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
>> >> index 86cd2af014f7..b9ac6d02e944 100644
>> >> --- a/arch/powerpc/mm/numa.c
>> >> +++ b/arch/powerpc/mm/numa.c
>> >> @@ -265,6 +265,34 @@ static int associativity_to_nid(const __be32 *associativity)
>> >> return nid;
>> >> }
>> >>
>> >> +int get_primary_and_secondary_domain(struct device_node *node, int *primary, int *secondary)
>> >> +{
>> >> + int secondary_index;
>> >> + const __be32 *associativity;
>> >> +
>> >> + if (!numa_enabled) {
>> >> + *primary = NUMA_NO_NODE;
>> >> + *secondary = NUMA_NO_NODE;
>> >> + return 0;
>> >> + }
>> >> +
>> >> + associativity = of_get_associativity(node);
>> >> + if (!associativity)
>> >> + return -ENODEV;
>> >> +
>> >> + if (of_read_number(associativity, 1) >= primary_domain_index) {
>> >> + *primary = of_read_number(&associativity[primary_domain_index], 1);
>> >> + secondary_index = of_read_number(&distance_ref_points[1], 1);
>> >
>> > Secondary ID is always the second reference point, but primary depends
>> > on the length of resources? That seems very weird.
>>
>> primary_domain_index is distance_ref_point[0]. With Form2 we would find
>> both primary and secondary domain ID same for all resources other than
>> persistent memory device. The usage w.r.t. persistent memory is
>> explained in patch 7.
>
> Right, I misunderstood
>
>>
>> With Form2 the primary domainID and secondary domainID are used to identify the NUMA nodes
>> the kernel should use when using persistent memory devices.
>
> This seems kind of bogus. With Form1, the primary/secondary ID are a
> sort of heirarchy of distance (things with same primary ID are very
> close, things with same secondary are kinda-close, etc.). With Form2,
> it's referring to their effective node for different purposes.
>
> Using the same terms for different meanings seems unnecessarily
> confusing.
They are essentially domainIDs. The interpretation of them are different
between Form1 and Form2. Hence I kept referring to them as primary and
secondary domainID. Any suggestion on what to name them with Form2?
>
>> Persistent memory devices
>> can also be used as regular memory using DAX KMEM driver and primary domainID indicates
>> the numa node number OS should use when using these devices as regular memory. Secondary
>> domainID is the numa node number that should be used when using this device as
>> persistent memory.
>
> It's weird to me that you'd want to consider them in different nodes
> for those different purposes.
--------------------------------------
| NUMA node0 |
| ProcA -----> MEMA |
| | |
| | |
| -------------------> PMEMB |
| |
---------------------------------------
---------------------------------------
| NUMA node1 |
| |
| ProcB -------> MEMC |
| | |
| -------------------> PMEMD |
| |
| |
---------------------------------------
For a topology like the above application running of ProcA wants to find out
persistent memory mount local to its NUMA node. Hence when using it as
pmem fsdax mount or devdax device we want PMEMB to have associativity
of NUMA node0 and PMEMD to have associativity of NUMA node 1. But when
we want to use it as memory using dax kmem driver, we want both PMEMB
and PMEMD to appear as memory only NUMA node at a distance that is
derived based on the latency of the media.
>
>> In the later case, we are interested in the locality of the
>> device to an established numa node. In the above example, if the last row represents a
>> persistent memory device/resource, NUMA node number 40 will be used when using the device
>> as regular memory and NUMA node number 0 will be the device numa node when using it as
>> a persistent memory device.
>
> I don't really get what you mean by "locality of the device to an
> established numa node". Or at least how that's different from
> anything else we're handling here.
-aneesh
next prev parent reply other threads:[~2021-06-15 7:06 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-14 16:39 [RFC PATCH 0/8] Add support for FORM2 associativity Aneesh Kumar K.V
2021-06-14 16:39 ` [RFC PATCH 1/8] powerpc/pseries: rename min_common_depth to primary_domain_index Aneesh Kumar K.V
2021-06-15 3:00 ` David Gibson
2021-06-15 8:21 ` Aneesh Kumar K.V
2021-06-14 16:39 ` [RFC PATCH 2/8] powerpc/pseries: rename distance_ref_points_depth to max_domain_index Aneesh Kumar K.V
2021-06-15 3:01 ` David Gibson
2021-06-15 8:22 ` Aneesh Kumar K.V
2021-06-14 16:39 ` [RFC PATCH 3/8] powerpc/pseries: Rename TYPE1_AFFINITY to FORM1_AFFINITY Aneesh Kumar K.V
2021-06-15 3:04 ` David Gibson
2021-06-14 16:39 ` [RFC PATCH 4/8] powerpc/pseries: Consolidate DLPAR NUMA distance update Aneesh Kumar K.V
2021-06-15 3:13 ` David Gibson
2021-06-15 8:26 ` Aneesh Kumar K.V
2021-06-14 16:40 ` [RFC PATCH 5/8] powerpc/pseries: Consolidate NUMA distance update during boot Aneesh Kumar K.V
2021-06-14 16:40 ` [RFC PATCH 6/8] powerpc/pseries: Add a helper for form1 cpu distance Aneesh Kumar K.V
2021-06-15 3:21 ` David Gibson
2021-06-14 16:40 ` [RFC PATCH 7/8] powerpc/pseries: Add support for FORM2 associativity Aneesh Kumar K.V
2021-06-15 3:53 ` David Gibson
2021-06-15 5:28 ` Aneesh Kumar K.V
2021-06-15 6:25 ` David Gibson
2021-06-15 7:40 ` Aneesh Kumar K.V
2021-06-17 7:50 ` David Gibson
2021-06-17 10:46 ` Aneesh Kumar K.V
2021-06-14 16:40 ` [RFC PATCH 8/8] powerpc/papr_scm: Use FORM2 associativity details Aneesh Kumar K.V
2021-06-15 3:55 ` David Gibson
2021-06-15 5:57 ` Aneesh Kumar K.V
2021-06-15 6:34 ` David Gibson
2021-06-15 7:05 ` Aneesh Kumar K.V [this message]
2021-06-17 7:46 ` David Gibson
2021-06-17 10:53 ` Daniel Henrique Barboza
2021-06-17 11:11 ` Aneesh Kumar K.V
2021-06-17 11:46 ` Aneesh Kumar K.V
2021-06-17 20:00 ` Daniel Henrique Barboza
2021-06-18 3:18 ` Aneesh Kumar K.V
2021-06-17 10:59 ` Aneesh Kumar K.V
2021-06-24 3:16 ` David Gibson
2021-06-17 13:55 ` Aneesh Kumar K.V
2021-06-17 14:04 ` Aneesh Kumar K.V
2021-06-15 1:47 ` [RFC PATCH 0/8] Add support for FORM2 associativity Daniel Henrique Barboza
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87a6nrobf6.fsf@linux.ibm.com \
--to=aneesh.kumar@linux.ibm.com \
--cc=danielhb413@gmail.com \
--cc=david@gibson.dropbear.id.au \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=nathanl@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).