All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Nathan Lynch <nathanl@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org,
	Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Subject: Re: [RFC PATCH 1/2] powerpc/numa: Introduce logical numa id
Date: Mon, 10 Aug 2020 00:10:56 +0530	[thread overview]
Message-ID: <87eeof4q87.fsf@linux.ibm.com> (raw)
In-Reply-To: <6d880a50-09c4-d591-c88c-09fd77512ad3@linux.ibm.com>

"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:

> On 8/8/20 2:15 AM, Nathan Lynch wrote:
>> "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
>>> On 8/7/20 9:54 AM, Nathan Lynch wrote:
>>>> "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
>>>>> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
>>>>> index e437a9ac4956..6c659aada55b 100644
>>>>> --- a/arch/powerpc/mm/numa.c
>>>>> +++ b/arch/powerpc/mm/numa.c
>>>>> @@ -221,25 +221,51 @@ static void initialize_distance_lookup_table(int nid,
>>>>>    	}
>>>>>    }
>>>>>    
>>>>> +static u32 nid_map[MAX_NUMNODES] = {[0 ... MAX_NUMNODES - 1] =  NUMA_NO_NODE};
>>>>
>>>> It's odd to me to use MAX_NUMNODES for this array when it's going to be
>>>> indexed not by Linux's logical node IDs but by the platform-provided
>>>> domain number, which has no relation to MAX_NUMNODES.
>>>
>>>
>>> I didn't want to dynamically allocate this. We could fetch
>>> "ibm,max-associativity-domains" to find the size for that. The current
>>> code do assume  firmware group id to not exceed MAX_NUMNODES. Hence kept
>>> the array size to be MAX_NUMNODEs. I do agree that it is confusing. May
>>> be we can do #define MAX_AFFINITY_DOMAIN MAX_NUMNODES?
>> 
>> Well, consider:
>> 
>> - ibm,max-associativity-domains can change at runtime with LPM. This
>>    doesn't happen in practice yet, but we should probably start thinking
>>    about how to support that.
>> - The domain numbering isn't clearly specified to have any particular
>>    properties such as beginning at zero or a contiguous range.
>> 
>> While the current code likely contains assumptions contrary to these
>> points, a change such as this is an opportunity to think about whether
>> those assumptions can be reduced or removed. In particular I think it
>> would be good to gracefully degrade when the number of NUMA affinity
>> domains can exceed MAX_NUMNODES. Using the platform-supplied domain
>> numbers to directly index Linux data structures will make that
>> impossible.
>> 
>> So, maybe genradix or even xarray wouldn't actually be overengineering
>> here.
>> 
>
> One of the challenges with such a data structure is that we initialize 
> the nid_map before the slab is available. This means a memblock based 
> allocation and we would end up implementing such a sparse data structure 
> ourselves here.
>
> As you mentioned above, since we know that hypervisor as of now limits 
> the max affinity domain id below ibm,max-associativity-domains we are 
> good with an array-like nid_map we have here. This keeps the code simpler.
>
> This will also allow us to switch to a more sparse data structure as you 
> requested here in the future because the main change that is pushed in 
> this series is the usage of firmare_group_id_to_nid(). The details of 
> the data structure we use to keep track of that mapping are pretty much 
> internal to that function.

How about this? This makes it not a direct index. But it do limit the
search to max numa node on the system. 

static int domain_id_map[MAX_NUMNODES] = {[0 ... MAX_NUMNODES - 1] =  -1 };

static int __affinity_domain_to_nid(int domain_id, int max_nid)
{
	int i;

	for (i = 0; i < max_nid; i++) {
		if (domain_id_map[i] == domain_id)
			return i;
	}
	return NUMA_NO_NODE;
}

int affinity_domain_to_nid(struct affinity_domain *domain)
{
	int nid, domain_id;
	static int last_nid = 0;
	static DEFINE_SPINLOCK(node_id_lock);

	domain_id = domain->id;
	/*
	 * For PowerNV we don't change the node id. This helps to avoid
	 * confusion w.r.t the expected node ids. On pseries, node numbers
	 * are virtualized. Hence do logical node id for pseries.
	 */
	if (!firmware_has_feature(FW_FEATURE_LPAR))
		return domain_id;

	if (domain_id ==  -1 || last_nid == MAX_NUMNODES)
		return NUMA_NO_NODE;

	nid = __affinity_domain_to_nid(domain_id, last_nid);

	if (nid == NUMA_NO_NODE) {
		spin_lock(&node_id_lock);
		/*  recheck with lock held */
		nid = __affinity_domain_to_nid(domain_id, last_nid);
		if (nid == NUMA_NO_NODE) {
			nid = last_nid++;
			domain_id_map[nid] = domain_id;
		}
		spin_unlock(&node_id_lock);
	}

	return nid;
}

  reply	other threads:[~2020-08-09 18:42 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-31 11:19 [RFC PATCH 1/2] powerpc/numa: Introduce logical numa id Aneesh Kumar K.V
2020-07-31 11:22 ` [RFC PATCH 2/2] powerpc/powernv/cpufreq: Don't assume chip id is same as Linux node id Aneesh Kumar K.V
2020-08-04  7:47   ` Gautham R Shenoy
2020-08-01  5:20 ` [RFC PATCH 1/2] powerpc/numa: Introduce logical numa id Srikar Dronamraju
2020-08-02 14:21   ` Aneesh Kumar K.V
2020-08-04  7:25     ` Srikar Dronamraju
2020-08-06 10:44       ` Aneesh Kumar K.V
2020-08-10  8:05         ` Srikar Dronamraju
2020-08-07  4:24 ` Nathan Lynch
2020-08-07  5:02   ` Aneesh Kumar K.V
2020-08-07 20:45     ` Nathan Lynch
2020-08-09 14:12       ` Aneesh Kumar K.V
2020-08-09 18:40         ` Aneesh Kumar K.V [this message]
2020-08-13 22:53           ` Nathan Lynch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87eeof4q87.fsf@linux.ibm.com \
    --to=aneesh.kumar@linux.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=nathanl@linux.ibm.com \
    --cc=srikar@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.