All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gautham R Shenoy <ego@linux.vnet.ibm.com>
To: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>,
	"Gautham R. Shenoy" <ego@linux.vnet.ibm.com>,
	Michael Neuling <mikey@neuling.org>,
	Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-kernel@vger.kernel.org, Nicholas Piggin <npiggin@gmail.com>,
	linuxppc-dev@lists.ozlabs.org,
	Valentin Schneider <valentin.schneider@arm.com>
Subject: Re: [PATCH 1/3] powerpc/smp: Parse ibm,thread-groups with multiple properties
Date: Tue, 8 Dec 2020 22:55:40 +0530	[thread overview]
Message-ID: <20201208172540.GA14206@in.ibm.com> (raw)
In-Reply-To: <20201207121042.GH528281@linux.vnet.ibm.com>

Hello Srikar,

Thanks for taking a look at the patch.

On Mon, Dec 07, 2020 at 05:40:42PM +0530, Srikar Dronamraju wrote:
> * Gautham R. Shenoy <ego@linux.vnet.ibm.com> [2020-12-04 10:18:45]:
> 
> > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> 
> <snipped>
> 
> > 
> >  static int parse_thread_groups(struct device_node *dn,
> > -			       struct thread_groups *tg,
> > -			       unsigned int property)
> > +			       struct thread_groups_list *tglp)
> >  {
> > -	int i;
> > -	u32 thread_group_array[3 + MAX_THREAD_LIST_SIZE];
> > +	int i = 0;
> > +	u32 *thread_group_array;
> >  	u32 *thread_list;
> >  	size_t total_threads;
> > -	int ret;
> > +	int ret = 0, count;
> > +	unsigned int property_idx = 0;
> 
> NIT:
> tglx mentions in one of his recent comments to try keep a reverse fir tree
> ordering of variables where possible.

I suppose you mean moving the longer local variable declarations to to
the top and shorter ones to the bottom. Thanks. Will fix this.


> 
> > 
> > +	count = of_property_count_u32_elems(dn, "ibm,thread-groups");
> > +	thread_group_array = kcalloc(count, sizeof(u32), GFP_KERNEL);
> >  	ret = of_property_read_u32_array(dn, "ibm,thread-groups",
> > -					 thread_group_array, 3);
> > +					 thread_group_array, count);
> >  	if (ret)
> > -		return ret;
> > -
> > -	tg->property = thread_group_array[0];
> > -	tg->nr_groups = thread_group_array[1];
> > -	tg->threads_per_group = thread_group_array[2];
> > -	if (tg->property != property ||
> > -	    tg->nr_groups < 1 ||
> > -	    tg->threads_per_group < 1)
> > -		return -ENODATA;
> > +		goto out_free;
> > 
> > -	total_threads = tg->nr_groups * tg->threads_per_group;
> > +	while (i < count && property_idx < MAX_THREAD_GROUP_PROPERTIES) {
> > +		int j;
> > +		struct thread_groups *tg = &tglp->property_tgs[property_idx++];
> 
> NIT: same as above.

Ok.
> 
> > 
> > -	ret = of_property_read_u32_array(dn, "ibm,thread-groups",
> > -					 thread_group_array,
> > -					 3 + total_threads);
> > -	if (ret)
> > -		return ret;
> > +		tg->property = thread_group_array[i];
> > +		tg->nr_groups = thread_group_array[i + 1];
> > +		tg->threads_per_group = thread_group_array[i + 2];
> > +		total_threads = tg->nr_groups * tg->threads_per_group;
> > +
> > +		thread_list = &thread_group_array[i + 3];
> > 
> > -	thread_list = &thread_group_array[3];
> > +		for (j = 0; j < total_threads; j++)
> > +			tg->thread_list[j] = thread_list[j];
> > +		i = i + 3 + total_threads;
> 
> 	Can't we simply use memcpy instead?

We could. But this one makes it more explicit.


> 
> > +	}
> > 
> > -	for (i = 0 ; i < total_threads; i++)
> > -		tg->thread_list[i] = thread_list[i];
> > +	tglp->nr_properties = property_idx;
> > 
> > -	return 0;
> > +out_free:
> > +	kfree(thread_group_array);
> > +	return ret;
> >  }
> > 
> >  /*
> > @@ -805,24 +827,39 @@ static int get_cpu_thread_group_start(int cpu, struct thread_groups *tg)
> >  	return -1;
> >  }
> > 
> > -static int init_cpu_l1_cache_map(int cpu)
> > +static int init_cpu_cache_map(int cpu, unsigned int cache_property)
> > 
> >  {
> >  	struct device_node *dn = of_get_cpu_node(cpu, NULL);
> > -	struct thread_groups tg = {.property = 0,
> > -				   .nr_groups = 0,
> > -				   .threads_per_group = 0};
> > +	struct thread_groups *tg = NULL;
> >  	int first_thread = cpu_first_thread_sibling(cpu);
> >  	int i, cpu_group_start = -1, err = 0;
> > +	cpumask_var_t *mask;
> > +	struct thread_groups_list *cpu_tgl = &tgl[cpu];
> 
> NIT: same as 1st comment.

Sure, will fix this.

> 
> > 
> >  	if (!dn)
> >  		return -ENODATA;
> > 
> > -	err = parse_thread_groups(dn, &tg, THREAD_GROUP_SHARE_L1);
> > -	if (err)
> > -		goto out;
> > +	if (!(cache_property == THREAD_GROUP_SHARE_L1))
> > +		return -EINVAL;
> > 
> > -	cpu_group_start = get_cpu_thread_group_start(cpu, &tg);
> > +	if (!cpu_tgl->nr_properties) {
> > +		err = parse_thread_groups(dn, cpu_tgl);
> > +		if (err)
> > +			goto out;
> > +	}
> > +
> > +	for (i = 0; i < cpu_tgl->nr_properties; i++) {
> > +		if (cpu_tgl->property_tgs[i].property == cache_property) {
> > +			tg = &cpu_tgl->property_tgs[i];
> > +			break;
> > +		}
> > +	}
> > +
> > +	if (!tg)
> > +		return -EINVAL;
> > +
> > +	cpu_group_start = get_cpu_thread_group_start(cpu, tg);
> 
> This whole hunk should be moved to a new function and called before
> init_cpu_cache_map. It will simplify the logic to great extent.

I suppose you are referring to the part where we select the correct
tg. Yeah, that can move to a different helper.

> 
> > 
> >  	if (unlikely(cpu_group_start == -1)) {
> >  		WARN_ON_ONCE(1);
> > @@ -830,11 +867,12 @@ static int init_cpu_l1_cache_map(int cpu)
> >  		goto out;
> >  	}
> > 
> > -	zalloc_cpumask_var_node(&per_cpu(cpu_l1_cache_map, cpu),
> > -				GFP_KERNEL, cpu_to_node(cpu));
> > +	mask = &per_cpu(cpu_l1_cache_map, cpu);
> > +
> > +	zalloc_cpumask_var_node(mask, GFP_KERNEL, cpu_to_node(cpu));
> > 
> 
> This hunk (and the next hunk) should be moved to next patch.
>

The next patch is only about introducing  THREAD_GROUP_SHARE_L2. Hence
I put in any other code in this patch, since it seems to be a logical
place to collate whatever we have in a generic form.



> >  	for (i = first_thread; i < first_thread + threads_per_core; i++) {
> > -		int i_group_start = get_cpu_thread_group_start(i, &tg);
> > +		int i_group_start = get_cpu_thread_group_start(i, tg);
> > 
> >  		if (unlikely(i_group_start == -1)) {
> >  			WARN_ON_ONCE(1);
> > @@ -843,7 +881,7 @@ static int init_cpu_l1_cache_map(int cpu)
> >  		}
> > 
> >  		if (i_group_start == cpu_group_start)
> > -			cpumask_set_cpu(i, per_cpu(cpu_l1_cache_map, cpu));
> > +			cpumask_set_cpu(i, *mask);
> >  	}
> > 
> >  out:
> > @@ -924,7 +962,7 @@ static int init_big_cores(void)
> >  	int cpu;
> > 
> >  	for_each_possible_cpu(cpu) {
> > -		int err = init_cpu_l1_cache_map(cpu);
> > +		int err = init_cpu_cache_map(cpu, THREAD_GROUP_SHARE_L1);
> > 
> >  		if (err)
> >  			return err;
> > -- 
> > 1.9.4
> > 
> 
> -- 
> Thanks and Regards
> Srikar Dronamraju

WARNING: multiple messages have this Message-ID (diff)
From: Gautham R Shenoy <ego@linux.vnet.ibm.com>
To: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>,
	Anton Blanchard <anton@ozlabs.org>,
	Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Michael Neuling <mikey@neuling.org>,
	Nicholas Piggin <npiggin@gmail.com>,
	Nathan Lynch <nathanl@linux.ibm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Valentin Schneider <valentin.schneider@arm.com>,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/3] powerpc/smp: Parse ibm,thread-groups with multiple properties
Date: Tue, 8 Dec 2020 22:55:40 +0530	[thread overview]
Message-ID: <20201208172540.GA14206@in.ibm.com> (raw)
In-Reply-To: <20201207121042.GH528281@linux.vnet.ibm.com>

Hello Srikar,

Thanks for taking a look at the patch.

On Mon, Dec 07, 2020 at 05:40:42PM +0530, Srikar Dronamraju wrote:
> * Gautham R. Shenoy <ego@linux.vnet.ibm.com> [2020-12-04 10:18:45]:
> 
> > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> 
> <snipped>
> 
> > 
> >  static int parse_thread_groups(struct device_node *dn,
> > -			       struct thread_groups *tg,
> > -			       unsigned int property)
> > +			       struct thread_groups_list *tglp)
> >  {
> > -	int i;
> > -	u32 thread_group_array[3 + MAX_THREAD_LIST_SIZE];
> > +	int i = 0;
> > +	u32 *thread_group_array;
> >  	u32 *thread_list;
> >  	size_t total_threads;
> > -	int ret;
> > +	int ret = 0, count;
> > +	unsigned int property_idx = 0;
> 
> NIT:
> tglx mentions in one of his recent comments to try keep a reverse fir tree
> ordering of variables where possible.

I suppose you mean moving the longer local variable declarations to to
the top and shorter ones to the bottom. Thanks. Will fix this.


> 
> > 
> > +	count = of_property_count_u32_elems(dn, "ibm,thread-groups");
> > +	thread_group_array = kcalloc(count, sizeof(u32), GFP_KERNEL);
> >  	ret = of_property_read_u32_array(dn, "ibm,thread-groups",
> > -					 thread_group_array, 3);
> > +					 thread_group_array, count);
> >  	if (ret)
> > -		return ret;
> > -
> > -	tg->property = thread_group_array[0];
> > -	tg->nr_groups = thread_group_array[1];
> > -	tg->threads_per_group = thread_group_array[2];
> > -	if (tg->property != property ||
> > -	    tg->nr_groups < 1 ||
> > -	    tg->threads_per_group < 1)
> > -		return -ENODATA;
> > +		goto out_free;
> > 
> > -	total_threads = tg->nr_groups * tg->threads_per_group;
> > +	while (i < count && property_idx < MAX_THREAD_GROUP_PROPERTIES) {
> > +		int j;
> > +		struct thread_groups *tg = &tglp->property_tgs[property_idx++];
> 
> NIT: same as above.

Ok.
> 
> > 
> > -	ret = of_property_read_u32_array(dn, "ibm,thread-groups",
> > -					 thread_group_array,
> > -					 3 + total_threads);
> > -	if (ret)
> > -		return ret;
> > +		tg->property = thread_group_array[i];
> > +		tg->nr_groups = thread_group_array[i + 1];
> > +		tg->threads_per_group = thread_group_array[i + 2];
> > +		total_threads = tg->nr_groups * tg->threads_per_group;
> > +
> > +		thread_list = &thread_group_array[i + 3];
> > 
> > -	thread_list = &thread_group_array[3];
> > +		for (j = 0; j < total_threads; j++)
> > +			tg->thread_list[j] = thread_list[j];
> > +		i = i + 3 + total_threads;
> 
> 	Can't we simply use memcpy instead?

We could. But this one makes it more explicit.


> 
> > +	}
> > 
> > -	for (i = 0 ; i < total_threads; i++)
> > -		tg->thread_list[i] = thread_list[i];
> > +	tglp->nr_properties = property_idx;
> > 
> > -	return 0;
> > +out_free:
> > +	kfree(thread_group_array);
> > +	return ret;
> >  }
> > 
> >  /*
> > @@ -805,24 +827,39 @@ static int get_cpu_thread_group_start(int cpu, struct thread_groups *tg)
> >  	return -1;
> >  }
> > 
> > -static int init_cpu_l1_cache_map(int cpu)
> > +static int init_cpu_cache_map(int cpu, unsigned int cache_property)
> > 
> >  {
> >  	struct device_node *dn = of_get_cpu_node(cpu, NULL);
> > -	struct thread_groups tg = {.property = 0,
> > -				   .nr_groups = 0,
> > -				   .threads_per_group = 0};
> > +	struct thread_groups *tg = NULL;
> >  	int first_thread = cpu_first_thread_sibling(cpu);
> >  	int i, cpu_group_start = -1, err = 0;
> > +	cpumask_var_t *mask;
> > +	struct thread_groups_list *cpu_tgl = &tgl[cpu];
> 
> NIT: same as 1st comment.

Sure, will fix this.

> 
> > 
> >  	if (!dn)
> >  		return -ENODATA;
> > 
> > -	err = parse_thread_groups(dn, &tg, THREAD_GROUP_SHARE_L1);
> > -	if (err)
> > -		goto out;
> > +	if (!(cache_property == THREAD_GROUP_SHARE_L1))
> > +		return -EINVAL;
> > 
> > -	cpu_group_start = get_cpu_thread_group_start(cpu, &tg);
> > +	if (!cpu_tgl->nr_properties) {
> > +		err = parse_thread_groups(dn, cpu_tgl);
> > +		if (err)
> > +			goto out;
> > +	}
> > +
> > +	for (i = 0; i < cpu_tgl->nr_properties; i++) {
> > +		if (cpu_tgl->property_tgs[i].property == cache_property) {
> > +			tg = &cpu_tgl->property_tgs[i];
> > +			break;
> > +		}
> > +	}
> > +
> > +	if (!tg)
> > +		return -EINVAL;
> > +
> > +	cpu_group_start = get_cpu_thread_group_start(cpu, tg);
> 
> This whole hunk should be moved to a new function and called before
> init_cpu_cache_map. It will simplify the logic to great extent.

I suppose you are referring to the part where we select the correct
tg. Yeah, that can move to a different helper.

> 
> > 
> >  	if (unlikely(cpu_group_start == -1)) {
> >  		WARN_ON_ONCE(1);
> > @@ -830,11 +867,12 @@ static int init_cpu_l1_cache_map(int cpu)
> >  		goto out;
> >  	}
> > 
> > -	zalloc_cpumask_var_node(&per_cpu(cpu_l1_cache_map, cpu),
> > -				GFP_KERNEL, cpu_to_node(cpu));
> > +	mask = &per_cpu(cpu_l1_cache_map, cpu);
> > +
> > +	zalloc_cpumask_var_node(mask, GFP_KERNEL, cpu_to_node(cpu));
> > 
> 
> This hunk (and the next hunk) should be moved to next patch.
>

The next patch is only about introducing  THREAD_GROUP_SHARE_L2. Hence
I put in any other code in this patch, since it seems to be a logical
place to collate whatever we have in a generic form.



> >  	for (i = first_thread; i < first_thread + threads_per_core; i++) {
> > -		int i_group_start = get_cpu_thread_group_start(i, &tg);
> > +		int i_group_start = get_cpu_thread_group_start(i, tg);
> > 
> >  		if (unlikely(i_group_start == -1)) {
> >  			WARN_ON_ONCE(1);
> > @@ -843,7 +881,7 @@ static int init_cpu_l1_cache_map(int cpu)
> >  		}
> > 
> >  		if (i_group_start == cpu_group_start)
> > -			cpumask_set_cpu(i, per_cpu(cpu_l1_cache_map, cpu));
> > +			cpumask_set_cpu(i, *mask);
> >  	}
> > 
> >  out:
> > @@ -924,7 +962,7 @@ static int init_big_cores(void)
> >  	int cpu;
> > 
> >  	for_each_possible_cpu(cpu) {
> > -		int err = init_cpu_l1_cache_map(cpu);
> > +		int err = init_cpu_cache_map(cpu, THREAD_GROUP_SHARE_L1);
> > 
> >  		if (err)
> >  			return err;
> > -- 
> > 1.9.4
> > 
> 
> -- 
> Thanks and Regards
> Srikar Dronamraju

  reply	other threads:[~2020-12-08 17:28 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-04  4:48 [PATCH 0/3] Extend Parsing "ibm, thread-groups" for Shared-L2 information Gautham R. Shenoy
2020-12-04  4:48 ` [PATCH 0/3] Extend Parsing "ibm,thread-groups" " Gautham R. Shenoy
2020-12-04  4:48 ` [PATCH 1/3] powerpc/smp: Parse ibm, thread-groups with multiple properties Gautham R. Shenoy
2020-12-04  4:48   ` [PATCH 1/3] powerpc/smp: Parse ibm,thread-groups " Gautham R. Shenoy
2020-12-07 12:10   ` Srikar Dronamraju
2020-12-07 12:10     ` Srikar Dronamraju
2020-12-08 17:25     ` Gautham R Shenoy [this message]
2020-12-08 17:25       ` Gautham R Shenoy
2020-12-09  3:59       ` [PATCH 1/3] powerpc/smp: Parse ibm, thread-groups " Michael Ellerman
2020-12-09  3:59         ` [PATCH 1/3] powerpc/smp: Parse ibm,thread-groups " Michael Ellerman
2020-12-09  8:35       ` Srikar Dronamraju
2020-12-09  8:35         ` Srikar Dronamraju
2020-12-09  9:05         ` Gautham R Shenoy
2020-12-09  9:05           ` Gautham R Shenoy
2020-12-04  4:48 ` [PATCH 2/3] powerpc/smp: Add support detecting thread-groups sharing L2 cache Gautham R. Shenoy
2020-12-04  4:48   ` Gautham R. Shenoy
2020-12-07 12:40   ` Srikar Dronamraju
2020-12-07 12:40     ` Srikar Dronamraju
2020-12-08 17:42     ` Gautham R Shenoy
2020-12-08 17:42       ` Gautham R Shenoy
2020-12-09  9:14       ` Srikar Dronamraju
2020-12-09  9:14         ` Srikar Dronamraju
2020-12-04  4:48 ` [PATCH 3/3] powerpc/cacheinfo: Print correct cache-sibling map/list for " Gautham R. Shenoy
2020-12-04  4:48   ` Gautham R. Shenoy
2020-12-04 10:32   ` kernel test robot
2020-12-07 13:11   ` Srikar Dronamraju
2020-12-07 13:11     ` Srikar Dronamraju
2020-12-08 17:56     ` Gautham R Shenoy
2020-12-08 17:56       ` Gautham R Shenoy
2020-12-09  8:39       ` Srikar Dronamraju
2020-12-09  8:39         ` Srikar Dronamraju
2020-12-09  9:07         ` Gautham R Shenoy
2020-12-09  9:07           ` Gautham R Shenoy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201208172540.GA14206@in.ibm.com \
    --to=ego@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mikey@neuling.org \
    --cc=nathanl@linux.ibm.com \
    --cc=npiggin@gmail.com \
    --cc=peterz@infradead.org \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=svaidy@linux.vnet.ibm.com \
    --cc=valentin.schneider@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.