linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] workqueue: Fix edge cases for calc of pool's cpumask
@ 2017-06-06 14:09 Michael Bringmann
  2017-06-06 14:20 ` Tejun Heo
  0 siblings, 1 reply; 4+ messages in thread
From: Michael Bringmann @ 2017-06-06 14:09 UTC (permalink / raw)
  To: Tejun Heo, Lai Jiangshan, linux-kernel, linuxppc-dev
  Cc: Michael Bringmann from Kernel Team, Nathan Fontenot


On NUMA systems with dynamic processors, the content of the cpumask
may change over time.  As new processors are added via DLPAR operations,
workqueues are created for them.  Depending upon the order in which CPUs
are added/removed, we may run into problems with the content of the
cpumask used by the workqueues.  This patch deals with situations where
the online cpumask for a node is a proper superset of possible cpumask
for the node.  It also deals with edge cases where the order in which
CPUs are removed/added from the online cpumask may leave the set for a
node empty, and require execution by CPUs on another node.

In these and other cases, the patch attempts to ensure that a valid,
usable cpumask is used to set up newly created pools for workqueues.

Signed-off-by: Tejun Heo <tj@kernel.org> & Michael Bringmann <mwb@linux.vnet.ibm.com>
---
Changes in V2:
  -- Merge in additional logic fixes provided by Tejun Heo.
  -- Revise safety check added to get_unbound_pool to trigger only
     in the event of a dangerously invalid cpumask attribute.
---
 kernel/workqueue.c |   15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index c74bf39..460de61 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -3366,6 +3366,9 @@ static struct worker_pool *get_unbound_pool(const struct workqueue_attrs *attrs)
 	copy_workqueue_attrs(pool->attrs, attrs);
 	pool->node = target_node;
 
+	if (!cpumask_weight(pool->attrs->cpumask))
+		cpumask_copy(pool->attrs->cpumask, cpumask_of(smp_processor_id()));
+
 	/*
 	 * no_numa isn't a worker_pool attribute, always clear it.  See
 	 * 'struct workqueue_attrs' comments for detail.
@@ -3559,13 +3562,13 @@ static struct pool_workqueue *alloc_unbound_pwq(struct workqueue_struct *wq,
  * stable.
  *
  * Return: %true if the resulting @cpumask is different from @attrs->cpumask,
- * %false if equal.
+ * %false if equal.  On %false return, the content of @cpumask is undefined.
  */
 static bool wq_calc_node_cpumask(const struct workqueue_attrs *attrs, int node,
 				 int cpu_going_down, cpumask_t *cpumask)
 {
 	if (!wq_numa_enabled || attrs->no_numa)
-		goto use_dfl;
+		return false;
 
 	/* does @node have any online CPUs @attrs wants? */
 	cpumask_and(cpumask, cpumask_of_node(node), attrs->cpumask);
@@ -3573,15 +3576,13 @@ static bool wq_calc_node_cpumask(const struct workqueue_attrs *attrs, int node,
 		cpumask_clear_cpu(cpu_going_down, cpumask);
 
 	if (cpumask_empty(cpumask))
-		goto use_dfl;
+		return false;
 
 	/* yeap, return possible CPUs in @node that @attrs wants */
 	cpumask_and(cpumask, attrs->cpumask, wq_numa_possible_cpumask[node]);
-	return !cpumask_equal(cpumask, attrs->cpumask);
 
-use_dfl:
-	cpumask_copy(cpumask, attrs->cpumask);
-	return false;
+	return !cpumask_empty(cpumask) &&
+		!cpumask_equal(cpumask, attrs->cpumask);
 }
 
 /* install @pwq into @wq's numa_pwq_tbl[] for @node and return the old pwq */

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] workqueue: Fix edge cases for calc of pool's cpumask
  2017-06-06 14:09 [PATCH v2] workqueue: Fix edge cases for calc of pool's cpumask Michael Bringmann
@ 2017-06-06 14:20 ` Tejun Heo
  2017-06-06 14:34   ` Michael Bringmann
  0 siblings, 1 reply; 4+ messages in thread
From: Tejun Heo @ 2017-06-06 14:20 UTC (permalink / raw)
  To: Michael Bringmann
  Cc: Lai Jiangshan, linux-kernel, linuxppc-dev,
	Michael Bringmann from Kernel Team, Nathan Fontenot

Hello, Michael.

It would have been better to continue debugging in the prev thread.
This still seems incorrect for the same reason as before.

On Tue, Jun 06, 2017 at 09:09:40AM -0500, Michael Bringmann wrote:
> On NUMA systems with dynamic processors, the content of the cpumask
> may change over time.  As new processors are added via DLPAR operations,
> workqueues are created for them.  Depending upon the order in which CPUs
> are added/removed, we may run into problems with the content of the
> cpumask used by the workqueues.  This patch deals with situations where
> the online cpumask for a node is a proper superset of possible cpumask
> for the node.  It also deals with edge cases where the order in which
> CPUs are removed/added from the online cpumask may leave the set for a
> node empty, and require execution by CPUs on another node.
> 
> In these and other cases, the patch attempts to ensure that a valid,
> usable cpumask is used to set up newly created pools for workqueues.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org> & Michael Bringmann <mwb@linux.vnet.ibm.com>

Heh, you can't add sob's for other people.  For partial attributions,
you can just note in the description.

> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index c74bf39..460de61 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -3366,6 +3366,9 @@ static struct worker_pool *get_unbound_pool(const struct workqueue_attrs *attrs)
>  	copy_workqueue_attrs(pool->attrs, attrs);
>  	pool->node = target_node;
>  
> +	if (!cpumask_weight(pool->attrs->cpumask))
> +		cpumask_copy(pool->attrs->cpumask, cpumask_of(smp_processor_id()));

So, this is still wrong.

>  	/*
>  	 * no_numa isn't a worker_pool attribute, always clear it.  See
>  	 * 'struct workqueue_attrs' comments for detail.
> @@ -3559,13 +3562,13 @@ static struct pool_workqueue *alloc_unbound_pwq(struct workqueue_struct *wq,
>   * stable.
>   *
>   * Return: %true if the resulting @cpumask is different from @attrs->cpumask,
> - * %false if equal.
> + * %false if equal.  On %false return, the content of @cpumask is undefined.
>   */
>  static bool wq_calc_node_cpumask(const struct workqueue_attrs *attrs, int node,
>  				 int cpu_going_down, cpumask_t *cpumask)
>  {
>  	if (!wq_numa_enabled || attrs->no_numa)
> -		goto use_dfl;
> +		return false;
>  
>  	/* does @node have any online CPUs @attrs wants? */
>  	cpumask_and(cpumask, cpumask_of_node(node), attrs->cpumask);
> @@ -3573,15 +3576,13 @@ static bool wq_calc_node_cpumask(const struct workqueue_attrs *attrs, int node,
>  		cpumask_clear_cpu(cpu_going_down, cpumask);
>  
>  	if (cpumask_empty(cpumask))
> -		goto use_dfl;
> +		return false;
>  
>  	/* yeap, return possible CPUs in @node that @attrs wants */
>  	cpumask_and(cpumask, attrs->cpumask, wq_numa_possible_cpumask[node]);
> -	return !cpumask_equal(cpumask, attrs->cpumask);
>  
> -use_dfl:
> -	cpumask_copy(cpumask, attrs->cpumask);
> -	return false;
> +	return !cpumask_empty(cpumask) &&
> +		!cpumask_equal(cpumask, attrs->cpumask);

And this part doesn't really change that.

CPUs going offline or online shouldn't change their relation to
wq_numa_possible_cpumask.  I wonder whether the arch code is changing
CPU id <-> NUMA node mapping on CPU on/offlining.  x86 used to do that
too and got recently modified.  Can you see whether that's the case?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] workqueue: Fix edge cases for calc of pool's cpumask
  2017-06-06 14:20 ` Tejun Heo
@ 2017-06-06 14:34   ` Michael Bringmann
  2017-06-06 14:38     ` Tejun Heo
  0 siblings, 1 reply; 4+ messages in thread
From: Michael Bringmann @ 2017-06-06 14:34 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Lai Jiangshan, linux-kernel, linuxppc-dev,
	Michael Bringmann from Kernel Team, Nathan Fontenot



On 06/06/2017 09:20 AM, Tejun Heo wrote:
> Hello, Michael.
> 
> It would have been better to continue debugging in the prev thread.
> This still seems incorrect for the same reason as before.
> 
> On Tue, Jun 06, 2017 at 09:09:40AM -0500, Michael Bringmann wrote:
>> On NUMA systems with dynamic processors, the content of the cpumask
>> may change over time.  As new processors are added via DLPAR operations,
>> workqueues are created for them.  Depending upon the order in which CPUs
>> are added/removed, we may run into problems with the content of the
>> cpumask used by the workqueues.  This patch deals with situations where
>> the online cpumask for a node is a proper superset of possible cpumask
>> for the node.  It also deals with edge cases where the order in which
>> CPUs are removed/added from the online cpumask may leave the set for a
>> node empty, and require execution by CPUs on another node.
>>
>> In these and other cases, the patch attempts to ensure that a valid,
>> usable cpumask is used to set up newly created pools for workqueues.
>>
>> Signed-off-by: Tejun Heo <tj@kernel.org> & Michael Bringmann <mwb@linux.vnet.ibm.com>
> 
> Heh, you can't add sob's for other people.  For partial attributions,
> you can just note in the description.

Sorry for the error.
> 
>> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
>> index c74bf39..460de61 100644
>> --- a/kernel/workqueue.c
>> +++ b/kernel/workqueue.c
>> @@ -3366,6 +3366,9 @@ static struct worker_pool *get_unbound_pool(const struct workqueue_attrs *attrs)
>>  	copy_workqueue_attrs(pool->attrs, attrs);
>>  	pool->node = target_node;
>>  
>> +	if (!cpumask_weight(pool->attrs->cpumask))
>> +		cpumask_copy(pool->attrs->cpumask, cpumask_of(smp_processor_id()));
> 
> So, this is still wrong.

It only catches if something has gone wrong before.  The alternative in this case
would be,

	BUG(!cpumask_weight(pool->attrs->cpumask));

> 
>>  	/*
>>  	 * no_numa isn't a worker_pool attribute, always clear it.  See
>>  	 * 'struct workqueue_attrs' comments for detail.
>> @@ -3559,13 +3562,13 @@ static struct pool_workqueue *alloc_unbound_pwq(struct workqueue_struct *wq,
>>   * stable.
>>   *
>>   * Return: %true if the resulting @cpumask is different from @attrs->cpumask,
>> - * %false if equal.
>> + * %false if equal.  On %false return, the content of @cpumask is undefined.
>>   */
>>  static bool wq_calc_node_cpumask(const struct workqueue_attrs *attrs, int node,
>>  				 int cpu_going_down, cpumask_t *cpumask)
>>  {
>>  	if (!wq_numa_enabled || attrs->no_numa)
>> -		goto use_dfl;
>> +		return false;
>>  
>>  	/* does @node have any online CPUs @attrs wants? */
>>  	cpumask_and(cpumask, cpumask_of_node(node), attrs->cpumask);
>> @@ -3573,15 +3576,13 @@ static bool wq_calc_node_cpumask(const struct workqueue_attrs *attrs, int node,
>>  		cpumask_clear_cpu(cpu_going_down, cpumask);
>>  
>>  	if (cpumask_empty(cpumask))
>> -		goto use_dfl;
>> +		return false;
>>  
>>  	/* yeap, return possible CPUs in @node that @attrs wants */
>>  	cpumask_and(cpumask, attrs->cpumask, wq_numa_possible_cpumask[node]);
>> -	return !cpumask_equal(cpumask, attrs->cpumask);
>>  
>> -use_dfl:
>> -	cpumask_copy(cpumask, attrs->cpumask);
>> -	return false;
>> +	return !cpumask_empty(cpumask) &&
>> +		!cpumask_equal(cpumask, attrs->cpumask);
> 
> And this part doesn't really change that.
> 
> CPUs going offline or online shouldn't change their relation to
> wq_numa_possible_cpumask.  I wonder whether the arch code is changing
> CPU id <-> NUMA node mapping on CPU on/offlining.  x86 used to do that
> too and got recently modified.  Can you see whether that's the case?

The but that I see does not appear to be related to changing of CPU/Node mapping
-- they are not changing their place when going offline/online.  Rather new CPUs
are being hot-added to the system (i.e. they were not present at boot), and the
node to which they are being added had no CPUs at boot.

> 
> Thanks.
> 

Thanks.

-- 
Michael W. Bringmann
Linux Technology Center
IBM Corporation
Tie-Line  363-5196
External: (512) 286-5196
Cell:       (512) 466-0650
mwb@linux.vnet.ibm.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] workqueue: Fix edge cases for calc of pool's cpumask
  2017-06-06 14:34   ` Michael Bringmann
@ 2017-06-06 14:38     ` Tejun Heo
  0 siblings, 0 replies; 4+ messages in thread
From: Tejun Heo @ 2017-06-06 14:38 UTC (permalink / raw)
  To: Michael Bringmann
  Cc: Lai Jiangshan, linux-kernel, linuxppc-dev,
	Michael Bringmann from Kernel Team, Nathan Fontenot

Hello,

On Tue, Jun 06, 2017 at 09:34:05AM -0500, Michael Bringmann wrote:
> >> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> >> index c74bf39..460de61 100644
> >> --- a/kernel/workqueue.c
> >> +++ b/kernel/workqueue.c
> >> @@ -3366,6 +3366,9 @@ static struct worker_pool *get_unbound_pool(const struct workqueue_attrs *attrs)
> >>  	copy_workqueue_attrs(pool->attrs, attrs);
> >>  	pool->node = target_node;
> >>  
> >> +	if (!cpumask_weight(pool->attrs->cpumask))
> >> +		cpumask_copy(pool->attrs->cpumask, cpumask_of(smp_processor_id()));
> > 
> > So, this is still wrong.
> 
> It only catches if something has gone wrong before.  The alternative in this case
> would be,
> 
> 	BUG(!cpumask_weight(pool->attrs->cpumask));

I'm kinda confused, so the hotplug problems you see across hotplugs go
away without the above change and the above is "just in case"?

> >> @@ -3559,13 +3562,13 @@ static struct pool_workqueue *alloc_unbound_pwq(struct workqueue_struct *wq,
> >>   * stable.
> >>   *
> >>   * Return: %true if the resulting @cpumask is different from @attrs->cpumask,
> >> - * %false if equal.
> >> + * %false if equal.  On %false return, the content of @cpumask is undefined.
> >>   */
> >>  static bool wq_calc_node_cpumask(const struct workqueue_attrs *attrs, int node,
> >>  				 int cpu_going_down, cpumask_t *cpumask)
> >>  {
> >>  	if (!wq_numa_enabled || attrs->no_numa)
> >> -		goto use_dfl;
> >> +		return false;
> >>  
> >>  	/* does @node have any online CPUs @attrs wants? */
> >>  	cpumask_and(cpumask, cpumask_of_node(node), attrs->cpumask);
> >> @@ -3573,15 +3576,13 @@ static bool wq_calc_node_cpumask(const struct workqueue_attrs *attrs, int node,
> >>  		cpumask_clear_cpu(cpu_going_down, cpumask);
> >>  
> >>  	if (cpumask_empty(cpumask))
> >> -		goto use_dfl;
> >> +		return false;
> >>  
> >>  	/* yeap, return possible CPUs in @node that @attrs wants */
> >>  	cpumask_and(cpumask, attrs->cpumask, wq_numa_possible_cpumask[node]);
> >> -	return !cpumask_equal(cpumask, attrs->cpumask);
> >>  
> >> -use_dfl:
> >> -	cpumask_copy(cpumask, attrs->cpumask);
> >> -	return false;
> >> +	return !cpumask_empty(cpumask) &&
> >> +		!cpumask_equal(cpumask, attrs->cpumask);
> > 
> > And this part doesn't really change that.
> > 
> > CPUs going offline or online shouldn't change their relation to
> > wq_numa_possible_cpumask.  I wonder whether the arch code is changing
> > CPU id <-> NUMA node mapping on CPU on/offlining.  x86 used to do that
> > too and got recently modified.  Can you see whether that's the case?
> 
> The but that I see does not appear to be related to changing of CPU/Node mapping
> -- they are not changing their place when going offline/online.  Rather new CPUs
> are being hot-added to the system (i.e. they were not present at boot), and the
> node to which they are being added had no CPUs at boot.

Can you please post the messages with the debug patch from the prev
thread?  In fact, let's please continue on that thread.  I'm having a
hard time following what's going wrong with the code.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-06-06 14:38 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-06 14:09 [PATCH v2] workqueue: Fix edge cases for calc of pool's cpumask Michael Bringmann
2017-06-06 14:20 ` Tejun Heo
2017-06-06 14:34   ` Michael Bringmann
2017-06-06 14:38     ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).