From: Michael Bringmann <mwb@linux.vnet.ibm.com>
To: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>,
linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
Michael Bringmann from Kernel Team <mbringm@us.ibm.com>,
Nathan Fontenot <nfont@linux.vnet.ibm.com>
Subject: Re: [PATCH v2] workqueue: Fix edge cases for calc of pool's cpumask
Date: Tue, 6 Jun 2017 09:34:05 -0500 [thread overview]
Message-ID: <a6fbf0fe-e6a4-dd99-f329-614a6de99555@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170606142009.GA18318@htj.duckdns.org>
On 06/06/2017 09:20 AM, Tejun Heo wrote:
> Hello, Michael.
>
> It would have been better to continue debugging in the prev thread.
> This still seems incorrect for the same reason as before.
>
> On Tue, Jun 06, 2017 at 09:09:40AM -0500, Michael Bringmann wrote:
>> On NUMA systems with dynamic processors, the content of the cpumask
>> may change over time. As new processors are added via DLPAR operations,
>> workqueues are created for them. Depending upon the order in which CPUs
>> are added/removed, we may run into problems with the content of the
>> cpumask used by the workqueues. This patch deals with situations where
>> the online cpumask for a node is a proper superset of possible cpumask
>> for the node. It also deals with edge cases where the order in which
>> CPUs are removed/added from the online cpumask may leave the set for a
>> node empty, and require execution by CPUs on another node.
>>
>> In these and other cases, the patch attempts to ensure that a valid,
>> usable cpumask is used to set up newly created pools for workqueues.
>>
>> Signed-off-by: Tejun Heo <tj@kernel.org> & Michael Bringmann <mwb@linux.vnet.ibm.com>
>
> Heh, you can't add sob's for other people. For partial attributions,
> you can just note in the description.
Sorry for the error.
>
>> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
>> index c74bf39..460de61 100644
>> --- a/kernel/workqueue.c
>> +++ b/kernel/workqueue.c
>> @@ -3366,6 +3366,9 @@ static struct worker_pool *get_unbound_pool(const struct workqueue_attrs *attrs)
>> copy_workqueue_attrs(pool->attrs, attrs);
>> pool->node = target_node;
>>
>> + if (!cpumask_weight(pool->attrs->cpumask))
>> + cpumask_copy(pool->attrs->cpumask, cpumask_of(smp_processor_id()));
>
> So, this is still wrong.
It only catches if something has gone wrong before. The alternative in this case
would be,
BUG(!cpumask_weight(pool->attrs->cpumask));
>
>> /*
>> * no_numa isn't a worker_pool attribute, always clear it. See
>> * 'struct workqueue_attrs' comments for detail.
>> @@ -3559,13 +3562,13 @@ static struct pool_workqueue *alloc_unbound_pwq(struct workqueue_struct *wq,
>> * stable.
>> *
>> * Return: %true if the resulting @cpumask is different from @attrs->cpumask,
>> - * %false if equal.
>> + * %false if equal. On %false return, the content of @cpumask is undefined.
>> */
>> static bool wq_calc_node_cpumask(const struct workqueue_attrs *attrs, int node,
>> int cpu_going_down, cpumask_t *cpumask)
>> {
>> if (!wq_numa_enabled || attrs->no_numa)
>> - goto use_dfl;
>> + return false;
>>
>> /* does @node have any online CPUs @attrs wants? */
>> cpumask_and(cpumask, cpumask_of_node(node), attrs->cpumask);
>> @@ -3573,15 +3576,13 @@ static bool wq_calc_node_cpumask(const struct workqueue_attrs *attrs, int node,
>> cpumask_clear_cpu(cpu_going_down, cpumask);
>>
>> if (cpumask_empty(cpumask))
>> - goto use_dfl;
>> + return false;
>>
>> /* yeap, return possible CPUs in @node that @attrs wants */
>> cpumask_and(cpumask, attrs->cpumask, wq_numa_possible_cpumask[node]);
>> - return !cpumask_equal(cpumask, attrs->cpumask);
>>
>> -use_dfl:
>> - cpumask_copy(cpumask, attrs->cpumask);
>> - return false;
>> + return !cpumask_empty(cpumask) &&
>> + !cpumask_equal(cpumask, attrs->cpumask);
>
> And this part doesn't really change that.
>
> CPUs going offline or online shouldn't change their relation to
> wq_numa_possible_cpumask. I wonder whether the arch code is changing
> CPU id <-> NUMA node mapping on CPU on/offlining. x86 used to do that
> too and got recently modified. Can you see whether that's the case?
The but that I see does not appear to be related to changing of CPU/Node mapping
-- they are not changing their place when going offline/online. Rather new CPUs
are being hot-added to the system (i.e. they were not present at boot), and the
node to which they are being added had no CPUs at boot.
>
> Thanks.
>
Thanks.
--
Michael W. Bringmann
Linux Technology Center
IBM Corporation
Tie-Line 363-5196
External: (512) 286-5196
Cell: (512) 466-0650
mwb@linux.vnet.ibm.com
next prev parent reply other threads:[~2017-06-06 14:34 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-06-06 14:09 [PATCH v2] workqueue: Fix edge cases for calc of pool's cpumask Michael Bringmann
2017-06-06 14:20 ` Tejun Heo
2017-06-06 14:34 ` Michael Bringmann [this message]
2017-06-06 14:38 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a6fbf0fe-e6a4-dd99-f329-614a6de99555@linux.vnet.ibm.com \
--to=mwb@linux.vnet.ibm.com \
--cc=jiangshanlai@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mbringm@us.ibm.com \
--cc=nfont@linux.vnet.ibm.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).