From: Schspa Shi <schspa@gmail.com>
To: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: tj@kernel.org, linux-kernel@vger.kernel.org,
Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH] workqueue: Use active mask for new worker when pool is DISASSOCIATED
Date: Wed, 13 Jul 2022 19:22:06 +0800 [thread overview]
Message-ID: <m2y1wxi5qh.fsf@gmail.com> (raw)
In-Reply-To: <0320c5f9-cbda-1652-1f97-24d1a22fb298@gmail.com>
Lai Jiangshan <jiangshanlai@gmail.com> writes:
> CC Peter.
> Peter has changed the CPU binding code in workqueue.c.
>
> I'm not understanding the problem enough, if kthread_bind_mask() is buggy
> in workqueue.c, it would be buggy in other places too.
>
It's not the bug of to use kthread_bind_mask(), other than we set the
bad pool->attrs->cpumask to this kthread.
>
> On 2022/7/7 17:05, Schspa Shi wrote:
>
>> - if (worker->rescue_wq)
>> - set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
>> + if (worker->rescue_wq) {
>> + if (pool->flags & POOL_DISASSOCIATED)
>> + set_cpus_allowed_ptr(worker->task, cpu_active_mask);
>> + else
>> + set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
>> + }
>>
>
> For unbound pools (which also has POOL_DISASSOCIATED), pool->attrs->cpumask
> should be used if pool->attrs->cpumask has active cpu.
>
In this case pool->attrs->cpumask have no active cpu, the cpu for this
pool have offlined already.
The bug will occurs when the cpu have called the workqueue_offline_cpu
form cpu unplug, and create a new worker which will running on a offline
cpu.
>
>> +
>> + mutex_lock(&wq_pool_attach_mutex);
>> + if ((pool->flags & POOL_DISASSOCIATED)) {
>> + /* We can't call get_online_cpus, there will be deadlock
>> + * cpu_active_mask will no change, because we have
>> + * wq_pool_attach_mutex hold.
>> + **/
>> + kthread_bind_mask(worker->task, cpu_active_mask);
>> + } else {
>> + kthread_bind_mask(worker->task, pool->attrs->cpumask);
>> + }
>> + mutex_unlock(&wq_pool_attach_mutex);
>
>
> For unbound pools, pool->attrs->cpumask should be used if pool->attrs->cpumask
> has active cpu.
>
> wq_pool_attach_mutex is held here and in worker_attach_to_pool() which smells bad.
>
Yes, this will be changed, I have make a new patch , to move the
thread bind to worker_attach_to_pool, via set_cpus_allowed_ptr.
>
>
> The change is complex. And if kthread_bind_mask() can't work as expected here,
> the change I prefer would be:
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 4056f2a3f9d5..1ad8aef5fe98 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1862,6 +1862,12 @@ static void worker_attach_to_pool(struct worker *worker,
> {
> mutex_lock(&wq_pool_attach_mutex);
>
> + /*
> + * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
> + * online CPUs. It'll be re-applied when any of the CPUs come up.
> + */
> + set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
> +
This will succeed in this case, set_cpus_allowed_ptr will use
cpu_online_mask to verify the cpumask is valid, but in this case,
the cpu_state is between cpu_online and cpu_active. And this
modification doesn't solve the problem I'm having.
> /*
> * The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains
> * stable across this function. See the comments above the flag
> @@ -1872,9 +1877,6 @@ static void worker_attach_to_pool(struct worker *worker,
> else
> kthread_set_per_cpu(worker->task, pool->cpu);
>
> - if (worker->rescue_wq)
> - set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
> -
> list_add_tail(&worker->node, &pool->workers);
> worker->pool = pool;
>
> @@ -1952,7 +1954,7 @@ static struct worker *create_worker(struct worker_pool *pool)
> goto fail;
>
> set_user_nice(worker->task, pool->attrs->nice);
> - kthread_bind_mask(worker->task, pool->attrs->cpumask);
> + worker->flags |= PF_NO_SETAFFINITY;
>
> /* successful, attach the worker to the pool */
> worker_attach_to_pool(worker, pool);
> @@ -4270,7 +4272,7 @@ static int init_rescuer(struct workqueue_struct *wq)
> }
>
> wq->rescuer = rescuer;
> - kthread_bind_mask(rescuer->task, cpu_possible_mask);
> + rescuer->flags |= PF_NO_SETAFFINITY;
> wake_up_process(rescuer->task);
>
> return 0;
>
>
> It is untested. It effectively reverts the commit 640f17c82460e
> ("workqueue: Restrict affinity change to rescuer").
> It avoids using kthread_bind_mask().
I will upload a new patchset to remove the extra &wq_pool_attach_mutex
and add a timing diagram to make this question clearer.
--
BRs
Schspa Shi
next prev parent reply other threads:[~2022-07-13 12:00 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-07 9:05 [PATCH] workqueue: Use active mask for new worker when pool is DISASSOCIATED Schspa Shi
2022-07-13 2:52 ` [workqueue] 1a0a67f5ef: phoronix-test-suite.fio.SequentialRead.IO_uring.Yes.No.1MB.DefaultTestDirectory.mb_s -17.7% regression kernel test robot
2022-07-13 2:52 ` kernel test robot
2022-07-13 6:09 ` Schspa Shi
2022-07-13 6:09 ` Schspa Shi
2022-07-13 9:52 ` [PATCH] workqueue: Use active mask for new worker when pool is DISASSOCIATED Lai Jiangshan
2022-07-13 11:22 ` Schspa Shi [this message]
2022-07-14 14:39 ` Peter Zijlstra
2022-07-14 15:17 ` Schspa Shi
2022-07-30 3:49 ` Schspa Shi
2022-08-01 3:56 ` Lai Jiangshan
2022-08-01 4:42 ` Schspa Shi
2022-08-01 8:48 ` Lai Jiangshan
2022-08-01 9:32 ` Schspa Shi
2022-08-07 13:52 ` Schspa Shi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m2y1wxi5qh.fsf@gmail.com \
--to=schspa@gmail.com \
--cc=jiangshanlai@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.