From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>, Dave Hansen <dave@sr71.net>,
LKML <linux-kernel@vger.kernel.org>,
Dave Jones <davej@redhat.com>,
dhillf@gmail.com, Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>
Subject: Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu
Date: Fri, 12 Apr 2013 02:49:50 +0530 [thread overview]
Message-ID: <516728F6.4090701@linux.vnet.ibm.com> (raw)
In-Reply-To: <alpine.LFD.2.02.1304112158040.21884@ionos>
On 04/12/2013 02:17 AM, Thomas Gleixner wrote:
> Srivatsa,
>
> On Fri, 12 Apr 2013, Srivatsa S. Bhat wrote:
>> On 04/09/2013 08:08 PM, Thomas Gleixner wrote:
>>> Add a new task state (TASK_PARKED) which prevents other wakeups and
>>> use this state explicitely for the unpark wakeup.
>>>
>>
>> Again, I think this is unnecessary. We are good as long as no one other
>> than the unpark code can kick the kthreads out of the loop in the park
>> code. Now that I understand the race you explained above, why not just
>> fix that race itself by reversing the ordering of clear(SHOULD_PARK)
>> and bind_to(CPU2)? That way, even if someone else wakes up the per-cpu
>> kthread, it will just remain confined to the park code, as intended.
>
> In theory.
>
>> A patch like below should do it IMHO. I guess I'm being a little too
>> persistent, sorry!
>
> No it's not about being persistent, you're JUST too much into voodoo
> programming instead of going for the straight forward and robust
> solutions.
>
> Darn, I hate it as much as everyone else to introduce a new task
> state, but that state allows us to make guarantees and gives us
> semantical clarity. A parked thread is parked and can only be woken up
> by the unpark code. That's clear semantics and not a magic construct
> which will work in most cases and for the remaining ones (See below)
> it will give us problems which are way harder to decode than the ones
> we tried to fix with that magic.
>
>> diff --git a/kernel/kthread.c b/kernel/kthread.c
>> index 691dc2e..9512fc5 100644
>> --- a/kernel/kthread.c
>> +++ b/kernel/kthread.c
>> @@ -308,6 +308,15 @@ struct task_struct *kthread_create_on_cpu(int (*threadfn)(void *data),
>> to_kthread(p)->cpu = cpu;
>> /* Park the thread to get it out of TASK_UNINTERRUPTIBLE state */
>> kthread_park(p);
>> +
>> + /*
>> + * Wait for p->on_rq to be reset to 0, to ensure that the per-cpu
>> + * migration thread (which belongs to the stop_task sched class)
>> + * doesn't run until the cpu is actually onlined and the thread is
>> + * unparked.
>> + */
>> + if (!wait_task_inactive(p, TASK_INTERRUPTIBLE))
>> + WARN_ON(1);
>
> Yay, we rely on TASK_INTERRUPTIBLE state with a task which already has
> references outside the creation code.
I doubt that. We have not even onlined the CPU, how would any else even
_know_ that we created this kthread??
*per_cpu_ptr(ht->store, cpu) = tsk; is executed _after_ returning from
this function.
The problem with ksoftirqd is very clear - we unpark threads _after_ we
online the CPU. So, in between the 2 steps, somebody on that CPU can call
__do_softirq(), leading to the race you described in your cover-letter.
That's why I tried to fix that race.
> And then we _HOPE_ that nothing
> wakes it up _BEFORE_ we do something else.
>
Nothing can wake it up, because no one is aware of the newly created
kthread.
> Aside of that, you are still insisting to enforce that for every per
> cpu thread even if the only one which needs that at this point are
> thos which have a create() callback (i.e. the migration thread). And
> next week you figure out that this is a performance impact on bringing
> up large machines....
>
Making this wait call specific to those kthreads with the ->create callback
won't be that much of a big deal, IMHO. But see below, I'm not going to
insist on going with my suggestions.
>> /**
>> * kthread_unpark - unpark a thread created by kthread_create().
>> * @k: thread created by kthread_create().
>> @@ -337,18 +357,29 @@ void kthread_unpark(struct task_struct *k)
>> struct kthread *kthread = task_get_live_kthread(k);
>>
>> if (kthread) {
>> + /*
>> + * Per-cpu kthreads such as ksoftirqd can get woken up by
>> + * other events. So after binding the thread, ensure that
>> + * it goes off the CPU atleast once, by parking it again.
>> + * This way, we can ensure that it will run on the correct
>> + * CPU on subsequent wakeup.
>> + */
>> + if (test_bit(KTHREAD_IS_PER_CPU, &kthread->flags)) {
>> + __kthread_bind(k, kthread->cpu);
>> + clear_bit(KTHREAD_IS_PARKED, &kthread->flags);
>
> And how is that f*cking different from the previous code?
>
> CPU0 CPU1 CPU2
> wakeup(T) -> run on CPU1 (last cpu)
>
> switch_to(T)
>
> __kthread_bind(T, CPU2)
>
> clear(KTHREAD_IS_PARKED)
>
> leave loop due to !KTHREAD_IS_PARKED
How?? The task will leave the loop only when we clear
SHOULD_PARK, not when we clear IS_PARKED. So it won't
leave the loop here. It will cause the kthread to
perform a fresh complete() for the waiting kthread_park()
on CPU0.
>
> BUG(wrong cpu) <--- VOODOO FAILURE
>
> kthread_park(T) <-- VOODOO TOO LATE
>
No, the purpose of clear(IS_PARKED) followed by __kthread_park() is to
ensure that the task gets *descheduled* atleast once after we did the
kthread_bind(). And that's because we can't use set_cpus_allowed_ptr() to
migrate a running kthread (because the kthread could be the migration
thread). So instead, we use kthread_bind() and depend on sleep->wakeup
to put the task on the right CPU.
> You can turn around the order of clearing/setting the flags as much as
> you want, I'm going to punch an hole in it.
>
> TASK_PARKED is the very obvious and robust solution which fixes _ALL_
> of the corner cases, at least as far as I can imagine them. And
> robustness rules at least in my world.
>
Yes, I agree that it is robust and has clear semantics. No doubt about
that. So I won't insist on going with my suggestions.
Regards,
Srivatsa S. Bhat
next prev parent reply other threads:[~2013-04-11 21:22 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-05 21:43 kernel BUG at kernel/smpboot.c:134! Dave Hansen
2013-04-06 7:12 ` Srivatsa S. Bhat
2013-04-06 8:31 ` Thomas Gleixner
2013-04-07 9:20 ` Thomas Gleixner
2013-04-07 9:50 ` Borislav Petkov
2013-04-08 9:24 ` Thomas Gleixner
2013-04-08 11:55 ` Borislav Petkov
2013-04-08 12:17 ` Thomas Gleixner
2013-04-09 14:38 ` [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu Thomas Gleixner
2013-04-09 15:55 ` Dave Hansen
2013-04-09 18:43 ` Thomas Gleixner
2013-04-09 19:30 ` Thomas Gleixner
2013-04-09 20:38 ` Dave Hansen
2013-04-09 20:54 ` Dave Hansen
2013-04-10 8:29 ` Thomas Gleixner
2013-04-10 10:51 ` Thomas Gleixner
2013-04-10 19:41 ` Dave Hansen
2013-04-11 10:19 ` Thomas Gleixner
2013-04-11 10:48 ` Srivatsa S. Bhat
2013-04-11 11:43 ` Srivatsa S. Bhat
2013-04-11 11:59 ` Srivatsa S. Bhat
2013-04-11 12:51 ` Thomas Gleixner
2013-04-11 12:54 ` Thomas Gleixner
2013-04-11 13:46 ` Thomas Gleixner
2013-04-11 18:07 ` Dave Hansen
2013-04-11 19:48 ` Thomas Gleixner
2013-04-10 14:03 ` [PATCH] CPU hotplug, smpboot: Fix crash in smpboot_thread_fn() Srivatsa S. Bhat
2013-04-11 8:10 ` Thomas Gleixner
2013-04-11 10:19 ` Srivatsa S. Bhat
2013-04-11 19:16 ` [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu Srivatsa S. Bhat
2013-04-11 20:47 ` Thomas Gleixner
2013-04-11 21:19 ` Srivatsa S. Bhat [this message]
2013-04-12 10:59 ` Thomas Gleixner
2013-04-12 11:26 ` Srivatsa S. Bhat
2013-04-15 19:49 ` Dave Hansen
2013-04-12 10:41 ` Peter Zijlstra
2013-04-12 12:32 ` [tip:core/urgent] " tip-bot for Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=516728F6.4090701@linux.vnet.ibm.com \
--to=srivatsa.bhat@linux.vnet.ibm.com \
--cc=bp@alien8.de \
--cc=dave@sr71.net \
--cc=davej@redhat.com \
--cc=dhillf@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.