From: ebiederm@xmission.com (Eric W. Biederman)
To: Oleg Nesterov <oleg@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>,
Andrew Morton <akpm@linux-foundation.org>,
Christoph Hellwig <hch@lst.de>, Ingo Molnar <mingo@elte.hu>,
Pavel Emelyanov <xemul@openvz.org>,
Vitaliy Gusev <vgusev@openvz.org>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/4] kthreads: rework kthread_stop()
Date: Mon, 02 Feb 2009 09:57:36 -0800 [thread overview]
Message-ID: <m1wsc82033.fsf@fess.ebiederm.org> (raw)
In-Reply-To: <20090201102117.GA5728@redhat.com> (Oleg Nesterov's message of "Sun\, 1 Feb 2009 11\:21\:17 +0100")
Oleg Nesterov <oleg@redhat.com> writes:
> On 01/31, Rusty Russell wrote:
>>
>> On Friday 30 January 2009 23:20:58 Oleg Nesterov wrote:
>> > On 01/30, Oleg Nesterov wrote:
>> > >
>> > > With this patch kthread() allocates all neccesary data (struct kthread)
>> > > on its own stack, globals kthread_stop_xxx are deleted. ->vfork_done
>> > > is used as a pointer into "struct kthread", this means kthread_stop()
>> > > can easily wait for kthread's exit.
>>
>> > struct kthread {
>> > int should_stop;
>> > struct completion exited;
>> > };
>>
>> Mildly prefer bool in new code.
>
> OK, and
>
>> > #define to_kthread(tsk) \
>> > container_of((tsk)->vfork_done, struct kthread, exited)
>>
>> This needs a comment. Especially since to_xxx(yyy) is usually simply a
>> container_of(yyy, xxx, member). This one is special.
>
> OK, I'll send the cleanup patch.
>
>> > int kthread_stop(struct task_struct *k)
>> > {
>> > struct kthread *kthread;
>> > int ret;
>> >
>> > trace_sched_kthread_stop(k);
>> > get_task_struct(k);
>> >
>> > kthread = to_kthread(k);
>> > barrier(); /* it might have exited */
>> > if (k->vfork_done != NULL) {
>> > kthread->should_stop = 1;
>> > wake_up_process(k);
>> > wait_for_completion(&kthread->exited);
>> > }
>> > ret = k->exit_code;
>>
>> I don't think this works. How does do_exit() preserve a stack var, other
>> than for a few cycles longer? Sure, the vfork_done will be OK, but this code
>> here will not be. I think you'd need a get_task_struct(current) before the
>> do_exit(ret)
>
> I think this works ;)
>
> This stack frame can't disappear until
> __put_task_struct()->...->free_thread_info().
> So, if you have a reference to task_struct, then it it is safe to dereference
> to_kthread(task).
Correct.
> Before this patch, kthread_stop() can only be used when we know that kthread
> must not exit by its own. And with this patch we are safe in this case, note
> that kthread_stop() does get_task_struct() before it sets ->should_stop = 1.
> And this also pins the memory pointed by to_kthread().
>> (the case where the kthread fn calls do_exit() is fine: you're
>> not allowed to call kthread stop on such threads).
>
> This was not allowed, but now this is fine. Please look at the 4/4 patch.
> But, in that case you must pin the task_struct after kthread_create(),
> otherwise (with or without this patch) you just can't use this task_struct
> in any way.
To finish the conversion of everything to kthreads from kernel_thread
we need to be able to call kthread_stop on threads that call do_exit.
That kthread_stop cannot be called on such threads is currently a major
deficiency of the kthread api.
>> In which case using vfork_done is really just a convenience pointer inside
>> struct task_struct to stash the struct kthread. And that's horribly ugly,
>> which is why I stuck with a simple global. Changing to a linked-list of
> things
>> to stop would avoid the deadlock you mentioned where a kthread stops another
>> kthread.
>
> Well, this patch overloads ->vfork_done, and I agree this is a bit ugly.
> But what you suggest (if I undestand correctly) is more complex, and doesn't
> have any advantages, imho.
No. This patch does not overload or really abuse vfork_done.
vfork_done is named a bit misleadingly. It should arguably be called
mm_done. vfork_done (if set) always points to a completion that will
be completed when do_exit() -> mm_release() is called.
What we want is some completion called from inside of do_exit.
mm_release happens to be such a completion and the code already
exists.
If mm_release did not do: tsk->vfork_done = NULL then the entire
test of if (k->vfork_done != NULL) would be unnecessary.
Oleg on that note we should not need a barrier at all. We should be
able to simply say:
cmplp = k->vfork_done;
if (cmplp){
/* if vfork_done is NULL we have passed mm_release */
kthread = container_of(cmplp, struct kthread, exited);
kthread->should_stop = 1;
wake_up_process(k);
wait_for_completion(&kthread->exited);
}
Thinking of it I wish we had someplace we could store a pointer
that would not be cleared so we could remove that whole confusing
conditional. I just looked through task_struct and there doesn't
appear to be anything promising.
Perhaps we could rename vfork_done mm_done and not clear it in
mm_release.
Eric
next prev parent reply other threads:[~2009-02-02 17:57 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-30 12:33 [PATCH 3/4] kthreads: rework kthread_stop() Oleg Nesterov
2009-01-30 12:50 ` Oleg Nesterov
2009-01-31 12:16 ` Rusty Russell
2009-02-01 10:21 ` Oleg Nesterov
2009-02-02 17:57 ` Eric W. Biederman [this message]
2009-02-02 19:41 ` Oleg Nesterov
2009-02-03 3:25 ` Eric W. Biederman
2009-02-03 13:41 ` Paul E. McKenney
2009-02-04 5:10 ` Eric W. Biederman
2009-02-04 11:04 ` Rusty Russell
2009-02-04 15:59 ` Eric W. Biederman
2009-02-05 1:03 ` Rusty Russell
2009-02-04 20:46 ` Jon Masters
2009-01-30 21:47 ` Andrew Morton
2009-02-01 10:49 ` Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m1wsc82033.fsf@fess.ebiederm.org \
--to=ebiederm@xmission.com \
--cc=akpm@linux-foundation.org \
--cc=hch@lst.de \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=oleg@redhat.com \
--cc=rusty@rustcorp.com.au \
--cc=vgusev@openvz.org \
--cc=xemul@openvz.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox