From: ebiederm@xmission.com (Eric W. Biederman)
To: Oleg Nesterov <oleg@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>,
Andrew Morton <akpm@linux-foundation.org>,
Christoph Hellwig <hch@lst.de>, Ingo Molnar <mingo@elte.hu>,
Pavel Emelyanov <xemul@openvz.org>,
Vitaliy Gusev <vgusev@openvz.org>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/4] kthreads: rework kthread_stop()
Date: Mon, 02 Feb 2009 09:57:36 -0800 [thread overview]
Message-ID: <m1wsc82033.fsf@fess.ebiederm.org> (raw)
In-Reply-To: <20090201102117.GA5728@redhat.com> (Oleg Nesterov's message of "Sun\, 1 Feb 2009 11\:21\:17 +0100")
Oleg Nesterov <oleg@redhat.com> writes:
> On 01/31, Rusty Russell wrote:
>>
>> On Friday 30 January 2009 23:20:58 Oleg Nesterov wrote:
>> > On 01/30, Oleg Nesterov wrote:
>> > >
>> > > With this patch kthread() allocates all neccesary data (struct kthread)
>> > > on its own stack, globals kthread_stop_xxx are deleted. ->vfork_done
>> > > is used as a pointer into "struct kthread", this means kthread_stop()
>> > > can easily wait for kthread's exit.
>>
>> > struct kthread {
>> > int should_stop;
>> > struct completion exited;
>> > };
>>
>> Mildly prefer bool in new code.
>
> OK, and
>
>> > #define to_kthread(tsk) \
>> > container_of((tsk)->vfork_done, struct kthread, exited)
>>
>> This needs a comment. Especially since to_xxx(yyy) is usually simply a
>> container_of(yyy, xxx, member). This one is special.
>
> OK, I'll send the cleanup patch.
>
>> > int kthread_stop(struct task_struct *k)
>> > {
>> > struct kthread *kthread;
>> > int ret;
>> >
>> > trace_sched_kthread_stop(k);
>> > get_task_struct(k);
>> >
>> > kthread = to_kthread(k);
>> > barrier(); /* it might have exited */
>> > if (k->vfork_done != NULL) {
>> > kthread->should_stop = 1;
>> > wake_up_process(k);
>> > wait_for_completion(&kthread->exited);
>> > }
>> > ret = k->exit_code;
>>
>> I don't think this works. How does do_exit() preserve a stack var, other
>> than for a few cycles longer? Sure, the vfork_done will be OK, but this code
>> here will not be. I think you'd need a get_task_struct(current) before the
>> do_exit(ret)
>
> I think this works ;)
>
> This stack frame can't disappear until
> __put_task_struct()->...->free_thread_info().
> So, if you have a reference to task_struct, then it it is safe to dereference
> to_kthread(task).
Correct.
> Before this patch, kthread_stop() can only be used when we know that kthread
> must not exit by its own. And with this patch we are safe in this case, note
> that kthread_stop() does get_task_struct() before it sets ->should_stop = 1.
> And this also pins the memory pointed by to_kthread().
>> (the case where the kthread fn calls do_exit() is fine: you're
>> not allowed to call kthread stop on such threads).
>
> This was not allowed, but now this is fine. Please look at the 4/4 patch.
> But, in that case you must pin the task_struct after kthread_create(),
> otherwise (with or without this patch) you just can't use this task_struct
> in any way.
To finish the conversion of everything to kthreads from kernel_thread
we need to be able to call kthread_stop on threads that call do_exit.
That kthread_stop cannot be called on such threads is currently a major
deficiency of the kthread api.
>> In which case using vfork_done is really just a convenience pointer inside
>> struct task_struct to stash the struct kthread. And that's horribly ugly,
>> which is why I stuck with a simple global. Changing to a linked-list of
> things
>> to stop would avoid the deadlock you mentioned where a kthread stops another
>> kthread.
>
> Well, this patch overloads ->vfork_done, and I agree this is a bit ugly.
> But what you suggest (if I undestand correctly) is more complex, and doesn't
> have any advantages, imho.
No. This patch does not overload or really abuse vfork_done.
vfork_done is named a bit misleadingly. It should arguably be called
mm_done. vfork_done (if set) always points to a completion that will
be completed when do_exit() -> mm_release() is called.
What we want is some completion called from inside of do_exit.
mm_release happens to be such a completion and the code already
exists.
If mm_release did not do: tsk->vfork_done = NULL then the entire
test of if (k->vfork_done != NULL) would be unnecessary.
Oleg on that note we should not need a barrier at all. We should be
able to simply say:
cmplp = k->vfork_done;
if (cmplp){
/* if vfork_done is NULL we have passed mm_release */
kthread = container_of(cmplp, struct kthread, exited);
kthread->should_stop = 1;
wake_up_process(k);
wait_for_completion(&kthread->exited);
}
Thinking of it I wish we had someplace we could store a pointer
that would not be cleared so we could remove that whole confusing
conditional. I just looked through task_struct and there doesn't
appear to be anything promising.
Perhaps we could rename vfork_done mm_done and not clear it in
mm_release.
Eric
next prev parent reply other threads:[~2009-02-02 17:57 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-30 12:33 [PATCH 3/4] kthreads: rework kthread_stop() Oleg Nesterov
2009-01-30 12:50 ` Oleg Nesterov
2009-01-31 12:16 ` Rusty Russell
2009-02-01 10:21 ` Oleg Nesterov
2009-02-02 17:57 ` Eric W. Biederman [this message]
2009-02-02 19:41 ` Oleg Nesterov
2009-02-03 3:25 ` Eric W. Biederman
2009-02-03 13:41 ` Paul E. McKenney
2009-02-04 5:10 ` Eric W. Biederman
2009-02-04 11:04 ` Rusty Russell
2009-02-04 15:59 ` Eric W. Biederman
2009-02-05 1:03 ` Rusty Russell
2009-02-04 20:46 ` Jon Masters
2009-01-30 21:47 ` Andrew Morton
2009-02-01 10:49 ` Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m1wsc82033.fsf@fess.ebiederm.org \
--to=ebiederm@xmission.com \
--cc=akpm@linux-foundation.org \
--cc=hch@lst.de \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=oleg@redhat.com \
--cc=rusty@rustcorp.com.au \
--cc=vgusev@openvz.org \
--cc=xemul@openvz.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.