Re: [PATCH 3/4] kthreads: rework kthread_stop()

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: ebiederm@xmission.com (Eric W. Biederman)
To: Oleg Nesterov <oleg@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>,
	Andrew Morton <akpm@linux-foundation.org>,
	Christoph Hellwig <hch@lst.de>, Ingo Molnar <mingo@elte.hu>,
	Pavel Emelyanov <xemul@openvz.org>,
	Vitaliy Gusev <vgusev@openvz.org>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/4] kthreads: rework kthread_stop()
Date: Mon, 02 Feb 2009 09:57:36 -0800	[thread overview]
Message-ID: <m1wsc82033.fsf@fess.ebiederm.org> (raw)
In-Reply-To: <20090201102117.GA5728@redhat.com> (Oleg Nesterov's message of "Sun\, 1 Feb 2009 11\:21\:17 +0100")

Oleg Nesterov <oleg@redhat.com> writes:

> On 01/31, Rusty Russell wrote:
>>
>> On Friday 30 January 2009 23:20:58 Oleg Nesterov wrote:
>> > On 01/30, Oleg Nesterov wrote:
>> > >
>> > > With this patch kthread() allocates all neccesary data (struct kthread)
>> > > on its own stack, globals kthread_stop_xxx are deleted. ->vfork_done
>> > > is used as a pointer into "struct kthread", this means kthread_stop()
>> > > can easily wait for kthread's exit.
>> 
>> > struct kthread {
>> > 	int should_stop;
>> > 	struct completion exited;
>> > };
>> 
>> Mildly prefer bool in new code.
>
> OK, and
>
>> > #define to_kthread(tsk)	\
>> > 	container_of((tsk)->vfork_done, struct kthread, exited)
>> 
>> This needs a comment.  Especially since to_xxx(yyy) is usually simply a
>> container_of(yyy, xxx, member).  This one is special.
>
> OK, I'll send the cleanup patch.
>
>> > int kthread_stop(struct task_struct *k)
>> > {
>> > 	struct kthread *kthread;
>> > 	int ret;
>> >
>> > 	trace_sched_kthread_stop(k);
>> > 	get_task_struct(k);
>> >
>> > 	kthread = to_kthread(k);
>> > 	barrier(); /* it might have exited */
>> > 	if (k->vfork_done != NULL) {
>> > 		kthread->should_stop = 1;
>> > 		wake_up_process(k);
>> > 		wait_for_completion(&kthread->exited);
>> > 	}
>> > 	ret = k->exit_code;
>>
>> I don't think this works.  How does do_exit() preserve a stack var, other
>> than for a few cycles longer?  Sure, the vfork_done will be OK, but this code
>> here will not be.  I think you'd need a get_task_struct(current) before the
>> do_exit(ret)
>
> I think this works ;)
>
> This stack frame can't disappear until
> __put_task_struct()->...->free_thread_info().
> So, if you have a reference to task_struct, then it it is safe to dereference
> to_kthread(task).

Correct.

> Before this patch, kthread_stop() can only be used when we know that kthread
> must not exit by its own. And with this patch we are safe in this case, note
> that kthread_stop() does get_task_struct() before it sets ->should_stop = 1.
> And this also pins the memory pointed by to_kthread().

>> (the case where the kthread fn calls do_exit() is fine: you're
>> not allowed to call kthread stop on such threads).
>
> This was not allowed, but now this is fine. Please look at the 4/4 patch.
> But, in that case you must pin the task_struct after kthread_create(),
> otherwise (with or without this patch) you just can't use this task_struct
> in any way.

To finish the conversion of everything to kthreads from kernel_thread
we need to be able to call kthread_stop on threads that call do_exit.
That kthread_stop cannot be called on such threads is currently a major
deficiency of the kthread api.

>> In which case using vfork_done is really just a convenience pointer inside
>> struct task_struct to stash the struct kthread.  And that's horribly ugly,
>> which is why I stuck with a simple global.  Changing to a linked-list of
> things
>> to stop would avoid the deadlock you mentioned where a kthread stops another
>> kthread.
>
> Well, this patch overloads ->vfork_done, and I agree this is a bit ugly.
> But what you suggest (if I undestand correctly) is more complex, and doesn't
> have any advantages, imho.

No.  This patch does not overload or really abuse vfork_done.
vfork_done is named a bit misleadingly.  It should arguably be called
mm_done.  vfork_done (if set) always points to a completion that will
be completed when do_exit() -> mm_release() is called.

What we want is some completion called from inside of do_exit.
mm_release happens to be such a completion and the code already
exists.

If mm_release did not do: tsk->vfork_done = NULL then the entire
test of if (k->vfork_done != NULL) would be unnecessary.

Oleg on that note we should not need a barrier at all. We should be
able to simply say: 

cmplp = k->vfork_done;
if (cmplp){
	/* if vfork_done is NULL we have passed mm_release */
	kthread = container_of(cmplp, struct kthread, exited);
	kthread->should_stop = 1;
	wake_up_process(k);
	wait_for_completion(&kthread->exited);
}

Thinking of it I wish we had someplace we could store a pointer
that would not be cleared so we could remove that whole confusing
conditional.  I just looked through task_struct and there doesn't
appear to be anything promising.

Perhaps we could rename vfork_done mm_done and not clear it in
mm_release.

Eric

next prev parent reply	other threads:[~2009-02-02 17:57 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-30 12:33 [PATCH 3/4] kthreads: rework kthread_stop() Oleg Nesterov
2009-01-30 12:50 ` Oleg Nesterov
2009-01-31 12:16   ` Rusty Russell
2009-02-01 10:21     ` Oleg Nesterov
2009-02-02 17:57       ` Eric W. Biederman [this message]
2009-02-02 19:41         ` Oleg Nesterov
2009-02-03  3:25           ` Eric W. Biederman
2009-02-03 13:41             ` Paul E. McKenney
2009-02-04  5:10               ` Eric W. Biederman
2009-02-04 11:04                 ` Rusty Russell
2009-02-04 15:59                   ` Eric W. Biederman
2009-02-05  1:03                     ` Rusty Russell
2009-02-04 20:46                   ` Jon Masters
2009-01-30 21:47 ` Andrew Morton
2009-02-01 10:49   ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m1wsc82033.fsf@fess.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=akpm@linux-foundation.org \
    --cc=hch@lst.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=oleg@redhat.com \
    --cc=rusty@rustcorp.com.au \
    --cc=vgusev@openvz.org \
    --cc=xemul@openvz.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox