Re: [PATCH 1/4] forget_original_parent: split out the un-ptrace part

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Oleg Nesterov <oleg@redhat.com>
To: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	"Metzger, Markus T" <markus.t.metzger@intel.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/4] forget_original_parent: split out the un-ptrace part
Date: Wed, 25 Feb 2009 21:44:54 +0100	[thread overview]
Message-ID: <20090225204454.GA11842@redhat.com> (raw)
In-Reply-To: <20090225003408.1DA81FC380@magilla.sf.frob.com>

On 02/24, Roland McGrath wrote:
>
> > > --- a/kernel/ptrace.c
> > > +++ b/kernel/ptrace.c
> > > @@ -534,7 +534,7 @@ repeat:
> > >  		 * Set the ptrace bit in the process ptrace flags.
> > >  		 * Then link us on our parent's ptraced list.
> > >  		 */
> > > -		if (!ret) {
> > > +		if (!ret && !(current->real_parent->flags & PF_EXITING)) {
> > >  			current->ptrace |= PT_PTRACED;
> >
> > Yes sure.
> >
> > But this means exit_ptrace() must always take tasklist, otherwise we
> > don't have the necessary barriers.
>
> Really?
>
> 	exit_signals(tsk);  /* sets PF_EXITING */
> 	/*
> 	 * tsk->flags are checked in the futex code to protect against
> 	 * an exiting task cleaning up the robust pi futexes.
> 	 */
> 	smp_mb();
>
> This is an exactly analogous use, isn't it?  So exit_ptrace() just has to
> follow this same existing barrier.  Right?

Yes, we do have the barrier between "flags |= PF_EXITING" and
"if (list_empty(ptraced))" in exit_ptrace(), but it is not enough.

Because the exiting ->real_parent can both set PF_EXITING and return
from exit_ptrace() (without taking tasklist because it sees ->ptraced
is empty) right after the child checks ->real_parent->flags & PF_EXITING.

I am still thinking what can we do here (and btw my apologies for delay,
some stupid reasons distract me).

> > But from the _pure theoretical_ pov, it is not correct to assume that
> > list_empty(&tracer->ptraced) == T means that current can not be used
> > somehow as tracee->parent. Another subthread can release a dead tracee.
>
> I don't follow how that's relevant.  If list_empty(), then it was empty or
> is becoming empty.  It can't then become nonempty again (because the thread
> doing the check is the only one that adds to that list).  That's all we're
> assuming.
>
> > For example, list_empty(&tracer->ptraced) == T doesn't mean that the
> > STOREs to this task_struct are finished, list_del_init(->ptrace_entry)
> > can still be in progress.
>
> Sure, but so what?  The check is to verify that some new list_del* (and
> related cleanup work, of course) doesn't need to be *started*.

Well. I am starting to regret I mentioned this "problem" ;) Because even
if I am right (it is very possible I am not), this all is _absolutely_
theoretical. Let me try again to explain what I meant.

First of all, in theory write_lock_irq() does not imply rcu_read_lock().

Now let's suppose that the exiting task T does exit_ptrace(), sees the
empty ->ptraced list, and then can do release_task()->call_rcu(put_task_struct)
without taking the tasklist_lock on this path.

Let's also suppose that we race with another sub-thread which reaps
a zombie tracee and does __ptrace_unlink()->list_del_init(ptrace_entry).
__list_del does 1) next->prev = prev and 2) prev->next = next.

Let's suppose 2 is completed, but 1 is not.

T checks list_empty(->ptraced) and sees head->next == head. It proceeds
and calls call_rcu(put_task_struct).

Since (in theory!) we do not have rcu_read_lock(), it is possible that
task_struct is already freed when 1 write to the memory.

But actually I meant that this is not really safe "in general". Let's
suppose we change __ptrace_unlink() so that it does, say,
BUG_ON(child->parent->exit_state != 0) before untracing. Yes, sure,
this is ugly, but correct. Or BUG_ON(!child->parent->signal), or
whatever else.

But this is only correct because T takes tasklist before it actually
starts to "destroy" itself.

In short, my point is: even if exit_ptrace() sees list_empty(->ptraced),
it is possible that the just-untraced tracee "looks" at us and expects
that the former tracer is "alive" enough.

Oleg.

     prev parent reply	other threads:[~2009-02-25 20:47 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-11 21:12 [PATCH 1/4] forget_original_parent: split out the un-ptrace part Oleg Nesterov
2009-02-20  2:27 ` Roland McGrath
2009-02-23 16:46   ` Oleg Nesterov
2009-02-23 18:26     ` Oleg Nesterov
2009-02-23 18:57     ` Oleg Nesterov
2009-02-25  0:34     ` Roland McGrath
2009-02-25 20:44       ` Oleg Nesterov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090225204454.GA11842@redhat.com \
    --to=oleg@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=markus.t.metzger@intel.com \
    --cc=roland@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.