All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Tejun Heo <tj@kernel.org>
Cc: linux-kernel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH ptrace] ptrace: fix signal->wait_chldexit usage in task_clear_group_stop_trapping()
Date: Sun, 8 May 2011 15:35:43 +0200	[thread overview]
Message-ID: <20110508133543.GA4820@redhat.com> (raw)
In-Reply-To: <20110506095222.GB8824@htj.dyndns.org>

On 05/06, Tejun Heo wrote:
>
> GROUP_STOP_TRAPPING waiting mechanism piggybacks on
> signal->wait_chldexit which is primarily used to implement waiting for
> wait(2) and friends.  When do_wait() waits on signal->wait_chldexit,
> it uses a custom wake up callback, child_wait_callback(), which
> expects the child task which is waking up the parent to be passed in
> as @key to filter out spurious wakeups.
>
> task_clear_group_stop_trapping() used __wake_up_sync() which uses NULL
> @key causing the following oops if the parent was doing do_wait().
>
>   BUG: unable to handle kernel NULL pointer dereference at 00000000000002d8
>   IP: [<ffffffff810499f9>] child_wait_callback+0x29/0x80

Argh! Shame on me. Somehow I convinced myself that this needs the cleanup
but safe because we are not going to wake up the TASK_INTTERRUPTIBLE tasks
sleeping in do_wait(). Somehow I forgot that wait_queue_t->func() is called
anyway without checking "mode". I even specially mentioned this should be
safe during the review ;)

> I still think it's a mistake to piggyback on wait_chldexit for this.

Agreed. Previously I thought we should teach __wake_up_parent() to wake
up the TASK_UNINTERRUPTIBLE tasks, now I think this is not needed.

> Given the relative low frequency of ptrace use, we would be much
> better off leaving already complex wait_chldexit alone and using bit
> waitqueue.

Well, I don't think so. wait_on_bit() looks as unnecessary complication
to me. See below.

> --- work.orig/kernel/signal.c
> +++ work/kernel/signal.c
> @@ -238,8 +238,8 @@ static void task_clear_group_stop_trappi
>  {
>  	if (unlikely(task->group_stop & GROUP_STOP_TRAPPING)) {
>  		task->group_stop &= ~GROUP_STOP_TRAPPING;
> -		__wake_up_sync(&task->parent->signal->wait_chldexit,
> -			       TASK_UNINTERRUPTIBLE, 1);
> +		__wake_up_sync_key(&task->parent->signal->wait_chldexit,
> +				   TASK_UNINTERRUPTIBLE, 1, task);
>  	}
>  }

I believe this is correct and should fix the problem.


But. Why do we need signal->wait_chldexit or bit waitqueue at all?
Previously this was needed because wait_event(!GROUP_STOP_TRAPPING)
was called from ptrace_check_attach(), and the tracer can do anything
after ptrace_attach() which sets GROUP_STOP_TRAPPING.

With the current code we know that GROUP_STOP_TRAPPING means: the
tracer can't go away from ptrace_attach() until we clear this bit.

How about the patch below instead?


Hmm. This is minor and off-topic, but perhaps it makes sense to move
the code which sets/waits GROUP_STOP_TRAPPING from ptrace_attach() to
the separate function, it can be called outside of tasklist_lock and
cred_guard_mutex.

And. Could you remind why ptrace_attach() does signal_wake_up() instead
of wake_up_state(TASK_STOPPED) ? OK, in general we shouldn't set
GROUP_STOP_PENDING without TIF_SIGPENDING, but in this case?

Oleg.

--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -256,9 +256,16 @@ unlock_tasklist:
 unlock_creds:
 	mutex_unlock(&task->signal->cred_guard_mutex);
 out:
-	if (wait_trap)
-		wait_event(current->signal->wait_chldexit,
-			   !(task->group_stop & GROUP_STOP_TRAPPING));
+	if (wait_trap) {
+		for (;;) {
+			set_current_state(TASK_UNINTERRUPTIBLE);
+			if (!(task->group_stop & GROUP_STOP_TRAPPING))
+				break;
+			schedule();
+		}
+		__set_current_state(TASK_RUNNING);
+	}
+
 	return retval;
 }
 
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -238,8 +238,7 @@ static void task_clear_group_stop_trappi
 {
 	if (unlikely(task->group_stop & GROUP_STOP_TRAPPING)) {
 		task->group_stop &= ~GROUP_STOP_TRAPPING;
-		__wake_up_sync(&task->parent->signal->wait_chldexit,
-			       TASK_UNINTERRUPTIBLE, 1);
+		wake_up_process(task->parent);
 	}
 }
 


  reply	other threads:[~2011-05-08 13:36 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-06  9:52 [PATCH ptrace] ptrace: fix signal->wait_chldexit usage in task_clear_group_stop_trapping() Tejun Heo
2011-05-08 13:35 ` Oleg Nesterov [this message]
2011-05-08 14:02   ` Tejun Heo
2011-05-08 15:34     ` Oleg Nesterov
2011-05-08 15:40       ` Oleg Nesterov
2011-05-08 15:55       ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110508133543.GA4820@redhat.com \
    --to=oleg@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.