linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHSET ptrace] ptrace: prepare for PTRACE_SEIZE/INTERRUPT
@ 2011-05-13 15:46 Tejun Heo
  2011-05-13 15:46 ` [PATCH 1/9] job control: reorganize wait_task_stopped() Tejun Heo
                   ` (8 more replies)
  0 siblings, 9 replies; 24+ messages in thread
From: Tejun Heo @ 2011-05-13 15:46 UTC (permalink / raw)
  To: oleg, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan, bdonlan

Hello,

This patchset is extension of preparation patches extracted from
"ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification"
patchset[1].

Trivial and reviewed ones are collected at the beginning so that they
can be applied to the ptrace tree.  The latter half concentrates on
improving TRAPPING handling without actually implementing any new
ptrace request.

This patchset makes the following changes to TRAPPING wait.

* TRAPPING is cleared when an action which requires cancellation
  happens and the fallback clearing at the end of do_signal_stop() is
  removed.  This doesn't require adding any extra callsite to
  clear_trapping() explicitly.  Jobctl flag clearing automatically
  handles cancellation of TRAPPING.

* Instead of signal->wait_chldexit, bit waitqueue is used.  This
  removes the unnecessary complexity and dependency on parent/child
  hierarchy of TRAPPING wait.  Note that this removes the wrong wait
  queue bug Oleg pointed out in wait_trapping().

* TRAPPING wait is moved from PTRACE_ATTACH to wait_task_stopped() and
  ptrace_check_attach().  In both cases, TRAPPING uses
  restart_syscall() to retry.  This simplifies the code and combined
  with the next change makes TRAPPING much safer.

* TRAPPING now uses interruptible sleep.  This makes it way safer even
  if we get something wrong - at least, the tracer can be easily
  killed no matter what.  Also, this makes TRAPPING wait behave just
  like other syscall retries and mixing TRAPPING waits with freezing
  becomes much easier.

Other than making the transient TASK_RUNNING during TRAPPING visible
through /proc and using automatically restarted interruptible sleep,
this patchset doesn't make userland visible changes.

This patchset contains the following nine patches.

  0001-job-control-reorganize-wait_task_stopped.patch
  0002-job-control-rename-signal-group_stop-and-flags-to-jo.patch
  0003-ptrace-ptrace_check_attach-rename-kill-to-ignore_sta.patch
  0004-ptrace-relocate-set_current_state-TASK_TRACED-in-ptr.patch
  0005-job-control-introduce-JOBCTL_PENDING_MASK-and-task_c.patch
  0006-job-control-make-task_clear_jobctl_pending-clear-TRA.patch
  0007-ptrace-use-bit_waitqueue-for-TRAPPING-instead-of-wai.patch
  0008-ptrace-move-JOBCTL_TRAPPING-wait-to-wait-2-and-ptrac.patch
  0009-ptrace-make-TRAPPING-wait-interruptible.patch

0001-0004 are reviewed or mostly trivial prep patches.

0005-0006 move clearing of TRAPPING to the actions requiring it.

0007 makes TRAPPING wait use bit waitqueue.

0008-0009 moves TRAPPING wait to wait(2) and ptrace_check_attach() and
makes it interruptible.

This patchset is on top of the current ptrace branch[2] - 40ae717d1e
"ptrace: fix signal->wait_chldexit usage in
task_clear_group_stop_trapping()" and is availble in the following git
branch.

  git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git review-ptrace-seize-prep

diffstat follows.

 fs/exec.c              |    2 
 include/linux/ptrace.h |    3 -
 include/linux/sched.h  |   19 ++++--
 kernel/exit.c          |   46 +++++++++++++---
 kernel/ptrace.c        |   92 ++++++++++++++++++++++++++++-----
 kernel/signal.c        |  134 +++++++++++++++++++++++++------------------------
 6 files changed, 200 insertions(+), 96 deletions(-)

Thanks.

--
tejun

[1] http://thread.gmane.org/gmane.linux.kernel/1136930
[2] git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc.git ptrace

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/9] job control: reorganize wait_task_stopped()
  2011-05-13 15:46 [PATCHSET ptrace] ptrace: prepare for PTRACE_SEIZE/INTERRUPT Tejun Heo
@ 2011-05-13 15:46 ` Tejun Heo
  2011-05-16 11:56   ` Oleg Nesterov
  2011-05-13 15:46 ` [PATCH 2/9] job control: rename signal->group_stop and flags to jobctl and rearrange flags Tejun Heo
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 24+ messages in thread
From: Tejun Heo @ 2011-05-13 15:46 UTC (permalink / raw)
  To: oleg, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan, bdonlan, Tejun Heo

wait_task_stopped() tested task_stopped_code() without acquiring
siglock and, if stop condition existed, called wait_task_stopped() and
directly returned the result.  This patch moves the initial
task_stopped_code() testing into wait_task_stopped() and make
wait_consider_task() fall through to wait_task_continue() on 0 return.

This is for the following two reasons.

* Because the initial task_stopped_code() test is done without
  acquiring siglock, it may race against SIGCONT generation.  The
  stopped condition might have been replaced by continued state by the
  time wait_task_stopped() acquired siglock.  This may lead to
  unexpected failure of WNOHANG waits.

  This reorganization addresses this single race case but there are
  other cases - TASK_RUNNING -> TASK_STOPPED transition and EXIT_*
  transitions.

* Scheduled ptrace updates require changes to the initial test which
  would fit better inside wait_task_stopped().

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
---
 kernel/exit.c |   30 +++++++++++++++++++++++-------
 1 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index 5cbc83e..3383793 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1377,11 +1377,23 @@ static int *task_stopped_code(struct task_struct *p, bool ptrace)
 	return NULL;
 }
 
-/*
- * Handle sys_wait4 work for one task in state TASK_STOPPED.  We hold
- * read_lock(&tasklist_lock) on entry.  If we return zero, we still hold
- * the lock and this task is uninteresting.  If we return nonzero, we have
- * released the lock and the system call should return.
+/**
+ * wait_task_stopped - Wait for %TASK_STOPPED or %TASK_TRACED
+ * @wo: wait options
+ * @ptrace: is the wait for ptrace
+ * @p: task to wait for
+ *
+ * Handle sys_wait4() work for %p in state %TASK_STOPPED or %TASK_TRACED.
+ *
+ * CONTEXT:
+ * read_lock(&tasklist_lock), which is released if return value is
+ * non-zero.  Also, grabs and releases @p->sighand->siglock.
+ *
+ * RETURNS:
+ * 0 if wait condition didn't exist and search for other wait conditions
+ * should continue.  Non-zero return, -errno on failure and @p's pid on
+ * success, implies that tasklist_lock is released and wait condition
+ * search should terminate.
  */
 static int wait_task_stopped(struct wait_opts *wo,
 				int ptrace, struct task_struct *p)
@@ -1397,6 +1409,9 @@ static int wait_task_stopped(struct wait_opts *wo,
 	if (!ptrace && !(wo->wo_flags & WUNTRACED))
 		return 0;
 
+	if (!task_stopped_code(p, ptrace))
+		return 0;
+
 	exit_code = 0;
 	spin_lock_irq(&p->sighand->siglock);
 
@@ -1607,8 +1622,9 @@ static int wait_consider_task(struct wait_opts *wo, int ptrace,
 	 * Wait for stopped.  Depending on @ptrace, different stopped state
 	 * is used and the two don't interact with each other.
 	 */
-	if (task_stopped_code(p, ptrace))
-		return wait_task_stopped(wo, ptrace, p);
+	ret = wait_task_stopped(wo, ptrace, p);
+	if (ret)
+		return ret;
 
 	/*
 	 * Wait for continued.  There's only one continued state and the
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/9] job control: rename signal->group_stop and flags to jobctl and rearrange flags
  2011-05-13 15:46 [PATCHSET ptrace] ptrace: prepare for PTRACE_SEIZE/INTERRUPT Tejun Heo
  2011-05-13 15:46 ` [PATCH 1/9] job control: reorganize wait_task_stopped() Tejun Heo
@ 2011-05-13 15:46 ` Tejun Heo
  2011-05-13 15:46 ` [PATCH 3/9] ptrace: ptrace_check_attach(): rename @kill to @ignore_state and add comments Tejun Heo
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Tejun Heo @ 2011-05-13 15:46 UTC (permalink / raw)
  To: oleg, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan, bdonlan, Tejun Heo

signal->group_stop currently hosts mostly group stop related flags;
however, it's gonna be used for wider purposes and the GROUP_STOP_
flag prefix becomes confusing.  Rename signal->group_stop to
signal->jobctl and rename all GROUP_STOP_* flags to JOBCTL_*.

Also, reassign JOBCTL_TRAPPING to bit 22 to better accomodate future
additions.

This doesn't cause any functional change.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/exec.c             |    2 +-
 include/linux/sched.h |   16 ++++----
 kernel/ptrace.c       |   12 +++---
 kernel/signal.c       |   91 +++++++++++++++++++++++++------------------------
 4 files changed, 61 insertions(+), 60 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 8328beb..fd6a4fb 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1659,7 +1659,7 @@ static int zap_process(struct task_struct *start, int exit_code)
 
 	t = start;
 	do {
-		task_clear_group_stop_pending(t);
+		task_clear_jobctl_stop_pending(t);
 		if (t != current && t->mm) {
 			sigaddset(&t->pending.signal, SIGKILL);
 			signal_wake_up(t, 1);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 3f53c25..408f2b9 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1260,7 +1260,7 @@ struct task_struct {
 	int exit_state;
 	int exit_code, exit_signal;
 	int pdeath_signal;  /*  The signal sent when the parent dies  */
-	unsigned int group_stop;	/* GROUP_STOP_*, siglock protected */
+	unsigned int jobctl;	/* JOBCTL_*, siglock protected */
 	/* ??? */
 	unsigned int personality;
 	unsigned did_exec:1;
@@ -1778,15 +1778,15 @@ extern void thread_group_times(struct task_struct *p, cputime_t *ut, cputime_t *
 #define used_math() tsk_used_math(current)
 
 /*
- * task->group_stop flags
+ * task->jobctl flags
  */
-#define GROUP_STOP_SIGMASK	0xffff    /* signr of the last group stop */
-#define GROUP_STOP_PENDING	(1 << 16) /* task should stop for group stop */
-#define GROUP_STOP_CONSUME	(1 << 17) /* consume group stop count */
-#define GROUP_STOP_TRAPPING	(1 << 18) /* switching from STOPPED to TRACED */
-#define GROUP_STOP_DEQUEUED	(1 << 19) /* stop signal dequeued */
+#define JOBCTL_STOP_SIGMASK	0xffff    /* signr of the last group stop */
+#define JOBCTL_STOP_DEQUEUED	(1 << 16) /* stop signal dequeued */
+#define JOBCTL_STOP_PENDING	(1 << 17) /* task should stop for group stop */
+#define JOBCTL_STOP_CONSUME	(1 << 18) /* consume group stop count */
+#define JOBCTL_TRAPPING		(1 << 21) /* switching to TRACED */
 
-extern void task_clear_group_stop_pending(struct task_struct *task);
+extern void task_clear_jobctl_stop_pending(struct task_struct *task);
 
 #ifdef CONFIG_PREEMPT_RCU
 
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 512bd01..30de03b 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -76,13 +76,13 @@ void __ptrace_unlink(struct task_struct *child)
 	spin_lock(&child->sighand->siglock);
 
 	/*
-	 * Reinstate GROUP_STOP_PENDING if group stop is in effect and
+	 * Reinstate JOBCTL_STOP_PENDING if group stop is in effect and
 	 * @child isn't dead.
 	 */
 	if (!(child->flags & PF_EXITING) &&
 	    (child->signal->flags & SIGNAL_STOP_STOPPED ||
 	     child->signal->group_stop_count))
-		child->group_stop |= GROUP_STOP_PENDING;
+		child->jobctl |= JOBCTL_STOP_PENDING;
 
 	/*
 	 * If transition to TASK_STOPPED is pending or in TASK_TRACED, kick
@@ -90,7 +90,7 @@ void __ptrace_unlink(struct task_struct *child)
 	 * is in TASK_TRACED; otherwise, we might unduly disrupt
 	 * TASK_KILLABLE sleeps.
 	 */
-	if (child->group_stop & GROUP_STOP_PENDING || task_is_traced(child))
+	if (child->jobctl & JOBCTL_STOP_PENDING || task_is_traced(child))
 		signal_wake_up(child, task_is_traced(child));
 
 	spin_unlock(&child->sighand->siglock);
@@ -226,7 +226,7 @@ static int ptrace_attach(struct task_struct *task)
 	spin_lock(&task->sighand->siglock);
 
 	/*
-	 * If the task is already STOPPED, set GROUP_STOP_PENDING and
+	 * If the task is already STOPPED, set JOBCTL_STOP_PENDING and
 	 * TRAPPING, and kick it so that it transits to TRACED.  TRAPPING
 	 * will be cleared if the child completes the transition or any
 	 * event which clears the group stop states happens.  We'll wait
@@ -243,7 +243,7 @@ static int ptrace_attach(struct task_struct *task)
 	 * in and out of STOPPED are protected by siglock.
 	 */
 	if (task_is_stopped(task)) {
-		task->group_stop |= GROUP_STOP_PENDING | GROUP_STOP_TRAPPING;
+		task->jobctl |= JOBCTL_STOP_PENDING | JOBCTL_TRAPPING;
 		signal_wake_up(task, 1);
 		wait_trap = true;
 	}
@@ -258,7 +258,7 @@ unlock_creds:
 out:
 	if (wait_trap)
 		wait_event(current->signal->wait_chldexit,
-			   !(task->group_stop & GROUP_STOP_TRAPPING));
+			   !(task->jobctl & JOBCTL_TRAPPING));
 	return retval;
 }
 
diff --git a/kernel/signal.c b/kernel/signal.c
index 35c5f4b..68c2361 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -124,7 +124,7 @@ static inline int has_pending_signals(sigset_t *signal, sigset_t *blocked)
 
 static int recalc_sigpending_tsk(struct task_struct *t)
 {
-	if ((t->group_stop & GROUP_STOP_PENDING) ||
+	if ((t->jobctl & JOBCTL_STOP_PENDING) ||
 	    PENDING(&t->pending, &t->blocked) ||
 	    PENDING(&t->signal->shared_pending, &t->blocked)) {
 		set_tsk_thread_flag(t, TIF_SIGPENDING);
@@ -224,27 +224,28 @@ static inline void print_dropped_signal(int sig)
 }
 
 /**
- * task_clear_group_stop_trapping - clear group stop trapping bit
+ * task_clear_jobctl_trapping - clear jobctl trapping bit
  * @task: target task
  *
- * If GROUP_STOP_TRAPPING is set, a ptracer is waiting for us.  Clear it
- * and wake up the ptracer.  Note that we don't need any further locking.
- * @task->siglock guarantees that @task->parent points to the ptracer.
+ * If JOBCTL_TRAPPING is set, a ptracer is waiting for us to enter TRACED.
+ * Clear it and wake up the ptracer.  Note that we don't need any further
+ * locking.  @task->siglock guarantees that @task->parent points to the
+ * ptracer.
  *
  * CONTEXT:
  * Must be called with @task->sighand->siglock held.
  */
-static void task_clear_group_stop_trapping(struct task_struct *task)
+static void task_clear_jobctl_trapping(struct task_struct *task)
 {
-	if (unlikely(task->group_stop & GROUP_STOP_TRAPPING)) {
-		task->group_stop &= ~GROUP_STOP_TRAPPING;
+	if (unlikely(task->jobctl & JOBCTL_TRAPPING)) {
+		task->jobctl &= ~JOBCTL_TRAPPING;
 		__wake_up_sync_key(&task->parent->signal->wait_chldexit,
 				   TASK_UNINTERRUPTIBLE, 1, task);
 	}
 }
 
 /**
- * task_clear_group_stop_pending - clear pending group stop
+ * task_clear_jobctl_stop_pending - clear pending group stop
  * @task: target task
  *
  * Clear group stop states for @task.
@@ -252,19 +253,19 @@ static void task_clear_group_stop_trapping(struct task_struct *task)
  * CONTEXT:
  * Must be called with @task->sighand->siglock held.
  */
-void task_clear_group_stop_pending(struct task_struct *task)
+void task_clear_jobctl_stop_pending(struct task_struct *task)
 {
-	task->group_stop &= ~(GROUP_STOP_PENDING | GROUP_STOP_CONSUME |
-			      GROUP_STOP_DEQUEUED);
+	task->jobctl &= ~(JOBCTL_STOP_PENDING | JOBCTL_STOP_CONSUME |
+			  JOBCTL_STOP_DEQUEUED);
 }
 
 /**
  * task_participate_group_stop - participate in a group stop
  * @task: task participating in a group stop
  *
- * @task has GROUP_STOP_PENDING set and is participating in a group stop.
+ * @task has %JOBCTL_STOP_PENDING set and is participating in a group stop.
  * Group stop states are cleared and the group stop count is consumed if
- * %GROUP_STOP_CONSUME was set.  If the consumption completes the group
+ * %JOBCTL_STOP_CONSUME was set.  If the consumption completes the group
  * stop, the appropriate %SIGNAL_* flags are set.
  *
  * CONTEXT:
@@ -277,11 +278,11 @@ void task_clear_group_stop_pending(struct task_struct *task)
 static bool task_participate_group_stop(struct task_struct *task)
 {
 	struct signal_struct *sig = task->signal;
-	bool consume = task->group_stop & GROUP_STOP_CONSUME;
+	bool consume = task->jobctl & JOBCTL_STOP_CONSUME;
 
-	WARN_ON_ONCE(!(task->group_stop & GROUP_STOP_PENDING));
+	WARN_ON_ONCE(!(task->jobctl & JOBCTL_STOP_PENDING));
 
-	task_clear_group_stop_pending(task);
+	task_clear_jobctl_stop_pending(task);
 
 	if (!consume)
 		return false;
@@ -604,7 +605,7 @@ int dequeue_signal(struct task_struct *tsk, sigset_t *mask, siginfo_t *info)
 		 * is to alert stop-signal processing code when another
 		 * processor has come along and cleared the flag.
 		 */
-		current->group_stop |= GROUP_STOP_DEQUEUED;
+		current->jobctl |= JOBCTL_STOP_DEQUEUED;
 	}
 	if ((info->si_code & __SI_MASK) == __SI_TIMER && info->si_sys_private) {
 		/*
@@ -809,7 +810,7 @@ static int prepare_signal(int sig, struct task_struct *p, int from_ancestor_ns)
 		rm_from_queue(SIG_KERNEL_STOP_MASK, &signal->shared_pending);
 		t = p;
 		do {
-			task_clear_group_stop_pending(t);
+			task_clear_jobctl_stop_pending(t);
 			rm_from_queue(SIG_KERNEL_STOP_MASK, &t->pending);
 			wake_up_state(t, __TASK_STOPPED);
 		} while_each_thread(p, t);
@@ -925,7 +926,7 @@ static void complete_signal(int sig, struct task_struct *p, int group)
 			signal->group_stop_count = 0;
 			t = p;
 			do {
-				task_clear_group_stop_pending(t);
+				task_clear_jobctl_stop_pending(t);
 				sigaddset(&t->pending.signal, SIGKILL);
 				signal_wake_up(t, 1);
 			} while_each_thread(p, t);
@@ -1160,7 +1161,7 @@ int zap_other_threads(struct task_struct *p)
 	p->signal->group_stop_count = 0;
 
 	while_each_thread(p, t) {
-		task_clear_group_stop_pending(t);
+		task_clear_jobctl_stop_pending(t);
 		count++;
 
 		/* Don't bother with already dead threads */
@@ -1738,7 +1739,7 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 	 * clear now.  We act as if SIGCONT is received after TASK_TRACED
 	 * is entered - ignore it.
 	 */
-	if (why == CLD_STOPPED && (current->group_stop & GROUP_STOP_PENDING))
+	if (why == CLD_STOPPED && (current->jobctl & JOBCTL_STOP_PENDING))
 		gstop_done = task_participate_group_stop(current);
 
 	current->last_siginfo = info;
@@ -1751,12 +1752,12 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 	set_current_state(TASK_TRACED);
 
 	/*
-	 * We're committing to trapping.  Clearing GROUP_STOP_TRAPPING and
+	 * We're committing to trapping.  Clearing JOBCTL_TRAPPING and
 	 * transition to TASK_TRACED should be atomic with respect to
-	 * siglock.  This hsould be done after the arch hook as siglock is
+	 * siglock.  This should be done after the arch hook as siglock is
 	 * released and regrabbed across it.
 	 */
-	task_clear_group_stop_trapping(current);
+	task_clear_jobctl_trapping(current);
 
 	spin_unlock_irq(&current->sighand->siglock);
 	read_lock(&tasklist_lock);
@@ -1792,9 +1793,9 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 		 *
 		 * If @gstop_done, the ptracer went away between group stop
 		 * completion and here.  During detach, it would have set
-		 * GROUP_STOP_PENDING on us and we'll re-enter TASK_STOPPED
-		 * in do_signal_stop() on return, so notifying the real
-		 * parent of the group stop completion is enough.
+		 * JOBCTL_STOP_PENDING on us and we'll re-enter
+		 * TASK_STOPPED in do_signal_stop() on return, so notifying
+		 * the real parent of the group stop completion is enough.
 		 */
 		if (gstop_done)
 			do_notify_parent_cldstop(current, false, why);
@@ -1856,14 +1857,14 @@ static int do_signal_stop(int signr)
 {
 	struct signal_struct *sig = current->signal;
 
-	if (!(current->group_stop & GROUP_STOP_PENDING)) {
-		unsigned int gstop = GROUP_STOP_PENDING | GROUP_STOP_CONSUME;
+	if (!(current->jobctl & JOBCTL_STOP_PENDING)) {
+		unsigned int gstop = JOBCTL_STOP_PENDING | JOBCTL_STOP_CONSUME;
 		struct task_struct *t;
 
-		/* signr will be recorded in task->group_stop for retries */
-		WARN_ON_ONCE(signr & ~GROUP_STOP_SIGMASK);
+		/* signr will be recorded in task->jobctl for retries */
+		WARN_ON_ONCE(signr & ~JOBCTL_STOP_SIGMASK);
 
-		if (!likely(current->group_stop & GROUP_STOP_DEQUEUED) ||
+		if (!likely(current->jobctl & JOBCTL_STOP_DEQUEUED) ||
 		    unlikely(signal_group_exit(sig)))
 			return 0;
 		/*
@@ -1890,19 +1891,19 @@ static int do_signal_stop(int signr)
 		else
 			WARN_ON_ONCE(!task_ptrace(current));
 
-		current->group_stop &= ~GROUP_STOP_SIGMASK;
-		current->group_stop |= signr | gstop;
+		current->jobctl &= ~JOBCTL_STOP_SIGMASK;
+		current->jobctl |= signr | gstop;
 		sig->group_stop_count = 1;
 		for (t = next_thread(current); t != current;
 		     t = next_thread(t)) {
-			t->group_stop &= ~GROUP_STOP_SIGMASK;
+			t->jobctl &= ~JOBCTL_STOP_SIGMASK;
 			/*
 			 * Setting state to TASK_STOPPED for a group
 			 * stop is always done with the siglock held,
 			 * so this check has no races.
 			 */
 			if (!(t->flags & PF_EXITING) && !task_is_stopped(t)) {
-				t->group_stop |= signr | gstop;
+				t->jobctl |= signr | gstop;
 				sig->group_stop_count++;
 				signal_wake_up(t, 0);
 			}
@@ -1943,23 +1944,23 @@ retry:
 
 		spin_lock_irq(&current->sighand->siglock);
 	} else {
-		ptrace_stop(current->group_stop & GROUP_STOP_SIGMASK,
+		ptrace_stop(current->jobctl & JOBCTL_STOP_SIGMASK,
 			    CLD_STOPPED, 0, NULL);
 		current->exit_code = 0;
 	}
 
 	/*
-	 * GROUP_STOP_PENDING could be set if another group stop has
+	 * JOBCTL_STOP_PENDING could be set if another group stop has
 	 * started since being woken up or ptrace wants us to transit
 	 * between TASK_STOPPED and TRACED.  Retry group stop.
 	 */
-	if (current->group_stop & GROUP_STOP_PENDING) {
-		WARN_ON_ONCE(!(current->group_stop & GROUP_STOP_SIGMASK));
+	if (current->jobctl & JOBCTL_STOP_PENDING) {
+		WARN_ON_ONCE(!(current->jobctl & JOBCTL_STOP_SIGMASK));
 		goto retry;
 	}
 
 	/* PTRACE_ATTACH might have raced with task killing, clear trapping */
-	task_clear_group_stop_trapping(current);
+	task_clear_jobctl_trapping(current);
 
 	spin_unlock_irq(&current->sighand->siglock);
 
@@ -2078,8 +2079,8 @@ relock:
 		if (unlikely(signr != 0))
 			ka = return_ka;
 		else {
-			if (unlikely(current->group_stop &
-				     GROUP_STOP_PENDING) && do_signal_stop(0))
+			if (unlikely(current->jobctl & JOBCTL_STOP_PENDING) &&
+			    do_signal_stop(0))
 				goto relock;
 
 			signr = dequeue_signal(current, &current->blocked,
@@ -2253,7 +2254,7 @@ void exit_signals(struct task_struct *tsk)
 	signotset(&unblocked);
 	retarget_shared_pending(tsk, &unblocked);
 
-	if (unlikely(tsk->group_stop & GROUP_STOP_PENDING) &&
+	if (unlikely(tsk->jobctl & JOBCTL_STOP_PENDING) &&
 	    task_participate_group_stop(tsk))
 		group_stop = CLD_STOPPED;
 out:
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/9] ptrace: ptrace_check_attach(): rename @kill to @ignore_state and add comments
  2011-05-13 15:46 [PATCHSET ptrace] ptrace: prepare for PTRACE_SEIZE/INTERRUPT Tejun Heo
  2011-05-13 15:46 ` [PATCH 1/9] job control: reorganize wait_task_stopped() Tejun Heo
  2011-05-13 15:46 ` [PATCH 2/9] job control: rename signal->group_stop and flags to jobctl and rearrange flags Tejun Heo
@ 2011-05-13 15:46 ` Tejun Heo
  2011-05-13 15:46 ` [PATCH 4/9] ptrace: relocate set_current_state(TASK_TRACED) in ptrace_stop() Tejun Heo
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Tejun Heo @ 2011-05-13 15:46 UTC (permalink / raw)
  To: oleg, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan, bdonlan, Tejun Heo

PTRACE_INTERRUPT is going to be added which should also skip
task_is_traced() check in ptrace_check_attach().  Rename @kill to
@ignore_state and make it bool.  Add function comment while at it.

This patch doesn't introduce any behavior difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/ptrace.h |    2 +-
 kernel/ptrace.c        |   24 +++++++++++++++++++-----
 2 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index a1147e5..974e187 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -105,7 +105,7 @@ extern long arch_ptrace(struct task_struct *child, long request,
 extern int ptrace_readdata(struct task_struct *tsk, unsigned long src, char __user *dst, int len);
 extern int ptrace_writedata(struct task_struct *tsk, char __user *src, unsigned long dst, int len);
 extern void ptrace_disable(struct task_struct *);
-extern int ptrace_check_attach(struct task_struct *task, int kill);
+extern int ptrace_check_attach(struct task_struct *task, bool ignore_state);
 extern int ptrace_request(struct task_struct *child, long request,
 			  unsigned long addr, unsigned long data);
 extern void ptrace_notify(int exit_code);
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 30de03b..5cd77a8 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -96,10 +96,24 @@ void __ptrace_unlink(struct task_struct *child)
 	spin_unlock(&child->sighand->siglock);
 }
 
-/*
- * Check that we have indeed attached to the thing..
+/**
+ * ptrace_check_attach - check whether ptracee is ready for ptrace operation
+ * @child: ptracee to check for
+ * @ignore_state: don't check whether @child is currently %TASK_TRACED
+ *
+ * Check whether @child is being ptraced by %current and ready for further
+ * ptrace operations.  If @ignore_state is %false, @child also should be in
+ * %TASK_TRACED state and on return the child is guaranteed to be traced
+ * and not executing.  If @ignore_state is %true, @child can be in any
+ * state.
+ *
+ * CONTEXT:
+ * Grabs and releases tasklist_lock and @child->sighand->siglock.
+ *
+ * RETURNS:
+ * 0 on success, -ESRCH if %child is not ready.
  */
-int ptrace_check_attach(struct task_struct *child, int kill)
+int ptrace_check_attach(struct task_struct *child, bool ignore_state)
 {
 	int ret = -ESRCH;
 
@@ -118,13 +132,13 @@ int ptrace_check_attach(struct task_struct *child, int kill)
 		 */
 		spin_lock_irq(&child->sighand->siglock);
 		WARN_ON_ONCE(task_is_stopped(child));
-		if (task_is_traced(child) || kill)
+		if (task_is_traced(child) || ignore_state)
 			ret = 0;
 		spin_unlock_irq(&child->sighand->siglock);
 	}
 	read_unlock(&tasklist_lock);
 
-	if (!ret && !kill)
+	if (!ret && !ignore_state)
 		ret = wait_task_inactive(child, TASK_TRACED) ? 0 : -ESRCH;
 
 	/* All systems go.. */
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 4/9] ptrace: relocate set_current_state(TASK_TRACED) in ptrace_stop()
  2011-05-13 15:46 [PATCHSET ptrace] ptrace: prepare for PTRACE_SEIZE/INTERRUPT Tejun Heo
                   ` (2 preceding siblings ...)
  2011-05-13 15:46 ` [PATCH 3/9] ptrace: ptrace_check_attach(): rename @kill to @ignore_state and add comments Tejun Heo
@ 2011-05-13 15:46 ` Tejun Heo
  2011-05-16 11:57   ` Oleg Nesterov
  2011-05-13 15:46 ` [PATCH 5/9] job control: introduce JOBCTL_PENDING_MASK and task_clear_jobctl_pending() Tejun Heo
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 24+ messages in thread
From: Tejun Heo @ 2011-05-13 15:46 UTC (permalink / raw)
  To: oleg, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan, bdonlan, Tejun Heo

In ptrace_stop(), after arch hook is done, the task state and jobctl
bits are updated while holding siglock.  The ordering requirement
there is that TASK_TRACED is set before JOBCTL_TRAPPING is cleared to
prevent ptracer waiting on TRAPPING doesn't end up waking up TRACED is
actually set and sees TASK_RUNNING in wait(2).

Move set_current_state(TASK_TRACED) to the top of the block and
reorganize comments.  This makes the ordering more obvious
(TASK_TRACED before other updates) and helps future updates to group
stop participation.

This patch doesn't cause any functional change.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/signal.c |   28 +++++++++++++---------------
 1 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 68c2361..26a30b8 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1733,6 +1733,18 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 	}
 
 	/*
+	 * We're committing to trapping.  TRACED should be visible before
+	 * TRAPPING is cleared; otherwise, the tracer might fail do_wait().
+	 * Also, transition to TRACED and updates to ->jobctl should be
+	 * atomic with respect to siglock and should be done after the arch
+	 * hook as siglock is released and regrabbed across it.
+	 */
+	set_current_state(TASK_TRACED);
+
+	current->last_siginfo = info;
+	current->exit_code = exit_code;
+
+	/*
 	 * If @why is CLD_STOPPED, we're trapping to participate in a group
 	 * stop.  Do the bookkeeping.  Note that if SIGCONT was delievered
 	 * while siglock was released for the arch hook, PENDING could be
@@ -1742,21 +1754,7 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 	if (why == CLD_STOPPED && (current->jobctl & JOBCTL_STOP_PENDING))
 		gstop_done = task_participate_group_stop(current);
 
-	current->last_siginfo = info;
-	current->exit_code = exit_code;
-
-	/*
-	 * TRACED should be visible before TRAPPING is cleared; otherwise,
-	 * the tracer might fail do_wait().
-	 */
-	set_current_state(TASK_TRACED);
-
-	/*
-	 * We're committing to trapping.  Clearing JOBCTL_TRAPPING and
-	 * transition to TASK_TRACED should be atomic with respect to
-	 * siglock.  This should be done after the arch hook as siglock is
-	 * released and regrabbed across it.
-	 */
+	/* entering a trap, clear TRAPPING */
 	task_clear_jobctl_trapping(current);
 
 	spin_unlock_irq(&current->sighand->siglock);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 5/9] job control: introduce JOBCTL_PENDING_MASK and task_clear_jobctl_pending()
  2011-05-13 15:46 [PATCHSET ptrace] ptrace: prepare for PTRACE_SEIZE/INTERRUPT Tejun Heo
                   ` (3 preceding siblings ...)
  2011-05-13 15:46 ` [PATCH 4/9] ptrace: relocate set_current_state(TASK_TRACED) in ptrace_stop() Tejun Heo
@ 2011-05-13 15:46 ` Tejun Heo
  2011-05-13 15:46 ` [PATCH 6/9] job control: make task_clear_jobctl_pending() clear TRAPPING automatically Tejun Heo
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Tejun Heo @ 2011-05-13 15:46 UTC (permalink / raw)
  To: oleg, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan, bdonlan, Tejun Heo

This patch introduces JOBCTL_PENDING_MASK and replaces
task_clear_jobctl_stop_pending() with task_clear_jobctl_pending()
which takes an extra @mask argument.

JOBCTL_PENDING_MASK is currently equal to JOBCTL_STOP_PENDING but
future patches will add more bits.  recalc_sigpending_tsk() is updated
to use JOBCTL_PENDING_MASK instead.

task_clear_jobctl_pending() takes @mask which in subset of
JOBCTL_PENDING_MASK and clears the relevant jobctl bits.  If
JOBCTL_STOP_PENDING is set, other STOP bits are cleared together.  All
task_clear_jobctl_stop_pending() users are updated to call
task_clear_jobctl_pending() with JOBCTL_STOP_PENDING which is
functionally identical to task_clear_jobctl_stop_pending().

This patch doesn't cause any functional change.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/exec.c             |    2 +-
 include/linux/sched.h |    5 ++++-
 kernel/signal.c       |   27 +++++++++++++++++----------
 3 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index fd6a4fb..c654664 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1659,7 +1659,7 @@ static int zap_process(struct task_struct *start, int exit_code)
 
 	t = start;
 	do {
-		task_clear_jobctl_stop_pending(t);
+		task_clear_jobctl_pending(t, JOBCTL_STOP_PENDING);
 		if (t != current && t->mm) {
 			sigaddset(&t->pending.signal, SIGKILL);
 			signal_wake_up(t, 1);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 408f2b9..86f21e7 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1786,7 +1786,10 @@ extern void thread_group_times(struct task_struct *p, cputime_t *ut, cputime_t *
 #define JOBCTL_STOP_CONSUME	(1 << 18) /* consume group stop count */
 #define JOBCTL_TRAPPING		(1 << 21) /* switching to TRACED */
 
-extern void task_clear_jobctl_stop_pending(struct task_struct *task);
+#define JOBCTL_PENDING_MASK	JOBCTL_STOP_PENDING
+
+extern void task_clear_jobctl_pending(struct task_struct *task,
+				      unsigned int mask);
 
 #ifdef CONFIG_PREEMPT_RCU
 
diff --git a/kernel/signal.c b/kernel/signal.c
index 26a30b8..e67eaa0 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -124,7 +124,7 @@ static inline int has_pending_signals(sigset_t *signal, sigset_t *blocked)
 
 static int recalc_sigpending_tsk(struct task_struct *t)
 {
-	if ((t->jobctl & JOBCTL_STOP_PENDING) ||
+	if ((t->jobctl & JOBCTL_PENDING_MASK) ||
 	    PENDING(&t->pending, &t->blocked) ||
 	    PENDING(&t->signal->shared_pending, &t->blocked)) {
 		set_tsk_thread_flag(t, TIF_SIGPENDING);
@@ -245,18 +245,25 @@ static void task_clear_jobctl_trapping(struct task_struct *task)
 }
 
 /**
- * task_clear_jobctl_stop_pending - clear pending group stop
+ * task_clear_jobctl_pending - clear jobctl pending bits
  * @task: target task
+ * @mask: pending bits to clear
  *
- * Clear group stop states for @task.
+ * Clear @mask from @task->jobctl.  @mask must be subset of
+ * %JOBCTL_PENDING_MASK.  If %JOBCTL_STOP_PENDING is being cleared, other
+ * STOP bits are cleared together.
  *
  * CONTEXT:
  * Must be called with @task->sighand->siglock held.
  */
-void task_clear_jobctl_stop_pending(struct task_struct *task)
+void task_clear_jobctl_pending(struct task_struct *task, unsigned int mask)
 {
-	task->jobctl &= ~(JOBCTL_STOP_PENDING | JOBCTL_STOP_CONSUME |
-			  JOBCTL_STOP_DEQUEUED);
+	BUG_ON(mask & ~JOBCTL_PENDING_MASK);
+
+	if (mask & JOBCTL_STOP_PENDING)
+		mask |= JOBCTL_STOP_CONSUME | JOBCTL_STOP_DEQUEUED;
+
+	task->jobctl &= ~mask;
 }
 
 /**
@@ -282,7 +289,7 @@ static bool task_participate_group_stop(struct task_struct *task)
 
 	WARN_ON_ONCE(!(task->jobctl & JOBCTL_STOP_PENDING));
 
-	task_clear_jobctl_stop_pending(task);
+	task_clear_jobctl_pending(task, JOBCTL_STOP_PENDING);
 
 	if (!consume)
 		return false;
@@ -810,7 +817,7 @@ static int prepare_signal(int sig, struct task_struct *p, int from_ancestor_ns)
 		rm_from_queue(SIG_KERNEL_STOP_MASK, &signal->shared_pending);
 		t = p;
 		do {
-			task_clear_jobctl_stop_pending(t);
+			task_clear_jobctl_pending(t, JOBCTL_STOP_PENDING);
 			rm_from_queue(SIG_KERNEL_STOP_MASK, &t->pending);
 			wake_up_state(t, __TASK_STOPPED);
 		} while_each_thread(p, t);
@@ -926,7 +933,7 @@ static void complete_signal(int sig, struct task_struct *p, int group)
 			signal->group_stop_count = 0;
 			t = p;
 			do {
-				task_clear_jobctl_stop_pending(t);
+				task_clear_jobctl_pending(t, JOBCTL_STOP_PENDING);
 				sigaddset(&t->pending.signal, SIGKILL);
 				signal_wake_up(t, 1);
 			} while_each_thread(p, t);
@@ -1161,7 +1168,7 @@ int zap_other_threads(struct task_struct *p)
 	p->signal->group_stop_count = 0;
 
 	while_each_thread(p, t) {
-		task_clear_jobctl_stop_pending(t);
+		task_clear_jobctl_pending(t, JOBCTL_STOP_PENDING);
 		count++;
 
 		/* Don't bother with already dead threads */
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 6/9] job control: make task_clear_jobctl_pending() clear TRAPPING automatically
  2011-05-13 15:46 [PATCHSET ptrace] ptrace: prepare for PTRACE_SEIZE/INTERRUPT Tejun Heo
                   ` (4 preceding siblings ...)
  2011-05-13 15:46 ` [PATCH 5/9] job control: introduce JOBCTL_PENDING_MASK and task_clear_jobctl_pending() Tejun Heo
@ 2011-05-13 15:46 ` Tejun Heo
  2011-05-16 12:25   ` Oleg Nesterov
  2011-05-13 15:46 ` [PATCH 7/9] ptrace: use bit_waitqueue for TRAPPING instead of wait_chldexit Tejun Heo
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 24+ messages in thread
From: Tejun Heo @ 2011-05-13 15:46 UTC (permalink / raw)
  To: oleg, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan, bdonlan, Tejun Heo

JOBCTL_TRAPPING indicates that ptracer is waiting for tracee to
(re)transit into TRACED.  task_clear_jobctl_pending() must be called
when either tracee enters TRACED or the transition is cancelled for
some reason.  The former is achieved by explicitly calling
task_clear_jobctl_pending() in ptrace_stop() and the latter by calling
it at the end of do_signal_stop().

Calling task_clear_jobctl_trapping() at the end of do_signal_stop()
limits the scope TRAPPING can be used and is fragile in that seemingly
unrelated changes to tracee's control flow can lead to stuck TRAPPING.

We already have task_clear_jobctl_pending() calls on those cancelling
events to clear JOBCTL_STOP_PENDING.  Cancellations can be handled by
making those call sites use JOBCTL_PENDING_MASK instead and updating
task_clear_jobctl_pending() such that task_clear_jobctl_trapping() is
called automatically if no stop/trap is pending.

This patch makes the above changes and removes the fallback
task_clear_jobctl_trapping() call from do_signal_stop().

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/exec.c       |    2 +-
 kernel/signal.c |   13 ++++++++-----
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index c654664..6ef8958 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1659,7 +1659,7 @@ static int zap_process(struct task_struct *start, int exit_code)
 
 	t = start;
 	do {
-		task_clear_jobctl_pending(t, JOBCTL_STOP_PENDING);
+		task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK);
 		if (t != current && t->mm) {
 			sigaddset(&t->pending.signal, SIGKILL);
 			signal_wake_up(t, 1);
diff --git a/kernel/signal.c b/kernel/signal.c
index e67eaa0..91e57d3 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -253,6 +253,9 @@ static void task_clear_jobctl_trapping(struct task_struct *task)
  * %JOBCTL_PENDING_MASK.  If %JOBCTL_STOP_PENDING is being cleared, other
  * STOP bits are cleared together.
  *
+ * If clearing of @mask leaves no stop or trap pending, this function calls
+ * task_clear_jobctl_trapping().
+ *
  * CONTEXT:
  * Must be called with @task->sighand->siglock held.
  */
@@ -264,6 +267,9 @@ void task_clear_jobctl_pending(struct task_struct *task, unsigned int mask)
 		mask |= JOBCTL_STOP_CONSUME | JOBCTL_STOP_DEQUEUED;
 
 	task->jobctl &= ~mask;
+
+	if (!(task->jobctl & JOBCTL_PENDING_MASK))
+		task_clear_jobctl_trapping(task);
 }
 
 /**
@@ -933,7 +939,7 @@ static void complete_signal(int sig, struct task_struct *p, int group)
 			signal->group_stop_count = 0;
 			t = p;
 			do {
-				task_clear_jobctl_pending(t, JOBCTL_STOP_PENDING);
+				task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK);
 				sigaddset(&t->pending.signal, SIGKILL);
 				signal_wake_up(t, 1);
 			} while_each_thread(p, t);
@@ -1168,7 +1174,7 @@ int zap_other_threads(struct task_struct *p)
 	p->signal->group_stop_count = 0;
 
 	while_each_thread(p, t) {
-		task_clear_jobctl_pending(t, JOBCTL_STOP_PENDING);
+		task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK);
 		count++;
 
 		/* Don't bother with already dead threads */
@@ -1964,9 +1970,6 @@ retry:
 		goto retry;
 	}
 
-	/* PTRACE_ATTACH might have raced with task killing, clear trapping */
-	task_clear_jobctl_trapping(current);
-
 	spin_unlock_irq(&current->sighand->siglock);
 
 	tracehook_finish_jctl();
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 7/9] ptrace: use bit_waitqueue for TRAPPING instead of wait_chldexit
  2011-05-13 15:46 [PATCHSET ptrace] ptrace: prepare for PTRACE_SEIZE/INTERRUPT Tejun Heo
                   ` (5 preceding siblings ...)
  2011-05-13 15:46 ` [PATCH 6/9] job control: make task_clear_jobctl_pending() clear TRAPPING automatically Tejun Heo
@ 2011-05-13 15:46 ` Tejun Heo
  2011-05-13 15:46 ` [PATCH 8/9] ptrace: move JOBCTL_TRAPPING wait to wait(2) and ptrace_check_attach() Tejun Heo
  2011-05-13 15:46 ` [PATCH 9/9] ptrace: make TRAPPING wait interruptible Tejun Heo
  8 siblings, 0 replies; 24+ messages in thread
From: Tejun Heo @ 2011-05-13 15:46 UTC (permalink / raw)
  To: oleg, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan, bdonlan, Tejun Heo

ptracer->signal->wait_chldexit was used to wait for TRAPPING; however,
->wait_chldexit was already complicated with waker-side filtering
without adding TRAPPING wait on top of it.  Also, it unnecessarily
made TRAPPING clearing depend on the current ptrace relationship - if
the ptracee is detached, wakeup is lost.

There is no reason to use signal->wait_chldexit here.  We're just
waiting for JOBCTL_TRAPPING bit to clear and given the relatively
infrequent use of ptrace, bit_waitqueue can serve it perfectly.

This patch makes JOBCTL_TRAPPING wait use bit_waitqueue instead of
signal->wait_chldexit.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/ptrace.c |   10 ++++++++--
 kernel/signal.c |    3 +--
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 5cd77a8..41fd45e 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -24,6 +24,12 @@
 #include <linux/regset.h>
 
 
+static int ptrace_trapping_sleep_fn(void *flags)
+{
+	schedule();
+	return 0;
+}
+
 /*
  * ptrace a task: make the debugger its new parent and
  * move it to the ptrace list.
@@ -271,8 +277,8 @@ unlock_creds:
 	mutex_unlock(&task->signal->cred_guard_mutex);
 out:
 	if (wait_trap)
-		wait_event(current->signal->wait_chldexit,
-			   !(task->jobctl & JOBCTL_TRAPPING));
+		wait_on_bit(&task->jobctl, ilog2(JOBCTL_TRAPPING),
+			    ptrace_trapping_sleep_fn, TASK_UNINTERRUPTIBLE);
 	return retval;
 }
 
diff --git a/kernel/signal.c b/kernel/signal.c
index 91e57d3..fff3036 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -239,8 +239,7 @@ static void task_clear_jobctl_trapping(struct task_struct *task)
 {
 	if (unlikely(task->jobctl & JOBCTL_TRAPPING)) {
 		task->jobctl &= ~JOBCTL_TRAPPING;
-		__wake_up_sync_key(&task->parent->signal->wait_chldexit,
-				   TASK_UNINTERRUPTIBLE, 1, task);
+		wake_up_bit(&task->jobctl, ilog2(JOBCTL_TRAPPING));
 	}
 }
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 8/9] ptrace: move JOBCTL_TRAPPING wait to wait(2) and ptrace_check_attach()
  2011-05-13 15:46 [PATCHSET ptrace] ptrace: prepare for PTRACE_SEIZE/INTERRUPT Tejun Heo
                   ` (6 preceding siblings ...)
  2011-05-13 15:46 ` [PATCH 7/9] ptrace: use bit_waitqueue for TRAPPING instead of wait_chldexit Tejun Heo
@ 2011-05-13 15:46 ` Tejun Heo
  2011-05-14 14:22   ` [PATCH UPDATED " Tejun Heo
  2011-05-13 15:46 ` [PATCH 9/9] ptrace: make TRAPPING wait interruptible Tejun Heo
  8 siblings, 1 reply; 24+ messages in thread
From: Tejun Heo @ 2011-05-13 15:46 UTC (permalink / raw)
  To: oleg, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan, bdonlan, Tejun Heo

Currently, JOBCTL_TRAPPING is used by PTRACE_ATTACH and SEIZE to hide
TASK_STOPPED -> TRACED transition from ptracer.  If tracee is in group
stop, TRAPPING is set, tracee is kicked and tracer waits for the
transition to complete before completing attach.  This prevents tracer
from seeing tracee during transition.

The transition is visible only through wait(2) and following ptrace(2)
requests.  Without TRAPPING, WNOHANG which should succeed right after
attach (when tracer knows tracee was stopped) might fail and likewise
for the following ptrace requests.

TRAPPING will also be used to implement ptrace notification re-traps,
which can be initiated by tasks other than tracer.  To allow this,
this patch moves TRAPPING wait from attach completion path to
operations which are actually affected by the transition - wait(2) and
following ptrace(2) requests.

As reliably checking and modifying TASK_STOPPED/TRACED transition
together with JOBCTL_TRAPPING require siglock and both ptrace and wait
paths are holding tasklist_lock and siglock where TRAPPING check is
needed, ptrace_wait_trapping() assumes both locks to be held on entry
and releases them if it actually had to wait for TRAPPING.

Both wait and ptrace paths are updated to retry the operation after
TRAPPING wait.  Note that wait_task_stopped() now always grabs siglock
for ptrace waits.  This can be avoided with "task_stopped_code() ->
rmb() -> TRAPPING -> rmb() -> task_stopped_code()" sequence but given
that ptrace isn't particularly sensitive to performance or
scalability, choosing simpler implementation seems better.

Both ptrace(2) and wait(2) use restart_syscall() to retry after
waiting for TRAPPING.  This simplifies the implementation and will be
useful when TRAPPING sleep is converted to be interruptible.

Note that, after this change, PTRACE_ATTACH may return before the
transition completes and the ptracer might see the tracee in transient
TASK_RUNNING state via /proc/PID/stat; however, wait(2) and the
following ptrace requests would behave correctly regardless.  This is
userland visible behavior change.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/ptrace.h |    1 +
 kernel/exit.c          |   18 ++++++++++++++++--
 kernel/ptrace.c        |   48 +++++++++++++++++++++++++++++++++++++++++++-----
 3 files changed, 60 insertions(+), 7 deletions(-)

diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index 974e187..6b359cd 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -105,6 +105,7 @@ extern long arch_ptrace(struct task_struct *child, long request,
 extern int ptrace_readdata(struct task_struct *tsk, unsigned long src, char __user *dst, int len);
 extern int ptrace_writedata(struct task_struct *tsk, char __user *src, unsigned long dst, int len);
 extern void ptrace_disable(struct task_struct *);
+extern bool ptrace_wait_trapping(struct task_struct *child);
 extern int ptrace_check_attach(struct task_struct *task, bool ignore_state);
 extern int ptrace_request(struct task_struct *child, long request,
 			  unsigned long addr, unsigned long data);
diff --git a/kernel/exit.c b/kernel/exit.c
index 3383793..b57cb14 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1409,15 +1409,29 @@ static int wait_task_stopped(struct wait_opts *wo,
 	if (!ptrace && !(wo->wo_flags & WUNTRACED))
 		return 0;
 
-	if (!task_stopped_code(p, ptrace))
+	/*
+	 * For ptrace waits, we can't reliably check whether wait condition
+	 * exists without grabbing siglock due to JOBCTL_TRAPPING
+	 * transitions.  A task might be temporarily in TASK_RUNNING while
+	 * trapping which should be transparent to the ptracer.
+	 *
+	 * Note that we can avoid unconditionally grabbing siglock by
+	 * wrapping TRAPPING test with two rmb's; however, let's stick with
+	 * simpler implementation for now.
+	 */
+	if (!ptrace && !(p->signal->flags & SIGNAL_STOP_STOPPED))
 		return 0;
 
 	exit_code = 0;
 	spin_lock_irq(&p->sighand->siglock);
 
 	p_code = task_stopped_code(p, ptrace);
-	if (unlikely(!p_code))
+	if (unlikely(!p_code)) {
+		/* if trapping, wait for it and restart the whole process */
+		if (ptrace && ptrace_wait_trapping(p))
+			return restart_syscall();
 		goto unlock_sig;
+	}
 
 	exit_code = *p_code;
 	if (!exit_code)
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 41fd45e..03349b5 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -30,6 +30,47 @@ static int ptrace_trapping_sleep_fn(void *flags)
 	return 0;
 }
 
+/**
+ * ptrace_wait_trapping - wait ptracee to finish %TASK_TRACED/STOPPED transition
+ * @child: child to wait for
+ *
+ * There are cases where ptracer needs to ask the ptracee to [re]enter
+ * %TASK_TRACED which involves the tracee going through %TASK_RUNNING
+ * briefly, which could affect operation of ptrace(2) and wait(2).
+ *
+ * %JOBCTL_TRAPPING is used to hide such transitions from the ptracer.
+ * It's set when such transition is initiated by the ptracer and cleared on
+ * completion.  Operations which may be affected should call this function
+ * to make sure no transition is in progress before proceeding.
+ *
+ * This function checks whether @child is TRAPPING and, if so, waits for
+ * the transition to complete.
+ *
+ * CONTEXT:
+ * read_lock(&tasklist_lock) and spin_lock_irq(&child->sighand->siglock).
+ * On %true return, both locks are released and the function might have
+ * slept.
+ *
+ * RETURNS:
+ * %false if @child wasn't trapping and nothing happened.  %true if waited
+ * for trapping transition and released both locks.
+ */
+bool ptrace_wait_trapping(struct task_struct *child)
+	__releases(&child->sighand->siglock)
+	__releases(&tasklist_lock)
+{
+	if (likely(!(child->jobctl & JOBCTL_TRAPPING)))
+		return false;
+
+	spin_unlock_irq(&child->sighand->siglock);
+	get_task_struct(child);
+	read_unlock(&tasklist_lock);
+	wait_on_bit(&child->jobctl, ilog2(JOBCTL_TRAPPING),
+		    ptrace_trapping_sleep_fn, TASK_UNINTERRUPTIBLE);
+	put_task_struct(child);
+	return true;
+}
+
 /*
  * ptrace a task: make the debugger its new parent and
  * move it to the ptrace list.
@@ -140,6 +181,8 @@ int ptrace_check_attach(struct task_struct *child, bool ignore_state)
 		WARN_ON_ONCE(task_is_stopped(child));
 		if (task_is_traced(child) || ignore_state)
 			ret = 0;
+		else if (ptrace_wait_trapping(child))
+			return restart_syscall();
 		spin_unlock_irq(&child->sighand->siglock);
 	}
 	read_unlock(&tasklist_lock);
@@ -203,7 +246,6 @@ bool ptrace_may_access(struct task_struct *task, unsigned int mode)
 
 static int ptrace_attach(struct task_struct *task)
 {
-	bool wait_trap = false;
 	int retval;
 
 	audit_ptrace(task);
@@ -265,7 +307,6 @@ static int ptrace_attach(struct task_struct *task)
 	if (task_is_stopped(task)) {
 		task->jobctl |= JOBCTL_STOP_PENDING | JOBCTL_TRAPPING;
 		signal_wake_up(task, 1);
-		wait_trap = true;
 	}
 
 	spin_unlock(&task->sighand->siglock);
@@ -276,9 +317,6 @@ unlock_tasklist:
 unlock_creds:
 	mutex_unlock(&task->signal->cred_guard_mutex);
 out:
-	if (wait_trap)
-		wait_on_bit(&task->jobctl, ilog2(JOBCTL_TRAPPING),
-			    ptrace_trapping_sleep_fn, TASK_UNINTERRUPTIBLE);
 	return retval;
 }
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 9/9] ptrace: make TRAPPING wait interruptible
  2011-05-13 15:46 [PATCHSET ptrace] ptrace: prepare for PTRACE_SEIZE/INTERRUPT Tejun Heo
                   ` (7 preceding siblings ...)
  2011-05-13 15:46 ` [PATCH 8/9] ptrace: move JOBCTL_TRAPPING wait to wait(2) and ptrace_check_attach() Tejun Heo
@ 2011-05-13 15:46 ` Tejun Heo
  8 siblings, 0 replies; 24+ messages in thread
From: Tejun Heo @ 2011-05-13 15:46 UTC (permalink / raw)
  To: oleg, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan, bdonlan, Tejun Heo

With JOBCTL_TRAPPING waits moved to wait_task_stopped() and
ptrace_check_attach(), and both using full syscall retries after wait,
TRAPPING wait can switch to interruptible sleeps.

As all transitions are interlocked and all cancellation events
(supposedly) clear TRAPPING, this doesn't change the actual behavior
but it makes the TRAPPING wait mechanism much more forgiving when
something goes wrong and allows using TRAPPING waits across freezing
points.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/ptrace.c |   10 +++++++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 03349b5..6e79eab 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -27,7 +27,7 @@
 static int ptrace_trapping_sleep_fn(void *flags)
 {
 	schedule();
-	return 0;
+	return signal_pending(current);
 }
 
 /**
@@ -44,7 +44,11 @@ static int ptrace_trapping_sleep_fn(void *flags)
  * to make sure no transition is in progress before proceeding.
  *
  * This function checks whether @child is TRAPPING and, if so, waits for
- * the transition to complete.
+ * the transition to complete.  Interruptible sleep is used for waiting and
+ * %true will be returned regardless of why it is woken up.  On %true
+ * return, callers should ensure that the whole operation is restarted
+ * using the syscall restart mechanism so that operations like freezing or
+ * killing don't get blocked by TRAPPING waits.
  *
  * CONTEXT:
  * read_lock(&tasklist_lock) and spin_lock_irq(&child->sighand->siglock).
@@ -66,7 +70,7 @@ bool ptrace_wait_trapping(struct task_struct *child)
 	get_task_struct(child);
 	read_unlock(&tasklist_lock);
 	wait_on_bit(&child->jobctl, ilog2(JOBCTL_TRAPPING),
-		    ptrace_trapping_sleep_fn, TASK_UNINTERRUPTIBLE);
+		    ptrace_trapping_sleep_fn, TASK_INTERRUPTIBLE);
 	put_task_struct(child);
 	return true;
 }
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH UPDATED 8/9] ptrace: move JOBCTL_TRAPPING wait to wait(2) and ptrace_check_attach()
  2011-05-13 15:46 ` [PATCH 8/9] ptrace: move JOBCTL_TRAPPING wait to wait(2) and ptrace_check_attach() Tejun Heo
@ 2011-05-14 14:22   ` Tejun Heo
  2011-05-16 12:11     ` Oleg Nesterov
  0 siblings, 1 reply; 24+ messages in thread
From: Tejun Heo @ 2011-05-14 14:22 UTC (permalink / raw)
  To: oleg, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan, bdonlan

Currently, JOBCTL_TRAPPING is used by PTRACE_ATTACH and SEIZE to hide
TASK_STOPPED -> TRACED transition from ptracer.  If tracee is in group
stop, TRAPPING is set, tracee is kicked and tracer waits for the
transition to complete before completing attach.  This prevents tracer
from seeing tracee during transition.

The transition is visible only through wait(2) and following ptrace(2)
requests.  Without TRAPPING, WNOHANG which should succeed right after
attach (when tracer knows tracee was stopped) might fail and likewise
for the following ptrace requests.

TRAPPING will also be used to implement ptrace notification re-traps,
which can be initiated by tasks other than tracer.  To allow this,
this patch moves TRAPPING wait from attach completion path to
operations which are actually affected by the transition - wait(2) and
following ptrace(2) requests.

As reliably checking and modifying TASK_STOPPED/TRACED transition
together with JOBCTL_TRAPPING require siglock and both ptrace and wait
paths are holding tasklist_lock and siglock where TRAPPING check is
needed, ptrace_wait_trapping() assumes both locks to be held on entry
and releases them if it actually had to wait for TRAPPING.

Both wait and ptrace paths are updated to retry the operation after
TRAPPING wait.  Note that wait_task_stopped() now always grabs siglock
for ptrace waits.  This can be avoided with "task_stopped_code() ->
rmb() -> TRAPPING -> rmb() -> task_stopped_code()" sequence but given
that ptrace isn't particularly sensitive to performance or
scalability, choosing simpler implementation seems better.

Both ptrace(2) and wait(2) use restart_syscall() to retry after
waiting for TRAPPING.  This simplifies the implementation and will be
useful when TRAPPING sleep is converted to be interruptible.

Note that, after this change, PTRACE_ATTACH may return before the
transition completes and the ptracer might see the tracee in transient
TASK_RUNNING state via /proc/PID/stat; however, wait(2) and the
following ptrace requests would behave correctly regardless.  This is
userland visible behavior change.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
Forgot to update comment in ptrace_attach().  Nothing changes other
than comment update.  Will update git tree accordingly.

Thanks.

 include/linux/ptrace.h |    1 
 kernel/exit.c          |   18 ++++++++++++--
 kernel/ptrace.c        |   62 ++++++++++++++++++++++++++++++++++++++-----------
 3 files changed, 66 insertions(+), 15 deletions(-)

Index: work/kernel/ptrace.c
===================================================================
--- work.orig/kernel/ptrace.c
+++ work/kernel/ptrace.c
@@ -30,6 +30,47 @@ static int ptrace_trapping_sleep_fn(void
 	return 0;
 }
 
+/**
+ * ptrace_wait_trapping - wait ptracee to finish %TASK_TRACED/STOPPED transition
+ * @child: child to wait for
+ *
+ * There are cases where ptracer needs to ask the ptracee to [re]enter
+ * %TASK_TRACED which involves the tracee going through %TASK_RUNNING
+ * briefly, which could affect operation of ptrace(2) and wait(2).
+ *
+ * %JOBCTL_TRAPPING is used to hide such transitions from the ptracer.
+ * It's set when such transition is initiated by the ptracer and cleared on
+ * completion.  Operations which may be affected should call this function
+ * to make sure no transition is in progress before proceeding.
+ *
+ * This function checks whether @child is TRAPPING and, if so, waits for
+ * the transition to complete.
+ *
+ * CONTEXT:
+ * read_lock(&tasklist_lock) and spin_lock_irq(&child->sighand->siglock).
+ * On %true return, both locks are released and the function might have
+ * slept.
+ *
+ * RETURNS:
+ * %false if @child wasn't trapping and nothing happened.  %true if waited
+ * for trapping transition and released both locks.
+ */
+bool ptrace_wait_trapping(struct task_struct *child)
+	__releases(&child->sighand->siglock)
+	__releases(&tasklist_lock)
+{
+	if (likely(!(child->jobctl & JOBCTL_TRAPPING)))
+		return false;
+
+	spin_unlock_irq(&child->sighand->siglock);
+	get_task_struct(child);
+	read_unlock(&tasklist_lock);
+	wait_on_bit(&child->jobctl, ilog2(JOBCTL_TRAPPING),
+		    ptrace_trapping_sleep_fn, TASK_UNINTERRUPTIBLE);
+	put_task_struct(child);
+	return true;
+}
+
 /*
  * ptrace a task: make the debugger its new parent and
  * move it to the ptrace list.
@@ -140,6 +181,8 @@ int ptrace_check_attach(struct task_stru
 		WARN_ON_ONCE(task_is_stopped(child));
 		if (task_is_traced(child) || ignore_state)
 			ret = 0;
+		else if (ptrace_wait_trapping(child))
+			return restart_syscall();
 		spin_unlock_irq(&child->sighand->siglock);
 	}
 	read_unlock(&tasklist_lock);
@@ -203,7 +246,6 @@ bool ptrace_may_access(struct task_struc
 
 static int ptrace_attach(struct task_struct *task)
 {
-	bool wait_trap = false;
 	int retval;
 
 	audit_ptrace(task);
@@ -249,15 +291,13 @@ static int ptrace_attach(struct task_str
 	 * If the task is already STOPPED, set JOBCTL_STOP_PENDING and
 	 * TRAPPING, and kick it so that it transits to TRACED.  TRAPPING
 	 * will be cleared if the child completes the transition or any
-	 * event which clears the group stop states happens.  We'll wait
-	 * for the transition to complete before returning from this
-	 * function.
+	 * event which clears the group stop states happens.
 	 *
-	 * This hides STOPPED -> RUNNING -> TRACED transition from the
-	 * attaching thread but a different thread in the same group can
-	 * still observe the transient RUNNING state.  IOW, if another
-	 * thread's WNOHANG wait(2) on the stopped tracee races against
-	 * ATTACH, the wait(2) may fail due to the transient RUNNING.
+	 * This is to hide STOPPED -> RUNNING -> TRACED transition from
+	 * wait(2) and ptrace(2).  If called before the transition is
+	 * complete, both will wait for TRAPPING to be cleared and retry,
+	 * thus hiding the transition from userland; however, the transient
+	 * RUNNING state is still visible through /proc.
 	 *
 	 * The following task_is_stopped() test is safe as both transitions
 	 * in and out of STOPPED are protected by siglock.
@@ -265,7 +305,6 @@ static int ptrace_attach(struct task_str
 	if (task_is_stopped(task)) {
 		task->jobctl |= JOBCTL_STOP_PENDING | JOBCTL_TRAPPING;
 		signal_wake_up(task, 1);
-		wait_trap = true;
 	}
 
 	spin_unlock(&task->sighand->siglock);
@@ -276,9 +315,6 @@ unlock_tasklist:
 unlock_creds:
 	mutex_unlock(&task->signal->cred_guard_mutex);
 out:
-	if (wait_trap)
-		wait_on_bit(&task->jobctl, ilog2(JOBCTL_TRAPPING),
-			    ptrace_trapping_sleep_fn, TASK_UNINTERRUPTIBLE);
 	return retval;
 }
 
Index: work/kernel/exit.c
===================================================================
--- work.orig/kernel/exit.c
+++ work/kernel/exit.c
@@ -1409,15 +1409,29 @@ static int wait_task_stopped(struct wait
 	if (!ptrace && !(wo->wo_flags & WUNTRACED))
 		return 0;
 
-	if (!task_stopped_code(p, ptrace))
+	/*
+	 * For ptrace waits, we can't reliably check whether wait condition
+	 * exists without grabbing siglock due to JOBCTL_TRAPPING
+	 * transitions.  A task might be temporarily in TASK_RUNNING while
+	 * trapping which should be transparent to the ptracer.
+	 *
+	 * Note that we can avoid unconditionally grabbing siglock by
+	 * wrapping TRAPPING test with two rmb's; however, let's stick with
+	 * simpler implementation for now.
+	 */
+	if (!ptrace && !(p->signal->flags & SIGNAL_STOP_STOPPED))
 		return 0;
 
 	exit_code = 0;
 	spin_lock_irq(&p->sighand->siglock);
 
 	p_code = task_stopped_code(p, ptrace);
-	if (unlikely(!p_code))
+	if (unlikely(!p_code)) {
+		/* if trapping, wait for it and restart the whole process */
+		if (ptrace && ptrace_wait_trapping(p))
+			return restart_syscall();
 		goto unlock_sig;
+	}
 
 	exit_code = *p_code;
 	if (!exit_code)
Index: work/include/linux/ptrace.h
===================================================================
--- work.orig/include/linux/ptrace.h
+++ work/include/linux/ptrace.h
@@ -105,6 +105,7 @@ extern long arch_ptrace(struct task_stru
 extern int ptrace_readdata(struct task_struct *tsk, unsigned long src, char __user *dst, int len);
 extern int ptrace_writedata(struct task_struct *tsk, char __user *src, unsigned long dst, int len);
 extern void ptrace_disable(struct task_struct *);
+extern bool ptrace_wait_trapping(struct task_struct *child);
 extern int ptrace_check_attach(struct task_struct *task, bool ignore_state);
 extern int ptrace_request(struct task_struct *child, long request,
 			  unsigned long addr, unsigned long data);

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/9] job control: reorganize wait_task_stopped()
  2011-05-13 15:46 ` [PATCH 1/9] job control: reorganize wait_task_stopped() Tejun Heo
@ 2011-05-16 11:56   ` Oleg Nesterov
  0 siblings, 0 replies; 24+ messages in thread
From: Oleg Nesterov @ 2011-05-16 11:56 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan,
	bdonlan

On 05/13, Tejun Heo wrote:
>
> wait_task_stopped() tested task_stopped_code() without acquiring
> siglock and, if stop condition existed, called wait_task_stopped() and
> directly returned the result.  This patch moves the initial
> task_stopped_code() testing into wait_task_stopped() and make
> wait_consider_task() fall through to wait_task_continue() on 0 return.

FYI, I already applied this one.

Oleg.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/9] ptrace: relocate set_current_state(TASK_TRACED) in ptrace_stop()
  2011-05-13 15:46 ` [PATCH 4/9] ptrace: relocate set_current_state(TASK_TRACED) in ptrace_stop() Tejun Heo
@ 2011-05-16 11:57   ` Oleg Nesterov
  2011-05-16 13:16     ` Tejun Heo
  0 siblings, 1 reply; 24+ messages in thread
From: Oleg Nesterov @ 2011-05-16 11:57 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan,
	bdonlan

On 05/13, Tejun Heo wrote:
>
> In ptrace_stop(), after arch hook is done, the task state and jobctl
> bits are updated while holding siglock.  The ordering requirement
> there is that TASK_TRACED is set before JOBCTL_TRAPPING is cleared to
> prevent ptracer waiting on TRAPPING doesn't end up waking up TRACED is
> actually set and sees TASK_RUNNING in wait(2).
>
> Move set_current_state(TASK_TRACED) to the top of the block and
> reorganize comments.  This makes the ordering more obvious
> (TASK_TRACED before other updates)

I am ab bit confused... please see below.

> and helps future updates to group
> stop participation.

OK, so I assume we need this change.

> This patch doesn't cause any functional change.

Agreed, so the patch looks fine.


But the comment looks a bit confusing to me. This is fine, I almost never
read them ;) Just I'd like to ensure I din't miss something.

> @@ -1733,6 +1733,18 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
>  	}
>
>  	/*
> +	 * We're committing to trapping.  TRACED should be visible before
> +	 * TRAPPING is cleared

This looks as if you explain the barrier in set_current_state(). And,
btw, why can't we use __set_current_state() here ?

And. not only TRACED, at least ->exit_code should be visible as well.

IOW. It is not that TRACED should be visible before jobctl &= ~JOBCTL_TRAPPING,
we should correctly update the tracee before __wake_up_sync_key(), and I assume
this is what the comment says.

Correct?

Oleg.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH UPDATED 8/9] ptrace: move JOBCTL_TRAPPING wait to wait(2) and ptrace_check_attach()
  2011-05-14 14:22   ` [PATCH UPDATED " Tejun Heo
@ 2011-05-16 12:11     ` Oleg Nesterov
  2011-05-16 13:36       ` Tejun Heo
  0 siblings, 1 reply; 24+ messages in thread
From: Oleg Nesterov @ 2011-05-16 12:11 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan,
	bdonlan

On 05/14, Tejun Heo wrote:
>
> @@ -1409,15 +1409,29 @@ static int wait_task_stopped(struct wait
>  	if (!ptrace && !(wo->wo_flags & WUNTRACED))
>  		return 0;
>
> -	if (!task_stopped_code(p, ptrace))
> +	/*
> +	 * For ptrace waits, we can't reliably check whether wait condition
> +	 * exists without grabbing siglock due to JOBCTL_TRAPPING
> +	 * transitions.  A task might be temporarily in TASK_RUNNING while
> +	 * trapping which should be transparent to the ptracer.
> +	 *
> +	 * Note that we can avoid unconditionally grabbing siglock by
> +	 * wrapping TRAPPING test with two rmb's; however, let's stick with
> +	 * simpler implementation for now.
> +	 */
> +	if (!ptrace && !(p->signal->flags & SIGNAL_STOP_STOPPED))
>  		return 0;
>
>  	exit_code = 0;
>  	spin_lock_irq(&p->sighand->siglock);
>
>  	p_code = task_stopped_code(p, ptrace);
> -	if (unlikely(!p_code))
> +	if (unlikely(!p_code)) {
> +		/* if trapping, wait for it and restart the whole process */
> +		if (ptrace && ptrace_wait_trapping(p))
> +			return restart_syscall();

Hmm. I didn't even know we have restart_syscall()... It is a bit fragile,
it assumes recalc_sigpending() is not possible during return from syscall.
In particular this means recalc_sigpending() must not be called in irq.
OK, this seems to be true.

Anyway, restart_syscall() is not right for do_wait(), especially with the
next patch. If the caller was woken by the real signal which has a handler,
we should not restart without SA_RESTART.


It is very hard to review this series. Without the further changes, it is
not clear why do we need these preparations. IIUC, ptrace_wait_trapping()
is only needed because we are going to re-trap. Otherwise we could always
wait in ptrace_attach() afaics.

I am still worried we are loosing the tight control over JOBCTL_TRAPPING.
6/9 contributes to this too.

Oleg.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 6/9] job control: make task_clear_jobctl_pending() clear TRAPPING automatically
  2011-05-13 15:46 ` [PATCH 6/9] job control: make task_clear_jobctl_pending() clear TRAPPING automatically Tejun Heo
@ 2011-05-16 12:25   ` Oleg Nesterov
  2011-05-16 13:24     ` Tejun Heo
  0 siblings, 1 reply; 24+ messages in thread
From: Oleg Nesterov @ 2011-05-16 12:25 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan,
	bdonlan

On 05/13, Tejun Heo wrote:
>
> @@ -264,6 +267,9 @@ void task_clear_jobctl_pending(struct task_struct *task, unsigned int mask)
>  		mask |= JOBCTL_STOP_CONSUME | JOBCTL_STOP_DEQUEUED;
>
>  	task->jobctl &= ~mask;
> +
> +	if (!(task->jobctl & JOBCTL_PENDING_MASK))
> +		task_clear_jobctl_trapping(task);
>  }

So, SIGCONT clears JOBCTL_TRAPPING and wakes up the tracer.

I can't really understand this without seeing the next changes, but
it seems this makes some things worse, although I am not sure.

For example. PTRACE_SEIZE should guarantee the tracee will trap and
report. However, if the tracee is stopped during attach, we can race
with SIGCONT. The previous version had the similar problem afaics, but
it was easy (I think) to fix. Now that SIGCONT clears JOBCTL_TRAPPING
we need more complications.

Oleg.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/9] ptrace: relocate set_current_state(TASK_TRACED) in ptrace_stop()
  2011-05-16 11:57   ` Oleg Nesterov
@ 2011-05-16 13:16     ` Tejun Heo
  2011-05-16 15:51       ` Oleg Nesterov
  0 siblings, 1 reply; 24+ messages in thread
From: Tejun Heo @ 2011-05-16 13:16 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan,
	bdonlan

Hey, Oleg.

On Mon, May 16, 2011 at 01:57:11PM +0200, Oleg Nesterov wrote:
> > and helps future updates to group stop participation.
> 
> OK, so I assume we need this change.

We don't necessarily need it but it makes things prettier later.

> But the comment looks a bit confusing to me. This is fine, I almost never
> read them ;) Just I'd like to ensure I din't miss something.

Oleg, IIRC, those comments were taken from your email pointing out
that set_current_state() needs to happen before clearing of TRAPPING,
so, if you're confused, I'm confused too. :-)

> > +	 * We're committing to trapping.  TRACED should be visible before
> > +	 * TRAPPING is cleared
> 
> This looks as if you explain the barrier in set_current_state(). And,
> btw, why can't we use __set_current_state() here ?
> 
> And. not only TRACED, at least ->exit_code should be visible as well.

The racy part was task_is_stopped_or_traced() in task_stopped_code()
and the value of exit_code doesn't matter at that point.  So, we need
at least smp_wmb() between __set_current_state() and clearing
TRAPPING.

> IOW. It is not that TRACED should be visible before jobctl &= ~JOBCTL_TRAPPING,
> we should correctly update the tracee before __wake_up_sync_key(), and I assume
> this is what the comment says.
> 
> Correct?

All we need to update on the tracee is tracee->state and
~JOBCTL_TRAPPING and __wake_up_sync_key() can be considered single
operation.  One doesn't make sense with the other.  Anyways, if you
wanna update the comment, please go ahead.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 6/9] job control: make task_clear_jobctl_pending() clear TRAPPING automatically
  2011-05-16 12:25   ` Oleg Nesterov
@ 2011-05-16 13:24     ` Tejun Heo
  2011-05-16 16:00       ` Oleg Nesterov
  0 siblings, 1 reply; 24+ messages in thread
From: Tejun Heo @ 2011-05-16 13:24 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan,
	bdonlan

Hello,

On Mon, May 16, 2011 at 02:25:35PM +0200, Oleg Nesterov wrote:
> On 05/13, Tejun Heo wrote:
> >
> > @@ -264,6 +267,9 @@ void task_clear_jobctl_pending(struct task_struct *task, unsigned int mask)
> >  		mask |= JOBCTL_STOP_CONSUME | JOBCTL_STOP_DEQUEUED;
> >
> >  	task->jobctl &= ~mask;
> > +
> > +	if (!(task->jobctl & JOBCTL_PENDING_MASK))
> > +		task_clear_jobctl_trapping(task);
> >  }
> 
> So, SIGCONT clears JOBCTL_TRAPPING and wakes up the tracer.

If JOBCTL_TRAPPING is set && JOBCTL_STOP_PENDING was the only pending
condition.

> I can't really understand this without seeing the next changes, but
> it seems this makes some things worse, although I am not sure.

It's a safety mechanism.  We shouldn't have TRAPPING set when no
stop/trap is pending and the above establishes that invariant and will
also be used to simplify exception cases where all pending conditions
are cleared (tracee being killed).  It's conceptually much easier to
grasp that way - you clear all pending conditions when the task dies
and the mechanism to wait for trap transition is cleared together.

> For example. PTRACE_SEIZE should guarantee the tracee will trap and
> report. However, if the tracee is stopped during attach, we can race
> with SIGCONT. The previous version had the similar problem afaics, but
> it was easy (I think) to fix. Now that SIGCONT clears JOBCTL_TRAPPING
> we need more complications.

This problem doesn't exist anymore.  SEIZE directly schedules
JOBCTL_TRAP_STOP instead of using JOBCTL_STOP_PENDING, so it actually
works exactly as it should.

1. If tracee was stopped and tracer attaches, tracee should re-trap
   and tracer should wait for TRAPPING.

2. If tracee was not stopped and tracer attaches, tracee traps to
   STOP.

3. If tracee was stopped and tracer attaches but SIGCONT is received
   before TRAPPING is done, TRAPPING is cleared and tracer and tracee
   are in the same conditions as #2, which is exactly how they should
   be.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH UPDATED 8/9] ptrace: move JOBCTL_TRAPPING wait to wait(2) and ptrace_check_attach()
  2011-05-16 12:11     ` Oleg Nesterov
@ 2011-05-16 13:36       ` Tejun Heo
  2011-05-16 16:04         ` Oleg Nesterov
  0 siblings, 1 reply; 24+ messages in thread
From: Tejun Heo @ 2011-05-16 13:36 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan,
	bdonlan

Hello,

On Mon, May 16, 2011 at 02:11:42PM +0200, Oleg Nesterov wrote:
> Hmm. I didn't even know we have restart_syscall()... It is a bit fragile,
> it assumes recalc_sigpending() is not possible during return from syscall.
> In particular this means recalc_sigpending() must not be called in irq.
> OK, this seems to be true.

Yes, it is quite fragile.  I think we can make it more reliable
instead of each path which might clear TIF_SIGPENDING to be careful
about not hitting it, but that's a separate issue and we need the
syscall restart mechanism anyway for other purposes.

> Anyway, restart_syscall() is not right for do_wait(), especially with the
> next patch. If the caller was woken by the real signal which has a handler,
> we should not restart without SA_RESTART.

I don't think it really matters and might even be incorrect if we do
that.  e.g. we would introduce -EINTR failure to WNOHANG waits.  These
waits and restarts are transient and should be transparent.  Maybe if
the ptracer is ptraced, syscall entry trap can give away that the
syscall was retried when SA_RESTART wasn't set but I'd be happy to
ignore that.

The only case that I think of where this could be visible is sleeping
wait(2), checking the state again after being woken up and gets signal
while waiting for TRAPPING.  In this case, yeap, we should fail with
-EINTR.

Hmmm... so, I suppose, we should either return -ERESTARTNOINTR or
-ERESTARTSYS depending on WNOHANG.  How does that sound?

> It is very hard to review this series. Without the further changes, it is
> not clear why do we need these preparations. IIUC, ptrace_wait_trapping()
> is only needed because we are going to re-trap. Otherwise we could always
> wait in ptrace_attach() afaics.

I think the patches in this series should be able to stand on
themselves.  It's relaxing TRAPPING constraints and making it
generally safer.  With these changes, SEIZE/INTERRUPT/notification
implementation becomes quite straight-forward.

> I am still worried we are loosing the tight control over JOBCTL_TRAPPING.
> 6/9 contributes to this too.

Setting of TRAPPING is scary.  Clearing isn't.  The worst that can
happen is unexpected failure of WNOHANG wait or ptrace request - even
that isn't gonna happen if the application is sensible enough to use
sleeping wait(2) after attaching.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/9] ptrace: relocate set_current_state(TASK_TRACED) in ptrace_stop()
  2011-05-16 13:16     ` Tejun Heo
@ 2011-05-16 15:51       ` Oleg Nesterov
  2011-05-16 15:59         ` Tejun Heo
  0 siblings, 1 reply; 24+ messages in thread
From: Oleg Nesterov @ 2011-05-16 15:51 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan,
	bdonlan

On 05/16, Tejun Heo wrote:
>
> Hey, Oleg.
>
> On Mon, May 16, 2011 at 01:57:11PM +0200, Oleg Nesterov wrote:
> > > and helps future updates to group stop participation.
> >
> > OK, so I assume we need this change.
>
> We don't necessarily need it but it makes things prettier later.
>
> > But the comment looks a bit confusing to me. This is fine, I almost never
> > read them ;) Just I'd like to ensure I din't miss something.
>
> Oleg, IIRC, those comments were taken from your email pointing out
> that set_current_state() needs to happen before clearing of TRAPPING,
> so, if you're confused, I'm confused too. :-)

So, we are both confused. Great!

> > > +	 * We're committing to trapping.  TRACED should be visible before
> > > +	 * TRAPPING is cleared
> >
> > This looks as if you explain the barrier in set_current_state(). And,
> > btw, why can't we use __set_current_state() here ?
> >
> > And. not only TRACED, at least ->exit_code should be visible as well.
>
> The racy part was task_is_stopped_or_traced() in task_stopped_code()
> and the value of exit_code doesn't matter at that point.

Why exit_code doesn't matter? task_stopped_code() needs
task_is_stopped_or_traced() && exit_code != 0. Both changes should be
visible.

> So, we need
> at least smp_wmb() between __set_current_state() and clearing
> TRAPPING.

I don't think so. Please see below,

> > IOW. It is not that TRACED should be visible before jobctl &= ~JOBCTL_TRAPPING,
> > we should correctly update the tracee before __wake_up_sync_key(), and I assume
> > this is what the comment says.
> >
> > Correct?
>
> All we need to update on the tracee is tracee->state and
> ~JOBCTL_TRAPPING and __wake_up_sync_key() can be considered single
> operation.

Yes! IOW, it safe to reorder the memory operations which change ->state,
->exit_code, and ->jobctl. This only important thing is that we should not
wake up the tracer before we change them.


And if I remember correctly this was the problem, the early patches did
something like

	task_clear_jobctl_trapping();
	set_current_state(TASK_TRACED);

Oleg.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/9] ptrace: relocate set_current_state(TASK_TRACED) in ptrace_stop()
  2011-05-16 15:51       ` Oleg Nesterov
@ 2011-05-16 15:59         ` Tejun Heo
  2011-05-16 16:34           ` Oleg Nesterov
  0 siblings, 1 reply; 24+ messages in thread
From: Tejun Heo @ 2011-05-16 15:59 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan,
	bdonlan

Hello,

On Mon, May 16, 2011 at 05:51:58PM +0200, Oleg Nesterov wrote:
> > The racy part was task_is_stopped_or_traced() in task_stopped_code()
> > and the value of exit_code doesn't matter at that point.
> 
> Why exit_code doesn't matter? task_stopped_code() needs
> task_is_stopped_or_traced() && exit_code != 0. Both changes should be
> visible.

Because the actual exit_code is checked only after grabbing siglock.
As long as task_is_stopped_or_traced() is true, ptracer will grab
siglock and then check exit_code, so TASK_* is the only thing which
matters.  Fragile, if you ask me - we don't have proper mechanism
around lockless do_wait() checks and synchronizations are just
scattered around the code.

> > All we need to update on the tracee is tracee->state and
> > ~JOBCTL_TRAPPING and __wake_up_sync_key() can be considered single
> > operation.
> 
> Yes! IOW, it safe to reorder the memory operations which change ->state,
> ->exit_code, and ->jobctl. This only important thing is that we should not
> wake up the tracer before we change them.
> 
> And if I remember correctly this was the problem, the early patches did
> something like
> 
> 	task_clear_jobctl_trapping();
> 	set_current_state(TASK_TRACED);

Right, try_to_wake_up() already contains smp_wmb().  We'll be fine
with __set_current_state().  Can we do it in a later patch?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 6/9] job control: make task_clear_jobctl_pending() clear TRAPPING automatically
  2011-05-16 13:24     ` Tejun Heo
@ 2011-05-16 16:00       ` Oleg Nesterov
  2011-05-16 16:09         ` Tejun Heo
  0 siblings, 1 reply; 24+ messages in thread
From: Oleg Nesterov @ 2011-05-16 16:00 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan,
	bdonlan

On 05/16, Tejun Heo wrote:
>
> Hello,
>
> On Mon, May 16, 2011 at 02:25:35PM +0200, Oleg Nesterov wrote:
> > On 05/13, Tejun Heo wrote:
> > >
> > > @@ -264,6 +267,9 @@ void task_clear_jobctl_pending(struct task_struct *task, unsigned int mask)
> > >  		mask |= JOBCTL_STOP_CONSUME | JOBCTL_STOP_DEQUEUED;
> > >
> > >  	task->jobctl &= ~mask;
> > > +
> > > +	if (!(task->jobctl & JOBCTL_PENDING_MASK))
> > > +		task_clear_jobctl_trapping(task);
> > >  }
> >
> > So, SIGCONT clears JOBCTL_TRAPPING and wakes up the tracer.
>
> If JOBCTL_TRAPPING is set && JOBCTL_STOP_PENDING was the only pending
> condition.
>
> > I can't really understand this without seeing the next changes, but
> > it seems this makes some things worse, although I am not sure.
>
> It's a safety mechanism.  We shouldn't have TRAPPING set when no
> stop/trap is pending and the above establishes that invariant

Hmm. I thought that SIGCONT should add the new TRAPPING... My head spins.

> > For example. PTRACE_SEIZE should guarantee the tracee will trap and
> > report. However, if the tracee is stopped during attach, we can race
> > with SIGCONT. The previous version had the similar problem afaics, but
> > it was easy (I think) to fix. Now that SIGCONT clears JOBCTL_TRAPPING
> > we need more complications.
>
> This problem doesn't exist anymore.

OK. Of course I do not understand your explanation right now, I do not
see the code, but I trust you ;)


My only point, I still think that it is better to not apply these
preparations right now, without the next SEIZE/etc changes.

Oleg.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH UPDATED 8/9] ptrace: move JOBCTL_TRAPPING wait to wait(2) and ptrace_check_attach()
  2011-05-16 13:36       ` Tejun Heo
@ 2011-05-16 16:04         ` Oleg Nesterov
  0 siblings, 0 replies; 24+ messages in thread
From: Oleg Nesterov @ 2011-05-16 16:04 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan,
	bdonlan

On 05/16, Tejun Heo wrote:
>
> On Mon, May 16, 2011 at 02:11:42PM +0200, Oleg Nesterov wrote:
> > Anyway, restart_syscall() is not right for do_wait(), especially with the
> > next patch. If the caller was woken by the real signal which has a handler,
> > we should not restart without SA_RESTART.
>
> I don't think it really matters and might even be incorrect if we do
> that.  e.g. we would introduce -EINTR failure to WNOHANG waits.

Agreed.

> The only case that I think of where this could be visible is sleeping
> wait(2), checking the state again after being woken up and gets signal
> while waiting for TRAPPING.  In this case, yeap, we should fail with
> -EINTR.

Yes, -EINTR or restart if SA_RESTART.

Oleg.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 6/9] job control: make task_clear_jobctl_pending() clear TRAPPING automatically
  2011-05-16 16:00       ` Oleg Nesterov
@ 2011-05-16 16:09         ` Tejun Heo
  0 siblings, 0 replies; 24+ messages in thread
From: Tejun Heo @ 2011-05-16 16:09 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan,
	bdonlan

Hello,

On Mon, May 16, 2011 at 06:00:19PM +0200, Oleg Nesterov wrote:
> > It's a safety mechanism.  We shouldn't have TRAPPING set when no
> > stop/trap is pending and the above establishes that invariant
> 
> Hmm. I thought that SIGCONT should add the new TRAPPING... My head spins.

So, it's two-fold, but we better talk about this with actual patches.

* SIGCONT always clears pending group stop - ie. JOBCTL_STOP_PENDING.
  If clearing of that flag leaves no group stop / trap condition
  pending (group stop pending was the only condition), it
  automatically clears TRAPPING.

* SIGCONT schedules JOBCTL_TRAP_STOP (STOP trap site for ptracer) and
  schedules TRAPPING, if tracee is already in TRAP_STOP for
  notification.  Note that this condition doesn't interact with the
  above action.

> My only point, I still think that it is better to not apply these
> preparations right now, without the next SEIZE/etc changes.

Sure thing.  As long as you don't see anything obviously wrong in the
prep patches, the rest of the series should be easier to review.

Anyways, will post soon.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/9] ptrace: relocate set_current_state(TASK_TRACED) in ptrace_stop()
  2011-05-16 15:59         ` Tejun Heo
@ 2011-05-16 16:34           ` Oleg Nesterov
  0 siblings, 0 replies; 24+ messages in thread
From: Oleg Nesterov @ 2011-05-16 16:34 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan,
	bdonlan

On 05/16, Tejun Heo wrote:
>
> Hello,
>
> On Mon, May 16, 2011 at 05:51:58PM +0200, Oleg Nesterov wrote:
> > > The racy part was task_is_stopped_or_traced() in task_stopped_code()
> > > and the value of exit_code doesn't matter at that point.
> >
> > Why exit_code doesn't matter? task_stopped_code() needs
> > task_is_stopped_or_traced() && exit_code != 0. Both changes should be
> > visible.
>
> Because the actual exit_code is checked only after grabbing siglock.

OK, this is true after the next 8/9 patch.

> As long as task_is_stopped_or_traced() is true, ptracer will grab
> siglock

Hmm. No? task_is_stopped_or_traced() is also checked under ->siglock?

Confused.

> > > All we need to update on the tracee is tracee->state and
> > > ~JOBCTL_TRAPPING and __wake_up_sync_key() can be considered single
> > > operation.
> >
> > Yes! IOW, it safe to reorder the memory operations which change ->state,
> > ->exit_code, and ->jobctl. This only important thing is that we should not
> > wake up the tracer before we change them.
> >
> > And if I remember correctly this was the problem, the early patches did
> > something like
> >
> > 	task_clear_jobctl_trapping();
> > 	set_current_state(TASK_TRACED);
>
> Right, try_to_wake_up() already contains smp_wmb().

Well, I do not think try_to_wake_up()->smp_wmb() is needed in this case,
wait_queue_head_t->lock helps. This wmb() is needed to ensure we do not
change (or even read) task->state before the preceding LOAD's completes.
It is not needed for wait_event-like code.

> We'll be fine
> with __set_current_state().  Can we do it in a later patch?

Sure, this is minor and needs a separate patch anyway.

Oleg.


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2011-05-16 16:36 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-05-13 15:46 [PATCHSET ptrace] ptrace: prepare for PTRACE_SEIZE/INTERRUPT Tejun Heo
2011-05-13 15:46 ` [PATCH 1/9] job control: reorganize wait_task_stopped() Tejun Heo
2011-05-16 11:56   ` Oleg Nesterov
2011-05-13 15:46 ` [PATCH 2/9] job control: rename signal->group_stop and flags to jobctl and rearrange flags Tejun Heo
2011-05-13 15:46 ` [PATCH 3/9] ptrace: ptrace_check_attach(): rename @kill to @ignore_state and add comments Tejun Heo
2011-05-13 15:46 ` [PATCH 4/9] ptrace: relocate set_current_state(TASK_TRACED) in ptrace_stop() Tejun Heo
2011-05-16 11:57   ` Oleg Nesterov
2011-05-16 13:16     ` Tejun Heo
2011-05-16 15:51       ` Oleg Nesterov
2011-05-16 15:59         ` Tejun Heo
2011-05-16 16:34           ` Oleg Nesterov
2011-05-13 15:46 ` [PATCH 5/9] job control: introduce JOBCTL_PENDING_MASK and task_clear_jobctl_pending() Tejun Heo
2011-05-13 15:46 ` [PATCH 6/9] job control: make task_clear_jobctl_pending() clear TRAPPING automatically Tejun Heo
2011-05-16 12:25   ` Oleg Nesterov
2011-05-16 13:24     ` Tejun Heo
2011-05-16 16:00       ` Oleg Nesterov
2011-05-16 16:09         ` Tejun Heo
2011-05-13 15:46 ` [PATCH 7/9] ptrace: use bit_waitqueue for TRAPPING instead of wait_chldexit Tejun Heo
2011-05-13 15:46 ` [PATCH 8/9] ptrace: move JOBCTL_TRAPPING wait to wait(2) and ptrace_check_attach() Tejun Heo
2011-05-14 14:22   ` [PATCH UPDATED " Tejun Heo
2011-05-16 12:11     ` Oleg Nesterov
2011-05-16 13:36       ` Tejun Heo
2011-05-16 16:04         ` Oleg Nesterov
2011-05-13 15:46 ` [PATCH 9/9] ptrace: make TRAPPING wait interruptible Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).