Re: A peculiarity in ptrace/waitpid behavior

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: A peculiarity in ptrace/waitpid behavior
       [not found] <CAJt8pk-+UGsmAzA8cTn3deWfSrDAy__Yh=bqi4_NRqJVhg63JQ@mail.gmail.com>
@ 2015-03-20 16:25 ` Oleg Nesterov
       [not found]   ` <CAJt8pk8-=xV_Tofr2W7jT8kLX-GseEeL5d2+w0U4zv2QqnP6rQ@mail.gmail.com>
  0 siblings, 1 reply; 6+ messages in thread
From: Oleg Nesterov @ 2015-03-20 16:25 UTC (permalink / raw)
  To: Pavel Labath; +Cc: linux-kernel

Hi Pavel,

let me add lkml, we should not discuss this offlist.

On 03/20, Pavel Labath wrote:
>
> 1) we get a waitpid() notification that the tracee got SIGUSR1
> 2) we do a ptrace(GETSIGINFO) to get more info
> 3) eventually we decide to restart the tracee with PTRACE_CONT, passing it
> SIGUSR1
> 4) immediately after that we get another waitpid notification, again with
> SIGUSR1, even though the thread had received no additional signals
> 5) we again try to a GETSIGINFO, however this time it fails with ESRCH.
> Therefore, we assume that the thread has died

I found a similar bug by code inspection some time ago. I even have
a fix, but I need to think more... And I even wrote the test-case ;)
see below.

But so far I can't say if you hit the same problem or not. If you can
reproduce the problem, perhaps I can send you debugging patch?

Oleg.

#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>
#include <sys/ptrace.h>
#include <sys/syscall.h>
#include <assert.h>

#define tkill(pid, sig)	\
	syscall(__NR_tkill, pid, sig)

void run_test(void)
{
	int pid, stat;

	pid = fork();
	if (!pid) {
		assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0);
		raise(SIGSTOP);
		assert(0);
	}

	assert(pid == wait(&stat) && stat == 0x137f);

	tkill(pid, SIGTRAP);	/* should not be reported */
	tkill(pid, SIGKILL);
	assert(pid == wait(&stat));
	if (stat == 0x9)
		return;

	printf("unexpected wait: stat=%x\n", stat);
	kill(0, SIGKILL);
}

int main(void)
{
	int i = 8; /* random */

	while (--i)
		if (!fork())
			break;

	for (;;)
		run_test();

	return 0;
}


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: A peculiarity in ptrace/waitpid behavior
       [not found]   ` <CAJt8pk8-=xV_Tofr2W7jT8kLX-GseEeL5d2+w0U4zv2QqnP6rQ@mail.gmail.com>
@ 2015-03-20 18:53     ` Pavel Labath
  2015-03-21 18:57     ` Oleg Nesterov
  1 sibling, 0 replies; 6+ messages in thread
From: Pavel Labath @ 2015-03-20 18:53 UTC (permalink / raw)
  Cc: linux-kernel

Sending again, this time as plain text (I hope)...

On 20 March 2015 at 18:46, Pavel Labath <labath@google.com> wrote:
>
> Hi,
>
> thanks for the super quick response. :)
>
> I am at home now, so I don't have access to the same machine to run the test. I will run it on monday and let you know.
>
> Meanwhile, I have tried running your test on my home machine, and it is indeed reporting "unexpected wait: stat=57f". If I understand correctly, that means the wait has reported sigtrap even though the tracee was in ptrace-stop.
>
> I can imagine that something similar is happening in our case. Since PTRACE_CONT and waitpid calls are happening in different threads, I can't positively say which one has occurred sooner. So far I have assumed the sequence was PTRACE_CONT -> waitpid -> PTRACE_SIGINFO. However, if wait can return even though the process is stopped then a possible sequence of events is waitpid -> PTRACE_CONT -> PTRACE_SIGINFO, in which case it is not surprising that the last call fails. One difference I see though is that in our test, we are not sending any additional signals to the thread in question (at least we shouldn't be sending them, but we are sending some signals to other threads in the same process). Do you think it could still be the same issue?
>
> I would be happy to test your patch. I don't think I can patch the kernel on my work machine directly, but I think I might be able to set up some sort of a test environment to try it out.
>
> regards,
> pavel
>
>
> On 20 March 2015 at 16:25, Oleg Nesterov <oleg@redhat.com> wrote:
>>
>> Hi Pavel,
>>
>> let me add lkml, we should not discuss this offlist.
>>
>> On 03/20, Pavel Labath wrote:
>> >
>> > 1) we get a waitpid() notification that the tracee got SIGUSR1
>> > 2) we do a ptrace(GETSIGINFO) to get more info
>> > 3) eventually we decide to restart the tracee with PTRACE_CONT, passing it
>> > SIGUSR1
>> > 4) immediately after that we get another waitpid notification, again with
>> > SIGUSR1, even though the thread had received no additional signals
>> > 5) we again try to a GETSIGINFO, however this time it fails with ESRCH.
>> > Therefore, we assume that the thread has died
>>
>> I found a similar bug by code inspection some time ago. I even have
>> a fix, but I need to think more... And I even wrote the test-case ;)
>> see below.
>>
>> But so far I can't say if you hit the same problem or not. If you can
>> reproduce the problem, perhaps I can send you debugging patch?
>>
>> Oleg.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: A peculiarity in ptrace/waitpid behavior
       [not found]   ` <CAJt8pk8-=xV_Tofr2W7jT8kLX-GseEeL5d2+w0U4zv2QqnP6rQ@mail.gmail.com>
  2015-03-20 18:53     ` Pavel Labath
@ 2015-03-21 18:57     ` Oleg Nesterov
  2015-03-22 15:33       ` Oleg Nesterov
  2015-03-23  9:42       ` Pavel Labath
  1 sibling, 2 replies; 6+ messages in thread
From: Oleg Nesterov @ 2015-03-21 18:57 UTC (permalink / raw)
  To: Pavel Labath; +Cc: linux-kernel

On 03/20, Pavel Labath wrote:
>
> One difference I see though is that in
> our test, we are not sending any additional signals to the thread in
> question (at least we shouldn't be sending them, but we are sending some
> signals to other threads in the same process). Do you think it could still
> be the same issue?

Not sure...

And. I found another race, which looks more promising wrt your description.
ptrace_resume() sets ->exit_code before it wakes the tracee up. If the
tracer's sub-thread calls wait() right after that, it can wrongly see
task_stopped_code(tracee, true) != 0, as if the tracee reports its
->exit_code.

> I would be happy to test your patch. I don't think I can patch the kernel
> on my work machine directly, but I think I might be able to set up some
> sort of a test environment to try it out.

Thanks! could you try the patch below? It won't help my test-case, but
_perhaps_ it can fix the problem you hit?

And a couple of questions just in case...

Which kernel version? Although probably this doesn't matter, this race
is very-very old.

Let me return to your description,

	1) we get a waitpid() notification that the tracee got SIGUSR1
	2) we do a ptrace(GETSIGINFO) to get more info
	3) eventually we decide to restart the tracee with PTRACE_CONT, passing it
	   SIGUSR1
	4) immediately after that we get another waitpid notification, again with
	   SIGUSR1,

Does this "waitpid notification" mean that _another_ thread returns
from waitpid() ?

And status == (SIGUSR1 << 8) | 0x7f , yes? IOW, is WIFSTOPPED() true?

Oleg.

--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -724,8 +724,10 @@ static int ptrace_resume(struct task_struct *child, long request,
 		user_disable_single_step(child);
 	}
 
+	spin_lock_irq(&child->sighand->siglock);
 	child->exit_code = data;
 	wake_up_state(child, __TASK_TRACED);
+	spin_unlock_irq(&child->sighand->siglock);
 
 	return 0;
 }


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: A peculiarity in ptrace/waitpid behavior
  2015-03-21 18:57     ` Oleg Nesterov
@ 2015-03-22 15:33       ` Oleg Nesterov
  2015-03-23  9:42       ` Pavel Labath
  1 sibling, 0 replies; 6+ messages in thread
From: Oleg Nesterov @ 2015-03-22 15:33 UTC (permalink / raw)
  To: Pavel Labath; +Cc: linux-kernel, Josh Stone

On 03/21, Oleg Nesterov wrote:
>
> On 03/20, Pavel Labath wrote:
> >
> > One difference I see though is that in
> > our test, we are not sending any additional signals to the thread in
> > question (at least we shouldn't be sending them, but we are sending some
> > signals to other threads in the same process). Do you think it could still
> > be the same issue?
>
> Not sure...
>
> And. I found another race, which looks more promising wrt your description.
> ptrace_resume() sets ->exit_code before it wakes the tracee up. If the
> tracer's sub-thread calls wait() right after that, it can wrongly see
> task_stopped_code(tracee, true) != 0, as if the tracee reports its
> ->exit_code.
>
> > I would be happy to test your patch. I don't think I can patch the kernel
> > on my work machine directly, but I think I might be able to set up some
> > sort of a test environment to try it out.
>
> Thanks! could you try the patch below? It won't help my test-case, but
> _perhaps_ it can fix the problem you hit?
>
> And a couple of questions just in case...
>
> Which kernel version? Although probably this doesn't matter, this race
> is very-very old.
>
> Let me return to your description,
>
> 	1) we get a waitpid() notification that the tracee got SIGUSR1
> 	2) we do a ptrace(GETSIGINFO) to get more info
> 	3) eventually we decide to restart the tracee with PTRACE_CONT, passing it
> 	   SIGUSR1
> 	4) immediately after that we get another waitpid notification, again with
> 	   SIGUSR1,
>
> Does this "waitpid notification" mean that _another_ thread returns
> from waitpid() ?
>
> And status == (SIGUSR1 << 8) | 0x7f , yes? IOW, is WIFSTOPPED() true?


Please see the updated (comments + optimization) patch below. Note that
it has the test-case, so I'll send it in any case.

But it would be nice to know if it fixes your problem or not.

Josh, this reminds me the obsure bug report we discussed some time ago,
perhaps this is the same thing...

Oleg.

------------------------------------------------------------------------------
Subject: [PATCH] ptrace: fix race between ptrace_resume() and wait_task_stopped()

ptrace_resume() is called when the tracee is still __TASK_TRACED. We set
tracee->exit_code and then wake_up_state() changes tracee->state. If the
tracer's sub-thread does wait() in between, task_stopped_code(ptrace => T)
wrongly looks like another report from tracee.

This confuses debugger, and since wait_task_stopped() clears ->exit_code
the tracee can miss a signal.

Test-case:

	#include <stdio.h>
	#include <unistd.h>
	#include <sys/wait.h>
	#include <sys/ptrace.h>
	#include <pthread.h>
	#include <assert.h>

	int pid;

	void *waiter(void *arg)
	{
		int stat;

		for (;;) {
			assert(pid == wait(&stat));
			assert(WIFSTOPPED(stat));
			if (WSTOPSIG(stat) == SIGHUP)
				continue;

			assert(WSTOPSIG(stat) == SIGCONT);
			printf("ERR! extra/wrong report:%x\n", stat);
		}
	}

	int main(void)
	{
		pthread_t thread;

		pid = fork();
		if (!pid) {
			assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0);
			for (;;)
				kill(getpid(), SIGHUP);
		}

		assert(pthread_create(&thread, NULL, waiter, NULL) == 0);

		for (;;)
			ptrace(PTRACE_CONT, pid, 0, SIGCONT);

		return 0;
	}

Note for stable: the bug is very old, but without 9899d11f6544 "ptrace:
ensure arch_ptrace/ptrace_request can never race with SIGKILL" the fix
should use lock_task_sighand(child).

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Pavel Labath <labath@google.com>
Cc: <stable@vger.kernel.org>
---
 kernel/ptrace.c |   20 ++++++++++++++++++++
 1 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 1eb9d90..5009263 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -697,6 +697,8 @@ static int ptrace_peek_siginfo(struct task_struct *child,
 static int ptrace_resume(struct task_struct *child, long request,
 			 unsigned long data)
 {
+	bool need_siglock;
+
 	if (!valid_signal(data))
 		return -EIO;
 
@@ -724,8 +726,26 @@ static int ptrace_resume(struct task_struct *child, long request,
 		user_disable_single_step(child);
 	}
 
+	/*
+	 * Change ->exit_code and ->state under siglock to avoid the race
+	 * with wait_task_stopped() in between; a non-zero ->exit_code will
+	 * wrongly look like another report from tracee.
+	 *
+	 * Note that we need siglock even if ->exit_code == data and/or this
+	 * status was not reported yet, the new status must not be cleared by
+	 * wait_task_stopped() after resume.
+	 *
+	 * If data == 0 we do not care if wait_task_stopped() reports the old
+	 * status and clears the code too; this can't race with the tracee, it
+	 * takes siglock after resume.
+	 */
+	need_siglock = data && !thread_group_empty(current);
+	if (need_siglock)
+		spin_lock_irq(&child->sighand->siglock);
 	child->exit_code = data;
 	wake_up_state(child, __TASK_TRACED);
+	if (need_siglock)
+		spin_unlock_irq(&child->sighand->siglock);
 
 	return 0;
 }
-- 
1.5.5.1



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: A peculiarity in ptrace/waitpid behavior
  2015-03-21 18:57     ` Oleg Nesterov
  2015-03-22 15:33       ` Oleg Nesterov
@ 2015-03-23  9:42       ` Pavel Labath
  2015-03-24 16:53         ` Pavel Labath
  1 sibling, 1 reply; 6+ messages in thread
From: Pavel Labath @ 2015-03-23  9:42 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: linux-kernel

On 21 March 2015 at 18:57, Oleg Nesterov <oleg@redhat.com> wrote:
> On 03/20, Pavel Labath wrote:
>>
>> One difference I see though is that in
>> our test, we are not sending any additional signals to the thread in
>> question (at least we shouldn't be sending them, but we are sending some
>> signals to other threads in the same process). Do you think it could still
>> be the same issue?
>
> Not sure...
>
> And. I found another race, which looks more promising wrt your description.
> ptrace_resume() sets ->exit_code before it wakes the tracee up. If the
> tracer's sub-thread calls wait() right after that, it can wrongly see
> task_stopped_code(tracee, true) != 0, as if the tracee reports its
> ->exit_code.
Yes, that sounds very much like what is happening. :)


>
>> I would be happy to test your patch. I don't think I can patch the kernel
>> on my work machine directly, but I think I might be able to set up some
>> sort of a test environment to try it out.
>
> Thanks! could you try the patch below? It won't help my test-case, but
> _perhaps_ it can fix the problem you hit?
Thanks for working on this. I will get on testing this. It will take a
while though, because the failure is quite rare and usually only shows
up in a full test suite run, which takes over 5 minutes. :)

>
> And a couple of questions just in case...
>
> Which kernel version? Although probably this doesn't matter, this race
> is very-very old.
Linux labath 3.13.0-46-generic #79-Ubuntu SMP Tue Mar 10 20:06:50 UTC
2015 x86_64 x86_64 x86_64 GNU/Linux
It should be the kernel shipped with Ubuntu 14.04 LTS.

>
> Let me return to your description,
>
>         1) we get a waitpid() notification that the tracee got SIGUSR1
>         2) we do a ptrace(GETSIGINFO) to get more info
>         3) eventually we decide to restart the tracee with PTRACE_CONT, passing it
>            SIGUSR1
>         4) immediately after that we get another waitpid notification, again with
>            SIGUSR1,
>
> Does this "waitpid notification" mean that _another_ thread returns
> from waitpid() ?
Yes, we have one thread doing waitpid() processing in a loop, and
another thread is issuing PTRACE requests.

> And status == (SIGUSR1 << 8) | 0x7f , yes? IOW, is WIFSTOPPED() true?

Exactly. The following is an excerpt from our log, which I was able to capture.
548 1426700116.644772000 [7295/729a]: NativeProcessLinux::Resume()
resuming thread = 29346 with signal SIGUSR1
554 1426700116.644889000 [7295/729c]:
MonitorChildProcessThreadFunction ::waitpid (pid = -29339, &status,
options = 1073741824)...
555 1426700116.644896000 [7295/729c]:
MonitorChildProcessThreadFunction ::waitpid (pid = -29339, &status,
options = 1073741824) => pid = 29346, status = 0x00000a7f (STOPPED),
signal = 10, exit_state = 0
556 1426700116.644906000 [7295/729a]: NativeProcessLinux::Resume()
resuming thread = 29346 result = true

The PTRACE_CONT call happens somewhere between lines 548 and 556,
while the waitpid calls happens between 554 and 555. So it is quite
possible that they are called simultaneously...


Thanks for working on this.

cheers,
pl

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: A peculiarity in ptrace/waitpid behavior
  2015-03-23  9:42       ` Pavel Labath
@ 2015-03-24 16:53         ` Pavel Labath
  0 siblings, 0 replies; 6+ messages in thread
From: Pavel Labath @ 2015-03-24 16:53 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: linux-kernel

Hi,

I have tested your patch and I can confirm that the error is gone when
running a patched kernel. Thanks for fixing this in such a short
notice.

I am still seeing one very rare failure where the SIGUSR does not
appear to be reported. However, I will need to dig around this a bit
more to make sure there is no error on our end.

Now I am thinking about how to work around these bugs, as our code
will need to run on unpatched kernels as well. As for this
ptrace/waitpid race, I think I will just refactor the code to make
wait and ptrace calls on the same thread. This should sidestep the
race, right?

Regarding your bug, I am not exactly sure what are the implications.
Could you briefly describe the situations in which this behavior can
occur? Am I correct in understanding that this is always a race
between a SIGKILL and another non-lethal signal? And that the SIGKILL
will be (eventually) reported?

cheers,
pl

On 23 March 2015 at 09:42, Pavel Labath <labath@google.com> wrote:
> On 21 March 2015 at 18:57, Oleg Nesterov <oleg@redhat.com> wrote:
>> On 03/20, Pavel Labath wrote:
>>>
>>> One difference I see though is that in
>>> our test, we are not sending any additional signals to the thread in
>>> question (at least we shouldn't be sending them, but we are sending some
>>> signals to other threads in the same process). Do you think it could still
>>> be the same issue?
>>
>> Not sure...
>>
>> And. I found another race, which looks more promising wrt your description.
>> ptrace_resume() sets ->exit_code before it wakes the tracee up. If the
>> tracer's sub-thread calls wait() right after that, it can wrongly see
>> task_stopped_code(tracee, true) != 0, as if the tracee reports its
>> ->exit_code.
> Yes, that sounds very much like what is happening. :)
>
>
>>
>>> I would be happy to test your patch. I don't think I can patch the kernel
>>> on my work machine directly, but I think I might be able to set up some
>>> sort of a test environment to try it out.
>>
>> Thanks! could you try the patch below? It won't help my test-case, but
>> _perhaps_ it can fix the problem you hit?
> Thanks for working on this. I will get on testing this. It will take a
> while though, because the failure is quite rare and usually only shows
> up in a full test suite run, which takes over 5 minutes. :)
>
>>
>> And a couple of questions just in case...
>>
>> Which kernel version? Although probably this doesn't matter, this race
>> is very-very old.
> Linux labath 3.13.0-46-generic #79-Ubuntu SMP Tue Mar 10 20:06:50 UTC
> 2015 x86_64 x86_64 x86_64 GNU/Linux
> It should be the kernel shipped with Ubuntu 14.04 LTS.
>
>>
>> Let me return to your description,
>>
>>         1) we get a waitpid() notification that the tracee got SIGUSR1
>>         2) we do a ptrace(GETSIGINFO) to get more info
>>         3) eventually we decide to restart the tracee with PTRACE_CONT, passing it
>>            SIGUSR1
>>         4) immediately after that we get another waitpid notification, again with
>>            SIGUSR1,
>>
>> Does this "waitpid notification" mean that _another_ thread returns
>> from waitpid() ?
> Yes, we have one thread doing waitpid() processing in a loop, and
> another thread is issuing PTRACE requests.
>
>> And status == (SIGUSR1 << 8) | 0x7f , yes? IOW, is WIFSTOPPED() true?
>
> Exactly. The following is an excerpt from our log, which I was able to capture.
> 548 1426700116.644772000 [7295/729a]: NativeProcessLinux::Resume()
> resuming thread = 29346 with signal SIGUSR1
> 554 1426700116.644889000 [7295/729c]:
> MonitorChildProcessThreadFunction ::waitpid (pid = -29339, &status,
> options = 1073741824)...
> 555 1426700116.644896000 [7295/729c]:
> MonitorChildProcessThreadFunction ::waitpid (pid = -29339, &status,
> options = 1073741824) => pid = 29346, status = 0x00000a7f (STOPPED),
> signal = 10, exit_state = 0
> 556 1426700116.644906000 [7295/729a]: NativeProcessLinux::Resume()
> resuming thread = 29346 result = true
>
> The PTRACE_CONT call happens somewhere between lines 548 and 556,
> while the waitpid calls happens between 554 and 555. So it is quite
> possible that they are called simultaneously...
>
>
> Thanks for working on this.
>
> cheers,
> pl

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-03-24 16:54 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CAJt8pk-+UGsmAzA8cTn3deWfSrDAy__Yh=bqi4_NRqJVhg63JQ@mail.gmail.com>
2015-03-20 16:25 ` A peculiarity in ptrace/waitpid behavior Oleg Nesterov
     [not found]   ` <CAJt8pk8-=xV_Tofr2W7jT8kLX-GseEeL5d2+w0U4zv2QqnP6rQ@mail.gmail.com>
2015-03-20 18:53     ` Pavel Labath
2015-03-21 18:57     ` Oleg Nesterov
2015-03-22 15:33       ` Oleg Nesterov
2015-03-23  9:42       ` Pavel Labath
2015-03-24 16:53         ` Pavel Labath

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).