public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>,
	Pavel Labath <labath@google.com>
Cc: Josh Stone <jistone@redhat.com>, Pedro Alves <palves@redhat.com>,
	Vince Harron <vharron@google.com>,
	linux-kernel@vger.kernel.org
Subject: [PATCH 0/2] ptrace: fix race between ptrace_resume() and wait_task_stopped()
Date: Tue, 24 Mar 2015 19:54:00 +0100	[thread overview]
Message-ID: <20150324185400.GA11826@redhat.com> (raw)

On 03/24, Pavel Labath wrote:
>
> I have tested your patch and I can confirm that the error is gone when
> running a patched kernel.

Great! Thanks a lot for the detailed/clear report and testing.

So I am sending this fix + another patch. 2/2 is "while at it" change,
just because ptrace_detach() can resume the tracee with the new code
too, so it makes sense to add a comment and remove the outdated logic.

> I am still seeing one very rare failure where the SIGUSR does not
> appear to be reported. However, I will need to dig around this a bit
> more to make sure there is no error on our end.

Hmm, perhaps we have (yet) another bug... please let me know if/when
you have more details.

> Now I am thinking about how to work around these bugs, as our code
> will need to run on unpatched kernels as well. As for this
> ptrace/waitpid race, I think I will just refactor the code to make
> wait and ptrace calls on the same thread. This should sidestep the
> race, right?

Yes sure, this will hide the problem.

> Regarding your bug, I am not exactly sure what are the implications.
> Could you briefly describe the situations in which this behavior can
> occur? Am I correct in understanding that this is always a race
> between a SIGKILL and another non-lethal signal? And that the SIGKILL
> will be (eventually) reported?

No, SIGKILL can be never reported. But note that ptrace_stop() does

	set_current_state(TASK_TRACED);

	current->last_siginfo = info;
	current->exit_code = exit_code;

and this is another case when wait_task_stopped() can consume/report this
exit_code even if the tracee won't actually stop because it is killed.

Usually this is not that bad, we can pretend that it was killed after stop.
Still this can confuse the debugger which sends SIGKILL to the stopped tracee.
We need more fatal_signal_pending() checks in ptrace_stop(). And in fact
in get_signal(), I think. The problem is that we need other cleanups here.
But fortunately this problem is minor.

Oleg.


             reply	other threads:[~2015-03-24 18:56 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-24 18:54 Oleg Nesterov [this message]
2015-03-24 18:54 ` [PATCH 1/2] ptrace: fix race between ptrace_resume() and wait_task_stopped() Oleg Nesterov
2015-03-24 18:54 ` [PATCH 2/2] ptrace: ptrace_detach() can no longer race with SIGKILL Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150324185400.GA11826@redhat.com \
    --to=oleg@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=jistone@redhat.com \
    --cc=labath@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=palves@redhat.com \
    --cc=vharron@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox