public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Guenter Roeck <linux@roeck-us.net>
Cc: Vovo Yang <vovoy@google.com>, Ingo Molnar <mingo@kernel.org>,
	linux-kernel@vger.kernel.org
Subject: Re: Threads stuck in zap_pid_ns_processes()
Date: Thu, 01 Jun 2017 14:36:38 -0500	[thread overview]
Message-ID: <87tw3z8pq1.fsf@xmission.com> (raw)
In-Reply-To: <20170601184549.GA28522@roeck-us.net> (Guenter Roeck's message of "Thu, 1 Jun 2017 11:45:49 -0700")

Guenter Roeck <linux@roeck-us.net> writes:

> On Thu, Jun 01, 2017 at 12:08:58PM -0500, Eric W. Biederman wrote:
>> Guenter Roeck <linux@roeck-us.net> writes:
>> >
>> > I think you nailed it. If I drop CLONE_NEWPID from the reproducer I get
>> > a zombie process.
>> >
>> > I guess the only question left is if zap_pid_ns_processes() should (or could)
>> > somehow detect that situation and return instead of waiting forever.
>> > What do you think ?
>> 
>> Any chance you can point me at the chromium code that is performing the
>> ptrace?
>> 
>> I want to conduct a review of the kernel semantics to see if the current
>> semantics make it unnecessarily easy to get into hang situations.  If
>> the semantics make it really easy to get into a hang situation I want
>> to see if there is anything we can do to delicately change the semantics
>> to avoid the hangs without breaking existing userspace.
>> 
> The internal bug should be accessible to you.
>
> https://bugs.chromium.org/p/chromium/issues/detail?id=721298&desc=2
>
> It has some additional information, and points to the following code in Chrome.
>
> https://cs.chromium.org/chromium/src/breakpad/src/client/linux/minidump_writer/linux_ptrace_dumper.cc?rcl=47e51739fd00badbceba5bc26b8abc8bbd530989&l=85
>
> With the information we have, I don't really have a good idea what we could or
> should change in Chrome to make the problem disappear, so I just concluded that
> we'll have to live with the forever-sleeping task.

I believe I see what is happening.  The code makes the assumption that a
thread will stay stopped and will not go away once ptrace attach
completes.

Unfortunately if someone sends SIGKILL to the process or exec sends
SIGKILL to the individual thread then PTRACE_DETACH will fail.

At which point you can use waitpid to reap the zombie and detach
from the thread.

So I think the forever-sleeping can be fixed with something as simple
as changing ResumeThread to say:

// Resumes a thread by detaching from it.
static bool ResumeThread(pid_t pid) {
  if (sys_ptrace(PTRACE_DETACH, pid, NULL, NULL) >= 0)
  	return true;
  /* Someone killed the thread? */
  return waitpid(pid, NULL, 0) == pid;
}

It almost certainly makes sense to fix PTRACE_DETACH in the kernel to
allow this case to work.  And odds are good that we could make that
change without breaking anyone.  So it is worth a try.

Eric

  reply	other threads:[~2017-06-01 19:43 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-11 17:11 Threads stuck in zap_pid_ns_processes() Guenter Roeck
2017-05-11 17:31 ` Eric W. Biederman
2017-05-11 18:35   ` Guenter Roeck
2017-05-11 20:23     ` Eric W. Biederman
2017-05-11 20:48       ` Guenter Roeck
2017-05-11 21:39         ` Eric W. Biederman
2017-05-11 20:21   ` Guenter Roeck
2017-05-11 21:25     ` Eric W. Biederman
2017-05-11 22:47       ` Guenter Roeck
2017-05-11 23:19         ` Eric W. Biederman
2017-05-12  9:30           ` Vovo Yang
2017-05-12 13:26             ` Eric W. Biederman
2017-05-12 16:52               ` Guenter Roeck
2017-05-12 17:33                 ` Eric W. Biederman
2017-05-12 17:55                   ` [REVIEW][PATCH] pid_ns: Sleep in TASK_INTERRUPTIBLE in zap_pid_ns_processes Eric W. Biederman
2017-05-12 19:33                     ` Guenter Roeck
2017-05-12 19:43                   ` Threads stuck in zap_pid_ns_processes() Guenter Roeck
2017-05-12 20:03                     ` Eric W. Biederman
2017-05-13 14:34                       ` Guenter Roeck
2017-05-13 18:21                         ` Eric W. Biederman
2017-06-01 17:08                         ` Eric W. Biederman
2017-06-01 18:45                           ` Guenter Roeck
2017-06-01 19:36                             ` Eric W. Biederman [this message]
2017-06-01 21:43                               ` Guenter Roeck
2017-06-02  1:06                                 ` Eric W. Biederman
2017-05-12  3:42         ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tw3z8pq1.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=mingo@kernel.org \
    --cc=vovoy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox