public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Guenter Roeck <linux@roeck-us.net>
Cc: Vovo Yang <vovoy@google.com>, Ingo Molnar <mingo@kernel.org>,
	linux-kernel@vger.kernel.org
Subject: Re: Threads stuck in zap_pid_ns_processes()
Date: Thu, 01 Jun 2017 20:06:46 -0500	[thread overview]
Message-ID: <87efv38aft.fsf@xmission.com> (raw)
In-Reply-To: <20170601214339.GB8803@roeck-us.net> (Guenter Roeck's message of "Thu, 1 Jun 2017 14:43:39 -0700")

Guenter Roeck <linux@roeck-us.net> writes:

> On Thu, Jun 01, 2017 at 02:36:38PM -0500, Eric W. Biederman wrote:
>> Guenter Roeck <linux@roeck-us.net> writes:
>> 
>> > On Thu, Jun 01, 2017 at 12:08:58PM -0500, Eric W. Biederman wrote:
>> >> Guenter Roeck <linux@roeck-us.net> writes:
>> >> >
>> >> > I think you nailed it. If I drop CLONE_NEWPID from the reproducer I get
>> >> > a zombie process.
>> >> >
>> >> > I guess the only question left is if zap_pid_ns_processes() should (or could)
>> >> > somehow detect that situation and return instead of waiting forever.
>> >> > What do you think ?
>> >> 
>> >> Any chance you can point me at the chromium code that is performing the
>> >> ptrace?
>> >> 
>> >> I want to conduct a review of the kernel semantics to see if the current
>> >> semantics make it unnecessarily easy to get into hang situations.  If
>> >> the semantics make it really easy to get into a hang situation I want
>> >> to see if there is anything we can do to delicately change the semantics
>> >> to avoid the hangs without breaking existing userspace.
>> >> 
>> > The internal bug should be accessible to you.
>> >
>> > https://bugs.chromium.org/p/chromium/issues/detail?id=721298&desc=2
>> >
>> > It has some additional information, and points to the following code in Chrome.
>> >
>> > https://cs.chromium.org/chromium/src/breakpad/src/client/linux/minidump_writer/linux_ptrace_dumper.cc?rcl=47e51739fd00badbceba5bc26b8abc8bbd530989&l=85
>> >
>> > With the information we have, I don't really have a good idea what we could or
>> > should change in Chrome to make the problem disappear, so I just concluded that
>> > we'll have to live with the forever-sleeping task.
>> 
>> I believe I see what is happening.  The code makes the assumption that a
>> thread will stay stopped and will not go away once ptrace attach
>> completes.
>> 
>> Unfortunately if someone sends SIGKILL to the process or exec sends
>> SIGKILL to the individual thread then PTRACE_DETACH will fail.
>> 
>> At which point you can use waitpid to reap the zombie and detach
>> from the thread.
>> 
>> So I think the forever-sleeping can be fixed with something as simple
>> as changing ResumeThread to say:
>> 
>> // Resumes a thread by detaching from it.
>> static bool ResumeThread(pid_t pid) {
>>   if (sys_ptrace(PTRACE_DETACH, pid, NULL, NULL) >= 0)
>>   	return true;
>>   /* Someone killed the thread? */
>>   return waitpid(pid, NULL, 0) == pid;
>> }
>> 
>> It almost certainly makes sense to fix PTRACE_DETACH in the kernel to
>> allow this case to work.  And odds are good that we could make that
>> change without breaking anyone.  So it is worth a try.
>> 
>
> Do I interpret this correctly as "the above code should work, but currently
> doesn't" ?

I added the early exit and the fallback waitpid clause.  So I am saying
with a trivial modification the code can be made to work.

Eric

  reply	other threads:[~2017-06-02  1:13 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-11 17:11 Threads stuck in zap_pid_ns_processes() Guenter Roeck
2017-05-11 17:31 ` Eric W. Biederman
2017-05-11 18:35   ` Guenter Roeck
2017-05-11 20:23     ` Eric W. Biederman
2017-05-11 20:48       ` Guenter Roeck
2017-05-11 21:39         ` Eric W. Biederman
2017-05-11 20:21   ` Guenter Roeck
2017-05-11 21:25     ` Eric W. Biederman
2017-05-11 22:47       ` Guenter Roeck
2017-05-11 23:19         ` Eric W. Biederman
2017-05-12  9:30           ` Vovo Yang
2017-05-12 13:26             ` Eric W. Biederman
2017-05-12 16:52               ` Guenter Roeck
2017-05-12 17:33                 ` Eric W. Biederman
2017-05-12 17:55                   ` [REVIEW][PATCH] pid_ns: Sleep in TASK_INTERRUPTIBLE in zap_pid_ns_processes Eric W. Biederman
2017-05-12 19:33                     ` Guenter Roeck
2017-05-12 19:43                   ` Threads stuck in zap_pid_ns_processes() Guenter Roeck
2017-05-12 20:03                     ` Eric W. Biederman
2017-05-13 14:34                       ` Guenter Roeck
2017-05-13 18:21                         ` Eric W. Biederman
2017-06-01 17:08                         ` Eric W. Biederman
2017-06-01 18:45                           ` Guenter Roeck
2017-06-01 19:36                             ` Eric W. Biederman
2017-06-01 21:43                               ` Guenter Roeck
2017-06-02  1:06                                 ` Eric W. Biederman [this message]
2017-05-12  3:42         ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87efv38aft.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=mingo@kernel.org \
    --cc=vovoy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox