From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757613AbdELRjb (ORCPT ); Fri, 12 May 2017 13:39:31 -0400 Received: from out03.mta.xmission.com ([166.70.13.233]:56709 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755209AbdELRja (ORCPT ); Fri, 12 May 2017 13:39:30 -0400 From: ebiederm@xmission.com (Eric W. Biederman) To: Guenter Roeck Cc: Vovo Yang , Ingo Molnar , linux-kernel@vger.kernel.org References: <20170511171108.GB15063@roeck-us.net> <87shkbfggm.fsf@xmission.com> <20170511202104.GA14720@roeck-us.net> <87y3u3axx8.fsf@xmission.com> <20170511224724.GB15676@roeck-us.net> <8760h79e22.fsf@xmission.com> <8760h66wak.fsf@xmission.com> <20170512165214.GA12960@roeck-us.net> Date: Fri, 12 May 2017 12:33:01 -0500 In-Reply-To: <20170512165214.GA12960@roeck-us.net> (Guenter Roeck's message of "Fri, 12 May 2017 09:52:14 -0700") Message-ID: <874lwqyo8i.fsf@xmission.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1d9EXD-00077p-61;;;mid=<874lwqyo8i.fsf@xmission.com>;;;hst=in01.mta.xmission.com;;;ip=97.121.81.159;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1+iwU3ThPLa9f6XcINoMcpS5+8wc91aNg8= X-SA-Exim-Connect-IP: 97.121.81.159 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.5 XMGappySubj_01 Very gappy subject * 0.0 TVD_RCVD_IP Message was received from an IP address * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa06 1397; Body=1 Fuz1=1 Fuz2=1] X-Spam-DCC: XMission; sa06 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: ;Guenter Roeck X-Spam-Relay-Country: X-Spam-Timing: total 5682 ms - load_scoreonly_sql: 0.03 (0.0%), signal_user_changed: 2.6 (0.0%), b_tie_ro: 1.82 (0.0%), parse: 0.84 (0.0%), extract_message_metadata: 15 (0.3%), get_uri_detail_list: 2.9 (0.1%), tests_pri_-1000: 7 (0.1%), tests_pri_-950: 1.18 (0.0%), tests_pri_-900: 0.97 (0.0%), tests_pri_-400: 28 (0.5%), check_bayes: 27 (0.5%), b_tokenize: 9 (0.2%), b_tok_get_all: 9 (0.2%), b_comp_prob: 3.1 (0.1%), b_tok_touch_all: 2.7 (0.0%), b_finish: 0.61 (0.0%), tests_pri_0: 265 (4.7%), check_dkim_signature: 0.53 (0.0%), check_dkim_adsp: 2.8 (0.0%), tests_pri_500: 5357 (94.3%), poll_dns_idle: 5352 (94.2%), rewrite_mail: 0.00 (0.0%) Subject: Re: Threads stuck in zap_pid_ns_processes() X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Guenter Roeck writes: > Hi Eric, > > On Fri, May 12, 2017 at 08:26:27AM -0500, Eric W. Biederman wrote: >> Vovo Yang writes: >> >> > On Fri, May 12, 2017 at 7:19 AM, Eric W. Biederman >> > wrote: >> >> Guenter Roeck writes: >> >> >> >>> What I know so far is >> >>> - We see this condition on a regular basis in the field. Regular is >> >>> relative, of course - let's say maybe 1 in a Milion Chromebooks >> >>> per day reports a crash because of it. That is not that many, >> >>> but it adds up. >> >>> - We are able to reproduce the problem with a performance benchmark >> >>> which opens 100 chrome tabs. While that is a lot, it should not >> >>> result in a kernel hang/crash. >> >>> - Vovo proviced the test code last night. I don't know if this is >> >>> exactly what is observed in the benchmark, or how it relates to the >> >>> benchmark in the first place, but it is the first time we are actually >> >>> able to reliably create a condition where the problem is seen. >> >> >> >> Thank you. I will be interesting to hear what is happening in the >> >> chrome perfomance benchmark that triggers this. >> >> >> > What's happening in the benchmark: >> > 1. A chrome renderer process was created with CLONE_NEWPID >> > 2. The process crashed >> > 3. Chrome breakpad service calls ptrace(PTRACE_ATTACH, ..) to attach to every >> > threads of the crashed process to dump info >> > 4. When breakpad detach the crashed process, the crashed process stuck in >> > zap_pid_ns_processes() >> >> Very interesting thank you. >> >> So the question is specifically which interaction is causing this. >> >> In the test case provided it was a sibling task in the pid namespace >> dying and not being reaped. Which may be what is happening with >> breakpad. So far I have yet to see kernel bug but I won't rule one out. >> > > I am trying to understand what you are looking for. I would have thought > that both the test application as well as the Chrome functionality > described above show that there are situations where zap_pid_ns_processes() > can get stuck and cause hung task timeouts in conjunction with the use of > ptrace(). > > Your last sentence seems to suggest that you believe that the kernel might > do what it is expected to do. Assuming this is the case, what else would > you like to see ? A test application which matches exactly the Chrome use > case ? We can try to provide that, but I don't entirely understand how > that would change the situation. After all, we already know that it is > possible to get a thread into this condition, and we already have one > means to reproduce it. > > Replacing TASK_UNINTERRUPTIBLE with TASK_INTERRUPTABLE works for both the > test application and the Chrome benchmark. The thread is still stuck in > zap_pid_ns_processes(), but it is now in S (sleep) state instead of D, > and no longer results in a hung task timeout. It remains in that state > until the parent process terminates. I am not entirely happy with it > since the processes are still stuck and may pile up over time, but at > least it solves the immediate problem for us. > > Question now is what to do with that solution. We can of course apply > it locally to Chrome OS, but I would rather have it upstream - especially > since we have to assume that any users of Chrome on Linux, or more > generically anyone using ptrace in conjunction with CLONE_NEWPID, may > experience the same problem. Right now I have no idea how to get there, > though. Can you provide some guidance ? Apologies for not being clear. I intend to send a pull request with the the TASK_UINTERRUPTIBLE to TASK_INTERRUPTIBLE change to Linus in the next week or so with a Cc stable and an appropriate Fixes tag. So the fix can be backported. I have a more comprehensive change queued I will probably merge for 4.13 already but it just changes what kind of zombies you see. It won't remove the ``stuck'' zombies. So what I am looking for now is: Why are things getting stuck in your benchmark? - Is it a userspace bug? In which case we can figure out what userspace (aka breakpad) needs to do to avoid the problem. - Is it a kernel bug with ptrace? There have been a lot of little subtle bugs with ptrace over the years so one more would not surprise So I am just looking to make certain we fix the root issue not just the hung task timeout warning. Eric