From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757226AbZKTBeh (ORCPT ); Thu, 19 Nov 2009 20:34:37 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1756379AbZKTBeg (ORCPT ); Thu, 19 Nov 2009 20:34:36 -0500 Received: from mx1.redhat.com ([209.132.183.28]:51667 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755611AbZKTBee (ORCPT ); Thu, 19 Nov 2009 20:34:34 -0500 Date: Fri, 20 Nov 2009 02:29:30 +0100 From: Oleg Nesterov To: Nick Piggin Cc: Linux Kernel Mailing List , Roland McGrath Subject: Re: Zombie process when ptracing Message-ID: <20091120012930.GA3985@redhat.com> References: <20091119102543.GB5602@wotan.suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20091119102543.GB5602@wotan.suse.de> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On 11/19, Nick Piggin wrote: > > Running recent git kernel, I have a process stuck in Z state > > bash ? 0000000000000000 0 3188 3187 0x00000000 > ffff88012e24fec8 0000000000000046 0000000000000000 0000000000000012 > ffff88012e24fec8 ffff88012e24e000 ffff88012e24ffd8 ffff88012e24e000 > 000000000000efc8 ffff88012e24e000 ffff88012ea82090 ffff88012ff78640 > Call Trace: > [] ? proc_clear_tty+0x5e/0x70 > [] ? exit_ptrace+0xb8/0x140 > [] do_exit+0x58a/0x7c0 > [] do_group_exit+0x3d/0xb0 > [] sys_exit_group+0x12/0x20 > [] system_call_fastpath+0x16/0x1b > > This was after stracing a few test programs. > > It also seems to have lost job control (^C) at the same time. This can happen if the tracer (strace) itself hangs, zombies should go away once the tracer is killed. Or its ->real_parent is stopped or hangs... (I assume you didn't strace /sbin/init) But, > Hmm, and the kernel just paniced with an nmi lockup while I was > trying to get more info. this probably means we have a kernel bug ;) If you see a zombie again, could you look at its /ptoc/pid/status? And of course, which programs did you trace and how? It would be great if we can reproduce the problem. Oleg.