From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753530AbbJOPeT (ORCPT ); Thu, 15 Oct 2015 11:34:19 -0400 Received: from mail-lf0-f48.google.com ([209.85.215.48]:36481 "EHLO mail-lf0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751145AbbJOPeS (ORCPT ); Thu, 15 Oct 2015 11:34:18 -0400 Date: Thu, 15 Oct 2015 17:34:13 +0200 From: Frederic Weisbecker To: Oleg Nesterov Cc: Andrew Morton , Rik van Riel , Christoph Lameter , Tejun Heo , Rusty Russell , linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/1] Revert "kmod: handle UMH_WAIT_PROC from system unbound workqueue" Message-ID: <20151015153411.GD12822@lerouge> References: <20151014185153.GA8117@redhat.com> <20151014185209.GB8117@redhat.com> <20151015133737.GC12822@lerouge> <20151015151819.GA22187@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151015151819.GA22187@redhat.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 15, 2015 at 05:18:19PM +0200, Oleg Nesterov wrote: > On 10/15, Frederic Weisbecker wrote: > > > > On Wed, Oct 14, 2015 at 08:52:09PM +0200, Oleg Nesterov wrote: > > > This reverts commit bb304a5c6fc63d8506cd9741a3a5f35b73605625. > > > > > > Because this patch leads to kthread zombies. > > > > > > call_usermodehelper_exec_sync() does fork() + wait() with "unignored" > > > SIGCHLD. What we have missed is that this worker thread can have other > > > children previously forked by call_usermodehelper_exec_work() without > > > UMH_WAIT_PROC. If such a child exits in between it becomes a zombie > > > and nobody can reap it (unless/until this worker thread exits too). > > > > I missed that indeed. > > Heh me too ;) > > > But then when we create the async thread with > > UMH_NO_WAIT, who reaps it? It's created by the workqueue which never > > exits. > > It is auto-reaped because SIGCHILD is ignored. And this is why > bb304a5c6fc6 is wrong; it can die while UMH_WAIT_PROC case waits > for the new child. Oooh, that's subtle! > > > And on others cases, who buries the sync thread? > > The same. > > Please see V2 I sent. I'll try to send more cleanups soon to make > this all more explicit. Ok. > Oleg. >