From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756242Ab2BQAfr (ORCPT ); Thu, 16 Feb 2012 19:35:47 -0500 Received: from mail.linuxfoundation.org ([140.211.169.12]:39367 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753800Ab2BQAfq (ORCPT ); Thu, 16 Feb 2012 19:35:46 -0500 Date: Thu, 16 Feb 2012 16:35:44 -0800 From: Andrew Morton To: Oleg Nesterov Cc: apw@canonical.com, arjan@linux.intel.com, fhrbata@redhat.com, john.johansen@canonical.com, penguin-kernel@I-love.SAKURA.ne.jp, rientjes@google.com, rusty@rustcorp.com.au, tj@kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/4] introduce complete_vfork_done() Message-Id: <20120216163544.4e41e5a5.akpm@linux-foundation.org> In-Reply-To: <20120216172647.GB30393@redhat.com> References: <20120214164709.GA21178@redhat.com> <20120214164914.GF21185@redhat.com> <20120215123049.6e938eed.akpm@linux-foundation.org> <20120216150429.GB11953@redhat.com> <20120216172626.GA30393@redhat.com> <20120216172647.GB30393@redhat.com> X-Mailer: Sylpheed 3.0.2 (GTK+ 2.20.1; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 16 Feb 2012 18:26:47 +0100 Oleg Nesterov wrote: > No functional changes. > > Move the clear-and-complete-vfork_done code into the new trivial > helper, complete_vfork_done(). > > ... > > --- a/fs/exec.c > +++ b/fs/exec.c > @@ -1915,7 +1915,6 @@ static int coredump_wait(int exit_code, struct core_state *core_state) > { > struct task_struct *tsk = current; > struct mm_struct *mm = tsk->mm; > - struct completion *vfork_done; > int core_waiters = -EBUSY; > > init_completion(&core_state->startup); > @@ -1934,11 +1933,8 @@ static int coredump_wait(int exit_code, struct core_state *core_state) > * Make sure nobody is waiting for us to release the VM, > * otherwise we can deadlock when we wait on each other > */ > - vfork_done = tsk->vfork_done; > - if (vfork_done) { > - tsk->vfork_done = NULL; > - complete(vfork_done); > - } > + if (tsk->vfork_done) > + complete_vfork_done(tsk); > > if (core_waiters) > wait_for_completion(&core_state->startup); > > ... > > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -667,6 +667,14 @@ struct mm_struct *mm_access(struct task_struct *task, unsigned int mode) > return mm; > } > > +void complete_vfork_done(struct task_struct *tsk) > +{ > + struct completion *vfork_done = tsk->vfork_done; > + > + tsk->vfork_done = NULL; > + complete(vfork_done); > +} > + > /* Please note the differences between mmput and mm_release. > * mmput is called whenever we stop holding onto a mm_struct, > * error success whatever. > @@ -682,8 +690,6 @@ struct mm_struct *mm_access(struct task_struct *task, unsigned int mode) > */ > void mm_release(struct task_struct *tsk, struct mm_struct *mm) > { > - struct completion *vfork_done = tsk->vfork_done; > - > /* Get rid of any futexes when releasing the mm */ > #ifdef CONFIG_FUTEX > if (unlikely(tsk->robust_list)) { > @@ -703,11 +709,8 @@ void mm_release(struct task_struct *tsk, struct mm_struct *mm) > /* Get rid of any cached register state */ > deactivate_mm(tsk, mm); > > - /* notify parent sleeping on vfork() */ > - if (vfork_done) { > - tsk->vfork_done = NULL; > - complete(vfork_done); > - } > + if (tsk->vfork_done) > + complete_vfork_done(tsk); This all looks somewhat smelly. - Why do we zero tsk->vfork_done in this manner? It *looks* like it's done to prevent the kernel from running complete() twice against a single task in a race situation. If this is the case then it's pretty lame, isn't it? We'd need external locking to firm that up and I'm not seeing it. - Moving the test for non-null tsk->vfork_done into complete_vfork_done() would simplify things a bit? - The complete_vfork_done() interface isn't wonderful. What prevents tsk from getting freed? Presumably the caller must have pinned it in some fashion? Or must hold some lock? Or it's always run against `current', in which case it would be clearer to not pass the task_struct arg at all?