From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754571AbbIRQ0i (ORCPT ); Fri, 18 Sep 2015 12:26:38 -0400 Received: from e39.co.us.ibm.com ([32.97.110.160]:35244 "EHLO e39.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752265AbbIRQ0h (ORCPT ); Fri, 18 Sep 2015 12:26:37 -0400 X-Helo: d03dlp03.boulder.ibm.com X-MailFrom: paulmck@linux.vnet.ibm.com X-RcptTo: linux-kernel@vger.kernel.org Date: Fri, 18 Sep 2015 09:00:38 -0700 From: "Paul E. McKenney" To: Peter Zijlstra Cc: Oleg Nesterov , Dmitry Vyukov , ebiederm@xmission.com, Al Viro , Andrew Morton , Ingo Molnar , mhocko@suse.cz, LKML , ktsan@googlegroups.com, Kostya Serebryany , Andrey Konovalov , Alexander Potapenko , Hans Boehm Subject: Re: [PATCH] kernel: fix data race in put_pid Message-ID: <20150918160038.GO4029@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <1442496268-47803-1-git-send-email-dvyukov@google.com> <20150917160837.GA26050@redhat.com> <20150917174456.GA30178@redhat.com> <20150917180919.GA32116@redhat.com> <20150918085156.GS3816@twins.programming.kicks-ass.net> <20150918132844.GA10241@redhat.com> <20150918134630.GW3816@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150918134630.GW3816@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15091816-0033-0000-0000-000005EFE615 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Sep 18, 2015 at 03:46:30PM +0200, Peter Zijlstra wrote: > On Fri, Sep 18, 2015 at 03:28:44PM +0200, Oleg Nesterov wrote: > > On 09/18, Peter Zijlstra wrote: > > > > > > On Thu, Sep 17, 2015 at 08:09:19PM +0200, Oleg Nesterov wrote: > > > > > > > I need to recheck, but afaics this is not possible. This optimization > > > > is fine, but probably needs a comment. > > > > > > For sure, this code doesn't make any sense to me. > > > > So yes, after a sleep I am starting to agree that in theory this fast-path > > check is wrong. I'll write another email.. > > This other mail will include a patch adding comments to pid.c ? That > code didn't want to make sense to me this morning. > > > > As an alternative patch, could we not do: > > > > > > void put_pid(struct pid *pid) > > > { > > > struct pid_namespace *ns; > > > > > > if (!pid) > > > return; > > > > > > ns = pid->numbers[pid->level].ns; > > > if ((atomic_read(&pid->count) == 1) || > > > atomic_dec_and_test(&pid->count)) { > > > > > > + smp_read_barrier_depends(); /* ctrl-dep */ > > > > Not sure... Firstly it is not clear what this barrier pairs with. And I > > have to admit that I can not understand if _CTRL() logic applies here. > > The same for atomic_read_ctrl(). > > The control dependency barrier pairs with the full barrier of > atomic_dec_and_test. > > So the two put_pid() instances: > > CPU0 CPU1 > > pid->foo = 1; > atomic_dec_and_test() == false atomic_read_ctrl() == 1 > kmem_cache_free(pid) > > CPU0 will modify a pid field and decrement, but not reach 0. > CPU1 finds we're the last, but must also be able to observe our foo > store such that we can rest assured it is complete before we free the > storage. > > The freeing of pid, on CPU1, is stores, these must not happen before we > satisfy the freeing condition, iow a load-store barrier, which is what > the control dependency provides. > > > OK, please forget about put_pid() for the moment. Suppose we have > > > > X = 1; > > synchronize_sched(); > > Y = 1; > > > > Or > > X = 1; > > call_rcu_sched( func => { Y = 1; } ); > > > > > > > > Now. In theory this this code is wrong: > > > > if (Y) { > > BUG_ON(X == 0); > > } > > > > But this is correct: > > > > if (Y) { > > rcu_read_lock_sched(); > > rcu_read_unlock_sched(); > > BUG_ON(X == 0); > > } > > > > So perhaps something like this > > > > /* > > * Comment to explain it is eq to read_lock + read_unlock, > > * in a sense that this guarantees a full barrier wrt to > > * the previous synchronize_sched(). > > */ > > #define rcu_read_barrier_sched() barrier() > > > > make sense? > > > > > > And again, I simply can't understand if this code > > > > if (READ_ONCE_CTRL(Y)) > > BUG_ON(X == 0); > > > > to me it does _not_ look correct in theory. > > So control dependencies provide a load-store barrier. Your examples > above rely on a load-load barrier; BUG_ON(X == 0) is a load. > > kmem_cache_free() OTOH is stores (we must modify the free list). And any reads are bogus, so ordering with writes suffices. Good! Thanx, Paul