From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752173AbbIRNbn (ORCPT ); Fri, 18 Sep 2015 09:31:43 -0400 Received: from mx1.redhat.com ([209.132.183.28]:52531 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751984AbbIRNbm (ORCPT ); Fri, 18 Sep 2015 09:31:42 -0400 Date: Fri, 18 Sep 2015 15:28:44 +0200 From: Oleg Nesterov To: Peter Zijlstra Cc: Dmitry Vyukov , ebiederm@xmission.com, Al Viro , Andrew Morton , Ingo Molnar , Paul McKenney , mhocko@suse.cz, LKML , ktsan@googlegroups.com, Kostya Serebryany , Andrey Konovalov , Alexander Potapenko , Hans Boehm Subject: Re: [PATCH] kernel: fix data race in put_pid Message-ID: <20150918132844.GA10241@redhat.com> References: <1442496268-47803-1-git-send-email-dvyukov@google.com> <20150917160837.GA26050@redhat.com> <20150917174456.GA30178@redhat.com> <20150917180919.GA32116@redhat.com> <20150918085156.GS3816@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150918085156.GS3816@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09/18, Peter Zijlstra wrote: > > On Thu, Sep 17, 2015 at 08:09:19PM +0200, Oleg Nesterov wrote: > > > I need to recheck, but afaics this is not possible. This optimization > > is fine, but probably needs a comment. > > For sure, this code doesn't make any sense to me. So yes, after a sleep I am starting to agree that in theory this fast-path check is wrong. I'll write another email.. > As an alternative patch, could we not do: > > void put_pid(struct pid *pid) > { > struct pid_namespace *ns; > > if (!pid) > return; > > ns = pid->numbers[pid->level].ns; > if ((atomic_read(&pid->count) == 1) || > atomic_dec_and_test(&pid->count)) { > > + smp_read_barrier_depends(); /* ctrl-dep */ Not sure... Firstly it is not clear what this barrier pairs with. And I have to admit that I can not understand if _CTRL() logic applies here. The same for atomic_read_ctrl(). OK, please forget about put_pid() for the moment. Suppose we have X = 1; synchronize_sched(); Y = 1; Or X = 1; call_rcu_sched( func => { Y = 1; } ); Now. In theory this this code is wrong: if (Y) { BUG_ON(X == 0); } But this is correct: if (Y) { rcu_read_lock_sched(); rcu_read_unlock_sched(); BUG_ON(X == 0); } So perhaps something like this /* * Comment to explain it is eq to read_lock + read_unlock, * in a sense that this guarantees a full barrier wrt to * the previous synchronize_sched(). */ #define rcu_read_barrier_sched() barrier() make sense? And again, I simply can't understand if this code if (READ_ONCE_CTRL(Y)) BUG_ON(X == 0); to me it does _not_ look correct in theory. Oleg.