From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Paul E. McKenney" Subject: Re: [PATCH 3/3] fs: rcu protect inode hash lookups Date: Tue, 2 Nov 2010 05:11:13 -0700 Message-ID: <20101102121113.GG2664@linux.vnet.ibm.com> References: <1288589624-15251-1-git-send-email-david@fromorbit.com> <1288589624-15251-4-git-send-email-david@fromorbit.com> <1288604287.2660.94.camel@edumazet-laptop> <20101101134451.GN2715@dastard> <1288625362.2660.108.camel@edumazet-laptop> <20101102000152.GO2715@dastard> Reply-To: paulmck@linux.vnet.ibm.com Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Eric Dumazet , viro@ZenIV.linux.org.uk, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org To: Dave Chinner Return-path: Received: from e8.ny.us.ibm.com ([32.97.182.138]:35403 "EHLO e8.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750796Ab0KBMLR (ORCPT ); Tue, 2 Nov 2010 08:11:17 -0400 Content-Disposition: inline In-Reply-To: <20101102000152.GO2715@dastard> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Tue, Nov 02, 2010 at 11:01:52AM +1100, Dave Chinner wrote: > On Mon, Nov 01, 2010 at 04:29:22PM +0100, Eric Dumazet wrote: > > Le mardi 02 novembre 2010 =C3=A0 00:44 +1100, Dave Chinner a =C3=A9= crit : > >=20 > > > Perhaps you should rename that file "slab_destroy_by_rcu-tips.txt= ", > > > because the current name seems unrelated to the contents. :/ > > >=20 > >=20 > > Hmm, I dont know, this doc really is about the nulls thing. >=20 > Ok, now I understand - there's a new list type that I didn't know > about called hlist_nulls. not surprising - it's not documented > anywhere. >=20 > Maybe explicitly describing what the list_null pattern =D1=96s or a > pointer to linux/include/list_nulls.h might be appropriate, because > I managed to read that documentation and not realise that it was > refering to a specific type of list that was already implemented > rather than a simple marker technique. >=20 > > This stuff also addressed one problem I forgot to tell you about: D= uring > > a lookup, you might find an item that is moved to another chain by > > another cpu, so your lookup is redirected to another chain. You can= miss > > your target. >=20 > >=20 > So, to go to per-chain locks as per the proposed bit-lock-on-the- > low-bit-of-the-head-pointer infrastructure, we'll have to cross that > with the hlist_null code that plays low-bit pointer games for > detecting the end of the chain. >=20 > That's just messy - another hash chain specific scalability hackup. > My dislike of using hash tables for unbounded caches is not > improved by this.... Indeed, using call_rcu() or synchronize_rcu() guarantees that the ident= ity of a given RCU-protected structure will not change while you remain in = a given RCU read-side critical section. In contrast, SLAB_DESTROY_BY_RCU only guarantees that the type of the object will remain the same. So t= his is the usual complexity/speed tradeoff. Use of call_rcu() gives the fr= eed memory more time to grow cache-cold, and synchronize_rcu() further bloc= ks the caller for several milliseconds, but allows much simpler identity checks, often permitting you to dispense entirely with identity checks. In contrast, SLAB_DESTROY_BY_RCU avoid cache-coldness, but requires much more careful identity checks. > > You must find a way to detect such thing to restart the lookup at > > the beginning (of the right chain). Either you stick the chain > > number in a new inode field (that costs extra memory), or you use > > the 'nulls' value that can let you know if you ended your lookup > > in the right chain. >=20 > The chain is determined by hashing the inode number. Perhaps a > simple enough test is to hash the last inode number on a cache miss > and if that doesn't match the hash of the lookup key we redo the > search. That seems to me like it will avoid needing to play games > with termination markers - does that sound reasonable? As long as you do the test under a lock that prevents the inode's ident= ity from changing. Not sure what you should do about hash collisions, thou= gh. Perhaps recheck the incoming pointer? Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel= " in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html