Re: [PATCH 3/3] fs: rcu protect inode hash lookups

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	viro@ZenIV.linux.org.uk, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/3] fs: rcu protect inode hash lookups
Date: Tue, 2 Nov 2010 05:11:13 -0700	[thread overview]
Message-ID: <20101102121113.GG2664@linux.vnet.ibm.com> (raw)
In-Reply-To: <20101102000152.GO2715@dastard>

On Tue, Nov 02, 2010 at 11:01:52AM +1100, Dave Chinner wrote:
> On Mon, Nov 01, 2010 at 04:29:22PM +0100, Eric Dumazet wrote:
> > Le mardi 02 novembre 2010 à 00:44 +1100, Dave Chinner a écrit :
> > 
> > > Perhaps you should rename that file "slab_destroy_by_rcu-tips.txt",
> > > because the current name seems unrelated to the contents. :/
> > > 
> > 
> > Hmm, I dont know, this doc really is about the nulls thing.
> 
> Ok, now I understand - there's a new list type that I didn't know
> about called hlist_nulls. not surprising - it's not documented
> anywhere.
> 
> Maybe explicitly describing what the list_null pattern іs or a
> pointer to linux/include/list_nulls.h might be appropriate, because
> I managed to read that documentation and not realise that it was
> refering to a specific type of list that was already implemented
> rather than a simple marker technique.
> 
> > This stuff also addressed one problem I forgot to tell you about: During
> > a lookup, you might find an item that is moved to another chain by
> > another cpu, so your lookup is redirected to another chain. You can miss
> > your target.
> 
> <groan>
> 
> So, to go to per-chain locks as per the proposed bit-lock-on-the-
> low-bit-of-the-head-pointer infrastructure, we'll have to cross that
> with the hlist_null code that plays low-bit pointer games for
> detecting the end of the chain.
> 
> That's just messy - another hash chain specific scalability hackup.
> My dislike of using hash tables for unbounded caches is not
> improved by this....

Indeed, using call_rcu() or synchronize_rcu() guarantees that the identity
of a given RCU-protected structure will not change while you remain in a
given RCU read-side critical section.  In contrast, SLAB_DESTROY_BY_RCU
only guarantees that the type of the object will remain the same.  So this
is the usual complexity/speed tradeoff.  Use of call_rcu() gives the freed
memory more time to grow cache-cold, and synchronize_rcu() further blocks
the caller for several milliseconds, but allows much simpler identity
checks, often permitting you to dispense entirely with identity checks.

In contrast, SLAB_DESTROY_BY_RCU avoid cache-coldness, but requires
much more careful identity checks.

> > You must find a way to detect such thing to restart the lookup at
> > the beginning (of the right chain).  Either you stick the chain
> > number in a new inode field (that costs extra memory), or you use
> > the 'nulls' value that can let you know if you ended your lookup
> > in the right chain.
> 
> The chain is determined by hashing the inode number. Perhaps a
> simple enough test is to hash the last inode number on a cache miss
> and if that doesn't match the hash of the lookup key we redo the
> search. That seems to me like it will avoid needing to play games
> with termination markers - does that sound reasonable?

As long as you do the test under a lock that prevents the inode's identity
from changing.  Not sure what you should do about hash collisions, though.
Perhaps recheck the incoming pointer?

							Thanx, Paul

WARNING: multiple messages have this Message-ID (diff)

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	viro@ZenIV.linux.org.uk, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/3] fs: rcu protect inode hash lookups
Date: Tue, 2 Nov 2010 05:11:13 -0700	[thread overview]
Message-ID: <20101102121113.GG2664@linux.vnet.ibm.com> (raw)
In-Reply-To: <20101102000152.GO2715@dastard>

On Tue, Nov 02, 2010 at 11:01:52AM +1100, Dave Chinner wrote:
> On Mon, Nov 01, 2010 at 04:29:22PM +0100, Eric Dumazet wrote:
> > Le mardi 02 novembre 2010 à 00:44 +1100, Dave Chinner a écrit :
> > 
> > > Perhaps you should rename that file "slab_destroy_by_rcu-tips.txt",
> > > because the current name seems unrelated to the contents. :/
> > > 
> > 
> > Hmm, I dont know, this doc really is about the nulls thing.
> 
> Ok, now I understand - there's a new list type that I didn't know
> about called hlist_nulls. not surprising - it's not documented
> anywhere.
> 
> Maybe explicitly describing what the list_null pattern іs or a
> pointer to linux/include/list_nulls.h might be appropriate, because
> I managed to read that documentation and not realise that it was
> refering to a specific type of list that was already implemented
> rather than a simple marker technique.
> 
> > This stuff also addressed one problem I forgot to tell you about: During
> > a lookup, you might find an item that is moved to another chain by
> > another cpu, so your lookup is redirected to another chain. You can miss
> > your target.
> 
> <groan>
> 
> So, to go to per-chain locks as per the proposed bit-lock-on-the-
> low-bit-of-the-head-pointer infrastructure, we'll have to cross that
> with the hlist_null code that plays low-bit pointer games for
> detecting the end of the chain.
> 
> That's just messy - another hash chain specific scalability hackup.
> My dislike of using hash tables for unbounded caches is not
> improved by this....

Indeed, using call_rcu() or synchronize_rcu() guarantees that the identity
of a given RCU-protected structure will not change while you remain in a
given RCU read-side critical section.  In contrast, SLAB_DESTROY_BY_RCU
only guarantees that the type of the object will remain the same.  So this
is the usual complexity/speed tradeoff.  Use of call_rcu() gives the freed
memory more time to grow cache-cold, and synchronize_rcu() further blocks
the caller for several milliseconds, but allows much simpler identity
checks, often permitting you to dispense entirely with identity checks.

In contrast, SLAB_DESTROY_BY_RCU avoid cache-coldness, but requires
much more careful identity checks.

> > You must find a way to detect such thing to restart the lookup at
> > the beginning (of the right chain).  Either you stick the chain
> > number in a new inode field (that costs extra memory), or you use
> > the 'nulls' value that can let you know if you ended your lookup
> > in the right chain.
> 
> The chain is determined by hashing the inode number. Perhaps a
> simple enough test is to hash the last inode number on a cache miss
> and if that doesn't match the hash of the lookup key we redo the
> search. That seems to me like it will avoid needing to play games
> with termination markers - does that sound reasonable?

As long as you do the test under a lock that prevents the inode's identity
from changing.  Not sure what you should do about hash collisions, though.
Perhaps recheck the incoming pointer?

							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2010-11-02 12:11 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-01  5:33 fs: inode freeing and hash lookup via RCU Dave Chinner
2010-11-01  5:33 ` [PATCH 1/3] fs: pull inode->i_lock up out of writeback_single_inode Dave Chinner
2010-11-01  5:33 ` [PATCH 2/3] fs: Use RCU freeing of inodes via SLAB_DESTROY_BY_RCU Dave Chinner
2010-11-01 15:31   ` Christoph Hellwig
2010-11-01  5:33 ` [PATCH 3/3] fs: rcu protect inode hash lookups Dave Chinner
2010-11-01  9:38   ` Eric Dumazet
2010-11-01  9:38     ` Eric Dumazet
2010-11-01 13:44     ` Dave Chinner
2010-11-01 13:44       ` Dave Chinner
2010-11-01 15:29       ` Eric Dumazet
2010-11-01 15:29         ` Eric Dumazet
2010-11-02  0:01         ` Dave Chinner
2010-11-02  0:01           ` Dave Chinner
2010-11-02  4:46           ` Eric Dumazet
2010-11-02  4:46             ` Eric Dumazet
2010-11-02 12:11           ` Paul E. McKenney [this message]
2010-11-02 12:11             ` Paul E. McKenney
2010-11-16 23:56   ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101102121113.GG2664@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=david@fromorbit.com \
    --cc=eric.dumazet@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.