public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Nick Piggin <npiggin@suse.de>
Cc: netdev@vger.kernel.org
Subject: Re: rt hash table / rt hash locks question
Date: Wed, 16 Jun 2010 14:27:38 +0200	[thread overview]
Message-ID: <1276691258.2632.55.camel@edumazet-laptop> (raw)
In-Reply-To: <20100616104633.GW6138@laptop>

Le mercredi 16 juin 2010 à 20:46 +1000, Nick Piggin a écrit :
> I'm just converting this scalable dentry/inode hash table to a more
> compact form. I was previously using a dumb spinlock per bucket,
> but this doubles the size of the tables so isn't production quality.
> 

Yes, we had this in the past (one rwlock or spinlock per hash chain),
and it was not very good with LOCKDEP on.

> What I've done at the moment is to use a bit_spinlock in bit 0 of each
> list pointer of the table. Bit spinlocks are now pretty nice because
> we can do __bit_spin_unlock() which gives non-atomic store with release
> ordering, so it should be almost as fast as spinlock.
> 
> But I look at rt hash and it seems you use a small hash on the side
> for spinlocks. So I wonder, pros for each:
> 
> - bitlocks have effectively zero storage
    yes but a mask is needed to get head pointer. Special care also must
be taken when insert/delete a node in chain, keeping this bit set.

> - bitlocks hit the same cacheline that the hash walk hits.
    yes
> - in RCU list, locked hash walks usually followed by hash modification,
>   bitlock should have brought in the line for exclusive.
    But we usually perform a read only lookup, _then_ take the lock, to
perform a new lookup before insert. So at time we would take the
bitlock, cache line is in shared state. With spinlocks, we always use
the exclusive mode, but on a separate cache line...

> - bitlock number of locks scales with hash size
    Yes, but concurrency is more a function of online cpus, given we use
jhash. 

> - spinlocks may be slightly better at the cacheline level (bitops
>   sometimes require explicit load which may not acquire exclusive
>   line on some archs). On x86 ll/sc architectures, this shouldn't
>   be a problem.
    Yes, you can add fairness (if ticket spinlocks variant used), but on
route cache I really doubt it can make a difference.

> - spinlocks better debugging (could be overcome with a LOCKDEP
>   option to revert to spinlocks, but a bit ugly).
	Definitely a good thing.

> - in practice, contention due to aliasing in buckets to lock mapping
>   is probably fairly minor.
     Agreed
> 
> Net code is obviously tested and tuned well, but instinctively I would
> have tought bitlocks are the better way to go. Any comments on this?

Well, to be honest, this code is rather old, and at time I wrote it,
bitlocks were probably not available.

You can add :

- One downside of the hashed spinlocks is the X86_INTERNODE_CACHE_SHIFT
being 12 on X86_VSMP : All locks are probably in same internode block :(

- Another downside is all locks are currently on a single NUMA node,
since we kmalloc() them in one contiguous chunk.

So I guess it would be worth to try :)



  reply	other threads:[~2010-06-16 12:27 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-16 10:46 rt hash table / rt hash locks question Nick Piggin
2010-06-16 12:27 ` Eric Dumazet [this message]
2010-06-16 12:49   ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1276691258.2632.55.camel@edumazet-laptop \
    --to=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=npiggin@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox