public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Daniel Phillips <phillips@bonn-fries.net>
To: "Brian J. Watson" <Brian.J.Watson@compaq.com>, Andi Kleen <ak@suse.de>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>
Subject: Re: Common hash table implementation
Date: Sun, 22 Jul 2001 00:57:05 +0200	[thread overview]
Message-ID: <01072200570501.02679@starship> (raw)
In-Reply-To: <oupitgqjxoi.fsf@pigdrop.muc.suse.de> <3B58B186.4D23D1A3@compaq.com>
In-Reply-To: <3B58B186.4D23D1A3@compaq.com>

On Saturday 21 July 2001 00:32, Brian J. Watson wrote:
> Andi Kleen wrote:
> > hash tables like the inode hash are twice as big as needed because
> > each bucket is two pointers instead of one]
>
> I agree. Hash tables such as inode_hashtable and dentry_hashtable are
> half as efficient under stress as they would otherwise be, because
> they use an array of list_heads.

Whoops, yes.  The buffer_head pprev strategy gives both the 
double linked list and a table vector of single links, at the expense 
of a few extra tests.

For a one-size-fits-all hash strategy the question is, double-linked of 
singled-linked?  The advantage of a double linked list for a hash is 
quick, generic deletion.  Using the pprev strategy the disadvantage of 
double links in the table vector can be eliminated.  The main advantage 
of the single link is space savings both in the table and in the 
structures.  The disadvantage is having to do an extra bucket search on 
deletion, though this is partially offset by fewer pointer operations 
to perform insertion or deletion.

It's hard to say which is fastest.  It might turn out that the single 
linked strategy is actually faster, it really depends on typical depth 
of the buckets.  A double-linked list deletion needs to follow two 
pointers and set two links, the single-linked list only one of each.  
There's a similar difference for insertion.  So the extra overhead of 
insertion and deletion can be set off against say, three or four 
iterations of the bucket search loop.  If buckets average six to eight 
elements deep, it's a tie.

A more subtle cost of the double-link approach is the slight extra 
cache pressure due to the enlarged objects.  This cost is incurred 
continously as the objects are used, it might very well add up to more 
than the cost of the final list traversal for the delete in the 
single-linked case.

How about implementing both strategies, using your generic interface to 
make them look the same?  And then seeing which is fastest in practice.

--
Daniel

  reply	other threads:[~2001-07-21 22:53 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <oupitgqjxoi.fsf@pigdrop.muc.suse.de>
2001-07-20 22:32 ` Common hash table implementation Brian J. Watson
2001-07-21 22:57   ` Daniel Phillips [this message]
2001-07-18  0:57 Brian J. Watson
2001-07-18  1:34 ` Larry McVoy
2001-07-18 13:46   ` Daniel Phillips
2001-07-21  0:24     ` Brian J. Watson
2001-07-21 20:25       ` Daniel Phillips
2001-07-22 10:18         ` Richard Guenther
2001-07-23 14:36           ` Daniel Phillips
2001-07-22 16:37         ` Larry McVoy
2001-07-23 14:24           ` Daniel Phillips
2001-07-22 23:34         ` Eyal Lebedinsky
2001-07-24 12:57           ` Daniel Phillips
2001-07-22  2:23   ` Rusty Russell
2001-07-24 12:28     ` Daniel Phillips
2001-07-18  9:48 ` Richard Guenther

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=01072200570501.02679@starship \
    --to=phillips@bonn-fries.net \
    --cc=Brian.J.Watson@compaq.com \
    --cc=ak@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox