Re: [PATCH] add b+tree library

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: "Jörn Engel" <joern@logfs.org>
To: Theodore Tso <tytso@mit.edu>
Cc: Andi Kleen <andi@firstfloor.org>,
	Johannes Berg <johannes@sipsolutions.net>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux Kernel list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] add b+tree library
Date: Sat, 10 Jan 2009 23:01:35 +0100	[thread overview]
Message-ID: <20090110220135.GF20611@logfs.org> (raw)
In-Reply-To: <20090110212740.GE31579@mit.edu>

On Sat, 10 January 2009 16:27:40 -0500, Theodore Tso wrote:
> On Sat, Jan 10, 2009 at 09:23:16PM +0100, Jörn Engel wrote:
> > 
> > Key difference is the number of cachelines you need to find a particular
> > entry.  rbtrees have a fanout of sqrt(3), so for a million elements (to
> > pick a random example) you need about 25 cachelines with rbtrees and
> > about 5-16 with btrees.  Closer to 5 if keys and pointers are small and
> > cachelines are large, closer to 16 if keys and pointers are large and
> > cachelines are small.
> 
> Three questions....  is the number of cachelines in use going to make a
> measurable difference for your use case in the filesystem?  If the
> operation is going to involve disk access, trying to optimize for to
> improve cacheline utilization may not be the higher priority thing to
> worry about.

I don't really expect a big difference, even if the filesystem is
intended for flash, not disks.  Other overhead will dominate the
picture.  The situation may be different for Johannes, though.

> If you have a million elements, and assuming each element is but 4
> bytes (which seems unlikely; very likely you'd be indexing at least
> 8-12 bytes of data, right?) we're talking about 4 megabytes of
> non-swappable kernel memory.  Is that likely to be happen in your use
> case?

A million was picked because a) it is easy to calculate with and b) it
is sufficiently (insanely) large to illustrate the effect.  It is not
likely at all in my case.  With 1000 elements, which is much more
realistic, you can just halve the numbers above.

> Finally, are b+tree so much better than rbtrees that it would be
> worthwhile to just *replace* rbtrees with b+trees?  Or is the problem
> the overhead issue if the number of entries in an rbtree is relatively
> small?

Maybe and no.  The overhead for near-empty or completely empty trees is
fairly low.  At one point in time I had one btree for every indirect
block and every inode in the filesystem.  As a result, struct btree_head
contains just two pointers and an int.

One key difference is that rbtrees maintain the tree within objects and
btrees maintain the tree externally.  So btrees have to allocate memory
on insertion, where rbtrees have the necessary memory as part of the
object.  With mempools the memory allocation should be reasonably safe,
so maybe this is a bit of a red herring now.

Another difference is the locking.  The current implementation
completely ignores locking and depends on the callers to serialize
access to the btree.

Keeping all that in mind, I believe many rbtree users could be
converted.

Jörn

-- 
The competent programmer is fully aware of the strictly limited size of
his own skull; therefore he approaches the programming task in full
humility, and among other things he avoids clever tricks like the plague.
-- Edsger W. Dijkstra

next prev parent reply	other threads:[~2009-01-10 22:02 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-10 10:47 [PATCH] add b+tree library Johannes Berg
2009-01-10 11:02 ` KOSAKI Motohiro
2009-01-10 11:37   ` Johannes Berg
2009-01-10 11:56     ` Jörn Engel
2009-01-10 12:29     ` KOSAKI Motohiro
2009-01-10 18:39       ` Jörn Engel
2009-01-10 18:44         ` Johannes Berg
2009-01-10 19:41           ` Andi Kleen
2009-01-10 20:22             ` Johannes Berg
2009-01-10 20:23             ` Jörn Engel
2009-01-10 21:27               ` Theodore Tso
2009-01-10 22:01                 ` Jörn Engel [this message]
2009-01-10 22:23                   ` Andrew Morton
2009-01-10 23:57                     ` Peter Zijlstra
2009-01-11  8:30                       ` Jörn Engel
2009-01-12 16:20                         ` Paul E. McKenney
2009-02-05  0:17                       ` Johannes Berg
2009-02-05  8:46                         ` Andi Kleen
2009-02-07 12:26                         ` Jörn Engel
2009-01-11  3:13                   ` Theodore Tso
2009-01-10 22:26                 ` Andi Kleen
2009-01-11  8:20                   ` Jörn Engel
2009-01-11 18:23                     ` Andi Kleen
2009-01-17 17:53 ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090110220135.GF20611@logfs.org \
    --to=joern@logfs.org \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=johannes@sipsolutions.net \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox