All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Jörn Engel" <joern@logfs.org>
Cc: linux-kernel@vger.kernel.org
Subject: Re: [RFC] B+Tree library
Date: Tue, 28 Oct 2008 12:25:57 +1100	[thread overview]
Message-ID: <20081028012557.GP18495@disturbed> (raw)
In-Reply-To: <20081026124643.GA1328@logfs.org>

On Sun, Oct 26, 2008 at 01:46:44PM +0100, Jörn Engel wrote:
> The idea of using btrees in the kernel is not exactly new.  They have a
> number of advantages over rbtrees and hashtables, but also a number of
> drawbacks.  More importantly, logfs already makes use of them and -
> since I don't want any incompatible code to get merged and suffer the
> trouble it creates - I would like to discuss my implementation and where
> it makes sense and where it doesn't.
> 
> General advantages of btrees are memory density and efficient use of
> cachelines.  Hashtables are either too small and degrade into linked
> list performance, or they are too large and waste memory.  With changing
> workloads, both may be true on the same system.  Rbtrees have a bad
> fanout of less than 2 (they are not actually balanced binary trees),
> hence reading a fairly large number of cachelines to each lookup.
> 
> Main disadvantage of btrees is that they are complicated, come in a
> gazillion subtly different variant that differ mainly in the balance
> between read efficiency and write efficiency.  Comparing btrees against
> anything is a bit like comparing apples and random fruits.
> 
> This implementation is extremely simple.  It splits nodes when they
> overflow.  It does not move elements to neighboring nodes.  It does not
> try fancy 2:3 splits.  It does not even merge nodes when they shrink,
> making degenerate cases possible.  And it requires callers to do
> tree-global locking.  In effect, it will be hard to find anything less
> sophisticated.
> 
> The one aspect where my implementation is actually nice is in allowing
> variable key length.  Btree keys are interpreted as an array of unsigned
> long.  So by passing the correct geometry to the core functions, it is
> possible to handle 32bit, 64bit or 128bit btrees, which logfs uses.  If
> so desired, any other weird data format can be used as well (Zach, are
> you reading this?).
> 
> So would something like this be merged once some users are identified?
> Would it be useful for anything but logfs?  Or am I plain nuts?

I think a btree library would be useful - there are places where
people are using rb-trees instead of btree simply because it's
easier to use the rbtree than it is to implement a btree library.
I can think of several places I could use such a library for
in-memory extent representation....

That being said, I haven't had a chance to look at that code yet....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2008-10-28  1:26 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-26 12:46 [RFC] B+Tree library Jörn Engel
2008-10-28  1:25 ` Dave Chinner [this message]
2008-10-30 17:43 ` Pavel Machek
2008-10-30 17:58   ` Jörn Engel
2008-10-30 19:14     ` Pavel Machek
2008-10-30 20:20       ` Jörn Engel
2008-10-31  6:38     ` Christian Borntraeger
2008-10-31  7:35       ` Jörn Engel
2008-10-31  9:16         ` Geert Uytterhoeven
2008-10-31  9:20           ` Jörn Engel
2008-10-31 10:35 ` Johannes Berg
2008-10-31 11:26   ` Jörn Engel
2008-10-31 11:32     ` Johannes Berg
2008-10-31 12:54       ` Jörn Engel
2008-10-31 13:07         ` Johannes Berg
2008-10-31 13:15           ` Jörn Engel
2008-11-01 15:59         ` [RFC] B+Tree library V2 Jörn Engel
2008-11-05 19:57           ` Johannes Berg
2008-11-05 20:06             ` Jörn Engel
2008-11-05 20:12               ` Johannes Berg
2008-11-05 20:21                 ` Jörn Engel
2008-11-05 20:25                   ` Johannes Berg
2008-11-07  7:52                     ` Jörn Engel
2009-01-08  0:57           ` Johannes Berg
2009-01-08 16:24             ` Jörn Engel
2009-01-08 16:34               ` Johannes Berg
2009-01-08 19:40                 ` Jörn Engel
2009-01-08 16:50               ` Johannes Berg
2009-01-08 19:46                 ` Jörn Engel
2009-01-08 17:10               ` Johannes Berg
2009-01-08 20:02                 ` Jörn Engel
2009-01-08 20:18                   ` Johannes Berg
2009-01-08 21:09                     ` Jörn Engel
2008-10-31 13:16 ` [RFC] B+Tree library Johannes Berg
2008-10-31 13:29   ` Jörn Engel
2008-10-31 13:45   ` Bert Wesarg
2008-10-31 15:18   ` Tim Gardner
2008-10-31 15:35     ` Jörn Engel
2008-10-31 20:17 ` Sean Young
2008-10-31 23:36   ` Jörn Engel
2008-11-01 10:17     ` Sean Young

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081028012557.GP18495@disturbed \
    --to=david@fromorbit.com \
    --cc=joern@logfs.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.