From: Dave Chinner <david@fromorbit.com>
To: "Jörn Engel" <joern@logfs.org>
Cc: linux-kernel@vger.kernel.org
Subject: Re: [RFC] B+Tree library
Date: Tue, 28 Oct 2008 12:25:57 +1100 [thread overview]
Message-ID: <20081028012557.GP18495@disturbed> (raw)
In-Reply-To: <20081026124643.GA1328@logfs.org>
On Sun, Oct 26, 2008 at 01:46:44PM +0100, Jörn Engel wrote:
> The idea of using btrees in the kernel is not exactly new. They have a
> number of advantages over rbtrees and hashtables, but also a number of
> drawbacks. More importantly, logfs already makes use of them and -
> since I don't want any incompatible code to get merged and suffer the
> trouble it creates - I would like to discuss my implementation and where
> it makes sense and where it doesn't.
>
> General advantages of btrees are memory density and efficient use of
> cachelines. Hashtables are either too small and degrade into linked
> list performance, or they are too large and waste memory. With changing
> workloads, both may be true on the same system. Rbtrees have a bad
> fanout of less than 2 (they are not actually balanced binary trees),
> hence reading a fairly large number of cachelines to each lookup.
>
> Main disadvantage of btrees is that they are complicated, come in a
> gazillion subtly different variant that differ mainly in the balance
> between read efficiency and write efficiency. Comparing btrees against
> anything is a bit like comparing apples and random fruits.
>
> This implementation is extremely simple. It splits nodes when they
> overflow. It does not move elements to neighboring nodes. It does not
> try fancy 2:3 splits. It does not even merge nodes when they shrink,
> making degenerate cases possible. And it requires callers to do
> tree-global locking. In effect, it will be hard to find anything less
> sophisticated.
>
> The one aspect where my implementation is actually nice is in allowing
> variable key length. Btree keys are interpreted as an array of unsigned
> long. So by passing the correct geometry to the core functions, it is
> possible to handle 32bit, 64bit or 128bit btrees, which logfs uses. If
> so desired, any other weird data format can be used as well (Zach, are
> you reading this?).
>
> So would something like this be merged once some users are identified?
> Would it be useful for anything but logfs? Or am I plain nuts?
I think a btree library would be useful - there are places where
people are using rb-trees instead of btree simply because it's
easier to use the rbtree than it is to implement a btree library.
I can think of several places I could use such a library for
in-memory extent representation....
That being said, I haven't had a chance to look at that code yet....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2008-10-28 1:26 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-26 12:46 [RFC] B+Tree library Jörn Engel
2008-10-28 1:25 ` Dave Chinner [this message]
2008-10-30 17:43 ` Pavel Machek
2008-10-30 17:58 ` Jörn Engel
2008-10-30 19:14 ` Pavel Machek
2008-10-30 20:20 ` Jörn Engel
2008-10-31 6:38 ` Christian Borntraeger
2008-10-31 7:35 ` Jörn Engel
2008-10-31 9:16 ` Geert Uytterhoeven
2008-10-31 9:20 ` Jörn Engel
2008-10-31 10:35 ` Johannes Berg
2008-10-31 11:26 ` Jörn Engel
2008-10-31 11:32 ` Johannes Berg
2008-10-31 12:54 ` Jörn Engel
2008-10-31 13:07 ` Johannes Berg
2008-10-31 13:15 ` Jörn Engel
2008-11-01 15:59 ` [RFC] B+Tree library V2 Jörn Engel
2008-11-05 19:57 ` Johannes Berg
2008-11-05 20:06 ` Jörn Engel
2008-11-05 20:12 ` Johannes Berg
2008-11-05 20:21 ` Jörn Engel
2008-11-05 20:25 ` Johannes Berg
2008-11-07 7:52 ` Jörn Engel
2009-01-08 0:57 ` Johannes Berg
2009-01-08 16:24 ` Jörn Engel
2009-01-08 16:34 ` Johannes Berg
2009-01-08 19:40 ` Jörn Engel
2009-01-08 16:50 ` Johannes Berg
2009-01-08 19:46 ` Jörn Engel
2009-01-08 17:10 ` Johannes Berg
2009-01-08 20:02 ` Jörn Engel
2009-01-08 20:18 ` Johannes Berg
2009-01-08 21:09 ` Jörn Engel
2008-10-31 13:16 ` [RFC] B+Tree library Johannes Berg
2008-10-31 13:29 ` Jörn Engel
2008-10-31 13:45 ` Bert Wesarg
2008-10-31 15:18 ` Tim Gardner
2008-10-31 15:35 ` Jörn Engel
2008-10-31 20:17 ` Sean Young
2008-10-31 23:36 ` Jörn Engel
2008-11-01 10:17 ` Sean Young
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081028012557.GP18495@disturbed \
--to=david@fromorbit.com \
--cc=joern@logfs.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox