Re: Fractal Tree Indexing over B-Trees?

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Josef Bacik <josef@redhat.com>
To: Zach Brown <zab@zabbo.net>
Cc: Josef Bacik <josef@redhat.com>,
	Danny Piccirillo <danny.piccirillo@member.fsf.org>,
	linux-btrfs@vger.kernel.org
Subject: Re: Fractal Tree Indexing over B-Trees?
Date: Wed, 28 Mar 2012 16:13:06 -0400	[thread overview]
Message-ID: <20120328201306.GD1941@localhost.localdomain> (raw)
In-Reply-To: <4F736B6F.2080805@zabbo.net>

On Wed, Mar 28, 2012 at 03:50:07PM -0400, Zach Brown wrote:
> 
> >but lets say O(log N/2) where N is the number of elements in the row.
> >So in the situation I describe you are looking at having to do minimum
> >of 29 reads, one for each row,
> 
> Hmm.
> 
> Levels are powers of two and are either full or empty.  So the total
> item count tells you which levels are full or empty.
> 
> (gdb) print/t 1000000000
> $3 = 111011100110101100101000000000
> 
> So no, definitely not reading 29 levels.
>

You are still going to have to have at least 29 levels to accomodate 1 billion
objects, though they won't all be full (sorry I missed the must be full or empty
bit).  So it looks like we'll have to actually search what 13 rows right?  So
still more rows than a b-tree, and again you are talking about binary searching
each row's blocks and then following its forward pointer down to the next one,
so maybe not 100's of times slower than a b-tree, but we're not talking about a
5-10% difference either, probably still measured in multiples.

> After realizing this, I have to be honest: I'm not finding your
> hand-wavey dismissal of the technology convincing at all. :)
> 

Hah my math was a little off I'll grant you but the basic idea still stands,
once we start getting into larger and larger rows we're going to have to binary
search just to find the right _block_ to use, which in my mind is much more of a
problem than having to binary search inside of multiple blocks.  At the very
least as the number of objects grows the time it takes to find something in the
tree increases exponentially.

> There's a *ton* of detail that they don't describe about how to actually
> translate the logical notion of power-of-two node sizes into coherent
> block IO that scales.  I don't doubt that it's possible.

I imagine there is, but based on what little information they've shown I don't
see how it's a hands down win against b-trees.  If anything we're talking about
having to solve really complex problems in order to get any sort of good
performance out of this thing.  Thanks,

Josef

next prev parent reply	other threads:[~2012-03-28 20:13 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-28 14:25 Fractal Tree Indexing over B-Trees? Danny Piccirillo
2012-03-28 15:10 ` C Anthony Risinger
2012-03-28 18:42 ` Josef Bacik
2012-03-28 18:45   ` Jeff Mahoney
2012-03-28 18:57   ` Josef Bacik
2012-03-28 19:50     ` Zach Brown
2012-03-28 20:13       ` Josef Bacik [this message]
2012-03-28 20:29         ` Zach Brown
2012-03-28 20:44         ` Niels de Carpentier
2012-03-28 20:53           ` Josef Bacik
2012-03-28 21:14             ` Niels de Carpentier
2013-07-04 17:48               ` Kẏra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120328201306.GD1941@localhost.localdomain \
    --to=josef@redhat.com \
    --cc=danny.piccirillo@member.fsf.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=zab@zabbo.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).