From: Josef Bacik <josef@redhat.com>
To: Danny Piccirillo <danny.piccirillo@member.fsf.org>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Fractal Tree Indexing over B-Trees?
Date: Wed, 28 Mar 2012 14:42:04 -0400 [thread overview]
Message-ID: <20120328184204.GB1941@localhost.localdomain> (raw)
In-Reply-To: <loom.20120328T160525-464@post.gmane.org>
On Wed, Mar 28, 2012 at 02:25:39PM +0000, Danny Piccirillo wrote:
> The case has been made on Phoronix for F-Trees: They makes use hard
> drive speeds, not (relatively slow) access times; beat SSD's; and scale
> perfectly across multiple cores with hundreds of millions of entries.
>
> http://en.wikipedia.org/wiki/TokuDB#Fractal_tree_indexes
>
> How TokuDB Fractal Tree Databases Work
>
> Via: http://www.phoronix.com/scan.php?page=news_item&px=MTA3NjM
>
> Time for someone to get started on ftrfs? Or can it be implemented
> in Btrfs?
> https://bugzilla.kernel.org/show_bug.cgi?id=43004
>
This is dumber shit than usual out of phoronix. I did some just basic
calculations for worst case scenarios, let's say we max out btrfs's 8 level
limit, so we have 7 levels of nodes and then our leaves. Lets just for
simplicity sake say we can fit 1 billion objects into this kind of tree, and for
each node/leaf we can only hold 10 objects. (Keep in mind these are just random
numbers I'm pulling out of my ass and may not work out right with such a large
tree, just work with me here).
So worst case scenario doing a binary search for an object with 10 objects is 4
comparisons, so with a full 8 level btree we're doing 4 comparisons per level,
so 32 comparisons worst possible case to find our object.
Now let's consider a fractal tree with 1 billion objects. So that would have a
29 row fractal tree (if I did my math right). Now I have to make some
assumptions about how you would implement a fractal tree here, but let's assume
that every level tells you it's min and max value to make it easier to tell
which level you need to search in. So worst case you're object is in the 29th
level, so you have to read 29 blocks in and check the min/max levels to figure
out which row to search. Let's be fair and say these are in memory, we're still
doing 29 comparisons right out of the gate to find the right row. Now we have
the right row, but this is a file system and the rows are split up into blocks,
so we not only have to binary search each block, but the blocks themselves to
find the right block with our object. And we can't do this directed either
because we have no way of indexing the individual blocks, so worst case scenario
we're having to read in 268435456 blocks to then do a normal binary search on
_each_ block! Now lets again say just to give fractal trees an added benefit
that each block has it's own min/max setting, we only have to binary search the
one that will have our actual data, but we're still talking about reading in
33554432 times the number of blocks as compared to a btree!!!!!!
And this doesn't even begin to touch the ability to handle multi-threaded
workloads. Imagine having to move all of these blocks around in memory because
your insert created a new row. You'd have to lock each row as you move stuff
around (at the very least), so if you hit a particularly hot row your
multithreaded performance is going to look like you are running on an Atari
2600.
So no I don't think it's time for someone to get started on ftrfs. Thanks,
Josef
next prev parent reply other threads:[~2012-03-28 18:42 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-28 14:25 Fractal Tree Indexing over B-Trees? Danny Piccirillo
2012-03-28 15:10 ` C Anthony Risinger
2012-03-28 18:42 ` Josef Bacik [this message]
2012-03-28 18:45 ` Jeff Mahoney
2012-03-28 18:57 ` Josef Bacik
2012-03-28 19:50 ` Zach Brown
2012-03-28 20:13 ` Josef Bacik
2012-03-28 20:29 ` Zach Brown
2012-03-28 20:44 ` Niels de Carpentier
2012-03-28 20:53 ` Josef Bacik
2012-03-28 21:14 ` Niels de Carpentier
2013-07-04 17:48 ` Kẏra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120328184204.GB1941@localhost.localdomain \
--to=josef@redhat.com \
--cc=danny.piccirillo@member.fsf.org \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).