From: Kẏra <kxra@fsf.org>
To: linux-btrfs@vger.kernel.org
Subject: Re: Fractal Tree Indexing over B-Trees?
Date: Thu, 4 Jul 2013 17:48:21 +0000 (UTC) [thread overview]
Message-ID: <loom.20130704T194442-3@post.gmane.org> (raw)
In-Reply-To: f18270bffa7724e63753821b4d0d5fa0.squirrel@webmail.decarpentier.com
Niels de Carpentier <niels <at> decarpentier.com> writes:
> > I'd like to see how they do that. The fact is you are still going to get
> > random
> > seeks since you have to binary search the blocks in an entire row since
> > there is
> > no way you can read a several thousand block row into memory to search it,
> > so
> > once your rows get pretty big you are doing just as much if not more
> > random io
> > as a btree.
>
> Why? The rows are sequential on disk, so you're never doing more random
> seeks than the number of rows as long as you can search faster than the
> disk can provide data. If they indeed can limit the number of searches per
> row to a constant, it shouldn't be an issue at all.
>
> > I don't buy that its better in the insert case either for similar reasons,
> > you
> > are talking about having to rewrite entire rows to maintain the sequential
> > nature of everything, so if you end up adding a giant row you have to go
> > and
> > write the whole thing out again, so you are talking about a lot more IO
> > than
> > with b-trees, albeit sequential. So maybe it's a win since it's
> > sequential but
> > I wonder at what tree size that benefit no longer exists.
>
> The worst case might be bad, but on average it should be good. I don't
> doubt it is always better on rotating disks, probably not on ssd's.
> >
> > Of course all this analysis is based on a power point presentation and
> > there may
> > be some detail I'm missing, but that's mostly my point, claiming that
> > b-trees
> > are now obsolete because we can maybe do inserts faster is just idiotic.
>
> It's a presentation from a company, so lots of marketing. Of course it
> doesn't make btrees obsolete, it's optimized for specific cases. Like they
> say they reduce random seeks at the cost of disk bandwidth and cpu usage.
> It depends on the use case if that is useful or not. Probably not for
> btrfs, since the future will be ssd's, and meta data will generally be
> cached.
>
> Niels
>
I'm reviving a very old thread from here:
http://thread.gmane.org/gmane.comp.file-systems.btrfs/16484
I found a bunch of more recent info from the company doing most of the work
behind fractal tree FS that I'd love to hear your thoughts on.
https://www.youtube.com/watch?v=c-n2LGPpQEw
http://www.tokutek.com/2013/02/mongodb-fractal-tree-indexes-high-compression
https://www.usenix.org/conference/hotstorage12/tokufs-streaming-file-system
Does this address earlier concerns at all?
I just hope their work will be
released as unpatented GPL-compatible free software.
prev parent reply other threads:[~2013-07-04 17:50 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-28 14:25 Fractal Tree Indexing over B-Trees? Danny Piccirillo
2012-03-28 15:10 ` C Anthony Risinger
2012-03-28 18:42 ` Josef Bacik
2012-03-28 18:45 ` Jeff Mahoney
2012-03-28 18:57 ` Josef Bacik
2012-03-28 19:50 ` Zach Brown
2012-03-28 20:13 ` Josef Bacik
2012-03-28 20:29 ` Zach Brown
2012-03-28 20:44 ` Niels de Carpentier
2012-03-28 20:53 ` Josef Bacik
2012-03-28 21:14 ` Niels de Carpentier
2013-07-04 17:48 ` Kẏra [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=loom.20130704T194442-3@post.gmane.org \
--to=kxra@fsf.org \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).