linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kẏra <kxra@fsf.org>
To: linux-btrfs@vger.kernel.org
Subject: Re: Fractal Tree Indexing over B-Trees?
Date: Thu, 4 Jul 2013 17:48:21 +0000 (UTC)	[thread overview]
Message-ID: <loom.20130704T194442-3@post.gmane.org> (raw)
In-Reply-To: f18270bffa7724e63753821b4d0d5fa0.squirrel@webmail.decarpentier.com

Niels de Carpentier <niels <at> decarpentier.com> writes:
> > I'd like to see how they do that.  The fact is you are still going to get
> > random
> > seeks since you have to binary search the blocks in an entire row since
> > there is
> > no way you can read a several thousand block row into memory to search it,
> > so
> > once your rows get pretty big you are doing just as much if not more
> > random io
> > as a btree.
> 
> Why? The rows are sequential on disk, so you're never doing more random
> seeks than the number of rows as long as you can search faster than the
> disk can provide data. If they indeed can limit the number of searches per
> row to a constant, it shouldn't be an issue at all.
> 
> > I don't buy that its better in the insert case either for similar reasons,
> > you
> > are talking about having to rewrite entire rows to maintain the sequential
> > nature of everything, so if you end up adding a giant row you have to go
> > and
> > write the whole thing out again, so you are talking about a lot more IO
> > than
> > with b-trees, albeit sequential.  So maybe it's a win since it's
> > sequential but
> > I wonder at what tree size that benefit no longer exists.
> 
> The worst case might be bad, but on average it should be good. I don't
> doubt it is always better on rotating disks, probably not on ssd's.
> >
> > Of course all this analysis is based on a power point presentation and
> > there may
> > be some detail I'm missing, but that's mostly my point, claiming that
> > b-trees
> > are now obsolete because we can maybe do inserts faster is just idiotic.
> 
> It's a presentation from a company, so lots of marketing. Of course it
> doesn't make btrees obsolete, it's optimized for specific cases. Like they
> say they reduce random seeks at the cost of disk bandwidth and cpu usage.
> It depends on the use case if that is useful or not. Probably not for
> btrfs, since the future will be ssd's, and meta data will generally be
> cached.
> 
> Niels
> 

I'm reviving a very old thread from here:
http://thread.gmane.org/gmane.comp.file-systems.btrfs/16484

I found a bunch of more recent info from the company doing most of the work
behind fractal tree FS that I'd love to hear your thoughts on.

https://www.youtube.com/watch?v=c-n2LGPpQEw
http://www.tokutek.com/2013/02/mongodb-fractal-tree-indexes-high-compression
https://www.usenix.org/conference/hotstorage12/tokufs-streaming-file-system

Does this address earlier concerns at all? 

I just hope their work will be
released as unpatented GPL-compatible free software. 


      reply	other threads:[~2013-07-04 17:50 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-28 14:25 Fractal Tree Indexing over B-Trees? Danny Piccirillo
2012-03-28 15:10 ` C Anthony Risinger
2012-03-28 18:42 ` Josef Bacik
2012-03-28 18:45   ` Jeff Mahoney
2012-03-28 18:57   ` Josef Bacik
2012-03-28 19:50     ` Zach Brown
2012-03-28 20:13       ` Josef Bacik
2012-03-28 20:29         ` Zach Brown
2012-03-28 20:44         ` Niels de Carpentier
2012-03-28 20:53           ` Josef Bacik
2012-03-28 21:14             ` Niels de Carpentier
2013-07-04 17:48               ` Kẏra [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=loom.20130704T194442-3@post.gmane.org \
    --to=kxra@fsf.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).