From: Kent Overstreet <kent.overstreet@gmail.com>
To: Ming Lin <mlin@kernel.org>
Cc: "linux-bcache@vger.kernel.org" <linux-bcache@vger.kernel.org>
Subject: Re: [ANNOUNCE] bcachefs!
Date: Thu, 6 Aug 2015 16:59:09 -0700 [thread overview]
Message-ID: <20150806235909.GA30757@moria.home.lan> (raw)
In-Reply-To: <CAF1ivSa3RCaydtMWPGyuVo=+wR+siF_5x1sPUsCL6DiDS+7yhA@mail.gmail.com>
On Thu, Aug 06, 2015 at 04:27:51PM -0700, Ming Lin wrote:
> On Thu, Aug 6, 2015 at 3:58 PM, Kent Overstreet
> <kent.overstreet@gmail.com> wrote:
> > On Tue, Jul 28, 2015 at 11:41:52AM -0700, Ming Lin wrote:
> >> On Fri, Jul 24, 2015 at 1:47 PM, Ming Lin <mlin@kernel.org> wrote:
> >> >
> >> > And I want to learn how the btree node insert/delete/update happens on
> >> > disk. These maybe too detail. I'm going to write a small tool to dump
> >> > the file system. Then I could understand better the on disk btree
> >> > format.
> >>
> >> Here is my simple tool to dump parts of the on-disk format.
> >> http://www.minggr.net/cgit/cgit.cgi/bcache-tools/commit/?id=deb258e2
> >>
> >> It's not in good shape, but simple enough to learn the on-disk format.
> >
> > Hey! Sorry for taking so long to respond, just got my computer set up back in
> > Alaska.
> >
> > If you want to keep going with your tool, this might be a starting point for a
> > debugfs tool - which bcache definitely needs at some point.
>
> Yes, that's my goal.
> I'll improve it once I get more familiar with bcachefs on-disk format.
I imagine the sanest thing to do will be to reuse some of the kernel side code -
at the very least, the bkey packing code. That code is already pretty self
contained, and it's very algorithmic - no point in redoing it, and no real
reason to do it differently.
If it makes things easier, we could probably shuffle code around a bit so that
perhaps bkey.c contains only code that can be easily compiled in userspace.
I'm not sure if there's any other significant code that you'd want to use in
userspace - possibly the mergesort code (i.e.
bch_extent_sort_fix_overlapping()), but that code is going to be harder to lift
out and compile in userspace without changes.
Journal replay is going to be another major issue... the problem is, the btree
isn't up to date until you do journal replay, and the way bcache does journal
replay is with the same index update path that it uses at runtime - which
modifies the btree, i.e. it can't do journal replay without modifying what's on
disk.
We don't want the userspace debugfs tool to be modifying the disk image, so the
method bcache uses is right out.
The method I had in mind was that when you read the journal, you keep that list
of index updates to do around, in memory - then, when you read or are looking at
any given btree node, you iterate over all the keys in the journal replay list
and apply only the ones that apply to the current node. If the insertions don't
fit into the current node (i.e. if we would have to split the node if we were
doing a normal index update) - just grow the node in memory, since we're just
going to be tossing it out when we're done instead of writing out our changes.
next prev parent reply other threads:[~2015-08-06 23:59 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-14 0:58 [ANNOUNCE] bcachefs! Kent Overstreet
[not found] ` <CACaajQtwx45r8GcRmchrQwDts1GH-V8g0x1FwGfDvnfm02bq+Q@mail.gmail.com>
2015-07-14 8:11 ` Kent Overstreet
2015-07-20 1:11 ` Denis Bychkov
[not found] ` <CAC7rs0uWSt85F443PRw1zvybccg+EfebaSyH9EhUwHjhTGryRA@mail.gmail.com>
[not found] ` <CAC7rs0upqkuH1CPd-OAmrpQ=8PmaDpzHYY1MaBDpAL6TS_iKyw@mail.gmail.com>
2015-07-20 2:52 ` Denis Bychkov
2015-07-24 19:25 ` Kent Overstreet
2015-07-15 6:11 ` Ming Lin
[not found] ` <CAC7rs0sbg2ci6=niQ0X11AONZbr2AOYhRbxfDH_w4N4A7dyPLw@mail.gmail.com>
2015-07-15 7:15 ` Ming Lin
2015-07-15 7:39 ` Ming Lin
2015-07-17 23:17 ` Kent Overstreet
2015-07-17 23:35 ` Ming Lin
2015-07-17 23:40 ` Kent Overstreet
2015-07-17 23:48 ` Ming Lin
2015-07-17 23:51 ` Kent Overstreet
2015-07-17 23:58 ` Ming Lin
2015-07-18 2:10 ` Kent Overstreet
2015-07-18 5:21 ` Ming Lin
2015-07-22 5:11 ` Ming Lin
2015-07-22 5:15 ` Ming Lin
2015-07-24 19:15 ` Kent Overstreet
2015-07-24 20:47 ` Ming Lin
2015-07-28 18:41 ` Ming Lin
2015-07-28 18:45 ` Ming Lin
2015-08-06 6:40 ` Ming Lin
2015-08-06 23:11 ` Kent Overstreet
2015-08-07 5:21 ` Ming Lin
2015-08-06 22:58 ` Kent Overstreet
2015-08-06 23:27 ` Ming Lin
2015-08-06 23:59 ` Kent Overstreet [this message]
2015-07-18 0:01 ` Denis Bychkov
2015-07-18 2:12 ` Kent Overstreet
2015-07-19 7:46 ` Denis Bychkov
2015-07-21 18:37 ` David Mohr
2015-07-21 21:53 ` Jason Warr
2015-07-24 19:32 ` Kent Overstreet
2015-07-24 19:42 ` Jason Warr
2015-07-22 7:19 ` Killian De Volder
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150806235909.GA30757@moria.home.lan \
--to=kent.overstreet@gmail.com \
--cc=linux-bcache@vger.kernel.org \
--cc=mlin@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox