From: Kent Overstreet <kent.overstreet@gmail.com>
To: Ming Lin <mlin@kernel.org>
Cc: "linux-bcache@vger.kernel.org" <linux-bcache@vger.kernel.org>
Subject: Re: [ANNOUNCE] bcachefs!
Date: Thu, 6 Aug 2015 16:11:12 -0700 [thread overview]
Message-ID: <20150806231112.GB2459@kmo-pixel> (raw)
In-Reply-To: <1438843206.9429.34.camel@hasee>
On Wed, Aug 05, 2015 at 11:40:06PM -0700, Ming Lin wrote:
> On Tue, 2015-07-28 at 11:45 -0700, Ming Lin wrote:
> > On Tue, Jul 28, 2015 at 11:41 AM, Ming Lin <mlin@kernel.org> wrote:
> > > On Fri, Jul 24, 2015 at 1:47 PM, Ming Lin <mlin@kernel.org> wrote:
> > >>
> > >> And I want to learn how the btree node insert/delete/update happens on
> > >> disk. These maybe too detail. I'm going to write a small tool to dump
> > >> the file system. Then I could understand better the on disk btree
> > >> format.
> > >
> > > Here is my simple tool to dump parts of the on-disk format.
> > > http://www.minggr.net/cgit/cgit.cgi/bcache-tools/commit/?id=deb258e2
> >
> > Actually: http://www.minggr.net/cgit/cgit.cgi/bcache-tools/commit/?id=3121eec
> >
> > >
> > > It's not in good shape, but simple enough to learn the on-disk format.
>
> Hi Kent,
>
> I'm trying to understand how the root inode is stored in the inode
> btree.
>
> dd if=/dev/zero of=fs.img bs=10M count=1
> bcacheadm format -C fs.img
> mount -t bcache -o loop fs.img /mnt
> umount /mnt
> hexdump -C fs.img > fs.hex
>
> From my simple tool, I know that the inode btree starts from offset
> 0xec000
The root node of the inode btree? Are you handling trees with multiple nodes
yet?
>
> 000ec000 43 ef f3 df ff ff ff ff 86 c1 47 1e 99 25 51 35 |C.........G..%Q5|
> 000ec010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
> 000ec020 00 00 00 00 00 00 00 00 ff ff ff ff ff ff ff ff |................|
> 000ec030 ff ff ff ff ff ff ff ff 01 05 00 00 00 00 00 00 |................|
> 000ec040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
> *
> 000ec070 88 b5 38 e2 45 36 eb f6 00 00 00 00 00 00 00 00 |..8.E6..........|
> 000ec080 01 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00 |................|
> 000ec090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
> *
> 000ed000 31 66 fd 31 ff ff ff ff 88 b5 38 e2 45 36 eb f6 |1f.1......8.E6..|
> 000ed010 02 00 00 00 00 00 00 00 01 00 00 00 03 00 0b 00 |................|
> 000ed020 0b 01 80 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
> 000ed030 00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 |................|
> 000ed040 ed 41 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |.A..............|
> 000ed050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
> *
> 000ed070 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
> 000ed080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
> *
>
> btree_node (0xec000)
> bset (0xed008) ---> bset->u64s = 0x0b = 11
> bkey_packed (0xed020)
> bkey (0xed020)
> bch_inode (0xed040 to 0xed077) ---> root inode
>
> Is the decode above correct?
I think so. The code that deals with reading in a btree node disk and
interpreting the contents is mainly in bch_btree_node_read_done(), btree_io.c -
it looks like you found that?
> I found the root inode manually. But how is it actually found by code?
The root inode is the inode with inode number BCACHE_ROOT_INO (4096) -
http://evilpiepirate.org/git/linux-bcache.git/tree/drivers/md/bcache/fs.c?h=bcache-dev&id=5cf7fb11d124839eea2191fd7e8eddecb296d67d#n2285
So to do it correctly, you'll need the bkey packing code in order to unpack the
key (if it was packed) so that you can get the actual inode number of the key.
You'll also need to do something like the mergesort algorithm (or something
equivalent; you don't need to do the actual mergesort if you're just doing a
linear search for one key). That is - if there's multiple bsets, they will
likely contain duplicates and keys in newer bsets overwrite keys in older bsets.
> Could you help to explain what it is from 0xec070 to 0xed007?
> Are they also bsets?
Without knowing your block size and spending a fair amount of time staring at
the hexdump, I don't know what starts there - but quite possibly yes; bsets that
aren't at the start of the btree node are embeddedd in a struct
btree_node_entry, not a struct btree_node.
To tell if it's a valid bset, you compare bset->seq against the seq in the first
bset - it's a random number generated for each new btree node; if they match
then the bset there goes with that btree node.
next prev parent reply other threads:[~2015-08-06 23:11 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-14 0:58 [ANNOUNCE] bcachefs! Kent Overstreet
[not found] ` <CACaajQtwx45r8GcRmchrQwDts1GH-V8g0x1FwGfDvnfm02bq+Q@mail.gmail.com>
2015-07-14 8:11 ` Kent Overstreet
2015-07-20 1:11 ` Denis Bychkov
[not found] ` <CAC7rs0uWSt85F443PRw1zvybccg+EfebaSyH9EhUwHjhTGryRA@mail.gmail.com>
[not found] ` <CAC7rs0upqkuH1CPd-OAmrpQ=8PmaDpzHYY1MaBDpAL6TS_iKyw@mail.gmail.com>
2015-07-20 2:52 ` Denis Bychkov
2015-07-24 19:25 ` Kent Overstreet
2015-07-15 6:11 ` Ming Lin
[not found] ` <CAC7rs0sbg2ci6=niQ0X11AONZbr2AOYhRbxfDH_w4N4A7dyPLw@mail.gmail.com>
2015-07-15 7:15 ` Ming Lin
2015-07-15 7:39 ` Ming Lin
2015-07-17 23:17 ` Kent Overstreet
2015-07-17 23:35 ` Ming Lin
2015-07-17 23:40 ` Kent Overstreet
2015-07-17 23:48 ` Ming Lin
2015-07-17 23:51 ` Kent Overstreet
2015-07-17 23:58 ` Ming Lin
2015-07-18 2:10 ` Kent Overstreet
2015-07-18 5:21 ` Ming Lin
2015-07-22 5:11 ` Ming Lin
2015-07-22 5:15 ` Ming Lin
2015-07-24 19:15 ` Kent Overstreet
2015-07-24 20:47 ` Ming Lin
2015-07-28 18:41 ` Ming Lin
2015-07-28 18:45 ` Ming Lin
2015-08-06 6:40 ` Ming Lin
2015-08-06 23:11 ` Kent Overstreet [this message]
2015-08-07 5:21 ` Ming Lin
2015-08-06 22:58 ` Kent Overstreet
2015-08-06 23:27 ` Ming Lin
2015-08-06 23:59 ` Kent Overstreet
2015-07-18 0:01 ` Denis Bychkov
2015-07-18 2:12 ` Kent Overstreet
2015-07-19 7:46 ` Denis Bychkov
2015-07-21 18:37 ` David Mohr
2015-07-21 21:53 ` Jason Warr
2015-07-24 19:32 ` Kent Overstreet
2015-07-24 19:42 ` Jason Warr
2015-07-22 7:19 ` Killian De Volder
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150806231112.GB2459@kmo-pixel \
--to=kent.overstreet@gmail.com \
--cc=linux-bcache@vger.kernel.org \
--cc=mlin@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox