public inbox for linux-bcache@vger.kernel.org
 help / color / mirror / Atom feed
From: Kent Overstreet <kent.overstreet@gmail.com>
To: Ming Lin <mlin@kernel.org>
Cc: "linux-bcache@vger.kernel.org" <linux-bcache@vger.kernel.org>
Subject: Re: [ANNOUNCE] bcachefs!
Date: Thu, 6 Aug 2015 16:11:12 -0700	[thread overview]
Message-ID: <20150806231112.GB2459@kmo-pixel> (raw)
In-Reply-To: <1438843206.9429.34.camel@hasee>

On Wed, Aug 05, 2015 at 11:40:06PM -0700, Ming Lin wrote:
> On Tue, 2015-07-28 at 11:45 -0700, Ming Lin wrote:
> > On Tue, Jul 28, 2015 at 11:41 AM, Ming Lin <mlin@kernel.org> wrote:
> > > On Fri, Jul 24, 2015 at 1:47 PM, Ming Lin <mlin@kernel.org> wrote:
> > >>
> > >> And I want to learn how the btree node insert/delete/update happens on
> > >> disk. These maybe too detail. I'm going to write a small tool to dump
> > >> the file system. Then I could understand better the on disk btree
> > >> format.
> > >
> > > Here is my simple tool to dump parts of the on-disk format.
> > > http://www.minggr.net/cgit/cgit.cgi/bcache-tools/commit/?id=deb258e2
> > 
> > Actually: http://www.minggr.net/cgit/cgit.cgi/bcache-tools/commit/?id=3121eec
> > 
> > >
> > > It's not in good shape, but simple enough to learn the on-disk format.
> 
> Hi Kent,
> 
> I'm trying to understand how the root inode is stored in the inode
> btree.
> 
> dd if=/dev/zero of=fs.img bs=10M count=1
> bcacheadm format -C fs.img
> mount -t bcache -o loop fs.img /mnt
> umount /mnt
> hexdump -C fs.img > fs.hex
> 
> From my simple tool, I know that the inode btree starts from offset
> 0xec000

The root node of the inode btree? Are you handling trees with multiple nodes
yet?

> 
> 000ec000  43 ef f3 df ff ff ff ff  86 c1 47 1e 99 25 51 35  |C.........G..%Q5|
> 000ec010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> 000ec020  00 00 00 00 00 00 00 00  ff ff ff ff ff ff ff ff  |................|
> 000ec030  ff ff ff ff ff ff ff ff  01 05 00 00 00 00 00 00  |................|
> 000ec040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> *
> 000ec070  88 b5 38 e2 45 36 eb f6  00 00 00 00 00 00 00 00  |..8.E6..........|
> 000ec080  01 00 00 00 03 00 00 00  00 00 00 00 00 00 00 00  |................|
> 000ec090  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> *
> 000ed000  31 66 fd 31 ff ff ff ff  88 b5 38 e2 45 36 eb f6  |1f.1......8.E6..|
> 000ed010  02 00 00 00 00 00 00 00  01 00 00 00 03 00 0b 00  |................|
> 000ed020  0b 01 80 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> 000ed030  00 00 00 00 00 00 00 00  00 10 00 00 00 00 00 00  |................|
> 000ed040  ed 41 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |.A..............|
> 000ed050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> *
> 000ed070  02 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> 000ed080  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> *
> 
> btree_node (0xec000)
>     bset (0xed008)  ---> bset->u64s = 0x0b = 11
>         bkey_packed (0xed020)
>             bkey (0xed020)
>             bch_inode (0xed040 to 0xed077)  ---> root inode
> 
> Is the decode above correct?

I think so. The code that deals with reading in a btree node disk and
interpreting the contents is mainly in bch_btree_node_read_done(), btree_io.c -
it looks like you found that?

> I found the root inode manually. But how is it actually found by code?

The root inode is the inode with inode number BCACHE_ROOT_INO (4096) -
http://evilpiepirate.org/git/linux-bcache.git/tree/drivers/md/bcache/fs.c?h=bcache-dev&id=5cf7fb11d124839eea2191fd7e8eddecb296d67d#n2285

So to do it correctly, you'll need the bkey packing code in order to unpack the
key (if it was packed) so that you can get the actual inode number of the key.

You'll also need to do something like the mergesort algorithm (or something
equivalent; you don't need to do the actual mergesort if you're just doing a
linear search for one key). That is - if there's multiple bsets, they will
likely contain duplicates and keys in newer bsets overwrite keys in older bsets.

> Could you help to explain what it is from 0xec070 to 0xed007?
> Are they also bsets?

Without knowing your block size and spending a fair amount of time staring at
the hexdump, I don't know what starts there - but quite possibly yes; bsets that
aren't at the start of the btree node are embeddedd in a struct
btree_node_entry, not a struct btree_node.

To tell if it's a valid bset, you compare bset->seq against the seq in the first
bset - it's a random number generated for each new btree node; if they match
then the bset there goes with that btree node.

  reply	other threads:[~2015-08-06 23:11 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-14  0:58 [ANNOUNCE] bcachefs! Kent Overstreet
     [not found] ` <CACaajQtwx45r8GcRmchrQwDts1GH-V8g0x1FwGfDvnfm02bq+Q@mail.gmail.com>
2015-07-14  8:11   ` Kent Overstreet
2015-07-20  1:11     ` Denis Bychkov
     [not found]       ` <CAC7rs0uWSt85F443PRw1zvybccg+EfebaSyH9EhUwHjhTGryRA@mail.gmail.com>
     [not found]         ` <CAC7rs0upqkuH1CPd-OAmrpQ=8PmaDpzHYY1MaBDpAL6TS_iKyw@mail.gmail.com>
2015-07-20  2:52           ` Denis Bychkov
2015-07-24 19:25             ` Kent Overstreet
2015-07-15  6:11 ` Ming Lin
     [not found]   ` <CAC7rs0sbg2ci6=niQ0X11AONZbr2AOYhRbxfDH_w4N4A7dyPLw@mail.gmail.com>
2015-07-15  7:15     ` Ming Lin
2015-07-15  7:39       ` Ming Lin
2015-07-17 23:17         ` Kent Overstreet
2015-07-17 23:35           ` Ming Lin
2015-07-17 23:40             ` Kent Overstreet
2015-07-17 23:48               ` Ming Lin
2015-07-17 23:51                 ` Kent Overstreet
2015-07-17 23:58                   ` Ming Lin
2015-07-18  2:10                     ` Kent Overstreet
2015-07-18  5:21                       ` Ming Lin
2015-07-22  5:11                         ` Ming Lin
2015-07-22  5:15                           ` Ming Lin
2015-07-24 19:15                           ` Kent Overstreet
2015-07-24 20:47                             ` Ming Lin
2015-07-28 18:41                               ` Ming Lin
2015-07-28 18:45                                 ` Ming Lin
2015-08-06  6:40                                   ` Ming Lin
2015-08-06 23:11                                     ` Kent Overstreet [this message]
2015-08-07  5:21                                       ` Ming Lin
2015-08-06 22:58                                 ` Kent Overstreet
2015-08-06 23:27                                   ` Ming Lin
2015-08-06 23:59                                     ` Kent Overstreet
2015-07-18  0:01 ` Denis Bychkov
2015-07-18  2:12   ` Kent Overstreet
2015-07-19  7:46     ` Denis Bychkov
2015-07-21 18:37 ` David Mohr
2015-07-21 21:53   ` Jason Warr
2015-07-24 19:32     ` Kent Overstreet
2015-07-24 19:42       ` Jason Warr
2015-07-22  7:19   ` Killian De Volder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150806231112.GB2459@kmo-pixel \
    --to=kent.overstreet@gmail.com \
    --cc=linux-bcache@vger.kernel.org \
    --cc=mlin@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox