Re: Crazy idea of cleanup the inode_record btrfsck things with SQL?

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Qu Wenruo <quwenruo@cn.fujitsu.com>
To: Zygo Blaxell <zblaxell@furryterror.org>
Cc: Robert White <rwhite@pobox.com>,
	linux-btrfs <linux-btrfs@vger.kernel.org>,
	David Sterba <dsterba@suse.cz>
Subject: Re: Crazy idea of cleanup the inode_record btrfsck things with SQL?
Date: Thu, 11 Dec 2014 10:05:20 +0800	[thread overview]
Message-ID: <5488FBE0.7060309@cn.fujitsu.com> (raw)
In-Reply-To: <20141210215729.GC22023@hungrycats.org>


-------- Original Message --------
Subject: Re: Crazy idea of cleanup the inode_record btrfsck things with SQL?
From: Zygo Blaxell <zblaxell@furryterror.org>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>
Date: 2014年12月11日 05:57
> On Thu, Dec 04, 2014 at 02:56:55PM +0800, Qu Wenruo wrote:
>> The main memory usage in btrfsck is extent record, which
>> we can't free them until we read them all in and checked, so even we
>> mmap/unmap, it can only help with
>> the extent_buffer(which is already freed if not used according to refs).
> I'm thinking aloud here, but is it *really* necessary to read everything
> into memory?
Totally agreed to only read what we need.
But some backref and counts on refs can only be determined after a full 
scan, especially for leaf/node corruption
case.
>    Maybe a multiple-pass algorithm might be possible, e.g. one
> to find free space by eliminating any areas that are occupied by extents,
> then other passes to rebuild the metadata in the free space.  Or, one
> pass to verify the connectivity of references and collect dangling refs,
> then a second pass which fixes only the dangling refs.
I have similar idea, but not multi-pass method, instead, using per 
sector scan + tree search for other data.
E.g in extent tree check, each time only record all extents in a block 
group, and check them.
After check, remove the good extents/block groups and then move to next 
block group.
For fs tree, any key with same objectid(ino) as a group, and only read  
the group in one time and remove
the already known healthy record. (info not fully gathered or bad record 
will still stay in memory)

But I don't consider this method can really save much memory though...
>
> Usually sequential reads are significantly faster than swapping--even
> if swapping on solid-state media.  It could be that reading 260GB of
> metadata sequentially two or three times is still faster than thrashing
> through random lookups in 20GB of swap on a 4GB machine.
>
Definitely, but if we want to reduce memory usage, it is almost 
unavoidable to do more disk IO, especially random
disk IO, so it will become a tradeoff, which may cause the already slow 
fsck more slow....

Thanks,
Qu

next prev parent reply	other threads:[~2014-12-11  2:05 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-01  1:58 Crazy idea of cleanup the inode_record btrfsck things with SQL? Qu Wenruo
2014-12-01  3:08 ` Duncan
2014-12-01  3:24   ` Qu Wenruo
2014-12-01  5:47     ` Duncan
2014-12-01  6:25       ` Qu Wenruo
2014-12-01  4:03 ` Robert White
2014-12-01  6:18   ` Qu Wenruo
2014-12-01 18:10     ` Robert White
2014-12-02  1:17       ` Qu Wenruo
2014-12-03 19:18         ` Robert White
2014-12-04  6:56           ` Qu Wenruo
2014-12-10 21:57             ` Zygo Blaxell
2014-12-11  2:05               ` Qu Wenruo [this message]
2014-12-11  2:27                 ` Zygo Blaxell
2014-12-01 12:53 ` Austin S Hemmelgarn
2014-12-02  0:37   ` Qu Wenruo
2014-12-11 19:00     ` Martin Steigerwald
2014-12-11 19:38 ` Roger Binns

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5488FBE0.7060309@cn.fujitsu.com \
    --to=quwenruo@cn.fujitsu.com \
    --cc=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=rwhite@pobox.com \
    --cc=zblaxell@furryterror.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.