linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>, linux-btrfs@vger.kernel.org
Cc: dsterba@suse.cz
Subject: Re: [PATCH RFC 00/16] Introduce low memory usage btrfsck mode
Date: Tue, 26 Apr 2016 09:38:13 -0400	[thread overview]
Message-ID: <571F6F45.9040201@gmail.com> (raw)
In-Reply-To: <1461642543-4621-1-git-send-email-quwenruo@cn.fujitsu.com>

On 2016-04-25 23:48, Qu Wenruo wrote:
> The branch can be fetched from my github:
> https://github.com/adam900710/btrfs-progs.git low_mem_fsck_rebasing
>
> Original btrfsck checks extent tree in a very efficient method, by
> recording every checked extent in extent record tree to ensure every
> extent will be iterated for at most 2 times.
>
> However extent records are all stored in heap memory, and consider how
> large a btrfs file system can be, it can easily eat up all memory and
> cause OOM for TB-sized metadata.
>
> Instead of such heap memory usage, we introduce low memory usage fsck
> mode.
>
> In this mode, we will use btrfs_search_slot() only and avoid any heap
> memory allocation.
>
> The work flow is:
> 1) Iterate extent tree (backref check)
>     And check whether the referencer of every backref exists.
>
> 2) Iterate other trees (forward ref check)
>     And check whether the backref of every tree block/data exists in
>     extent tree.
>
> So in theory, every extent is iterated twice just as original one.
> But since we don't have extent record, but use btrfs_search_slot() every
> time we check, it will cause extra IO.
>
> I assume the extra IO is reasonable and should make btrfsck able to
> handle super large fs.
>
> TODO features:
> 1) Repair
>     Repair should be the same as old btrfsck, but still need to determine
>     the repair principle.
>     Current repair sometimes uses backref to repair data extent,
>     sometimes uses data extent to fix backref.
>     We need a consistent principle, or we will screw things up.
>
> 2) Replace current fsck code
>     We assume the low memory mode has less lines of code, and may be
>     easier for review and expand.
>
>     If low memory mode is stable enough, we will consider to replace
>     current extent and chunk tree check codes to free a lot of lines.
>
> 3) Further code refining
>     Reduce duplicated codes
>
> 4) Unify output
>     Make the output of low-memory mode same as the normal one.
>
> Lu Fengqi (16):
>    btrfs-progs: fsck: Introduce function to check tree block backref in
>      extent tree
>    btrfs-progs: fsck: Introduce function to check data backref in extent
>      tree
>    btrfs-progs: fsck: Introduce function to query tree block level
>    btrfs-progs: fsck: Introduce function to check referencer of a backref
>    btrfs-progs: fsck: Introduce function to check shared block ref
>    btrfs-progs: fsck: Introduce function to check referencer for data
>      backref
>    btrfs-progs: fsck: Introduce function to check shared data backref
>    btrfs-progs: fsck: Introduce function to check an extent
>    btrfs-progs: fsck: Introduce function to check dev extent item
>    btrfs-progs: fsck: Introduce function to check dev used space
>    btrfs-progs: fsck: Introduce function to check block group item
>    btrfs-progs: fsck: Introduce function to check chunk item
>    btrfs-progs: fsck: Introduce hub function for later fsck
>    btrfs-progs: fsck: Introduce function to speed up fs tree check
>    btrfs-progs: fsck: Introduce traversal function for fsck
>    btrfs-progs: fsck: Introduce low memory mode
>
>   Documentation/btrfs-check.asciidoc |    2 +
>   cmds-check.c                       | 1667 +++++++++++++++++++++++++++++++++---
>   ctree.h                            |    2 +
>   extent-tree.c                      |    2 +-
>   4 files changed, 1536 insertions(+), 137 deletions(-)
>
I don't really have a stock of broken FS images to test this with, but 
I've checked it against known good ones and it correctly identifies them 
as good (I've tested all the profiles except raid5 and raid6 in both 
normal and mixed-bg variants, with all combinations of profiles between 
data and metadata, and with 2-8 devices for the multi-device levels, 
most of the involved filesystems were on LVM thinp storage with mostly 
sparse files), and it properly repairs the couple of broken filesystems 
I can make by hand (mostly stuff with orphaned inodes or bad ref-counts) 
in the same way the existing code repairs them, all while using 
measurably less memory as advertised, so you can add:

Tested-by: Austin S. Hemmelgarn <ahferroin7@gmail.com>

  parent reply	other threads:[~2016-04-26 13:39 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-26  3:48 [PATCH RFC 00/16] Introduce low memory usage btrfsck mode Qu Wenruo
2016-04-26  3:48 ` [PATCH RFC 01/16] btrfs-progs: fsck: Introduce function to check tree block backref in extent tree Qu Wenruo
2016-04-28 14:03   ` Josef Bacik
2016-04-26  3:48 ` [PATCH RFC 02/16] btrfs-progs: fsck: Introduce function to check data " Qu Wenruo
2016-04-28  1:43   ` [PATCH RFC v1.1 " Qu Wenruo
2016-04-28 14:08     ` Josef Bacik
2016-04-26  3:48 ` [PATCH RFC 03/16] btrfs-progs: fsck: Introduce function to query tree block level Qu Wenruo
2016-04-28 14:13   ` Josef Bacik
2016-04-29  3:35     ` Qu Wenruo
2016-04-29 13:51       ` Josef Bacik
2016-04-26  3:48 ` [PATCH RFC 04/16] btrfs-progs: fsck: Introduce function to check referencer of a backref Qu Wenruo
2016-04-28 14:17   ` Josef Bacik
2016-04-29  5:31     ` Qu Wenruo
2016-04-29 13:52       ` Josef Bacik
2016-05-03  0:56         ` Qu Wenruo
2016-04-28 14:31   ` Josef Bacik
2016-04-26  3:48 ` [PATCH RFC 05/16] btrfs-progs: fsck: Introduce function to check shared block ref Qu Wenruo
2016-04-28 14:19   ` Josef Bacik
2016-04-26  3:48 ` [PATCH RFC 06/16] btrfs-progs: fsck: Introduce function to check referencer for data backref Qu Wenruo
2016-04-28 14:22   ` Josef Bacik
2016-04-26  3:48 ` [PATCH RFC 07/16] btrfs-progs: fsck: Introduce function to check shared " Qu Wenruo
2016-04-28 14:23   ` Josef Bacik
2016-04-26  3:48 ` [PATCH RFC 08/16] btrfs-progs: fsck: Introduce function to check an extent Qu Wenruo
2016-04-28 14:26   ` Josef Bacik
2016-04-26  3:48 ` [PATCH RFC 09/16] btrfs-progs: fsck: Introduce function to check dev extent item Qu Wenruo
2016-04-28 14:27   ` Josef Bacik
2016-04-26  3:48 ` [PATCH RFC 10/16] btrfs-progs: fsck: Introduce function to check dev used space Qu Wenruo
2016-04-28 14:29   ` Josef Bacik
2016-04-26  3:48 ` [PATCH RFC 11/16] btrfs-progs: fsck: Introduce function to check block group item Qu Wenruo
2016-04-26  3:48 ` [PATCH RFC 12/16] btrfs-progs: fsck: Introduce function to check chunk item Qu Wenruo
2016-04-26  3:49 ` [PATCH RFC 13/16] btrfs-progs: fsck: Introduce hub function for later fsck Qu Wenruo
2016-04-26  3:49 ` [PATCH RFC 14/16] btrfs-progs: fsck: Introduce function to speed up fs tree check Qu Wenruo
2016-04-26  3:49 ` [PATCH RFC 15/16] btrfs-progs: fsck: Introduce traversal function for fsck Qu Wenruo
2016-04-26  3:49 ` [PATCH RFC 16/16] btrfs-progs: fsck: Introduce low memory mode Qu Wenruo
2016-04-26 13:38 ` Austin S. Hemmelgarn [this message]
2016-04-28 14:32 ` [PATCH RFC 00/16] Introduce low memory usage btrfsck mode Josef Bacik
2016-04-29  0:25   ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=571F6F45.9040201@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).