From: Josef Bacik <josef@toxicpanda.com>
To: Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PoC PATCH 00/11] btrfs: scrub: rework to get rid of the complex bio formshaping
Date: Tue, 13 Dec 2022 17:08:50 -0500 [thread overview]
Message-ID: <Y5j38sRW7mZlAmZk@localhost.localdomain> (raw)
In-Reply-To: <cover.1670314744.git.wqu@suse.com>
On Tue, Dec 06, 2022 at 04:23:27PM +0800, Qu Wenruo wrote:
> This is a proof-of-concept patchset, thus don't merge.
>
> This series is mostly a full rework of low level scrub code.
>
> Although I implemented the full support for all profiles, only raid56
> code is partially cleaned up (which is already over 1K lines removed).
> The estimated full cleanup will be around 1~2K more lines removed
> eventually.
>
> The series is sent out for feedback, as the full patchset can be very
> large (mostly to remove old codes).
>
> [PROBLEMS OF SCRUB]
>
> TL;DR
> The current scrub code is way too complex for future expansion.
>
> Current scrub code has a complex system to manage its in-flight bios.
>
> This behavior is designed to improve scrub performance, but has a lot of
> disadvantage too:
>
> - Too many indirect calls/jumps
>
> To scrub one extent in a simple mirror (like SINGLE), the call chain
> involves the following functions:
>
> /* Before entering scrub_simple_mirror() */
> scrub_ctx()
> |- INIT_WORK(&sbio->work, scrub_bio_end_io_worker);
>
> /* In scrub_simple_mirror() */
> scrub_extent()
> |- scrub_sectors()
> |- scrub_add_sector_to_rd_bio()
> |- sbio->bio->bi_end_io = scrub_bio_end_io;
>
> /* Now it jumps to the endio function */
>
> scrub_bio_end_io()
> |- queue_work()
>
> /* Now it jumps to workqueue, which is setup in scrub_ctx() */
> scrub_bio_end_io_worker()
> |- scrub_block_complete()
> |- scrub_handle_errored_block() /* For corruption case */
> | |- scrub_recheck_block()
> | |- scrub_repair_block_from_good_copy()
> |- scrub_checksum()
> |- scrub_write_block_to_dev_replace()
>
> The whole jumps/delayed calls are really a mess to read.
> This makes me wonder if the original code is really designed for human
> to read.
>
> - Complex bio form-shaping
> We have scrub_bio to manage the in-flight bios.
>
> It has at most 64 bios for one device scrub, and each bio can be as
> large as 128K.
>
> This is definitely a big performance enhancement.
>
> But I'm not convinced that scrub performance should be the first thing
> to consider.
>
> - No usage of btrfs_submit_bio()
> This means we're doing a lot of things which can already be handled by
> btrfs_submit_bio().
>
> This means quite some duplicated code.
>
> [ENHANCEMENT]
>
> This patchset will introduce a new infrasturcture, scrub2_stripe.
>
> The "scrub2" prefix is to avoid naming conflicts.
>
> The overall idea is, we always do scrub in the unit of BTRFS_STRIPE_LEN,
> hence the "scrub2_stripe".
>
> Furthermore, all work will done in a submit-and-wait fashion, reducing
> the delayed calls/jumps to minimal.
>
> The new scrub entrance in scrub_simple_mirror() would looks like this:
>
> scrub_simple_mirror()
> | /* Find a stripe with at least one sector used */
> |- scrub2_find_fill_first_stripe()
> |
> | /* Submit a read for the whole 64KiB */
> |- scrub2_read_and_wait_stripe()
> | |- btrfs_submit_bio()
> | | /* do the verification for all sectors */
> | |- scrub2_verify_one_stripe()
> |
> |- scrub2_reapair_one_stripe()
> | |- scrub2_repair_from_mirror()
> | | /* reuse the existing read path */
> | |- scrub2_read_and_wait_stripe()
> |
> |- scrub2_report_errors()
> |
> | /*
> | * For dev-replace and regular scrub repair, the write range
> | * will be different.
> | * (replace will writeback all good sectors, repair only writes
> | * back repaired sectors).
> | */
> |- scrub2_writeback_sectors()
>
> Everything is done in a submit-and-wait fashion.
>
> Although this will reduce concurrency, the readability will be greatly
> improved.
>
> Furthermore since we're already submitting read in 64KiB size, it's less
> IOPS intense, thus it's already doing optimization for read.
>
> Even if the performance is not that good, it can be enhanced later by
> using scrub2_stripe_group to submit multiple stripes in one go (aka,
> enlarge the block size) to improve performance, without damaging
> readability.
>
> [CURRENT FEATURES]
>
> - Working scrub/replace for all profiles
> Currently only non-raid56 is tested.
>
> [TODO]
>
> There are a lot of todos:
>
> - Completely remove scrub_bio/scrub_block/scrub_sector
> I estimate that would result further 1~2K lines removed.
>
> Unfortunately, thus huge cleanup will take way more patches than
> usual.
>
> - Add proper support for zoned devices
> Currently zoned code is not touched, but existing zoned code relies on
> scrub_bio.
> Thus they won't work at all.
>
> - Proper performance benchmarking
I looked through everything, I don't see any glaring problems. I definitely
would like to see a decent comment at the top of scrub.c detailing the behavior,
similar to delalloc-space.c or space-info.c.
I do not love the scrub2 thing, but if the entire patchset ended with
s/scrub2/scrub/g then I suppose that would be ok. Ditto for exporting functions
that aren't prefix'ed with btrfs_. If you're going to eventually come through
and clean that up then by all means do this, I just would want to see it be
properly cleaned at the end.
I didn't pay too close attention to the code, the missing parts like zoned and
stuff are enough that I don't want to devote too much attention to code that is
likely to change between now and it's final form. Your design makes sense, I
don't care about scrub performance in general, as long as it's not unusably slow
I'd happily trade performance for readability and better maintainability. But I
definitely want the performance changes described, so we know what we're paying
for. Thanks,
Josef
prev parent reply other threads:[~2022-12-13 22:08 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-06 8:23 [PoC PATCH 00/11] btrfs: scrub: rework to get rid of the complex bio formshaping Qu Wenruo
2022-12-06 8:23 ` [PoC PATCH 01/11] btrfs: scrub: introduce the structure for new BTRFS_STRIPE_LEN based interface Qu Wenruo
2022-12-06 8:23 ` [PoC PATCH 02/11] btrfs: scrub: introduce a helper to find and fill the sector info for a scrub2_stripe Qu Wenruo
2022-12-06 8:23 ` [PoC PATCH 03/11] btrfs: scrub: introduce a helper to verify one scrub2_stripe Qu Wenruo
2022-12-06 8:23 ` [PoC PATCH 04/11] btrfs: scrub: add the repair function for scrub2_stripe Qu Wenruo
2022-12-06 8:23 ` [PoC PATCH 05/11] btrfs: scrub: add a writeback helper " Qu Wenruo
2022-12-06 8:45 ` Christoph Hellwig
2022-12-06 8:23 ` [PoC PATCH 06/11] btrfs: scrub: add the error reporting " Qu Wenruo
2022-12-06 18:48 ` kernel test robot
2022-12-06 8:23 ` [PoC PATCH 07/11] btrfs: scrub: add raid56 P/Q scrubbing support Qu Wenruo
2022-12-27 10:45 ` kernel test robot
2022-12-06 8:23 ` [PoC PATCH 08/11] btrfs: scrub: use dedicated super block verification function to scrub one super block Qu Wenruo
2022-12-06 8:23 ` [PoC PATCH 09/11] btrfs: scrub: switch to the new scrub2_stripe based infrastructure Qu Wenruo
2022-12-06 8:23 ` [PoC PATCH 10/11] btrfs: scrub: cleanup the old scrub_parity infrastructure Qu Wenruo
2022-12-06 8:23 ` [PoC PATCH 11/11] btrfs: scrub: cleanup scrub_extent() and its related functions Qu Wenruo
2022-12-06 8:40 ` [PoC PATCH 00/11] btrfs: scrub: rework to get rid of the complex bio formshaping Christoph Hellwig
2022-12-06 9:29 ` Qu Wenruo
2022-12-13 22:08 ` Josef Bacik [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y5j38sRW7mZlAmZk@localhost.localdomain \
--to=josef@toxicpanda.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.