Re: [PATCH 0/5] btrfs: scrub: improve the scrub performance

Linux Btrfs filesystem development
 help / color / mirror / Atom feed

From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Jani Partanen <jiipee@sotapeli.fi>, Qu Wenruo <wqu@suse.com>,
	linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 0/5] btrfs: scrub: improve the scrub performance
Date: Wed, 2 Aug 2023 06:06:31 +0800	[thread overview]
Message-ID: <48dea2d4-42ba-50bc-d955-9aa4a8082c7e@gmx.com> (raw)
In-Reply-To: <2d45a042-0c01-1026-1ced-0d8fdd026891@sotapeli.fi>



On 2023/8/2 04:14, Jani Partanen wrote:
> Hello, I did some testing with 4 x 320GB HDD's. Meta raid1c4 and data
> raid 5.

RAID5 has other problems related to scrub performance unfortunately.

>
> Kernel  6.3.12
>
> btrfs scrub start -B /dev/sdb
>
> scrub done for 6691cb53-271b-4abd-b2ab-143c41027924
> Scrub started:    Tue Aug  1 04:00:39 2023
> Status:           finished
> Duration:         2:37:35
> Total to scrub:   149.58GiB
> Rate:             16.20MiB/s
> Error summary:    no errors found
>
>
> Kernel 6.5.0-rc3
>
> btrfs scrub start -B /dev/sdb
>
> scrub done for 6691cb53-271b-4abd-b2ab-143c41027924
> Scrub started:    Tue Aug  1 08:41:12 2023
> Status:           finished
> Duration:         1:31:03
> Total to scrub:   299.16GiB
> Rate:             56.08MiB/s
> Error summary:    no errors found
>
>
> So much better speed but Total to scrub reporting seems strange.
>
> df -h /dev/sdb
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/sdb        1,2T  599G  292G  68% /mnt
>
>
> Looks like old did like 1/4 of total data what seems like right because
> I have 4 drives.
>
> New did  about 1/2 of total data what seems wrong.

I checked the kernel part of the progress reporting, for single device
scrub for RAID56, if it's a data stripe it contributes to the scrubbed
bytes, but if it's a P/Q stripe it should not contribute to the value.

Thus 1/4 should be the correct value.

However there is another factor in btrfs-progs, which determines how to
report the numbers.

There is a fix for it already merged in v6.3.2, but it seems to have
other problems involved.

>
> And if I do scrub against mount point:
>
> btrfs scrub start -B /mnt/
> scrub done for 6691cb53-271b-4abd-b2ab-143c41027924
> Scrub started:    Tue Aug  1 11:03:56 2023
> Status:           finished
> Duration:         10:02:44
> Total to scrub:   1.17TiB
> Rate:             33.89MiB/s
> Error summary:    no errors found
>
>
> Then performance goes down to toilet and now Total to scrub reporting is
> like 2/1
>
> btrfs version
> btrfs-progs v6.3.3
>
> Is it btrfs-progs issue with reporting?

Can you try with -BdR option?

It shows the raw numbers, which is the easiest way to determine if it's
a bug in btrfs-progs or kernel.

> What about raid 5 scrub
> performance, why it is so bad?

It's explained in this cover letter:
https://lore.kernel.org/linux-btrfs/cover.1688368617.git.wqu@suse.com/

In short, RAID56 full fs scrub is causing too many duplicated reads, and
the root cause is, the per-device scrub is never a good idea for RAID56.

That's why I'm trying to introduce the new scrub flag for that.

Thanks,
Qu

>
>
> About disks, they are old WD Blue drives what can do about 100MB/s
> read/write on average.
>
>
> On 28/07/2023 14.14, Qu Wenruo wrote:
>> [REPO]
>> https://github.com/adam900710/linux/tree/scrub_testing
>>
>> [CHANGELOG]
>> v1:
>> - Rebased to latest misc-next
>>
>> - Rework the read IO grouping patch
>>    David has found some crashes mostly related to scrub performance
>>    fixes, meanwhile the original grouping patch has one extra flag,
>>    SCRUB_FLAG_READ_SUBMITTED, to avoid double submitting.
>>
>>    But this flag can be avoided as we can easily avoid double submitting
>>    just by properly checking the sctx->nr_stripe variable.
>>
>>    This reworked grouping read IO patch should be safer compared to the
>>    initial version, with better code structure.
>>
>>    Unfortunately, the final performance is worse than the initial version
>>    (2.2GiB/s vs 2.5GiB/s), but it should be less racy thus safer.
>>
>> - Re-order the patches
>>    The first 3 patches are the main fixes, and I put safer patches first,
>>    so even if David still found crash at certain patch, the remaining can
>>    be dropped if needed.
>>
>> There is a huge scrub performance drop introduced by v6.4 kernel, that
>> the scrub performance is only around 1/3 for large data extents.
>>
>> There are several causes:
>>
>> - Missing blk plug
>>    This means read requests won't be merged by block layer, this can
>>    hugely reduce the read performance.
>>
>> - Extra time spent on extent/csum tree search
>>    This including extra path allocation/freeing and tree searchs.
>>    This is especially obvious for large data extents, as previously we
>>    only do one csum search per 512K, but now we do one csum search per
>>    64K, an 8 times increase in csum tree search.
>>
>> - Less concurrency
>>    Mostly due to the fact that we're doing submit-and-wait, thus much
>>    lower queue depth, affecting things like NVME which benefits a lot
>>    from high concurrency.
>>
>> The first 3 patches would greately improve the scrub read performance,
>> but unfortunately it's still not as fast as the pre-6.4 kernels.
>> (2.2GiB/s vs 3.0GiB/s), but still much better than 6.4 kernels (2.2GiB
>> vs 1.0GiB/s).
>>
>> Qu Wenruo (5):
>>    btrfs: scrub: avoid unnecessary extent tree search preparing stripes
>>    btrfs: scrub: avoid unnecessary csum tree search preparing stripes
>>    btrfs: scrub: fix grouping of read IO
>>    btrfs: scrub: don't go ordered workqueue for dev-replace
>>    btrfs: scrub: move write back of repaired sectors into
>>      scrub_stripe_read_repair_worker()
>>
>>   fs/btrfs/file-item.c |  33 +++---
>>   fs/btrfs/file-item.h |   6 +-
>>   fs/btrfs/raid56.c    |   4 +-
>>   fs/btrfs/scrub.c     | 234 ++++++++++++++++++++++++++-----------------
>>   4 files changed, 169 insertions(+), 108 deletions(-)
>>

next prev parent reply	other threads:[~2023-08-01 22:06 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-28 11:14 [PATCH 0/5] btrfs: scrub: improve the scrub performance Qu Wenruo
2023-07-28 11:14 ` [PATCH 1/5] btrfs: scrub: avoid unnecessary extent tree search preparing stripes Qu Wenruo
2023-07-28 11:14 ` [PATCH 2/5] btrfs: scrub: avoid unnecessary csum " Qu Wenruo
2023-07-28 11:14 ` [PATCH 3/5] btrfs: scrub: fix grouping of read IO Qu Wenruo
2023-07-28 11:14 ` [PATCH 4/5] btrfs: scrub: don't go ordered workqueue for dev-replace Qu Wenruo
2023-07-28 11:14 ` [PATCH 5/5] btrfs: scrub: move write back of repaired sectors into scrub_stripe_read_repair_worker() Qu Wenruo
2023-07-28 12:38 ` [PATCH 0/5] btrfs: scrub: improve the scrub performance Martin Steigerwald
2023-07-28 16:50   ` David Sterba
2023-07-28 21:14     ` Martin Steigerwald
2023-08-01 20:14 ` Jani Partanen
2023-08-01 22:06   ` Qu Wenruo [this message]
2023-08-01 23:48     ` Jani Partanen
2023-08-02  1:56       ` Qu Wenruo
2023-08-02  2:15         ` Jani Partanen
2023-08-02  2:20           ` Qu Wenruo
2023-08-03  6:30             ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48dea2d4-42ba-50bc-d955-9aa4a8082c7e@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=jiipee@sotapeli.fi \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox