From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Jani Partanen <jiipee@sotapeli.fi>, Qu Wenruo <wqu@suse.com>,
linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 0/5] btrfs: scrub: improve the scrub performance
Date: Wed, 2 Aug 2023 06:06:31 +0800 [thread overview]
Message-ID: <48dea2d4-42ba-50bc-d955-9aa4a8082c7e@gmx.com> (raw)
In-Reply-To: <2d45a042-0c01-1026-1ced-0d8fdd026891@sotapeli.fi>
On 2023/8/2 04:14, Jani Partanen wrote:
> Hello, I did some testing with 4 x 320GB HDD's. Meta raid1c4 and data
> raid 5.
RAID5 has other problems related to scrub performance unfortunately.
>
> Kernel 6.3.12
>
> btrfs scrub start -B /dev/sdb
>
> scrub done for 6691cb53-271b-4abd-b2ab-143c41027924
> Scrub started: Tue Aug 1 04:00:39 2023
> Status: finished
> Duration: 2:37:35
> Total to scrub: 149.58GiB
> Rate: 16.20MiB/s
> Error summary: no errors found
>
>
> Kernel 6.5.0-rc3
>
> btrfs scrub start -B /dev/sdb
>
> scrub done for 6691cb53-271b-4abd-b2ab-143c41027924
> Scrub started: Tue Aug 1 08:41:12 2023
> Status: finished
> Duration: 1:31:03
> Total to scrub: 299.16GiB
> Rate: 56.08MiB/s
> Error summary: no errors found
>
>
> So much better speed but Total to scrub reporting seems strange.
>
> df -h /dev/sdb
> Filesystem Size Used Avail Use% Mounted on
> /dev/sdb 1,2T 599G 292G 68% /mnt
>
>
> Looks like old did like 1/4 of total data what seems like right because
> I have 4 drives.
>
> New did about 1/2 of total data what seems wrong.
I checked the kernel part of the progress reporting, for single device
scrub for RAID56, if it's a data stripe it contributes to the scrubbed
bytes, but if it's a P/Q stripe it should not contribute to the value.
Thus 1/4 should be the correct value.
However there is another factor in btrfs-progs, which determines how to
report the numbers.
There is a fix for it already merged in v6.3.2, but it seems to have
other problems involved.
>
> And if I do scrub against mount point:
>
> btrfs scrub start -B /mnt/
> scrub done for 6691cb53-271b-4abd-b2ab-143c41027924
> Scrub started: Tue Aug 1 11:03:56 2023
> Status: finished
> Duration: 10:02:44
> Total to scrub: 1.17TiB
> Rate: 33.89MiB/s
> Error summary: no errors found
>
>
> Then performance goes down to toilet and now Total to scrub reporting is
> like 2/1
>
> btrfs version
> btrfs-progs v6.3.3
>
> Is it btrfs-progs issue with reporting?
Can you try with -BdR option?
It shows the raw numbers, which is the easiest way to determine if it's
a bug in btrfs-progs or kernel.
> What about raid 5 scrub
> performance, why it is so bad?
It's explained in this cover letter:
https://lore.kernel.org/linux-btrfs/cover.1688368617.git.wqu@suse.com/
In short, RAID56 full fs scrub is causing too many duplicated reads, and
the root cause is, the per-device scrub is never a good idea for RAID56.
That's why I'm trying to introduce the new scrub flag for that.
Thanks,
Qu
>
>
> About disks, they are old WD Blue drives what can do about 100MB/s
> read/write on average.
>
>
> On 28/07/2023 14.14, Qu Wenruo wrote:
>> [REPO]
>> https://github.com/adam900710/linux/tree/scrub_testing
>>
>> [CHANGELOG]
>> v1:
>> - Rebased to latest misc-next
>>
>> - Rework the read IO grouping patch
>> David has found some crashes mostly related to scrub performance
>> fixes, meanwhile the original grouping patch has one extra flag,
>> SCRUB_FLAG_READ_SUBMITTED, to avoid double submitting.
>>
>> But this flag can be avoided as we can easily avoid double submitting
>> just by properly checking the sctx->nr_stripe variable.
>>
>> This reworked grouping read IO patch should be safer compared to the
>> initial version, with better code structure.
>>
>> Unfortunately, the final performance is worse than the initial version
>> (2.2GiB/s vs 2.5GiB/s), but it should be less racy thus safer.
>>
>> - Re-order the patches
>> The first 3 patches are the main fixes, and I put safer patches first,
>> so even if David still found crash at certain patch, the remaining can
>> be dropped if needed.
>>
>> There is a huge scrub performance drop introduced by v6.4 kernel, that
>> the scrub performance is only around 1/3 for large data extents.
>>
>> There are several causes:
>>
>> - Missing blk plug
>> This means read requests won't be merged by block layer, this can
>> hugely reduce the read performance.
>>
>> - Extra time spent on extent/csum tree search
>> This including extra path allocation/freeing and tree searchs.
>> This is especially obvious for large data extents, as previously we
>> only do one csum search per 512K, but now we do one csum search per
>> 64K, an 8 times increase in csum tree search.
>>
>> - Less concurrency
>> Mostly due to the fact that we're doing submit-and-wait, thus much
>> lower queue depth, affecting things like NVME which benefits a lot
>> from high concurrency.
>>
>> The first 3 patches would greately improve the scrub read performance,
>> but unfortunately it's still not as fast as the pre-6.4 kernels.
>> (2.2GiB/s vs 3.0GiB/s), but still much better than 6.4 kernels (2.2GiB
>> vs 1.0GiB/s).
>>
>> Qu Wenruo (5):
>> btrfs: scrub: avoid unnecessary extent tree search preparing stripes
>> btrfs: scrub: avoid unnecessary csum tree search preparing stripes
>> btrfs: scrub: fix grouping of read IO
>> btrfs: scrub: don't go ordered workqueue for dev-replace
>> btrfs: scrub: move write back of repaired sectors into
>> scrub_stripe_read_repair_worker()
>>
>> fs/btrfs/file-item.c | 33 +++---
>> fs/btrfs/file-item.h | 6 +-
>> fs/btrfs/raid56.c | 4 +-
>> fs/btrfs/scrub.c | 234 ++++++++++++++++++++++++++-----------------
>> 4 files changed, 169 insertions(+), 108 deletions(-)
>>
next prev parent reply other threads:[~2023-08-01 22:06 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-28 11:14 [PATCH 0/5] btrfs: scrub: improve the scrub performance Qu Wenruo
2023-07-28 11:14 ` [PATCH 1/5] btrfs: scrub: avoid unnecessary extent tree search preparing stripes Qu Wenruo
2023-07-28 11:14 ` [PATCH 2/5] btrfs: scrub: avoid unnecessary csum " Qu Wenruo
2023-07-28 11:14 ` [PATCH 3/5] btrfs: scrub: fix grouping of read IO Qu Wenruo
2023-07-28 11:14 ` [PATCH 4/5] btrfs: scrub: don't go ordered workqueue for dev-replace Qu Wenruo
2023-07-28 11:14 ` [PATCH 5/5] btrfs: scrub: move write back of repaired sectors into scrub_stripe_read_repair_worker() Qu Wenruo
2023-07-28 12:38 ` [PATCH 0/5] btrfs: scrub: improve the scrub performance Martin Steigerwald
2023-07-28 16:50 ` David Sterba
2023-07-28 21:14 ` Martin Steigerwald
2023-08-01 20:14 ` Jani Partanen
2023-08-01 22:06 ` Qu Wenruo [this message]
2023-08-01 23:48 ` Jani Partanen
2023-08-02 1:56 ` Qu Wenruo
2023-08-02 2:15 ` Jani Partanen
2023-08-02 2:20 ` Qu Wenruo
2023-08-03 6:30 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48dea2d4-42ba-50bc-d955-9aa4a8082c7e@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=jiipee@sotapeli.fi \
--cc=linux-btrfs@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox