From: Wang Yugui <wangyugui@e16-tech.com>
To: Naohiro Aota <naohiro.aota@wdc.com>, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 0/2] btrfs: disable inline checksum for multi-dev striped FS
Date: Mon, 29 Jan 2024 20:56:22 +0800 [thread overview]
Message-ID: <20240129205621.1BA8.409509F4@e16-tech.com> (raw)
In-Reply-To: <20240124081931.1DDE.409509F4@e16-tech.com>
Hi,
> Hi,
>
> > There was a report of write performance regression on 6.5-rc4 on RAID0
> > (4 devices) btrfs [1]. Then, I reported that BTRFS_FS_CSUM_IMPL_FAST
> > and doing the checksum inline can be bad for performance on RAID0
> > setup [2].
> >
> > [1] https://lore.kernel.org/linux-btrfs/20230731152223.4EFB.409509F4@e16-tech.com/
> > [2] https://lore.kernel.org/linux-btrfs/p3vo3g7pqn664mhmdhlotu5dzcna6vjtcoc2hb2lsgo2fwct7k@xzaxclba5tae/
> >
> > While inlining the fast checksum is good for single (or two) device,
> > but it is not fast enough for multi-device striped writing.
> >
> > So, this series first introduces fs_devices->inline_csum_mode and its
> > sysfs interface to tweak the inline csum behavior (auto/on/off). Then,
> > it disables inline checksum when it find a block group striped writing
> > into multiple devices.
>
> We have struct btrfs_inode | sync_writers in kernel 6.1.y, but dropped in recent
> kernel.
>
> Is btrfs_inode | sync_writers not implemented very well?
I tried the logic blow, some like ' btrfs_inode | sync_writers'.
- checksum of metadata always sync
- checksum of data async only when depth over 1,
to reduce task switch when low load.
to use more cpu core when high load.
performance test result is not good
2GiB/s(checksum of data always async) -> 2.1GiB/s when low load.
4GiB/s(checksum of data always async) -> 2788MiB/s when high load.
but the info maybe useful, so post it here.
- checksum of metadata always sync
diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c
index 12b12443efaa..8ef968f0957d 100644
--- a/fs/btrfs/bio.c
+++ b/fs/btrfs/bio.c
@@ -598,7 +598,7 @@ static void run_one_async_free(struct btrfs_work *work)
static bool should_async_write(struct btrfs_bio *bbio)
{
/* Submit synchronously if the checksum implementation is fast. */
- if (test_bit(BTRFS_FS_CSUM_IMPL_FAST, &bbio->fs_info->flags))
+ if ((bbio->bio.bi_opf & REQ_META) && test_bit(BTRFS_FS_CSUM_IMPL_FAST, &bbio->fs_info->flags))
return false;
/*
- checksum of data async only when depth over 1, to reduce task switch.
diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c
index efb894967f55..f90b6e8cf53c 100644
--- a/fs/btrfs/bio.c
+++ b/fs/btrfs/bio.c
@@ -626,6 +626,9 @@ static bool should_async_write(struct btrfs_bio *bbio)
if ((bbio->bio.bi_opf & REQ_META) && btrfs_is_zoned(bbio->fs_info))
return false;
+ if (!(bbio->bio.bi_opf & REQ_META) && atomic_read(&bbio->fs_info->depth_checksum_data)==0 )
+ return false;
+
return true;
}
@@ -725,11 +728,21 @@ static bool btrfs_submit_chunk(struct btrfs_bio *bbio, int mirror_num)
if (inode && !(inode->flags & BTRFS_INODE_NODATASUM) &&
!test_bit(BTRFS_FS_STATE_NO_CSUMS, &fs_info->fs_state) &&
!btrfs_is_data_reloc_root(inode->root)) {
- if (should_async_write(bbio) &&
- btrfs_wq_submit_bio(bbio, bioc, &smap, mirror_num))
- goto done;
-
+ if (should_async_write(bbio)){
+ if (!(bbio->bio.bi_opf & REQ_META))
+ atomic_inc(&bbio->fs_info->depth_checksum_data);
+ ret = btrfs_wq_submit_bio(bbio, bioc, &smap, mirror_num);
+ if (!(bbio->bio.bi_opf & REQ_META))
+ atomic_dec(&bbio->fs_info->depth_checksum_data);
+ if(ret)
+ goto done;
+ }
+
+ if (!(bbio->bio.bi_opf & REQ_META))
+ atomic_inc(&bbio->fs_info->depth_checksum_data);
ret = btrfs_bio_csum(bbio);
+ if (!(bbio->bio.bi_opf & REQ_META))
+ atomic_dec(&bbio->fs_info->depth_checksum_data);
if (ret)
goto fail_put_bio;
} else if (use_append) {
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index d7b127443c9a..3fd89be7610a 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2776,6 +2776,7 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info)
fs_info->thread_pool_size = min_t(unsigned long,
num_online_cpus() + 2, 8);
+ atomic_set(&fs_info->depth_checksum_data, 0);
INIT_LIST_HEAD(&fs_info->ordered_roots);
spin_lock_init(&fs_info->ordered_root_lock);
diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h
index 7443bf014639..123cc8fa9be1 100644
--- a/fs/btrfs/fs.h
+++ b/fs/btrfs/fs.h
@@ -596,6 +596,7 @@ struct btrfs_fs_info {
struct task_struct *transaction_kthread;
struct task_struct *cleaner_kthread;
u32 thread_pool_size;
+ atomic_t depth_checksum_data;
struct kobject *space_info_kobj;
struct kobject *qgroups_kobj;
Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2024/01/25
next prev parent reply other threads:[~2024-01-29 13:27 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-18 8:54 [PATCH 0/2] btrfs: disable inline checksum for multi-dev striped FS Naohiro Aota
2024-01-18 8:54 ` [PATCH 1/2] btrfs: introduce inline_csum_mode to tweak inline checksum behavior Naohiro Aota
2024-01-18 8:54 ` [PATCH 2/2] btrfs: detect multi-dev stripe and disable automatic inline checksum Naohiro Aota
2024-01-19 15:29 ` Johannes Thumshirn
2024-01-22 8:02 ` Naohiro Aota
2024-01-22 21:11 ` David Sterba
2024-01-18 9:12 ` [PATCH 0/2] btrfs: disable inline checksum for multi-dev striped FS Roman Mamedov
2024-01-19 15:49 ` David Sterba
2024-01-22 15:31 ` Naohiro Aota
2024-01-22 7:17 ` Naohiro Aota
2024-01-19 15:30 ` Johannes Thumshirn
2024-01-19 16:01 ` David Sterba
2024-01-22 15:12 ` Naohiro Aota
2024-01-22 21:19 ` David Sterba
2024-01-24 0:19 ` Wang Yugui
2024-01-29 12:56 ` Wang Yugui [this message]
2024-01-30 1:38 ` Naohiro Aota
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240129205621.1BA8.409509F4@e16-tech.com \
--to=wangyugui@e16-tech.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=naohiro.aota@wdc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox