From: Christoph Hellwig <hch@infradead.org>
To: kernel test robot <oliver.sang@intel.com>
Cc: Christoph Hellwig <hch@lst.de>,
oe-lkp@lists.linux.dev, lkp@intel.com,
Jens Axboe <axboe@kernel.dk>,
Ulf Hansson <ulf.hansson@linaro.org>,
Damien Le Moal <dlemoal@kernel.org>,
Hannes Reinecke <hare@suse.de>,
linux-block@vger.kernel.org, linux-um@lists.infradead.org,
drbd-dev@lists.linbit.com, nbd@other.debian.org,
linuxppc-dev@lists.ozlabs.org, virtualization@lists.linux.dev,
xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org,
dm-devel@lists.linux.dev, linux-raid@vger.kernel.org,
linux-mmc@vger.kernel.org, linux-mtd@lists.infradead.org,
nvdimm@lists.linux.dev, linux-nvme@lists.infradead.org,
linux-scsi@vger.kernel.org, ying.huang@intel.com,
feng.tang@intel.com, fengwei.yin@intel.com
Subject: Re: [axboe-block:for-next] [block] 1122c0c1cc: aim7.jobs-per-min 22.6% improvement
Date: Tue, 25 Jun 2024 01:57:35 -0700 [thread overview]
Message-ID: <ZnqGf49cvy6W-xWf@infradead.org> (raw)
In-Reply-To: <202406250948.e0044f1d-oliver.sang@intel.com>
Hi Oliver,
can you test the patch below? It restores the previous behavior if
the device did not have a volatile write cache. I think at least
for raid0 and raid1 without bitmap the new behavior actually is correct
and better, but it will need fixes for other modes. If the underlying
devices did have a volatile write cache I'm a bit lost what the problem
was and this probably won't fix the issue.
---
From 81c816827197f811e14add7a79220ed9eef6af02 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Tue, 25 Jun 2024 08:48:18 +0200
Subject: md: set md-specific flags for all queue limits
The md driver wants to enforce a number of flags to an all devices, even
when not inheriting them from the underlying devices. To make sure these
flags survive the queue_limits_set calls that md uses to update the
queue limits without deriving them form the previous limits add a new
md_init_stacking_limits helper that calls blk_set_stacking_limits and sets
these flags.
Fixes: 1122c0c1cc71 ("block: move cache control settings out of queue->flags")
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/md/md.c | 13 ++++++++-----
drivers/md/md.h | 1 +
drivers/md/raid0.c | 2 +-
drivers/md/raid1.c | 2 +-
drivers/md/raid10.c | 2 +-
drivers/md/raid5.c | 2 +-
6 files changed, 13 insertions(+), 9 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 69ea54aedd99a1..8368438e58e989 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5853,6 +5853,13 @@ static void mddev_delayed_delete(struct work_struct *ws)
kobject_put(&mddev->kobj);
}
+void md_init_stacking_limits(struct queue_limits *lim)
+{
+ blk_set_stacking_limits(lim);
+ lim->features = BLK_FEAT_WRITE_CACHE | BLK_FEAT_FUA |
+ BLK_FEAT_IO_STAT | BLK_FEAT_NOWAIT;
+}
+
struct mddev *md_alloc(dev_t dev, char *name)
{
/*
@@ -5871,10 +5878,6 @@ struct mddev *md_alloc(dev_t dev, char *name)
int shift;
int unit;
int error;
- struct queue_limits lim = {
- .features = BLK_FEAT_WRITE_CACHE | BLK_FEAT_FUA |
- BLK_FEAT_IO_STAT | BLK_FEAT_NOWAIT,
- };
/*
* Wait for any previous instance of this device to be completely
@@ -5914,7 +5917,7 @@ struct mddev *md_alloc(dev_t dev, char *name)
*/
mddev->hold_active = UNTIL_STOP;
- disk = blk_alloc_disk(&lim, NUMA_NO_NODE);
+ disk = blk_alloc_disk(NULL, NUMA_NO_NODE);
if (IS_ERR(disk)) {
error = PTR_ERR(disk);
goto out_free_mddev;
diff --git a/drivers/md/md.h b/drivers/md/md.h
index c4d7ebf9587d07..28cb4b0b6c1740 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -893,6 +893,7 @@ extern int strict_strtoul_scaled(const char *cp, unsigned long *res, int scale);
extern int mddev_init(struct mddev *mddev);
extern void mddev_destroy(struct mddev *mddev);
+void md_init_stacking_limits(struct queue_limits *lim);
struct mddev *md_alloc(dev_t dev, char *name);
void mddev_put(struct mddev *mddev);
extern int md_run(struct mddev *mddev);
diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index 62634e2a33bd0f..32d58752477847 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -379,7 +379,7 @@ static int raid0_set_limits(struct mddev *mddev)
struct queue_limits lim;
int err;
- blk_set_stacking_limits(&lim);
+ md_init_stacking_limits(&lim);
lim.max_hw_sectors = mddev->chunk_sectors;
lim.max_write_zeroes_sectors = mddev->chunk_sectors;
lim.io_min = mddev->chunk_sectors << 9;
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 1a0eba65b8a92b..04a0c2ca173245 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -3194,7 +3194,7 @@ static int raid1_set_limits(struct mddev *mddev)
struct queue_limits lim;
int err;
- blk_set_stacking_limits(&lim);
+ md_init_stacking_limits(&lim);
lim.max_write_zeroes_sectors = 0;
err = mddev_stack_rdev_limits(mddev, &lim, MDDEV_STACK_INTEGRITY);
if (err) {
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 3334aa803c8380..2a9c4ee982e023 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -3974,7 +3974,7 @@ static int raid10_set_queue_limits(struct mddev *mddev)
struct queue_limits lim;
int err;
- blk_set_stacking_limits(&lim);
+ md_init_stacking_limits(&lim);
lim.max_write_zeroes_sectors = 0;
lim.io_min = mddev->chunk_sectors << 9;
lim.io_opt = lim.io_min * raid10_nr_stripes(conf);
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 0192a6323f09ba..10219205160bbf 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7708,7 +7708,7 @@ static int raid5_set_limits(struct mddev *mddev)
*/
stripe = roundup_pow_of_two(data_disks * (mddev->chunk_sectors << 9));
- blk_set_stacking_limits(&lim);
+ md_init_stacking_limits(&lim);
lim.io_min = mddev->chunk_sectors << 9;
lim.io_opt = lim.io_min * (conf->raid_disks - conf->max_degraded);
lim.features |= BLK_FEAT_RAID_PARTIAL_STRIPES_EXPENSIVE;
--
2.43.0
next prev parent reply other threads:[~2024-06-25 8:57 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-25 2:28 [axboe-block:for-next] [block] 1122c0c1cc: aim7.jobs-per-min 22.6% improvement kernel test robot
2024-06-25 8:57 ` Christoph Hellwig [this message]
2024-06-26 2:10 ` Oliver Sang
2024-06-26 3:39 ` Christoph Hellwig
2024-06-27 2:35 ` Oliver Sang
2024-06-27 4:54 ` Christoph Hellwig
2024-07-01 8:22 ` Oliver Sang
2024-07-02 7:32 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZnqGf49cvy6W-xWf@infradead.org \
--to=hch@infradead.org \
--cc=axboe@kernel.dk \
--cc=dlemoal@kernel.org \
--cc=dm-devel@lists.linux.dev \
--cc=drbd-dev@lists.linbit.com \
--cc=feng.tang@intel.com \
--cc=fengwei.yin@intel.com \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=linux-bcache@vger.kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-mmc@vger.kernel.org \
--cc=linux-mtd@lists.infradead.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-raid@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=linux-um@lists.infradead.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=lkp@intel.com \
--cc=nbd@other.debian.org \
--cc=nvdimm@lists.linux.dev \
--cc=oe-lkp@lists.linux.dev \
--cc=oliver.sang@intel.com \
--cc=ulf.hansson@linaro.org \
--cc=virtualization@lists.linux.dev \
--cc=xen-devel@lists.xenproject.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).