* [PATCH] dm-delay: support zoned devices
@ 2025-03-21 7:18 Christoph Hellwig
2025-03-21 17:52 ` Benjamin Marzinski
2025-03-24 11:09 ` Damien Le Moal
0 siblings, 2 replies; 8+ messages in thread
From: Christoph Hellwig @ 2025-03-21 7:18 UTC (permalink / raw)
To: snitzer, mpatocka; +Cc: dm-devel
Add support for zoned device by passing through report_zoned to the
underlying read device.
This is required to make enable xfstests xfs/311 on zoned devices.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/md/dm-delay.c | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/drivers/md/dm-delay.c b/drivers/md/dm-delay.c
index 08f6387620c1..d4cf0ac2a7aa 100644
--- a/drivers/md/dm-delay.c
+++ b/drivers/md/dm-delay.c
@@ -369,6 +369,21 @@ static int delay_map(struct dm_target *ti, struct bio *bio)
return delay_bio(dc, c, bio);
}
+#ifdef CONFIG_BLK_DEV_ZONED
+static int delay_report_zones(struct dm_target *ti,
+ struct dm_report_zones_args *args, unsigned int nr_zones)
+{
+ struct delay_c *dc = ti->private;
+ struct delay_class *c = &dc->read;
+
+ return dm_report_zones(c->dev->bdev, c->start,
+ c->start + dm_target_offset(ti, args->next_sector),
+ args, nr_zones);
+}
+#else
+#define delay_report_zones NULL
+#endif
+
#define DMEMIT_DELAY_CLASS(c) \
DMEMIT("%s %llu %u", (c)->dev->name, (unsigned long long)(c)->start, (c)->delay)
@@ -424,11 +439,12 @@ static int delay_iterate_devices(struct dm_target *ti,
static struct target_type delay_target = {
.name = "delay",
.version = {1, 4, 0},
- .features = DM_TARGET_PASSES_INTEGRITY,
+ .features = DM_TARGET_PASSES_INTEGRITY | DM_TARGET_ZONED_HM,
.module = THIS_MODULE,
.ctr = delay_ctr,
.dtr = delay_dtr,
.map = delay_map,
+ .report_zones = delay_report_zones,
.presuspend = delay_presuspend,
.resume = delay_resume,
.status = delay_status,
--
2.45.2
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH] dm-delay: support zoned devices
2025-03-21 7:18 [PATCH] dm-delay: support zoned devices Christoph Hellwig
@ 2025-03-21 17:52 ` Benjamin Marzinski
2025-03-23 6:28 ` Christoph Hellwig
2025-03-26 12:55 ` Damien Le Moal
2025-03-24 11:09 ` Damien Le Moal
1 sibling, 2 replies; 8+ messages in thread
From: Benjamin Marzinski @ 2025-03-21 17:52 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: snitzer, mpatocka, dm-devel, Damien Le Moal
On Fri, Mar 21, 2025 at 08:18:16AM +0100, Christoph Hellwig wrote:
> Add support for zoned device by passing through report_zoned to the
> underlying read device.
>
> This is required to make enable xfstests xfs/311 on zoned devices.
On suspend, delay_presuspend() stops delaying and it doesn't guarantee
that new bios coming in will always be submitted after the delayed bios
it is flushing. That can mess things up for zoned devices. I didn't
check if that matters for the specific test. Setting
ti->emulate_zone_append = true;
would enforce write ordering, at the expense of adding a whole other
layer of delays to zoned dm-delay devices. Since this isn't really
useful outside of testing, I think that could be acceptable if necessary
(it would require us to support table reloads of zoned devices with
emulated zone append, since tests often want to change the delay).
However it would probably be better to see if we can just make dm-delay
preserve write ordering during a suspend.
-Ben
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> drivers/md/dm-delay.c | 18 +++++++++++++++++-
> 1 file changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/md/dm-delay.c b/drivers/md/dm-delay.c
> index 08f6387620c1..d4cf0ac2a7aa 100644
> --- a/drivers/md/dm-delay.c
> +++ b/drivers/md/dm-delay.c
> @@ -369,6 +369,21 @@ static int delay_map(struct dm_target *ti, struct bio *bio)
> return delay_bio(dc, c, bio);
> }
>
> +#ifdef CONFIG_BLK_DEV_ZONED
> +static int delay_report_zones(struct dm_target *ti,
> + struct dm_report_zones_args *args, unsigned int nr_zones)
> +{
> + struct delay_c *dc = ti->private;
> + struct delay_class *c = &dc->read;
> +
> + return dm_report_zones(c->dev->bdev, c->start,
> + c->start + dm_target_offset(ti, args->next_sector),
> + args, nr_zones);
> +}
> +#else
> +#define delay_report_zones NULL
> +#endif
> +
> #define DMEMIT_DELAY_CLASS(c) \
> DMEMIT("%s %llu %u", (c)->dev->name, (unsigned long long)(c)->start, (c)->delay)
>
> @@ -424,11 +439,12 @@ static int delay_iterate_devices(struct dm_target *ti,
> static struct target_type delay_target = {
> .name = "delay",
> .version = {1, 4, 0},
> - .features = DM_TARGET_PASSES_INTEGRITY,
> + .features = DM_TARGET_PASSES_INTEGRITY | DM_TARGET_ZONED_HM,
> .module = THIS_MODULE,
> .ctr = delay_ctr,
> .dtr = delay_dtr,
> .map = delay_map,
> + .report_zones = delay_report_zones,
> .presuspend = delay_presuspend,
> .resume = delay_resume,
> .status = delay_status,
> --
> 2.45.2
>
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH] dm-delay: support zoned devices
2025-03-21 17:52 ` Benjamin Marzinski
@ 2025-03-23 6:28 ` Christoph Hellwig
2025-03-26 12:55 ` Damien Le Moal
1 sibling, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2025-03-23 6:28 UTC (permalink / raw)
To: Benjamin Marzinski
Cc: Christoph Hellwig, snitzer, mpatocka, dm-devel, Damien Le Moal
On Fri, Mar 21, 2025 at 01:52:41PM -0400, Benjamin Marzinski wrote:
> On Fri, Mar 21, 2025 at 08:18:16AM +0100, Christoph Hellwig wrote:
> > Add support for zoned device by passing through report_zoned to the
> > underlying read device.
> >
> > This is required to make enable xfstests xfs/311 on zoned devices.
>
> On suspend, delay_presuspend() stops delaying and it doesn't guarantee
> that new bios coming in will always be submitted after the delayed bios
> it is flushing. That can mess things up for zoned devices. I didn't
> check if that matters for the specific test. Setting
>
> ti->emulate_zone_append = true;
>
> would enforce write ordering, at the expense of adding a whole other
> layer of delays to zoned dm-delay devices. Since this isn't really
> useful outside of testing, I think that could be acceptable if necessary
> (it would require us to support table reloads of zoned devices with
> emulated zone append, since tests often want to change the delay).
> However it would probably be better to see if we can just make dm-delay
> preserve write ordering during a suspend.
My use case is all about using zone append, so emulating it would be
a bit counter productive, but I'll give it a spin. I doubt that this
test is very timing critical.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] dm-delay: support zoned devices
2025-03-21 17:52 ` Benjamin Marzinski
2025-03-23 6:28 ` Christoph Hellwig
@ 2025-03-26 12:55 ` Damien Le Moal
2025-03-26 15:00 ` Benjamin Marzinski
1 sibling, 1 reply; 8+ messages in thread
From: Damien Le Moal @ 2025-03-26 12:55 UTC (permalink / raw)
To: Benjamin Marzinski, Christoph Hellwig; +Cc: snitzer, mpatocka, dm-devel
On 2025/03/21 13:52, Benjamin Marzinski wrote:
> On Fri, Mar 21, 2025 at 08:18:16AM +0100, Christoph Hellwig wrote:
>> Add support for zoned device by passing through report_zoned to the
>> underlying read device.
>>
>> This is required to make enable xfstests xfs/311 on zoned devices.
>
> On suspend, delay_presuspend() stops delaying and it doesn't guarantee
> that new bios coming in will always be submitted after the delayed bios
> it is flushing. That can mess things up for zoned devices. I didn't
> check if that matters for the specific test. Setting
>
> ti->emulate_zone_append = true;
>
> would enforce write ordering, at the expense of adding a whole other
> layer of delays to zoned dm-delay devices. Since this isn't really
> useful outside of testing, I think that could be acceptable if necessary
> (it would require us to support table reloads of zoned devices with
> emulated zone append, since tests often want to change the delay).
> However it would probably be better to see if we can just make dm-delay
> preserve write ordering during a suspend.
delay_presuspend() calls flush_delayed_bios() with flush_all == true. So all
BIOs will be flushed in the order they are queued in the delay list, which as
far as I can tell is the order in which the user of dm-delay issued the BIOs. So
for writes, the order is preserved as far as I can tell.
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] dm-delay: support zoned devices
2025-03-26 12:55 ` Damien Le Moal
@ 2025-03-26 15:00 ` Benjamin Marzinski
2025-03-26 15:17 ` Damien Le Moal
0 siblings, 1 reply; 8+ messages in thread
From: Benjamin Marzinski @ 2025-03-26 15:00 UTC (permalink / raw)
To: Damien Le Moal; +Cc: Christoph Hellwig, snitzer, mpatocka, dm-devel
On Wed, Mar 26, 2025 at 08:55:48AM -0400, Damien Le Moal wrote:
> On 2025/03/21 13:52, Benjamin Marzinski wrote:
> > On Fri, Mar 21, 2025 at 08:18:16AM +0100, Christoph Hellwig wrote:
> >> Add support for zoned device by passing through report_zoned to the
> >> underlying read device.
> >>
> >> This is required to make enable xfstests xfs/311 on zoned devices.
> >
> > On suspend, delay_presuspend() stops delaying and it doesn't guarantee
> > that new bios coming in will always be submitted after the delayed bios
> > it is flushing. That can mess things up for zoned devices. I didn't
> > check if that matters for the specific test. Setting
> >
> > ti->emulate_zone_append = true;
> >
> > would enforce write ordering, at the expense of adding a whole other
> > layer of delays to zoned dm-delay devices. Since this isn't really
> > useful outside of testing, I think that could be acceptable if necessary
> > (it would require us to support table reloads of zoned devices with
> > emulated zone append, since tests often want to change the delay).
> > However it would probably be better to see if we can just make dm-delay
> > preserve write ordering during a suspend.
>
> delay_presuspend() calls flush_delayed_bios() with flush_all == true. So all
> BIOs will be flushed in the order they are queued in the delay list, which as
> far as I can tell is the order in which the user of dm-delay issued the BIOs. So
> for writes, the order is preserved as far as I can tell.
delay_presuspend() is called before we set the DMF_BLOCK_IO_FOR_SUSPEND
bit, which will stop incoming bio from getting mapped, and also before
lock_fs() is called. This means it's common for new bios to continue to
come into delay_map(), while delay_presuspend() is running. The moment
delay_presuspend() sets dc->may_delay = false, those new bios will stop
getting queued by delay_bio(). They will get remapped immeditately to
the underlying device. flush_delayed_bios() doesn't even get called
until after dc->may_delay is set to false, and if there are a lot of
bios on the delayed_bios list, flush_delayed_bios() will schedule. So,
it's actually very common for new incoming bios to get passed to
underlying device before all the bios on the dc->delayed_bios list do.
Solving this without grabbing the dc->process_bios_lock mutex for every
bio sent to dm-delay probably involves keeping the incoming bios going
to dc->delayed_bios during suspend, at least until we can guarantee that
it's empty and no bios are being flushed.
-Ben
>
> --
> Damien Le Moal
> Western Digital Research
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] dm-delay: support zoned devices
2025-03-26 15:00 ` Benjamin Marzinski
@ 2025-03-26 15:17 ` Damien Le Moal
2025-03-26 16:57 ` Benjamin Marzinski
0 siblings, 1 reply; 8+ messages in thread
From: Damien Le Moal @ 2025-03-26 15:17 UTC (permalink / raw)
To: Benjamin Marzinski; +Cc: Christoph Hellwig, snitzer, mpatocka, dm-devel
On 2025/03/26 11:00, Benjamin Marzinski wrote:
> On Wed, Mar 26, 2025 at 08:55:48AM -0400, Damien Le Moal wrote:
>> On 2025/03/21 13:52, Benjamin Marzinski wrote:
>>> On Fri, Mar 21, 2025 at 08:18:16AM +0100, Christoph Hellwig wrote:
>>>> Add support for zoned device by passing through report_zoned to the
>>>> underlying read device.
>>>>
>>>> This is required to make enable xfstests xfs/311 on zoned devices.
>>>
>>> On suspend, delay_presuspend() stops delaying and it doesn't guarantee
>>> that new bios coming in will always be submitted after the delayed bios
>>> it is flushing. That can mess things up for zoned devices. I didn't
>>> check if that matters for the specific test. Setting
>>>
>>> ti->emulate_zone_append = true;
>>>
>>> would enforce write ordering, at the expense of adding a whole other
>>> layer of delays to zoned dm-delay devices. Since this isn't really
>>> useful outside of testing, I think that could be acceptable if necessary
>>> (it would require us to support table reloads of zoned devices with
>>> emulated zone append, since tests often want to change the delay).
>>> However it would probably be better to see if we can just make dm-delay
>>> preserve write ordering during a suspend.
>>
>> delay_presuspend() calls flush_delayed_bios() with flush_all == true. So all
>> BIOs will be flushed in the order they are queued in the delay list, which as
>> far as I can tell is the order in which the user of dm-delay issued the BIOs. So
>> for writes, the order is preserved as far as I can tell.
>
> delay_presuspend() is called before we set the DMF_BLOCK_IO_FOR_SUSPEND
> bit, which will stop incoming bio from getting mapped, and also before
> lock_fs() is called. This means it's common for new bios to continue to
> come into delay_map(), while delay_presuspend() is running. The moment
> delay_presuspend() sets dc->may_delay = false, those new bios will stop
> getting queued by delay_bio(). They will get remapped immeditately to
> the underlying device. flush_delayed_bios() doesn't even get called
> until after dc->may_delay is set to false, and if there are a lot of
> bios on the delayed_bios list, flush_delayed_bios() will schedule. So,
> it's actually very common for new incoming bios to get passed to
> underlying device before all the bios on the dc->delayed_bios list do.
>
> Solving this without grabbing the dc->process_bios_lock mutex for every
> bio sent to dm-delay probably involves keeping the incoming bios going
> to dc->delayed_bios during suspend, at least until we can guarantee that
> it's empty and no bios are being flushed.
OK. Understood. Thank you for the explanation. And the above sounds like a
rather simple solution, which does not even needs to be zone specific.
I also think that this is orthogonal to Christoph patch and we can fix the
suspend issue on top of Christoph's patch. This is a very niche issue anyway for
the main target use case which is fstests, since fstests does not suspend/resume
the dm-delay device as far as I know.
>
> -Ben
>
>>
>> --
>> Damien Le Moal
>> Western Digital Research
>
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] dm-delay: support zoned devices
2025-03-26 15:17 ` Damien Le Moal
@ 2025-03-26 16:57 ` Benjamin Marzinski
0 siblings, 0 replies; 8+ messages in thread
From: Benjamin Marzinski @ 2025-03-26 16:57 UTC (permalink / raw)
To: Damien Le Moal; +Cc: Christoph Hellwig, snitzer, mpatocka, dm-devel
On Wed, Mar 26, 2025 at 11:17:24AM -0400, Damien Le Moal wrote:
> On 2025/03/26 11:00, Benjamin Marzinski wrote:
> > On Wed, Mar 26, 2025 at 08:55:48AM -0400, Damien Le Moal wrote:
> >> On 2025/03/21 13:52, Benjamin Marzinski wrote:
> >>> On Fri, Mar 21, 2025 at 08:18:16AM +0100, Christoph Hellwig wrote:
> >>>> Add support for zoned device by passing through report_zoned to the
> >>>> underlying read device.
> >>>>
> >>>> This is required to make enable xfstests xfs/311 on zoned devices.
> >>>
> >>> On suspend, delay_presuspend() stops delaying and it doesn't guarantee
> >>> that new bios coming in will always be submitted after the delayed bios
> >>> it is flushing. That can mess things up for zoned devices. I didn't
> >>> check if that matters for the specific test. Setting
> >>>
> >>> ti->emulate_zone_append = true;
> >>>
> >>> would enforce write ordering, at the expense of adding a whole other
> >>> layer of delays to zoned dm-delay devices. Since this isn't really
> >>> useful outside of testing, I think that could be acceptable if necessary
> >>> (it would require us to support table reloads of zoned devices with
> >>> emulated zone append, since tests often want to change the delay).
> >>> However it would probably be better to see if we can just make dm-delay
> >>> preserve write ordering during a suspend.
> >>
> >> delay_presuspend() calls flush_delayed_bios() with flush_all == true. So all
> >> BIOs will be flushed in the order they are queued in the delay list, which as
> >> far as I can tell is the order in which the user of dm-delay issued the BIOs. So
> >> for writes, the order is preserved as far as I can tell.
> >
> > delay_presuspend() is called before we set the DMF_BLOCK_IO_FOR_SUSPEND
> > bit, which will stop incoming bio from getting mapped, and also before
> > lock_fs() is called. This means it's common for new bios to continue to
> > come into delay_map(), while delay_presuspend() is running. The moment
> > delay_presuspend() sets dc->may_delay = false, those new bios will stop
> > getting queued by delay_bio(). They will get remapped immeditately to
> > the underlying device. flush_delayed_bios() doesn't even get called
> > until after dc->may_delay is set to false, and if there are a lot of
> > bios on the delayed_bios list, flush_delayed_bios() will schedule. So,
> > it's actually very common for new incoming bios to get passed to
> > underlying device before all the bios on the dc->delayed_bios list do.
> >
> > Solving this without grabbing the dc->process_bios_lock mutex for every
> > bio sent to dm-delay probably involves keeping the incoming bios going
> > to dc->delayed_bios during suspend, at least until we can guarantee that
> > it's empty and no bios are being flushed.
>
> OK. Understood. Thank you for the explanation. And the above sounds like a
> rather simple solution, which does not even needs to be zone specific.
>
> I also think that this is orthogonal to Christoph patch and we can fix the
> suspend issue on top of Christoph's patch. This is a very niche issue anyway for
> the main target use case which is fstests, since fstests does not suspend/resume
> the dm-delay device as far as I know.
I'm not sure that I would call a patch that adds a feature which doesn't
work correctly in some situations orthogonal to making the feature work
correctly in those situations. But I'll grant you that dm-delay is a
testing target, and Christoph's patch makes in useful for more tests, as
long as they don't suspend the device while there is still outstanding
IO. So, sure. I agree that Christoph doesn't need to change his patch,
and we can fix the suspend issue in a separate one. How that all gets
merged is Mikulas's call.
-Ben
>
> >
> > -Ben
> >
> >>
> >> --
> >> Damien Le Moal
> >> Western Digital Research
> >
>
>
> --
> Damien Le Moal
> Western Digital Research
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] dm-delay: support zoned devices
2025-03-21 7:18 [PATCH] dm-delay: support zoned devices Christoph Hellwig
2025-03-21 17:52 ` Benjamin Marzinski
@ 2025-03-24 11:09 ` Damien Le Moal
1 sibling, 0 replies; 8+ messages in thread
From: Damien Le Moal @ 2025-03-24 11:09 UTC (permalink / raw)
To: Christoph Hellwig, snitzer, mpatocka; +Cc: dm-devel
On 2025/03/21 3:18, Christoph Hellwig wrote:
> Add support for zoned device by passing through report_zoned to the
> underlying read device.
>
> This is required to make enable xfstests xfs/311 on zoned devices.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Looks good to me.
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-03-26 16:57 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-21 7:18 [PATCH] dm-delay: support zoned devices Christoph Hellwig
2025-03-21 17:52 ` Benjamin Marzinski
2025-03-23 6:28 ` Christoph Hellwig
2025-03-26 12:55 ` Damien Le Moal
2025-03-26 15:00 ` Benjamin Marzinski
2025-03-26 15:17 ` Damien Le Moal
2025-03-26 16:57 ` Benjamin Marzinski
2025-03-24 11:09 ` Damien Le Moal
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.