All of lore.kernel.org
 help / color / mirror / Atom feed
From: Benjamin Marzinski <bmarzins@redhat.com>
To: Damien Le Moal <dlemoal@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>,
	snitzer@kernel.org, mpatocka@redhat.com,
	dm-devel@lists.linux.dev
Subject: Re: [PATCH] dm-delay: support zoned devices
Date: Wed, 26 Mar 2025 12:57:52 -0400	[thread overview]
Message-ID: <Z-QyEG6nNo4EJ__z@redhat.com> (raw)
In-Reply-To: <df52d589-784c-4035-9e55-8ce94ad182dc@kernel.org>

On Wed, Mar 26, 2025 at 11:17:24AM -0400, Damien Le Moal wrote:
> On 2025/03/26 11:00, Benjamin Marzinski wrote:
> > On Wed, Mar 26, 2025 at 08:55:48AM -0400, Damien Le Moal wrote:
> >> On 2025/03/21 13:52, Benjamin Marzinski wrote:
> >>> On Fri, Mar 21, 2025 at 08:18:16AM +0100, Christoph Hellwig wrote:
> >>>> Add support for zoned device by passing through report_zoned to the
> >>>> underlying read device.
> >>>>
> >>>> This is required to make enable xfstests xfs/311 on zoned devices.
> >>>
> >>> On suspend, delay_presuspend() stops delaying and it doesn't guarantee
> >>> that new bios coming in will always be submitted after the delayed bios
> >>> it is flushing. That can mess things up for zoned devices. I didn't
> >>> check if that matters for the specific test. Setting
> >>>
> >>> ti->emulate_zone_append = true;
> >>>
> >>> would enforce write ordering, at the expense of adding a whole other
> >>> layer of delays to zoned dm-delay devices. Since this isn't really
> >>> useful outside of testing, I think that could be acceptable if necessary
> >>> (it would require us to support table reloads of zoned devices with
> >>> emulated zone append, since tests often want to change the delay).
> >>> However it would probably be better to see if we can just make dm-delay
> >>> preserve write ordering during a suspend.
> >>
> >> delay_presuspend() calls flush_delayed_bios() with flush_all == true. So all
> >> BIOs will be flushed in the order they are queued in the delay list, which as
> >> far as I can tell is the order in which the user of dm-delay issued the BIOs. So
> >> for writes, the order is preserved as far as I can tell.
> > 
> > delay_presuspend() is called before we set the DMF_BLOCK_IO_FOR_SUSPEND
> > bit, which will stop incoming bio from getting mapped, and also before
> > lock_fs() is called. This means it's common for new bios to continue to
> > come into delay_map(), while delay_presuspend() is running.  The moment
> > delay_presuspend() sets dc->may_delay = false, those new bios will stop
> > getting queued by delay_bio(). They will get remapped immeditately to
> > the underlying device. flush_delayed_bios() doesn't even get called
> > until after dc->may_delay is set to false, and if there are a lot of
> > bios on the delayed_bios list, flush_delayed_bios() will schedule. So,
> > it's actually very common for new incoming bios to get passed to
> > underlying device before all the bios on the dc->delayed_bios list do.
> > 
> > Solving this without grabbing the dc->process_bios_lock mutex for every
> > bio sent to dm-delay probably involves keeping the incoming bios going
> > to dc->delayed_bios during suspend, at least until we can guarantee that
> > it's empty and no bios are being flushed.
> 
> OK. Understood. Thank you for the explanation. And the above sounds like a
> rather simple solution, which does not even needs to be zone specific.
> 
> I also think that this is orthogonal to Christoph patch and we can fix the
> suspend issue on top of Christoph's patch. This is a very niche issue anyway for
> the main target use case which is fstests, since fstests does not suspend/resume
> the dm-delay device as far as I know.

I'm not sure that I would call a patch that adds a feature which doesn't
work correctly in some situations orthogonal to making the feature work
correctly in those situations. But I'll grant you that dm-delay is a
testing target, and Christoph's patch makes in useful for more tests, as
long as they don't suspend the device while there is still outstanding
IO. So, sure. I agree that Christoph doesn't need to change his patch,
and we can fix the suspend issue in a separate one. How that all gets
merged is Mikulas's call.

-Ben

> 
> > 
> > -Ben
> > 
> >>
> >> -- 
> >> Damien Le Moal
> >> Western Digital Research
> > 
> 
> 
> -- 
> Damien Le Moal
> Western Digital Research


  reply	other threads:[~2025-03-26 16:57 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-21  7:18 [PATCH] dm-delay: support zoned devices Christoph Hellwig
2025-03-21 17:52 ` Benjamin Marzinski
2025-03-23  6:28   ` Christoph Hellwig
2025-03-26 12:55   ` Damien Le Moal
2025-03-26 15:00     ` Benjamin Marzinski
2025-03-26 15:17       ` Damien Le Moal
2025-03-26 16:57         ` Benjamin Marzinski [this message]
2025-03-24 11:09 ` Damien Le Moal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z-QyEG6nNo4EJ__z@redhat.com \
    --to=bmarzins@redhat.com \
    --cc=dlemoal@kernel.org \
    --cc=dm-devel@lists.linux.dev \
    --cc=hch@lst.de \
    --cc=mpatocka@redhat.com \
    --cc=snitzer@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.