dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Benjamin Marzinski <bmarzins@redhat.com>
To: Martin Wilck <martin.wilck@suse.com>
Cc: Christophe Varoqui <christophe.varoqui@opensvc.com>,
	device-mapper development <dm-devel@lists.linux.dev>
Subject: Re: [PATCH 09/15] limpathpersist: redesign failed release workaround
Date: Tue, 26 Aug 2025 15:36:16 -0400	[thread overview]
Message-ID: <aK4MsHaPBxu_JIxu@redhat.com> (raw)
In-Reply-To: <c4abd405c8ba0f28f9e98212ff0cac8c85789bcd.camel@suse.com>

On Tue, Aug 26, 2025 at 10:44:22AM +0200, Martin Wilck wrote:
> On Mon, 2025-08-25 at 20:51 -0400, Benjamin Marzinski wrote:
> > On Sun, Aug 24, 2025 at 05:26:50PM +0200, Martin Wilck wrote:
> > 
> > > 
> > > > +	/*
> > > > +	 * Cannot free the reservation because the path that is
> > > > holding it
> > > > +	 * is not usable. Workaround this by:
> > > > +	 * 1. Suspending the device
> > > > +	 * 2. Preempting the reservation to move it to a usable
> > > > path
> > > > +	 *    (this removes the registered keys on all paths
> > > > except
> > > > the
> > > > +	 *    preempting one. Since the device is suspended, no
> > > > IO
> > > > can
> > > > +	 *    go to these unregistered paths and fail).
> > > > +	 * 3. Releasing the reservation on the path that now
> > > > holds
> > > > it.
> > > > +	 * 4. Resuming the device (since it no longer matters
> > > > that
> > > > most of
> > > > +	 *    that paths no longer have a registered key)
> > > > +	 * 5. Reregistering keys on all the paths
> > > > +	 */
> > > > +
> > > > +	if (!dm_simplecmd_noflush(DM_DEVICE_SUSPEND, mpp->alias,
> > > > 0))
> > > > {
> > > > +		condlog(0, "%s: release: failed to suspend dm
> > > > device.",
> > > 
> > > Why do you use dm_simplecmd_noflush() here? Shouldn't queued IO be
> > > flushed from the dm device to avoid it being sent to paths that are
> > > going to be unregistered?
> > > 
> > 
> > I'm pretty certain that DM will still flush all the IO from the
> > target
> > to DM core before suspending, even with dm_simplecmd_noflush() set.
> > In
> > request based multipath, queued IOs are never stored in the target.
> > In
> > bio based multipath, they are, but they will get flushed back up to
> > DM
> > core when suspending and queued there. No IO should happen through
> > the
> > target after the suspend, until the resume. dm_simplecmd_noflush()
> > just
> > keeps multipath from failing any IO that it had queueing, and it's
> > only
> > really necessary when we resize the device, because if we shrink the
> > device, outstanding IO might be outside the new bounds.
> 
> OK, thanks for the clarification. I guess I've never fully understood
> the way queueing works in dm.
> 
> What about queueing in the path devices? We'll be removing registration
> keys, so IO sent by the SCSI layer may end up with RESERVATION CONFLICT
> errors. To my understanding, without the DM_NOFLUSH_FLAG the kernel
> will freeze the queue and flush everything, as if the device was closed
> during shutdown. If DM_NOFLUSH_FLAG is set, this won't happen. What's
> preventing the SCSI layer from sending IO while we're modifying the
> registrations?

In __dm_suspend() we block all new IOs to the dm device here:
https://github.com/torvalds/linux/blob/fab1beda7597fac1cecc01707d55eadb6bbe773c/drivers/md/dm.c#L2955-L2966

Once we know that no new IOs are getting sent to the target, we wait for
all the IOs that were send to the target to get completed by calling
dm_wait_for_completion() here:

https://github.com/torvalds/linux/blob/fab1beda7597fac1cecc01707d55eadb6bbe773c/drivers/md/dm.c#L2973

Any IOs that are currently being sent inside the multipath target will
get handled either while getting mapped or when ending the path IO by
multipath_clone_and_map(), __multipath_map_bio(), multipath_end_io(), or
multipath_end_io_bio(), which will complete the IOs or send them back to
DM core for queueing there (which also satisfies dm_wait_for_completion).

So by the time the suspend command returns, there won't be any IOs in
flight for the the SCSI layer to send to the target, and there can't be
new ones coming in through DM until we resume.

-Ben 
 
> Martin


  parent reply	other threads:[~2025-08-26 19:36 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-10 18:10 [PATCH 00/15] Improve mpathpersist's unavailable path handling Benjamin Marzinski
2025-07-10 18:10 ` [PATCH 01/15] multipathd: remove thread from mpath_pr_event_handle Benjamin Marzinski
2025-07-10 18:10 ` [PATCH 02/15] libmpathpersist: remove uneeded wrapper function Benjamin Marzinski
2025-07-10 18:10 ` [PATCH 03/15] libmpathpersist: reduce log level for persistent reservation checking Benjamin Marzinski
2025-08-24 12:57   ` Martin Wilck
2025-08-25 15:36     ` Martin Wilck
2025-07-10 18:10 ` [PATCH 04/15] libmpathpersist: remove pointless update_map_pr ret value code Benjamin Marzinski
2025-07-10 18:10 ` [PATCH 05/15] multipathd: use update_map_pr in mpath_pr_event_handle Benjamin Marzinski
2025-07-10 18:10 ` [PATCH 06/15] libmpathpersist: limit changing prflag in update_map_pr Benjamin Marzinski
2025-07-10 18:10 ` [PATCH 07/15] multipathd: Don't call update_map_pr unnecessarily Benjamin Marzinski
2025-07-10 18:10 ` [PATCH 08/15] libmpathpersist: remove useless function send_prout_activepath Benjamin Marzinski
2025-07-10 18:10 ` [PATCH 09/15] limpathpersist: redesign failed release workaround Benjamin Marzinski
2025-08-24 15:26   ` Martin Wilck
2025-08-26  0:51     ` Benjamin Marzinski
2025-08-26  8:44       ` Martin Wilck
2025-08-26 10:06         ` Martin Wilck
2025-08-26 21:07           ` Benjamin Marzinski
2025-08-27  6:45             ` Martin Wilck
2025-08-26 19:36         ` Benjamin Marzinski [this message]
2025-08-26 20:53           ` Martin Wilck
2025-07-10 18:10 ` [PATCH 10/15] libmpathpersist: fail the release if all threads fail Benjamin Marzinski
2025-08-24 15:33   ` Martin Wilck
2025-08-29  3:23     ` Benjamin Marzinski
2025-07-10 18:10 ` [PATCH 11/15] limpathpersist: Handle changing key corner case Benjamin Marzinski
2025-07-11 12:15   ` Martin Wilck
2025-07-11 14:11     ` Martin Wilck
2025-07-14 16:59       ` Benjamin Marzinski
2025-07-14 17:15         ` Martin Wilck
2025-07-10 18:10 ` [PATCH 12/15] libmapthpersist: Handle REGISTER AND IGNORE " Benjamin Marzinski
2025-07-10 18:10 ` [PATCH 13/15] libmultipath: rename prflag_value enums Benjamin Marzinski
2025-07-10 18:10 ` [PATCH 14/15] libmpathpersist: use a switch statement for prout command finalizing Benjamin Marzinski
2025-07-10 18:11 ` [PATCH 15/15] libmpathpersist: Add safety check for preempting on key change Benjamin Marzinski
2025-08-24 21:00   ` Martin Wilck
2025-08-25 15:46     ` Martin Wilck
2025-08-24 21:21 ` [PATCH 00/15] Improve mpathpersist's unavailable path handling Martin Wilck
2025-08-25  6:38 ` Hannes Reinecke
2025-08-25 19:56   ` Benjamin Marzinski
2025-08-26  6:06     ` Hannes Reinecke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aK4MsHaPBxu_JIxu@redhat.com \
    --to=bmarzins@redhat.com \
    --cc=christophe.varoqui@opensvc.com \
    --cc=dm-devel@lists.linux.dev \
    --cc=martin.wilck@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).