public inbox for dm-devel@redhat.com
 help / color / mirror / Atom feed
From: Benjamin Marzinski <bmarzins@redhat.com>
To: Brian Bunker <brian@purestorage.com>
Cc: Martin Wilck <martin.wilck@suse.com>,
	dm-devel@lists.linux.dev,
	Krishna Kant <krishna.kant@purestorage.com>
Subject: Re: [PATCH] Add purge capability to multipath-tools
Date: Thu, 4 Dec 2025 15:03:49 -0500	[thread overview]
Message-ID: <aTHpJVhhDKrJxhBU@redhat.com> (raw)
In-Reply-To: <CAHZQxyJ0Y7Tx4UC4je40s6dQKDE942qo-OwSpnFUeJcocozL-Q@mail.gmail.com>

On Thu, Dec 04, 2025 at 08:55:58AM -0800, Brian Bunker wrote:
> On Wed, Dec 3, 2025 at 11:37 AM Benjamin Marzinski <bmarzins@redhat.com> wrote:
> >
> > On Thu, Nov 27, 2025 at 12:31:12PM +0100, Martin Wilck wrote:
> > > On Mon, 2025-11-24 at 15:11 -0500, Benjamin Marzinski wrote:
> > > > On Fri, Nov 21, 2025 at 02:35:23PM -0800, Brian Bunker wrote:
> > > >
> > > > >
> > > > > I added the purge thread because I didn't want to starve the
> > > > > checker thread at a large disconnect volume scale. I noticed
> > > > > that the number of devices if I purged inline with the check that
> > > > > it didn't scale linearly after a point and seemed to be
> > > > > significantly
> > > > > starving the checker thread. Doing the purge in another thread
> > > > > seemed to relieve that pressure.
> > > >
> > > > That might be because the check thread has safeguards to try to avoid
> > > > starving the other threads. Since a lot of multipathd's work is gated
> > > > by
> > > > the vecs lock, there's only so much parallelism that can happen with
> > > > multiple threads. If deleting the devices is taking a long time, it
> > > > might
> > > > be better for this to get interrupted, so that other threads can run.
> > > >
> > > > Since purging the paths is fairly low priority, we could continue to
> > > > run
> > > > it in its own thread, but instead of running through all the devices
> > > > at
> > > > once, purgeloop could lock the vecs->lock, handle one device, and
> > > > then
> > > > unlock and loop. This means it would need to search through the list
> > > > from the start each time it regrabbed the lock, since the list could
> > > > have changed while it wasn't holding it. When purgeloop makes it all
> > > > the
> > > > way through the list without finding any paths that are due for a
> > > > delete
> > > > attempt, it can sleep or block waiting for more paths.
> > > >
> > > > This would mean that paths would need to remember if they had already
> > > > been handled in a cycle. That could be done by purgeloop keeping
> > > > track
> > > > of which cycle it was on, and the paths storing the cycle number
> > > > whenever they were checked.
> > >
> > > As I wrote in my other post, wish that this thread wouldn't hold the
> > > vecs lock at all. multipathd could simply use a pipe to write dev_t's
> > > of to-be-purged paths to the purge thread (or process :-) ).
> >
> > This seems like a good idea. Obviously, writing to sysfs while the vecs
> > lock is being held means that multipath will continue to have that scsi
> > device open, so there's no chance of the device name getting reused if
> > the device is removed from the system. But this other thread/process
> > could simply open the scsi device, double check that it's still
> > disconnected, write to sysfs, and then close the device. That would
> > offer the same guarantee, without the need to interact with anything
> > else.
> >
> > > Martin
> >
> Martin,

Actually, this wasn't Martin, it was me.

> If you look at the v3 implementation that I sent up yesterday, you will
> see we are no longer holding the vecs lock during the sysfs delete
> operation. We also made the delete operation non-blocking.
> 
> The two places in the purgeloop where we do hold the lock are for two
> very short operations:
> 
> 1. We scan the vector to find one path that needs purging. We update
> pp->purge_cycle and increment the retry counter pp->purge_retries++.
> Then we copy the necessary data for the purge into the purge_info
> data structure. Then we release the lock. There shouldn't be much
> contention here with these operations.
> 
> 2. We also re-acquire the lock to make sure the path still exists in
> the pathvec and clear the purge_path and reset purge_retries. This should
> also be very fast.
> 
> We can't take the approach to open the device since it is disconnected.
> I think it will just fail or hang. We have the udev reference to make sure
> the device object is alive and not freed underneath us.

That makes sense. Unfortunately, I checked, and holding a reference on
the udev device does not keep the scsi device from being deleted, or a
new device from being created with the same devnode. 

My comments on your v3 patch talk about using a dup() to copy the
existing pp->fd to deal with this, which should work, regardless of the
path state.

-Ben
 
> Thanks,
> Brian
> 
> -- 
> Brian Bunker
> PURE Storage, Inc.
> brian@purestorage.com


      reply	other threads:[~2025-12-04 20:03 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-13 18:43 [PATCH] Add purge capability to multipath-tools Brian Bunker
2025-11-21 19:07 ` Benjamin Marzinski
2025-11-21 22:35   ` Brian Bunker
2025-11-24 20:11     ` Benjamin Marzinski
2025-11-27 11:31       ` Martin Wilck
2025-12-03 19:37         ` Benjamin Marzinski
2025-12-04 16:55           ` Brian Bunker
2025-12-04 20:03             ` Benjamin Marzinski [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aTHpJVhhDKrJxhBU@redhat.com \
    --to=bmarzins@redhat.com \
    --cc=brian@purestorage.com \
    --cc=dm-devel@lists.linux.dev \
    --cc=krishna.kant@purestorage.com \
    --cc=martin.wilck@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox