All of lore.kernel.org
 help / color / mirror / Atom feed
From: Benjamin Marzinski <bmarzins@redhat.com>
To: Martin Wilck <martin.wilck@suse.com>
Cc: Christophe Varoqui <christophe.varoqui@opensvc.com>,
	dm-devel@lists.linux.dev
Subject: Re: [PATCH v2 03/14] multipathd: sync maps at end of checkerloop
Date: Wed, 15 Jan 2025 11:45:35 -0500	[thread overview]
Message-ID: <Z4fmL_jRQg8C-xFA@redhat.com> (raw)
In-Reply-To: <5ae128660980dcc037e97a607668498408958638.camel@suse.com>

On Tue, Jan 14, 2025 at 10:36:06PM +0100, Martin Wilck wrote:
> On Thu, 2024-12-19 at 18:04 -0500, Benjamin Marzinski wrote:
> > On Wed, Dec 11, 2024 at 11:58:58PM +0100, Martin Wilck wrote:
> > > @@ -3032,11 +3026,13 @@ checkerloop (void *ap)
> > >  							    
> > > start_time.tv_sec);
> > >  			if (checker_state == CHECKER_FINISHED) {
> > >  				vector_foreach_slot(vecs->mpvec,
> > > mpp, i) {
> > > -					if ((update_mpp_prio(mpp)
> > > ||
> > > -					     (mpp->need_reload &&
> > > mpp->synced_count > 0)) &&
> > 
> > When you call reload_and_sync_map(), it will automatically resync the
> > map via setup_multipath() -> refresh_multipath() ->
> > update_multipath_strings().
> > 
> > This means that if for some reason multipathd and the kernel disagree
> > about a map, and reloading it doesn't fix the problem, you will
> > immediately set mpp->need_reload again. With the old mpp-
> > >synced_count
> > check, you only reload maps with need_reload() when a path is
> > checked.
> > Without this check, or a (mpp->checker_count > 0) check to replace
> > it,
> > you will keep reloading these maps every loop, roughly once a second.
> > I
> > would rather not do this.
> > 
> > If you want to make sure to immediately handle a need_reload that
> > wasn't
> > triggered by this call to reload_and_sync_map() which was because of
> > an
> > earlier need_reload, we could make need_reload have three states, to
> > distinguish between a reload we want done immediately, and one we
> > would
> > like to wait on because we just did a reload and it didn't fix the
> > problem. Then we could remember if need_reload was set before calling
> > reload_and_sync_map(), and if it was, and if it is still set after,
> > we
> > could switch it to the delayed version.
> > 
> > Or perhaps I'm just being paranoid here. 
> 
> As you probably know and as I recently verified, reloading the kernel
> from the checker loop will hardly ever fail except with -ENOMEM [1]. We
> can pass non-existing or failed devices to the kernel, it will happily
> accept them.
> 
> update_pathvec_from_dm() never adds any devices to the map, it just
> removes some. If need_reload is set, it means that it has removed
> either a path or an entire pathgroup. When the map is reloaded, it will
> only reference (a subset of) the devices that were already mapped. I
> see no way how this could fail unless either multipathd or the kernel
> are really badly malfunctioning, in which case we don't need to bother
> about reloading too frequently.
> 
> But *if* the reload succeeds, the set of devices in the kernel is
> guaranteed to be identical to the table that we've just used for
> reloading. So only way that another difference between kernel and
> multipath state could occur between the reload and
> update_pathvec_from_dm() running again is that another device has just
> diappeared from the system, in which case a quick reload would be a
> reasonable action. (Well I guess another possibility would be a 3rd
> party maliciously adding wrong path devices to the maps we maintain,
> but that's not something we can do much about).
> 
> If need_reload is indeed set again in this situation, I would indeed
> prefer to double-check this map quickly. As argued above, I strongly
> believe that such a situation will not persist. IMO a detected
> inconsistency between the kernel and multipathd is a very bad thing
> that we should try to fix rather sooner than later. It's at least as
> bad as a failed path, which we'll check every second, too.
> 
> Bottom line: I think re-checking this quickly is actually the right
> thing to do. Would you accept this if I add a warning in the
> "inconsistent" case, so that in the event that we actually run into a
> persistent discrepancy situation, we will notice?
> 

Sure. This is probably just my paranoia here, since I can't actually
come up with a concrete case where there would be a persistent
discrepancy. If it ever happens, it's almost definitely a bug in the
multipath code.

-Ben

> Regards,
> Martin
> 
> [1]
> https://lore.kernel.org/dm-devel/ee6fcbda31fd1f13969653782417fbed748f5bc7.camel@suse.com/
> 
> > -Ben
> > 
> > > -					   
> > > reload_and_sync_map(mpp, vecs) == 2)
> > > +					sync_mpp(vecs, mpp,
> > > ticks);
> > > +					if ((update_mpp_prio(mpp)
> > > || mpp->need_reload) &&
> > > +					   
> > > reload_and_sync_map(mpp, vecs) == 2) {
> > >  						/* multipath
> > > device deleted */
> > >  						i--;
> > > +						continue;
> > > +					}
> > >  				}
> > >  			}
> > >  			lock_cleanup_pop(vecs->lock);
> > > -- 
> > > 2.47.0
> > 


  reply	other threads:[~2025-01-15 16:45 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-11 22:58 [PATCH v2 00/14] multipathd: More map reload handling, and checkerloop work Martin Wilck
2024-12-11 22:58 ` [PATCH v2 01/14] multipathd: don't reload map in update_mpp_prio() Martin Wilck
2024-12-11 22:58 ` [PATCH v2 02/14] multipathd: remove dm_get_info() call from refresh_multipath() Martin Wilck
2024-12-11 22:58 ` [PATCH v2 03/14] multipathd: sync maps at end of checkerloop Martin Wilck
2024-12-19 21:50   ` Benjamin Marzinski
2025-01-17 19:04     ` Martin Wilck
2024-12-19 23:04   ` Benjamin Marzinski
2025-01-14 21:36     ` Martin Wilck
2025-01-15 16:45       ` Benjamin Marzinski [this message]
2024-12-11 22:58 ` [PATCH v2 04/14] multipathd: quickly re-sync if a map is inconsistent Martin Wilck
2024-12-19 21:57   ` Benjamin Marzinski
2025-01-14 21:37     ` Martin Wilck
2025-01-15 18:48       ` Benjamin Marzinski
2024-12-19 23:05   ` Benjamin Marzinski
2024-12-11 22:59 ` [PATCH v2 05/14] multipathd: move yielding for waiters to start of checkerloop Martin Wilck
2024-12-11 22:59 ` [PATCH v2 06/14] multipathd: add checker_finished() Martin Wilck
2024-12-11 22:59 ` [PATCH v2 07/14] multipathd: move "tick" calls into checker_finished() Martin Wilck
2024-12-11 22:59 ` [PATCH v2 08/14] multipathd: don't call reload_and_sync_map() from deferred_failback_tick() Martin Wilck
2024-12-19 22:04   ` Benjamin Marzinski
2025-01-14 21:40     ` Martin Wilck
2024-12-11 22:59 ` [PATCH v2 09/14] multipathd: move retry_count_tick() into existing mpvec loop Martin Wilck
2024-12-11 22:59 ` [PATCH v2 10/14] multipathd: don't call update_map() from missing_uev_wait_tick() Martin Wilck
2024-12-11 22:59 ` [PATCH v2 11/14] multipathd: don't call udpate_map() from ghost_delay_tick() Martin Wilck
2024-12-11 22:59 ` [PATCH v2 12/14] multipathd: only call reload_and_sync_map() when ghost delay expires Martin Wilck
2024-12-11 22:59 ` [PATCH v2 13/14] multipathd: remove non-existent maps in checkerloop Martin Wilck
2024-12-11 22:59 ` [PATCH v2 14/14] multipathd: remove mpvec_garbage_collector() Martin Wilck
2024-12-19 22:33 ` [PATCH v2 00/14] multipathd: More map reload handling, and checkerloop work Benjamin Marzinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z4fmL_jRQg8C-xFA@redhat.com \
    --to=bmarzins@redhat.com \
    --cc=christophe.varoqui@opensvc.com \
    --cc=dm-devel@lists.linux.dev \
    --cc=martin.wilck@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.