All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/14] multipathd: More map reload handling, and checkerloop work
@ 2024-12-11 22:58 Martin Wilck
  2024-12-11 22:58 ` [PATCH v2 01/14] multipathd: don't reload map in update_mpp_prio() Martin Wilck
                   ` (14 more replies)
  0 siblings, 15 replies; 27+ messages in thread
From: Martin Wilck @ 2024-12-11 22:58 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: dm-devel, Martin Wilck

This patch set goes on top of Ben's set [1] for github issue 105 [2].

The first patch implements the remark I had on patch 2 on Ben's set, the 2nd
is a minor cleanup.

Patch 3 moves the sync_mpp() call from the beginning to the end of the
checkerloop, as suggested by Ben in [3]. If an inconsitency is found
(mpp->need_reload), we reload the map and schedule another sync for the
next tick (patch 4).

Patch 5 ff. reshuffle the code in checkerloop(). There is now one function,
checker_finished(), that takes all actions that need to be done with the vecs
lock taken after the checkers have finished. checkerloop() enters this
function immediately when the checkers have finished, without dropping and
re-acquiring the vecs lock. The map reload logic is completely handled in this
function.

The various _tick() functions don't loop over mpvec any more; they are now
just called for a single mpp, and they simply return true if a map reload is
required.  The actual reload action differs: if missing_uev_wait_tick()
requests a reload, it needs to be a full update_map() (which calls
adopt_paths()), whereas in the other cases, reload_and_sync_map() is sufficient.
Patch 12 changes the reload action for the ghost delay tick.

Patch 13 removes maps if that are not found in the kernel any more. This
obsoletes the map garbage collector. Unlike the logic in v1, we won't remove
maps on arbitrary error conditions any more.

Changes v1 -> v2:

Removed patch 3 and 4 of v1 and replaced them by an alternative approach.
Instead of allowing map and path removal in the checker loop, the kernel sync
is moved towards the end.

Patch 5 ff. in v1 contained a bug in the new checker_finished() function.
If one "tick" function returned true, the other might not be executed any
more. In v2, all tick functions are executed, and the action to be taken is
selected according to the combined results.  Also, we won't call
reload_and_sync_map() when we've already called update_map().

Patch 13 is new in v2. Patch 8 from v1 has been moved after patch 13.

(In the thread following the review of v1, I mistakenly wrote about an
upcoming "v4" of this patch set. That was wrong, I meant this v2 here).

Reviews & comments welcome.

Regards
Martin

[1] https://lore.kernel.org/dm-devel/20241205035638.1218953-1-bmarzins@redhat.com/
[2] https://github.com/opensvc/multipath-tools/issues/105
[3] https://lore.kernel.org/dm-devel/Z1iUekRg8sai8HLT@redhat.com/


Martin Wilck (14):
  multipathd: don't reload map in update_mpp_prio()
  multipathd: remove dm_get_info() call from refresh_multipath()
  multipathd: sync maps at end of checkerloop
  multipathd: quickly re-sync if a map is inconsistent
  multipathd: move yielding for waiters to start of checkerloop
  multipathd: add checker_finished()
  multipathd: move "tick" calls into checker_finished()
  multipathd: don't call reload_and_sync_map() from
    deferred_failback_tick()
  multipathd: move retry_count_tick() into existing mpvec loop
  multipathd: don't call update_map() from missing_uev_wait_tick()
  multipathd: don't call udpate_map() from ghost_delay_tick()
  multipathd: only call reload_and_sync_map() when ghost delay expires
  multipathd: remove non-existent maps in checkerloop
  multipathd: remove mpvec_garbage_collector()

 libmultipath/structs.h     |   2 +-
 libmultipath/structs_vec.c |   1 -
 multipathd/main.c          | 287 +++++++++++++++----------------------
 3 files changed, 118 insertions(+), 172 deletions(-)

-- 
2.47.0


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 01/14] multipathd: don't reload map in update_mpp_prio()
  2024-12-11 22:58 [PATCH v2 00/14] multipathd: More map reload handling, and checkerloop work Martin Wilck
@ 2024-12-11 22:58 ` Martin Wilck
  2024-12-11 22:58 ` [PATCH v2 02/14] multipathd: remove dm_get_info() call from refresh_multipath() Martin Wilck
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 27+ messages in thread
From: Martin Wilck @ 2024-12-11 22:58 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: dm-devel, Martin Wilck

Rather, return a bool indicating whether checkerloop() should call
reload_and_sync_map().

Signed-off-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 multipathd/main.c | 22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index d7d4a26..fa684e6 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2689,25 +2689,25 @@ update_path_state (struct vectors * vecs, struct path * pp)
 	return chkr_new_path_up ? CHECK_PATH_NEW_UP : CHECK_PATH_CHECKED;
 }
 
-static int
-update_mpp_prio(struct vectors *vecs, struct multipath *mpp)
+/* Return value: true if the map needs to be reloaded */
+static bool update_mpp_prio(struct multipath *mpp)
 {
 	bool need_reload, changed;
 	enum prio_update_type prio_update = mpp->prio_update;
 	mpp->prio_update = PRIO_UPDATE_NONE;
 
 	if (mpp->wait_for_udev || prio_update == PRIO_UPDATE_NONE)
-		return 0;
+		return false;
 	condlog(4, "prio refresh");
 
 	changed = update_prio(mpp, prio_update != PRIO_UPDATE_NORMAL);
 	if (prio_update == PRIO_UPDATE_MARGINAL)
-		return reload_and_sync_map(mpp, vecs);
+		return true;
 	if (changed && mpp->pgpolicyfn == (pgpolicyfn *)group_by_prio &&
 	    mpp->pgfailback == -FAILBACK_IMMEDIATE) {
 		condlog(2, "%s: path priorities changed. reloading",
 			mpp->alias);
-		return reload_and_sync_map(mpp, vecs);
+		return true;
 	}
 	if (need_switch_pathgroup(mpp, &need_reload)) {
 		if (mpp->pgfailback > 0 &&
@@ -2718,12 +2718,12 @@ update_mpp_prio(struct vectors *vecs, struct multipath *mpp)
 			 (prio_update == PRIO_UPDATE_NEW_PATH &&
 			  followover_should_failback(mpp))) {
 			if (need_reload)
-				return reload_and_sync_map(mpp, vecs);
+				return true;
 			else
 				switch_pathgroup(mpp);
 		}
 	}
-	return 0;
+	return false;
 }
 
 static int
@@ -3040,13 +3040,11 @@ checkerloop (void *ap)
 							     start_time.tv_sec);
 			if (checker_state == CHECKER_FINISHED) {
 				vector_foreach_slot(vecs->mpvec, mpp, i) {
-					if (update_mpp_prio(vecs, mpp) == 2 ||
-					    (mpp->need_reload &&
-					     mpp->synced_count > 0 &&
-					     reload_and_sync_map(mpp, vecs) == 2)) {
+					if ((update_mpp_prio(mpp) ||
+					     (mpp->need_reload && mpp->synced_count > 0)) &&
+					    reload_and_sync_map(mpp, vecs) == 2)
 						/* multipath device deleted */
 						i--;
-					}
 				}
 			}
 			lock_cleanup_pop(vecs->lock);
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 02/14] multipathd: remove dm_get_info() call from refresh_multipath()
  2024-12-11 22:58 [PATCH v2 00/14] multipathd: More map reload handling, and checkerloop work Martin Wilck
  2024-12-11 22:58 ` [PATCH v2 01/14] multipathd: don't reload map in update_mpp_prio() Martin Wilck
@ 2024-12-11 22:58 ` Martin Wilck
  2024-12-11 22:58 ` [PATCH v2 03/14] multipathd: sync maps at end of checkerloop Martin Wilck
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 27+ messages in thread
From: Martin Wilck @ 2024-12-11 22:58 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: dm-devel, Martin Wilck

If the map is inaccessible, update_multipath_strings() will fail.
There is no need for the extra call to dm_get_info().

Signed-off-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 multipathd/main.c | 14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index fa684e6..4a28fbb 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -503,20 +503,12 @@ remove_maps_and_stop_waiters(struct vectors *vecs)
 
 int refresh_multipath(struct vectors *vecs, struct multipath *mpp)
 {
-	if (dm_get_info(mpp->alias, &mpp->dmi) != DMP_OK) {
-		/* Error accessing table */
-		condlog(2, "%s: cannot access table", mpp->alias);
-		goto out;
-	}
-
 	if (update_multipath_strings(mpp, vecs->pathvec) != DMP_OK) {
-		condlog(0, "%s: failed to setup multipath", mpp->alias);
-		goto out;
+		condlog(0, "%s: failed to read map from kernel", mpp->alias);
+		remove_map_and_stop_waiter(mpp, vecs);
+		return 1;
 	}
 	return 0;
-out:
-	remove_map_and_stop_waiter(mpp, vecs);
-	return 1;
 }
 
 int setup_multipath(struct vectors *vecs, struct multipath *mpp)
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 03/14] multipathd: sync maps at end of checkerloop
  2024-12-11 22:58 [PATCH v2 00/14] multipathd: More map reload handling, and checkerloop work Martin Wilck
  2024-12-11 22:58 ` [PATCH v2 01/14] multipathd: don't reload map in update_mpp_prio() Martin Wilck
  2024-12-11 22:58 ` [PATCH v2 02/14] multipathd: remove dm_get_info() call from refresh_multipath() Martin Wilck
@ 2024-12-11 22:58 ` Martin Wilck
  2024-12-19 21:50   ` Benjamin Marzinski
  2024-12-19 23:04   ` Benjamin Marzinski
  2024-12-11 22:58 ` [PATCH v2 04/14] multipathd: quickly re-sync if a map is inconsistent Martin Wilck
                   ` (11 subsequent siblings)
  14 siblings, 2 replies; 27+ messages in thread
From: Martin Wilck @ 2024-12-11 22:58 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: dm-devel, Martin Wilck

Rather than calling sync_mpp early in the checkerloop and tracking
map synchronization with synced_count, call sync_mpp() in the CHECKER_FINISHED
state, if either at least one path of the map has been checked in the current
iteration, or the sync tick has expired. This avoids potentially deleting
paths from the pathvec through the do_sync_mpp() -> update_multipath_strings()
-> sync_paths -> check_removed_paths() call chain while we're iterating over
the pathvec. Also, the time gap between obtaining path states and syncing
the state with the kernel is smaller this way.

Suggested-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 libmultipath/structs.h     |  2 +-
 libmultipath/structs_vec.c |  1 -
 multipathd/main.c          | 26 +++++++++++---------------
 3 files changed, 12 insertions(+), 17 deletions(-)

diff --git a/libmultipath/structs.h b/libmultipath/structs.h
index 6a30c59..9d22bdd 100644
--- a/libmultipath/structs.h
+++ b/libmultipath/structs.h
@@ -471,7 +471,7 @@ struct multipath {
 	int ghost_delay_tick;
 	int queue_mode;
 	unsigned int sync_tick;
-	int synced_count;
+	int checker_count;
 	enum prio_update_type prio_update;
 	uid_t uid;
 	gid_t gid;
diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c
index 7a4e3eb..6aa744d 100644
--- a/libmultipath/structs_vec.c
+++ b/libmultipath/structs_vec.c
@@ -530,7 +530,6 @@ update_multipath_table (struct multipath *mpp, vector pathvec, int flags)
 	conf = get_multipath_config();
 	mpp->sync_tick = conf->max_checkint;
 	put_multipath_config(conf);
-	mpp->synced_count++;
 
 	r = libmp_mapinfo(DM_MAP_BY_NAME | MAPINFO_MPATH_ONLY,
 			  (mapid_t) { .str = mpp->alias },
diff --git a/multipathd/main.c b/multipathd/main.c
index 4a28fbb..e4e6bf7 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2470,7 +2470,7 @@ sync_mpp(struct vectors * vecs, struct multipath *mpp, unsigned int ticks)
 	if (mpp->sync_tick)
 		mpp->sync_tick -= (mpp->sync_tick > ticks) ? ticks :
 				  mpp->sync_tick;
-	if (mpp->sync_tick)
+	if (mpp->sync_tick && !mpp->checker_count)
 		return;
 
 	do_sync_mpp(vecs, mpp);
@@ -2513,12 +2513,6 @@ update_path_state (struct vectors * vecs, struct path * pp)
 		return handle_path_wwid_change(pp, vecs)? CHECK_PATH_REMOVED :
 							  CHECK_PATH_SKIPPED;
 	}
-	if (pp->mpp->synced_count == 0) {
-		do_sync_mpp(vecs, pp->mpp);
-		/* if update_multipath_strings orphaned the path, quit early */
-		if (!pp->mpp)
-			return CHECK_PATH_SKIPPED;
-	}
 	if ((newstate != PATH_UP && newstate != PATH_GHOST &&
 	     newstate != PATH_PENDING) && (pp->state == PATH_DELAYED)) {
 		/* If path state become failed again cancel path delay state */
@@ -2918,9 +2912,11 @@ check_paths(struct vectors *vecs, unsigned int ticks)
 	vector_foreach_slot(vecs->pathvec, pp, i) {
 		if (pp->is_checked != CHECK_PATH_UNCHECKED)
 			continue;
-		if (pp->mpp)
+		if (pp->mpp) {
 			pp->is_checked = check_path(pp, ticks);
-		else
+			if (pp->is_checked == CHECK_PATH_STARTED)
+				pp->mpp->checker_count++;
+		} else
 			pp->is_checked = check_uninitialized_path(pp, ticks);
 		if (pp->is_checked == CHECK_PATH_STARTED &&
 		    checker_need_wait(&pp->checker))
@@ -3014,12 +3010,10 @@ checkerloop (void *ap)
 			pthread_cleanup_push(cleanup_lock, &vecs->lock);
 			lock(&vecs->lock);
 			pthread_testcancel();
-			vector_foreach_slot(vecs->mpvec, mpp, i)
-				mpp->synced_count = 0;
 			if (checker_state == CHECKER_STARTING) {
 				vector_foreach_slot(vecs->mpvec, mpp, i) {
-					sync_mpp(vecs, mpp, ticks);
 					mpp->prio_update = PRIO_UPDATE_NONE;
+					mpp->checker_count = 0;
 				}
 				vector_foreach_slot(vecs->pathvec, pp, i)
 					pp->is_checked = CHECK_PATH_UNCHECKED;
@@ -3032,11 +3026,13 @@ checkerloop (void *ap)
 							     start_time.tv_sec);
 			if (checker_state == CHECKER_FINISHED) {
 				vector_foreach_slot(vecs->mpvec, mpp, i) {
-					if ((update_mpp_prio(mpp) ||
-					     (mpp->need_reload && mpp->synced_count > 0)) &&
-					    reload_and_sync_map(mpp, vecs) == 2)
+					sync_mpp(vecs, mpp, ticks);
+					if ((update_mpp_prio(mpp) || mpp->need_reload) &&
+					    reload_and_sync_map(mpp, vecs) == 2) {
 						/* multipath device deleted */
 						i--;
+						continue;
+					}
 				}
 			}
 			lock_cleanup_pop(vecs->lock);
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 04/14] multipathd: quickly re-sync if a map is inconsistent
  2024-12-11 22:58 [PATCH v2 00/14] multipathd: More map reload handling, and checkerloop work Martin Wilck
                   ` (2 preceding siblings ...)
  2024-12-11 22:58 ` [PATCH v2 03/14] multipathd: sync maps at end of checkerloop Martin Wilck
@ 2024-12-11 22:58 ` Martin Wilck
  2024-12-19 21:57   ` Benjamin Marzinski
  2024-12-19 23:05   ` Benjamin Marzinski
  2024-12-11 22:59 ` [PATCH v2 05/14] multipathd: move yielding for waiters to start of checkerloop Martin Wilck
                   ` (10 subsequent siblings)
  14 siblings, 2 replies; 27+ messages in thread
From: Martin Wilck @ 2024-12-11 22:58 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: dm-devel, Martin Wilck

After reading the kernel device-mapper table, update_pathvec_from_dm()
sets the mpp->need_reload flag if an inconsistent state was found (often a
path with wrong WWID). We expect reload_and_sync_map() to fix this situation.
However, schedule a quick resync in this case, to be double-check that the
inconsistency has been fixed.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipathd/main.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index e4e6bf7..178618c 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -3026,13 +3026,22 @@ checkerloop (void *ap)
 							     start_time.tv_sec);
 			if (checker_state == CHECKER_FINISHED) {
 				vector_foreach_slot(vecs->mpvec, mpp, i) {
+					bool inconsistent;
+
 					sync_mpp(vecs, mpp, ticks);
-					if ((update_mpp_prio(mpp) || mpp->need_reload) &&
+					inconsistent = mpp->need_reload;
+					if ((update_mpp_prio(mpp) || inconsistent) &&
 					    reload_and_sync_map(mpp, vecs) == 2) {
 						/* multipath device deleted */
 						i--;
 						continue;
 					}
+					/*
+					 * If we reloaded due to inconsistent state,
+					 * schedule another sync at the next tick.
+					 */
+					if (inconsistent)
+						mpp->sync_tick = 1;
 				}
 			}
 			lock_cleanup_pop(vecs->lock);
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 05/14] multipathd: move yielding for waiters to start of checkerloop
  2024-12-11 22:58 [PATCH v2 00/14] multipathd: More map reload handling, and checkerloop work Martin Wilck
                   ` (3 preceding siblings ...)
  2024-12-11 22:58 ` [PATCH v2 04/14] multipathd: quickly re-sync if a map is inconsistent Martin Wilck
@ 2024-12-11 22:59 ` Martin Wilck
  2024-12-11 22:59 ` [PATCH v2 06/14] multipathd: add checker_finished() Martin Wilck
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 27+ messages in thread
From: Martin Wilck @ 2024-12-11 22:59 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: dm-devel, Martin Wilck

This simplifies the lock-taking logic and prepares the following
patch.

Signed-off-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 multipathd/main.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 178618c..be542f6 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -3007,6 +3007,16 @@ checkerloop (void *ap)
 			struct multipath *mpp;
 			int i;
 
+			if (checker_state != CHECKER_STARTING) {
+				struct timespec wait = { .tv_nsec = 10000, };
+				if (checker_state == CHECKER_WAITING_FOR_PATHS) {
+					/* wait 5ms */
+					wait.tv_nsec = 5 * 1000 * 1000;
+					checker_state = CHECKER_UPDATING_PATHS;
+				}
+				nanosleep(&wait, NULL);
+			}
+
 			pthread_cleanup_push(cleanup_lock, &vecs->lock);
 			lock(&vecs->lock);
 			pthread_testcancel();
@@ -3045,16 +3055,6 @@ checkerloop (void *ap)
 				}
 			}
 			lock_cleanup_pop(vecs->lock);
-			if (checker_state != CHECKER_FINISHED) {
-				/* Yield to waiters */
-				struct timespec wait = { .tv_nsec = 10000, };
-				if (checker_state == CHECKER_WAITING_FOR_PATHS) {
-					/* wait 5ms */
-					wait.tv_nsec = 5 * 1000 * 1000;
-					checker_state = CHECKER_UPDATING_PATHS;
-				}
-				nanosleep(&wait, NULL);
-			}
 		}
 
 		pthread_cleanup_push(cleanup_lock, &vecs->lock);
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 06/14] multipathd: add checker_finished()
  2024-12-11 22:58 [PATCH v2 00/14] multipathd: More map reload handling, and checkerloop work Martin Wilck
                   ` (4 preceding siblings ...)
  2024-12-11 22:59 ` [PATCH v2 05/14] multipathd: move yielding for waiters to start of checkerloop Martin Wilck
@ 2024-12-11 22:59 ` Martin Wilck
  2024-12-11 22:59 ` [PATCH v2 07/14] multipathd: move "tick" calls into checker_finished() Martin Wilck
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 27+ messages in thread
From: Martin Wilck @ 2024-12-11 22:59 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: dm-devel, Martin Wilck

Move the code that handles the checker_state == CHECKER_FINISHED state
into a separate function for better readability. Subsequent patches will
add code to this function.

Replace the "&& reload_and_sync_map()" conditional by a subordinate if
statement, as this expresses the logic of the code better.

Signed-off-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 multipathd/main.c | 47 +++++++++++++++++++++++++++--------------------
 1 file changed, 27 insertions(+), 20 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index be542f6..0823484 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2967,6 +2967,31 @@ update_paths(struct vectors *vecs, int *num_paths_p, time_t start_secs)
 	return CHECKER_FINISHED;
 }
 
+static void checker_finished(struct vectors *vecs, unsigned int ticks)
+{
+	struct multipath *mpp;
+	int i;
+
+	vector_foreach_slot(vecs->mpvec, mpp, i) {
+		bool inconsistent;
+
+		sync_mpp(vecs, mpp, ticks);
+		inconsistent = mpp->need_reload;
+		if (update_mpp_prio(mpp) || inconsistent)
+			if (reload_and_sync_map(mpp, vecs) == 2) {
+				/* multipath device deleted */
+				i--;
+				continue;
+			}
+		/*
+		 * If we reloaded due to inconsistent state,
+		 * schedule another sync at the next tick.
+		 */
+		if (inconsistent)
+			mpp->sync_tick = 1;
+	}
+}
+
 static void *
 checkerloop (void *ap)
 {
@@ -3034,26 +3059,8 @@ checkerloop (void *ap)
 			if (checker_state == CHECKER_UPDATING_PATHS)
 				checker_state = update_paths(vecs, &num_paths,
 							     start_time.tv_sec);
-			if (checker_state == CHECKER_FINISHED) {
-				vector_foreach_slot(vecs->mpvec, mpp, i) {
-					bool inconsistent;
-
-					sync_mpp(vecs, mpp, ticks);
-					inconsistent = mpp->need_reload;
-					if ((update_mpp_prio(mpp) || inconsistent) &&
-					    reload_and_sync_map(mpp, vecs) == 2) {
-						/* multipath device deleted */
-						i--;
-						continue;
-					}
-					/*
-					 * If we reloaded due to inconsistent state,
-					 * schedule another sync at the next tick.
-					 */
-					if (inconsistent)
-						mpp->sync_tick = 1;
-				}
-			}
+			if (checker_state == CHECKER_FINISHED)
+				checker_finished(vecs, ticks);
 			lock_cleanup_pop(vecs->lock);
 		}
 
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 07/14] multipathd: move "tick" calls into checker_finished()
  2024-12-11 22:58 [PATCH v2 00/14] multipathd: More map reload handling, and checkerloop work Martin Wilck
                   ` (5 preceding siblings ...)
  2024-12-11 22:59 ` [PATCH v2 06/14] multipathd: add checker_finished() Martin Wilck
@ 2024-12-11 22:59 ` Martin Wilck
  2024-12-11 22:59 ` [PATCH v2 08/14] multipathd: don't call reload_and_sync_map() from deferred_failback_tick() Martin Wilck
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 27+ messages in thread
From: Martin Wilck @ 2024-12-11 22:59 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: dm-devel, Martin Wilck

The various "tick" functions will only be called after CHECKER_FINISHED
is reached. So we might as well move them into the respective code block
into checker_finished(). This way don't have to drop and re-take he lock
when all paths have been checked.

Signed-off-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 multipathd/main.c | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 0823484..b045f8b 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2990,6 +2990,11 @@ static void checker_finished(struct vectors *vecs, unsigned int ticks)
 		if (inconsistent)
 			mpp->sync_tick = 1;
 	}
+	deferred_failback_tick(vecs);
+	retry_count_tick(vecs->mpvec);
+	missing_uev_wait_tick(vecs);
+	ghost_delay_tick(vecs);
+	partial_retrigger_tick(vecs->pathvec);
 }
 
 static void *
@@ -3064,16 +3069,6 @@ checkerloop (void *ap)
 			lock_cleanup_pop(vecs->lock);
 		}
 
-		pthread_cleanup_push(cleanup_lock, &vecs->lock);
-		lock(&vecs->lock);
-		pthread_testcancel();
-		deferred_failback_tick(vecs);
-		retry_count_tick(vecs->mpvec);
-		missing_uev_wait_tick(vecs);
-		ghost_delay_tick(vecs);
-		partial_retrigger_tick(vecs->pathvec);
-		lock_cleanup_pop(vecs->lock);
-
 		if (count)
 			count--;
 		else {
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 08/14] multipathd: don't call reload_and_sync_map() from deferred_failback_tick()
  2024-12-11 22:58 [PATCH v2 00/14] multipathd: More map reload handling, and checkerloop work Martin Wilck
                   ` (6 preceding siblings ...)
  2024-12-11 22:59 ` [PATCH v2 07/14] multipathd: move "tick" calls into checker_finished() Martin Wilck
@ 2024-12-11 22:59 ` Martin Wilck
  2024-12-19 22:04   ` Benjamin Marzinski
  2024-12-11 22:59 ` [PATCH v2 09/14] multipathd: move retry_count_tick() into existing mpvec loop Martin Wilck
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 27+ messages in thread
From: Martin Wilck @ 2024-12-11 22:59 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: dm-devel, Martin Wilck

Instead, move the call inside the existing loop over vecs->mpvec and call
reload_and_sync_map() from checker_finished(). Note that we can't simply
put the call deferred_failback_tick() in the "if" condition, because we
need to adjust the ticks even if update_mpp_prio() returns true. For
consistency, use separate bool variables for each condition that would
necessitate a reload.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipathd/main.c | 39 ++++++++++++++-------------------------
 1 file changed, 14 insertions(+), 25 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index b045f8b..b1f0f81 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2076,32 +2076,20 @@ ghost_delay_tick(struct vectors *vecs)
 	}
 }
 
-static void
-deferred_failback_tick (struct vectors *vecs)
+static bool deferred_failback_tick(struct multipath *mpp)
 {
-	struct multipath * mpp;
-	int i;
 	bool need_reload;
 
-	vector_foreach_slot (vecs->mpvec, mpp, i) {
-		/*
-		 * deferred failback getting sooner
-		 */
-		if (mpp->pgfailback > 0 && mpp->failback_tick > 0) {
-			mpp->failback_tick--;
+	if (mpp->pgfailback <= 0 || mpp->failback_tick <= 0)
+		return false;
 
-			if (!mpp->failback_tick &&
-			    need_switch_pathgroup(mpp, &need_reload)) {
-				if (need_reload) {
-					if (reload_and_sync_map(mpp, vecs) == 2) {
-						/* multipath device removed */
-						i--;
-					}
-				} else
-					switch_pathgroup(mpp);
-			}
-		}
-	}
+	mpp->failback_tick--;
+	if (!mpp->failback_tick &&
+	    need_switch_pathgroup(mpp, &need_reload) &&
+	    need_reload)
+		return true;
+	else
+		return false;
 }
 
 static void
@@ -2973,11 +2961,13 @@ static void checker_finished(struct vectors *vecs, unsigned int ticks)
 	int i;
 
 	vector_foreach_slot(vecs->mpvec, mpp, i) {
-		bool inconsistent;
+		bool inconsistent, prio_reload, failback_reload;
 
 		sync_mpp(vecs, mpp, ticks);
 		inconsistent = mpp->need_reload;
-		if (update_mpp_prio(mpp) || inconsistent)
+		prio_reload = update_mpp_prio(mpp);
+		failback_reload = deferred_failback_tick(mpp);
+		if (prio_reload || failback_reload || inconsistent)
 			if (reload_and_sync_map(mpp, vecs) == 2) {
 				/* multipath device deleted */
 				i--;
@@ -2990,7 +2980,6 @@ static void checker_finished(struct vectors *vecs, unsigned int ticks)
 		if (inconsistent)
 			mpp->sync_tick = 1;
 	}
-	deferred_failback_tick(vecs);
 	retry_count_tick(vecs->mpvec);
 	missing_uev_wait_tick(vecs);
 	ghost_delay_tick(vecs);
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 09/14] multipathd: move retry_count_tick() into existing mpvec loop
  2024-12-11 22:58 [PATCH v2 00/14] multipathd: More map reload handling, and checkerloop work Martin Wilck
                   ` (7 preceding siblings ...)
  2024-12-11 22:59 ` [PATCH v2 08/14] multipathd: don't call reload_and_sync_map() from deferred_failback_tick() Martin Wilck
@ 2024-12-11 22:59 ` Martin Wilck
  2024-12-11 22:59 ` [PATCH v2 10/14] multipathd: don't call update_map() from missing_uev_wait_tick() Martin Wilck
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 27+ messages in thread
From: Martin Wilck @ 2024-12-11 22:59 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: dm-devel, Martin Wilck

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipathd/main.c | 24 ++++++++++--------------
 1 file changed, 10 insertions(+), 14 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index b1f0f81..fd9ea6c 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2093,21 +2093,17 @@ static bool deferred_failback_tick(struct multipath *mpp)
 }
 
 static void
-retry_count_tick(vector mpvec)
+retry_count_tick(struct multipath *mpp)
 {
-	struct multipath *mpp;
-	unsigned int i;
+	if (mpp->retry_tick <= 0)
+		return;
 
-	vector_foreach_slot (mpvec, mpp, i) {
-		if (mpp->retry_tick > 0) {
-			mpp->stat_total_queueing_time++;
-			condlog(4, "%s: Retrying.. No active path", mpp->alias);
-			if(--mpp->retry_tick == 0) {
-				mpp->stat_map_failures++;
-				dm_queue_if_no_path(mpp, 0);
-				condlog(2, "%s: Disable queueing", mpp->alias);
-			}
-		}
+	mpp->stat_total_queueing_time++;
+	condlog(4, "%s: Retrying.. No active path", mpp->alias);
+	if(--mpp->retry_tick == 0) {
+		mpp->stat_map_failures++;
+		dm_queue_if_no_path(mpp, 0);
+		condlog(2, "%s: Disable queueing", mpp->alias);
 	}
 }
 
@@ -2979,8 +2975,8 @@ static void checker_finished(struct vectors *vecs, unsigned int ticks)
 		 */
 		if (inconsistent)
 			mpp->sync_tick = 1;
+		retry_count_tick(mpp);
 	}
-	retry_count_tick(vecs->mpvec);
 	missing_uev_wait_tick(vecs);
 	ghost_delay_tick(vecs);
 	partial_retrigger_tick(vecs->pathvec);
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 10/14] multipathd: don't call update_map() from missing_uev_wait_tick()
  2024-12-11 22:58 [PATCH v2 00/14] multipathd: More map reload handling, and checkerloop work Martin Wilck
                   ` (8 preceding siblings ...)
  2024-12-11 22:59 ` [PATCH v2 09/14] multipathd: move retry_count_tick() into existing mpvec loop Martin Wilck
@ 2024-12-11 22:59 ` Martin Wilck
  2024-12-11 22:59 ` [PATCH v2 11/14] multipathd: don't call udpate_map() from ghost_delay_tick() Martin Wilck
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 27+ messages in thread
From: Martin Wilck @ 2024-12-11 22:59 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: dm-devel, Martin Wilck

Instead, check for missing uevents in the existing mpvec loop.
Note that if the uevent tick expires, we need to call update_map() rather than
reload_and_sync_map(), because the paths have not been added to the multipath
(see wait_for_udev handling ev_add_path()).

The set of actions taken by update_map() is a superset of those in
reload_and_sync_map(), thus we don't need to call the latter if we've already
called the former.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipathd/main.c | 44 ++++++++++++++++++++++----------------------
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index fd9ea6c..6d3c3a2 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2029,29 +2029,19 @@ followover_should_failback(struct multipath *mpp)
 	return 0;
 }
 
-static void
-missing_uev_wait_tick(struct vectors *vecs)
+/* Returns true if update_map() needs to be called */
+static bool
+missing_uev_wait_tick(struct multipath *mpp, bool *timed_out)
 {
-	struct multipath * mpp;
-	int i;
-	int timed_out = 0;
+	if (mpp->wait_for_udev && --mpp->uev_wait_tick <= 0) {
+		int wait = mpp->wait_for_udev;
 
-	vector_foreach_slot (vecs->mpvec, mpp, i) {
-		if (mpp->wait_for_udev && --mpp->uev_wait_tick <= 0) {
-			timed_out = 1;
-			condlog(0, "%s: timeout waiting on creation uevent. enabling reloads", mpp->alias);
-			if (mpp->wait_for_udev > 1 &&
-			    update_map(mpp, vecs, 0)) {
-				/* update_map removed map */
-				i--;
-				continue;
-			}
-			mpp->wait_for_udev = 0;
-		}
+		mpp->wait_for_udev = 0;
+		*timed_out = true;
+		condlog(0, "%s: timeout waiting on creation uevent. enabling reloads", mpp->alias);
+		return wait > 1;
 	}
-
-	if (timed_out && !need_to_delay_reconfig(vecs))
-		unblock_reconfigure();
+	return false;
 }
 
 static void
@@ -2954,16 +2944,25 @@ update_paths(struct vectors *vecs, int *num_paths_p, time_t start_secs)
 static void checker_finished(struct vectors *vecs, unsigned int ticks)
 {
 	struct multipath *mpp;
+	bool uev_timed_out = false;
 	int i;
 
 	vector_foreach_slot(vecs->mpvec, mpp, i) {
 		bool inconsistent, prio_reload, failback_reload;
+		bool uev_wait_reload;
 
 		sync_mpp(vecs, mpp, ticks);
 		inconsistent = mpp->need_reload;
 		prio_reload = update_mpp_prio(mpp);
 		failback_reload = deferred_failback_tick(mpp);
-		if (prio_reload || failback_reload || inconsistent)
+		uev_wait_reload = missing_uev_wait_tick(mpp, &uev_timed_out);
+		if (uev_wait_reload) {
+			if (update_map(mpp, vecs, 0)) {
+				/* multipath device deleted */
+				i--;
+				continue;
+			}
+		} else if (prio_reload || failback_reload || inconsistent)
 			if (reload_and_sync_map(mpp, vecs) == 2) {
 				/* multipath device deleted */
 				i--;
@@ -2977,7 +2976,8 @@ static void checker_finished(struct vectors *vecs, unsigned int ticks)
 			mpp->sync_tick = 1;
 		retry_count_tick(mpp);
 	}
-	missing_uev_wait_tick(vecs);
+	if (uev_timed_out && !need_to_delay_reconfig(vecs))
+		unblock_reconfigure();
 	ghost_delay_tick(vecs);
 	partial_retrigger_tick(vecs->pathvec);
 }
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 11/14] multipathd: don't call udpate_map() from ghost_delay_tick()
  2024-12-11 22:58 [PATCH v2 00/14] multipathd: More map reload handling, and checkerloop work Martin Wilck
                   ` (9 preceding siblings ...)
  2024-12-11 22:59 ` [PATCH v2 10/14] multipathd: don't call update_map() from missing_uev_wait_tick() Martin Wilck
@ 2024-12-11 22:59 ` Martin Wilck
  2024-12-11 22:59 ` [PATCH v2 12/14] multipathd: only call reload_and_sync_map() when ghost delay expires Martin Wilck
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 27+ messages in thread
From: Martin Wilck @ 2024-12-11 22:59 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: dm-devel, Martin Wilck

Instead, move the call into the existing mpvec loop and call update_map()
from checker_finished().

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipathd/main.c | 33 ++++++++++++---------------------
 1 file changed, 12 insertions(+), 21 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 6d3c3a2..71d5363 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2044,26 +2044,17 @@ missing_uev_wait_tick(struct multipath *mpp, bool *timed_out)
 	return false;
 }
 
-static void
-ghost_delay_tick(struct vectors *vecs)
+static bool
+ghost_delay_tick(struct  multipath * mpp)
 {
-	struct multipath * mpp;
-	int i;
-
-	vector_foreach_slot (vecs->mpvec, mpp, i) {
-		if (mpp->ghost_delay_tick <= 0)
-			continue;
-		if (--mpp->ghost_delay_tick <= 0) {
-			condlog(0, "%s: timed out waiting for active path",
-				mpp->alias);
-			mpp->force_udev_reload = 1;
-			if (update_map(mpp, vecs, 0) != 0) {
-				/* update_map removed map */
-				i--;
-				continue;
-			}
-		}
+	if (mpp->ghost_delay_tick <= 0)
+		return false;
+	if (--mpp->ghost_delay_tick <= 0) {
+		condlog(0, "%s: timed out waiting for active path", mpp->alias);
+		mpp->force_udev_reload = 1;
+		return true;
 	}
+	return false;
 }
 
 static bool deferred_failback_tick(struct multipath *mpp)
@@ -2949,14 +2940,15 @@ static void checker_finished(struct vectors *vecs, unsigned int ticks)
 
 	vector_foreach_slot(vecs->mpvec, mpp, i) {
 		bool inconsistent, prio_reload, failback_reload;
-		bool uev_wait_reload;
+		bool uev_wait_reload, ghost_reload;
 
 		sync_mpp(vecs, mpp, ticks);
 		inconsistent = mpp->need_reload;
 		prio_reload = update_mpp_prio(mpp);
 		failback_reload = deferred_failback_tick(mpp);
 		uev_wait_reload = missing_uev_wait_tick(mpp, &uev_timed_out);
-		if (uev_wait_reload) {
+		ghost_reload = ghost_delay_tick(mpp);
+		if (uev_wait_reload || ghost_reload) {
 			if (update_map(mpp, vecs, 0)) {
 				/* multipath device deleted */
 				i--;
@@ -2978,7 +2970,6 @@ static void checker_finished(struct vectors *vecs, unsigned int ticks)
 	}
 	if (uev_timed_out && !need_to_delay_reconfig(vecs))
 		unblock_reconfigure();
-	ghost_delay_tick(vecs);
 	partial_retrigger_tick(vecs->pathvec);
 }
 
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 12/14] multipathd: only call reload_and_sync_map() when ghost delay expires
  2024-12-11 22:58 [PATCH v2 00/14] multipathd: More map reload handling, and checkerloop work Martin Wilck
                   ` (10 preceding siblings ...)
  2024-12-11 22:59 ` [PATCH v2 11/14] multipathd: don't call udpate_map() from ghost_delay_tick() Martin Wilck
@ 2024-12-11 22:59 ` Martin Wilck
  2024-12-11 22:59 ` [PATCH v2 13/14] multipathd: remove non-existent maps in checkerloop Martin Wilck
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 27+ messages in thread
From: Martin Wilck @ 2024-12-11 22:59 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: dm-devel, Martin Wilck

While we're waiting for active paths to appear when ghost delay is
enabled, we do add new paths to multipathd's internal mpp data structure
as they are discovered (unlike missing uevent case, where we orphan
the paths instead). When the ghost_delay timer expires, it should be
sufficient to call the more light-weight reload_and_sync_map() instead
of update_map().

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipathd/main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 71d5363..8d8b237 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2948,13 +2948,13 @@ static void checker_finished(struct vectors *vecs, unsigned int ticks)
 		failback_reload = deferred_failback_tick(mpp);
 		uev_wait_reload = missing_uev_wait_tick(mpp, &uev_timed_out);
 		ghost_reload = ghost_delay_tick(mpp);
-		if (uev_wait_reload || ghost_reload) {
+		if (uev_wait_reload) {
 			if (update_map(mpp, vecs, 0)) {
 				/* multipath device deleted */
 				i--;
 				continue;
 			}
-		} else if (prio_reload || failback_reload || inconsistent)
+		} else if (prio_reload || failback_reload || ghost_reload || inconsistent)
 			if (reload_and_sync_map(mpp, vecs) == 2) {
 				/* multipath device deleted */
 				i--;
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 13/14] multipathd: remove non-existent maps in checkerloop
  2024-12-11 22:58 [PATCH v2 00/14] multipathd: More map reload handling, and checkerloop work Martin Wilck
                   ` (11 preceding siblings ...)
  2024-12-11 22:59 ` [PATCH v2 12/14] multipathd: only call reload_and_sync_map() when ghost delay expires Martin Wilck
@ 2024-12-11 22:59 ` Martin Wilck
  2024-12-11 22:59 ` [PATCH v2 14/14] multipathd: remove mpvec_garbage_collector() Martin Wilck
  2024-12-19 22:33 ` [PATCH v2 00/14] multipathd: More map reload handling, and checkerloop work Benjamin Marzinski
  14 siblings, 0 replies; 27+ messages in thread
From: Martin Wilck @ 2024-12-11 22:59 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: dm-devel, Martin Wilck

If update_multipath_strings() returns DMP_NOT_FOUND for a map that
we are tracking in the checker, the map must have been removed by
an external entity. Remove it.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipathd/main.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 8d8b237..138ddd0 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2411,7 +2411,7 @@ get_new_state(struct path *pp)
 	return newstate;
 }
 
-static void
+static int
 do_sync_mpp(struct vectors * vecs, struct multipath *mpp)
 {
 	int i, ret;
@@ -2424,21 +2424,22 @@ do_sync_mpp(struct vectors * vecs, struct multipath *mpp)
 			"couldn't synchronize with kernel state");
 		vector_foreach_slot (mpp->paths, pp, i)
 			pp->dmstate = PSTATE_UNDEF;
-		return;
+		return ret;
 	}
 	set_no_path_retry(mpp);
+	return ret;
 }
 
-static void
+static int
 sync_mpp(struct vectors * vecs, struct multipath *mpp, unsigned int ticks)
 {
 	if (mpp->sync_tick)
 		mpp->sync_tick -= (mpp->sync_tick > ticks) ? ticks :
 				  mpp->sync_tick;
 	if (mpp->sync_tick && !mpp->checker_count)
-		return;
+		return DMP_OK;
 
-	do_sync_mpp(vecs, mpp);
+	return do_sync_mpp(vecs, mpp);
 }
 
 static int
@@ -2942,7 +2943,11 @@ static void checker_finished(struct vectors *vecs, unsigned int ticks)
 		bool inconsistent, prio_reload, failback_reload;
 		bool uev_wait_reload, ghost_reload;
 
-		sync_mpp(vecs, mpp, ticks);
+		if (sync_mpp(vecs, mpp, ticks) == DMP_NOT_FOUND) {
+			remove_map_and_stop_waiter(mpp, vecs);
+			i--;
+			continue;
+		}
 		inconsistent = mpp->need_reload;
 		prio_reload = update_mpp_prio(mpp);
 		failback_reload = deferred_failback_tick(mpp);
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 14/14] multipathd: remove mpvec_garbage_collector()
  2024-12-11 22:58 [PATCH v2 00/14] multipathd: More map reload handling, and checkerloop work Martin Wilck
                   ` (12 preceding siblings ...)
  2024-12-11 22:59 ` [PATCH v2 13/14] multipathd: remove non-existent maps in checkerloop Martin Wilck
@ 2024-12-11 22:59 ` Martin Wilck
  2024-12-19 22:33 ` [PATCH v2 00/14] multipathd: More map reload handling, and checkerloop work Benjamin Marzinski
  14 siblings, 0 replies; 27+ messages in thread
From: Martin Wilck @ 2024-12-11 22:59 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: dm-devel, Martin Wilck

This function duplicates functionality that we now have in the
checker_finished() code path. Remove it.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipathd/main.c | 31 -------------------------------
 1 file changed, 31 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 138ddd0..941ec68 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -1967,24 +1967,6 @@ enable_group(struct path * pp)
 	}
 }
 
-static void
-mpvec_garbage_collector (struct vectors * vecs)
-{
-	struct multipath * mpp;
-	int i;
-
-	if (!vecs->mpvec)
-		return;
-
-	vector_foreach_slot (vecs->mpvec, mpp, i) {
-		if (mpp && mpp->alias && !dm_map_present(mpp->alias)) {
-			condlog(2, "%s: remove dead map", mpp->alias);
-			remove_map_and_stop_waiter(mpp, vecs);
-			i--;
-		}
-	}
-}
-
 /* This is called after a path has started working again. It the multipath
  * device for this path uses the followover failback type, and this is the
  * best pathgroup, and this is the first path in the pathgroup to come back
@@ -2983,7 +2965,6 @@ checkerloop (void *ap)
 {
 	struct vectors *vecs;
 	struct path *pp;
-	int count = 0;
 	struct timespec last_time;
 	struct config *conf;
 	int foreign_tick = 0;
@@ -3050,18 +3031,6 @@ checkerloop (void *ap)
 			lock_cleanup_pop(vecs->lock);
 		}
 
-		if (count)
-			count--;
-		else {
-			pthread_cleanup_push(cleanup_lock, &vecs->lock);
-			lock(&vecs->lock);
-			pthread_testcancel();
-			condlog(4, "map garbage collection");
-			mpvec_garbage_collector(vecs);
-			count = MAPGCINT;
-			lock_cleanup_pop(vecs->lock);
-		}
-
 		get_monotonic_time(&end_time);
 		timespecsub(&end_time, &start_time, &diff_time);
 		if (num_paths) {
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 03/14] multipathd: sync maps at end of checkerloop
  2024-12-11 22:58 ` [PATCH v2 03/14] multipathd: sync maps at end of checkerloop Martin Wilck
@ 2024-12-19 21:50   ` Benjamin Marzinski
  2025-01-17 19:04     ` Martin Wilck
  2024-12-19 23:04   ` Benjamin Marzinski
  1 sibling, 1 reply; 27+ messages in thread
From: Benjamin Marzinski @ 2024-12-19 21:50 UTC (permalink / raw)
  To: Martin Wilck; +Cc: Christophe Varoqui, dm-devel, Martin Wilck

On Wed, Dec 11, 2024 at 11:58:58PM +0100, Martin Wilck wrote:
> Rather than calling sync_mpp early in the checkerloop and tracking
> map synchronization with synced_count, call sync_mpp() in the CHECKER_FINISHED
> state, if either at least one path of the map has been checked in the current
> iteration, or the sync tick has expired. This avoids potentially deleting
> paths from the pathvec through the do_sync_mpp() -> update_multipath_strings()
> -> sync_paths -> check_removed_paths() call chain while we're iterating over
> the pathvec. Also, the time gap between obtaining path states and syncing
> the state with the kernel is smaller this way.
> 

Sorry for the delayed review. I've been busy lately and there was a lot
to look at with moving the syncs from before we check the paths till the
end. Turns out I'm mostly o.k. with this. Syncing right before we
checked the paths still left a small window where things could change in
the kernel state, and we wouldn't see them before we checked the paths.
It's just a larger window now.

The one place where we won't just pick up the change on the next checker
loop is enable_group(). If the kernel disables a pathgroup, multipathd
would re-enable it if a path in it switched states to PATH_UP. I'm not
totally sure how necessary this is. The kenel has code to re-enable the
pathgroups. But it's been there a while, and perhaps it does avoid an
issue.

The kernel doesn't even necessarily send an event if it disables a
pathgroup (since the pathgroup isn't actually disabled, it's just
bypassed, which means all the other pathgroups are checked first).  So
there is a real chance that since the last path check, the pgp->status
could have changed. By not checking before we check the path, we could
not realize that tht pgp is in PGSTATE_DISABLED when a path switches
states to PATH_UP.  It shouldn't be that hard to check in sync_mpp() if
there are any disabled path_groups the include a path where
pp->is_checked == CHECK_PATH_NEW_UP. So we could move the enable_group()
code to checker_finished() as well and fix this.

-Ben

> Suggested-by: Benjamin Marzinski <bmarzins@redhat.com>
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  libmultipath/structs.h     |  2 +-
>  libmultipath/structs_vec.c |  1 -
>  multipathd/main.c          | 26 +++++++++++---------------
>  3 files changed, 12 insertions(+), 17 deletions(-)
> 
> diff --git a/libmultipath/structs.h b/libmultipath/structs.h
> index 6a30c59..9d22bdd 100644
> --- a/libmultipath/structs.h
> +++ b/libmultipath/structs.h
> @@ -471,7 +471,7 @@ struct multipath {
>  	int ghost_delay_tick;
>  	int queue_mode;
>  	unsigned int sync_tick;
> -	int synced_count;
> +	int checker_count;
>  	enum prio_update_type prio_update;
>  	uid_t uid;
>  	gid_t gid;
> diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c
> index 7a4e3eb..6aa744d 100644
> --- a/libmultipath/structs_vec.c
> +++ b/libmultipath/structs_vec.c
> @@ -530,7 +530,6 @@ update_multipath_table (struct multipath *mpp, vector pathvec, int flags)
>  	conf = get_multipath_config();
>  	mpp->sync_tick = conf->max_checkint;
>  	put_multipath_config(conf);
> -	mpp->synced_count++;
>  
>  	r = libmp_mapinfo(DM_MAP_BY_NAME | MAPINFO_MPATH_ONLY,
>  			  (mapid_t) { .str = mpp->alias },
> diff --git a/multipathd/main.c b/multipathd/main.c
> index 4a28fbb..e4e6bf7 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -2470,7 +2470,7 @@ sync_mpp(struct vectors * vecs, struct multipath *mpp, unsigned int ticks)
>  	if (mpp->sync_tick)
>  		mpp->sync_tick -= (mpp->sync_tick > ticks) ? ticks :
>  				  mpp->sync_tick;
> -	if (mpp->sync_tick)
> +	if (mpp->sync_tick && !mpp->checker_count)
>  		return;
>  
>  	do_sync_mpp(vecs, mpp);
> @@ -2513,12 +2513,6 @@ update_path_state (struct vectors * vecs, struct path * pp)
>  		return handle_path_wwid_change(pp, vecs)? CHECK_PATH_REMOVED :
>  							  CHECK_PATH_SKIPPED;
>  	}
> -	if (pp->mpp->synced_count == 0) {
> -		do_sync_mpp(vecs, pp->mpp);
> -		/* if update_multipath_strings orphaned the path, quit early */
> -		if (!pp->mpp)
> -			return CHECK_PATH_SKIPPED;
> -	}
>  	if ((newstate != PATH_UP && newstate != PATH_GHOST &&
>  	     newstate != PATH_PENDING) && (pp->state == PATH_DELAYED)) {
>  		/* If path state become failed again cancel path delay state */
> @@ -2918,9 +2912,11 @@ check_paths(struct vectors *vecs, unsigned int ticks)
>  	vector_foreach_slot(vecs->pathvec, pp, i) {
>  		if (pp->is_checked != CHECK_PATH_UNCHECKED)
>  			continue;
> -		if (pp->mpp)
> +		if (pp->mpp) {
>  			pp->is_checked = check_path(pp, ticks);
> -		else
> +			if (pp->is_checked == CHECK_PATH_STARTED)
> +				pp->mpp->checker_count++;
> +		} else
>  			pp->is_checked = check_uninitialized_path(pp, ticks);
>  		if (pp->is_checked == CHECK_PATH_STARTED &&
>  		    checker_need_wait(&pp->checker))
> @@ -3014,12 +3010,10 @@ checkerloop (void *ap)
>  			pthread_cleanup_push(cleanup_lock, &vecs->lock);
>  			lock(&vecs->lock);
>  			pthread_testcancel();
> -			vector_foreach_slot(vecs->mpvec, mpp, i)
> -				mpp->synced_count = 0;
>  			if (checker_state == CHECKER_STARTING) {
>  				vector_foreach_slot(vecs->mpvec, mpp, i) {
> -					sync_mpp(vecs, mpp, ticks);
>  					mpp->prio_update = PRIO_UPDATE_NONE;
> +					mpp->checker_count = 0;
>  				}
>  				vector_foreach_slot(vecs->pathvec, pp, i)
>  					pp->is_checked = CHECK_PATH_UNCHECKED;
> @@ -3032,11 +3026,13 @@ checkerloop (void *ap)
>  							     start_time.tv_sec);
>  			if (checker_state == CHECKER_FINISHED) {
>  				vector_foreach_slot(vecs->mpvec, mpp, i) {
> -					if ((update_mpp_prio(mpp) ||
> -					     (mpp->need_reload && mpp->synced_count > 0)) &&
> -					    reload_and_sync_map(mpp, vecs) == 2)
> +					sync_mpp(vecs, mpp, ticks);
> +					if ((update_mpp_prio(mpp) || mpp->need_reload) &&
> +					    reload_and_sync_map(mpp, vecs) == 2) {
>  						/* multipath device deleted */
>  						i--;
> +						continue;
> +					}
>  				}
>  			}
>  			lock_cleanup_pop(vecs->lock);
> -- 
> 2.47.0


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 04/14] multipathd: quickly re-sync if a map is inconsistent
  2024-12-11 22:58 ` [PATCH v2 04/14] multipathd: quickly re-sync if a map is inconsistent Martin Wilck
@ 2024-12-19 21:57   ` Benjamin Marzinski
  2025-01-14 21:37     ` Martin Wilck
  2024-12-19 23:05   ` Benjamin Marzinski
  1 sibling, 1 reply; 27+ messages in thread
From: Benjamin Marzinski @ 2024-12-19 21:57 UTC (permalink / raw)
  To: Martin Wilck; +Cc: Christophe Varoqui, dm-devel, Martin Wilck

On Wed, Dec 11, 2024 at 11:58:59PM +0100, Martin Wilck wrote:
> After reading the kernel device-mapper table, update_pathvec_from_dm()
> sets the mpp->need_reload flag if an inconsistent state was found (often a
> path with wrong WWID). We expect reload_and_sync_map() to fix this situation.
> However, schedule a quick resync in this case, to be double-check that the
> inconsistency has been fixed.

I'm not too sure about this. My biggest worry with handling
mpp->need_reload in the checkerloop is what happens if for some reason
multipathd and the kernel keep disagreeing on something. You would just
keep reloading the device. That seems unlikely, so I've o.k. with
handling it here, but if that does happen, this would make it much
worse.  Instead of reloading every path check, you would reload every
loop.

If you do detect an inconsistent state, and trigger a reload, and the
state is still inconsistent after that, I would argue that yet another
reload is more likely to remain inconsistent than it is to fix the
problem. So I would rather not speed it up.

If I'm overlooking a case where a second reload would fix a problem,
please let me know.

-Ben

> 
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  multipathd/main.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/multipathd/main.c b/multipathd/main.c
> index e4e6bf7..178618c 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -3026,13 +3026,22 @@ checkerloop (void *ap)
>  							     start_time.tv_sec);
>  			if (checker_state == CHECKER_FINISHED) {
>  				vector_foreach_slot(vecs->mpvec, mpp, i) {
> +					bool inconsistent;
> +
>  					sync_mpp(vecs, mpp, ticks);
> -					if ((update_mpp_prio(mpp) || mpp->need_reload) &&
> +					inconsistent = mpp->need_reload;
> +					if ((update_mpp_prio(mpp) || inconsistent) &&
>  					    reload_and_sync_map(mpp, vecs) == 2) {
>  						/* multipath device deleted */
>  						i--;
>  						continue;
>  					}
> +					/*
> +					 * If we reloaded due to inconsistent state,
> +					 * schedule another sync at the next tick.
> +					 */
> +					if (inconsistent)
> +						mpp->sync_tick = 1;
>  				}
>  			}
>  			lock_cleanup_pop(vecs->lock);
> -- 
> 2.47.0


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 08/14] multipathd: don't call reload_and_sync_map() from deferred_failback_tick()
  2024-12-11 22:59 ` [PATCH v2 08/14] multipathd: don't call reload_and_sync_map() from deferred_failback_tick() Martin Wilck
@ 2024-12-19 22:04   ` Benjamin Marzinski
  2025-01-14 21:40     ` Martin Wilck
  0 siblings, 1 reply; 27+ messages in thread
From: Benjamin Marzinski @ 2024-12-19 22:04 UTC (permalink / raw)
  To: Martin Wilck; +Cc: Christophe Varoqui, dm-devel, Martin Wilck

On Wed, Dec 11, 2024 at 11:59:03PM +0100, Martin Wilck wrote:
> Instead, move the call inside the existing loop over vecs->mpvec and call
> reload_and_sync_map() from checker_finished(). Note that we can't simply
> put the call deferred_failback_tick() in the "if" condition, because we
> need to adjust the ticks even if update_mpp_prio() returns true. For
> consistency, use separate bool variables for each condition that would
> necessitate a reload.
> 
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  multipathd/main.c | 39 ++++++++++++++-------------------------
>  1 file changed, 14 insertions(+), 25 deletions(-)
> 
> diff --git a/multipathd/main.c b/multipathd/main.c
> index b045f8b..b1f0f81 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -2076,32 +2076,20 @@ ghost_delay_tick(struct vectors *vecs)
>  	}
>  }
>  
> -static void
> -deferred_failback_tick (struct vectors *vecs)
> +static bool deferred_failback_tick(struct multipath *mpp)
>  {
> -	struct multipath * mpp;
> -	int i;
>  	bool need_reload;
>  
> -	vector_foreach_slot (vecs->mpvec, mpp, i) {
> -		/*
> -		 * deferred failback getting sooner
> -		 */
> -		if (mpp->pgfailback > 0 && mpp->failback_tick > 0) {
> -			mpp->failback_tick--;
> +	if (mpp->pgfailback <= 0 || mpp->failback_tick <= 0)
> +		return false;
>  
> -			if (!mpp->failback_tick &&
> -			    need_switch_pathgroup(mpp, &need_reload)) {
> -				if (need_reload) {
> -					if (reload_and_sync_map(mpp, vecs) == 2) {
> -						/* multipath device removed */
> -						i--;
> -					}
> -				} else
> -					switch_pathgroup(mpp);
> -			}
> -		}
> -	}
> +	mpp->failback_tick--;
> +	if (!mpp->failback_tick &&
> +	    need_switch_pathgroup(mpp, &need_reload) &&
> +	    need_reload)
> +		return true;
> +	else

If (!mpp->failback_tick && need_switch_pathgroup(mpp, &need_reload) is
true by need_reload isn't, don't we still need to call
switch_pathgroup(mpp);

-Ben

> +		return false;
>  }
>  
>  static void
> @@ -2973,11 +2961,13 @@ static void checker_finished(struct vectors *vecs, unsigned int ticks)
>  	int i;
>  
>  	vector_foreach_slot(vecs->mpvec, mpp, i) {
> -		bool inconsistent;
> +		bool inconsistent, prio_reload, failback_reload;
>  
>  		sync_mpp(vecs, mpp, ticks);
>  		inconsistent = mpp->need_reload;
> -		if (update_mpp_prio(mpp) || inconsistent)
> +		prio_reload = update_mpp_prio(mpp);
> +		failback_reload = deferred_failback_tick(mpp);
> +		if (prio_reload || failback_reload || inconsistent)
>  			if (reload_and_sync_map(mpp, vecs) == 2) {
>  				/* multipath device deleted */
>  				i--;
> @@ -2990,7 +2980,6 @@ static void checker_finished(struct vectors *vecs, unsigned int ticks)
>  		if (inconsistent)
>  			mpp->sync_tick = 1;
>  	}
> -	deferred_failback_tick(vecs);
>  	retry_count_tick(vecs->mpvec);
>  	missing_uev_wait_tick(vecs);
>  	ghost_delay_tick(vecs);
> -- 
> 2.47.0


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 00/14] multipathd: More map reload handling, and checkerloop work
  2024-12-11 22:58 [PATCH v2 00/14] multipathd: More map reload handling, and checkerloop work Martin Wilck
                   ` (13 preceding siblings ...)
  2024-12-11 22:59 ` [PATCH v2 14/14] multipathd: remove mpvec_garbage_collector() Martin Wilck
@ 2024-12-19 22:33 ` Benjamin Marzinski
  14 siblings, 0 replies; 27+ messages in thread
From: Benjamin Marzinski @ 2024-12-19 22:33 UTC (permalink / raw)
  To: Martin Wilck; +Cc: Christophe Varoqui, dm-devel, Martin Wilck

On Wed, Dec 11, 2024 at 11:58:55PM +0100, Martin Wilck wrote:

For all the patches except 3, 4, and 8:
Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>


> This patch set goes on top of Ben's set [1] for github issue 105 [2].
> 
> The first patch implements the remark I had on patch 2 on Ben's set, the 2nd
> is a minor cleanup.
> 
> Patch 3 moves the sync_mpp() call from the beginning to the end of the
> checkerloop, as suggested by Ben in [3]. If an inconsitency is found
> (mpp->need_reload), we reload the map and schedule another sync for the
> next tick (patch 4).
> 
> Patch 5 ff. reshuffle the code in checkerloop(). There is now one function,
> checker_finished(), that takes all actions that need to be done with the vecs
> lock taken after the checkers have finished. checkerloop() enters this
> function immediately when the checkers have finished, without dropping and
> re-acquiring the vecs lock. The map reload logic is completely handled in this
> function.
> 
> The various _tick() functions don't loop over mpvec any more; they are now
> just called for a single mpp, and they simply return true if a map reload is
> required.  The actual reload action differs: if missing_uev_wait_tick()
> requests a reload, it needs to be a full update_map() (which calls
> adopt_paths()), whereas in the other cases, reload_and_sync_map() is sufficient.
> Patch 12 changes the reload action for the ghost delay tick.
> 
> Patch 13 removes maps if that are not found in the kernel any more. This
> obsoletes the map garbage collector. Unlike the logic in v1, we won't remove
> maps on arbitrary error conditions any more.
> 
> Changes v1 -> v2:
> 
> Removed patch 3 and 4 of v1 and replaced them by an alternative approach.
> Instead of allowing map and path removal in the checker loop, the kernel sync
> is moved towards the end.
> 
> Patch 5 ff. in v1 contained a bug in the new checker_finished() function.
> If one "tick" function returned true, the other might not be executed any
> more. In v2, all tick functions are executed, and the action to be taken is
> selected according to the combined results.  Also, we won't call
> reload_and_sync_map() when we've already called update_map().
> 
> Patch 13 is new in v2. Patch 8 from v1 has been moved after patch 13.
> 
> (In the thread following the review of v1, I mistakenly wrote about an
> upcoming "v4" of this patch set. That was wrong, I meant this v2 here).
> 
> Reviews & comments welcome.
> 
> Regards
> Martin
> 
> [1] https://lore.kernel.org/dm-devel/20241205035638.1218953-1-bmarzins@redhat.com/
> [2] https://github.com/opensvc/multipath-tools/issues/105
> [3] https://lore.kernel.org/dm-devel/Z1iUekRg8sai8HLT@redhat.com/
> 
> 
> Martin Wilck (14):
>   multipathd: don't reload map in update_mpp_prio()
>   multipathd: remove dm_get_info() call from refresh_multipath()
>   multipathd: sync maps at end of checkerloop
>   multipathd: quickly re-sync if a map is inconsistent
>   multipathd: move yielding for waiters to start of checkerloop
>   multipathd: add checker_finished()
>   multipathd: move "tick" calls into checker_finished()
>   multipathd: don't call reload_and_sync_map() from
>     deferred_failback_tick()
>   multipathd: move retry_count_tick() into existing mpvec loop
>   multipathd: don't call update_map() from missing_uev_wait_tick()
>   multipathd: don't call udpate_map() from ghost_delay_tick()
>   multipathd: only call reload_and_sync_map() when ghost delay expires
>   multipathd: remove non-existent maps in checkerloop
>   multipathd: remove mpvec_garbage_collector()
> 
>  libmultipath/structs.h     |   2 +-
>  libmultipath/structs_vec.c |   1 -
>  multipathd/main.c          | 287 +++++++++++++++----------------------
>  3 files changed, 118 insertions(+), 172 deletions(-)
> 
> -- 
> 2.47.0


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 03/14] multipathd: sync maps at end of checkerloop
  2024-12-11 22:58 ` [PATCH v2 03/14] multipathd: sync maps at end of checkerloop Martin Wilck
  2024-12-19 21:50   ` Benjamin Marzinski
@ 2024-12-19 23:04   ` Benjamin Marzinski
  2025-01-14 21:36     ` Martin Wilck
  1 sibling, 1 reply; 27+ messages in thread
From: Benjamin Marzinski @ 2024-12-19 23:04 UTC (permalink / raw)
  To: Martin Wilck; +Cc: Christophe Varoqui, dm-devel, Martin Wilck

On Wed, Dec 11, 2024 at 11:58:58PM +0100, Martin Wilck wrote:
> Rather than calling sync_mpp early in the checkerloop and tracking
> map synchronization with synced_count, call sync_mpp() in the CHECKER_FINISHED
> state, if either at least one path of the map has been checked in the current
> iteration, or the sync tick has expired. This avoids potentially deleting
> paths from the pathvec through the do_sync_mpp() -> update_multipath_strings()
> -> sync_paths -> check_removed_paths() call chain while we're iterating over
> the pathvec. Also, the time gap between obtaining path states and syncing
> the state with the kernel is smaller this way.
> 
> Suggested-by: Benjamin Marzinski <bmarzins@redhat.com>
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  libmultipath/structs.h     |  2 +-
>  libmultipath/structs_vec.c |  1 -
>  multipathd/main.c          | 26 +++++++++++---------------
>  3 files changed, 12 insertions(+), 17 deletions(-)
> 
> diff --git a/libmultipath/structs.h b/libmultipath/structs.h
> index 6a30c59..9d22bdd 100644
> --- a/libmultipath/structs.h
> +++ b/libmultipath/structs.h
> @@ -471,7 +471,7 @@ struct multipath {
>  	int ghost_delay_tick;
>  	int queue_mode;
>  	unsigned int sync_tick;
> -	int synced_count;
> +	int checker_count;
>  	enum prio_update_type prio_update;
>  	uid_t uid;
>  	gid_t gid;
> diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c
> index 7a4e3eb..6aa744d 100644
> --- a/libmultipath/structs_vec.c
> +++ b/libmultipath/structs_vec.c
> @@ -530,7 +530,6 @@ update_multipath_table (struct multipath *mpp, vector pathvec, int flags)
>  	conf = get_multipath_config();
>  	mpp->sync_tick = conf->max_checkint;
>  	put_multipath_config(conf);
> -	mpp->synced_count++;
>  
>  	r = libmp_mapinfo(DM_MAP_BY_NAME | MAPINFO_MPATH_ONLY,
>  			  (mapid_t) { .str = mpp->alias },
> diff --git a/multipathd/main.c b/multipathd/main.c
> index 4a28fbb..e4e6bf7 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -2470,7 +2470,7 @@ sync_mpp(struct vectors * vecs, struct multipath *mpp, unsigned int ticks)
>  	if (mpp->sync_tick)
>  		mpp->sync_tick -= (mpp->sync_tick > ticks) ? ticks :
>  				  mpp->sync_tick;
> -	if (mpp->sync_tick)
> +	if (mpp->sync_tick && !mpp->checker_count)
>  		return;
>  
>  	do_sync_mpp(vecs, mpp);
> @@ -2513,12 +2513,6 @@ update_path_state (struct vectors * vecs, struct path * pp)
>  		return handle_path_wwid_change(pp, vecs)? CHECK_PATH_REMOVED :
>  							  CHECK_PATH_SKIPPED;
>  	}
> -	if (pp->mpp->synced_count == 0) {
> -		do_sync_mpp(vecs, pp->mpp);
> -		/* if update_multipath_strings orphaned the path, quit early */
> -		if (!pp->mpp)
> -			return CHECK_PATH_SKIPPED;
> -	}
>  	if ((newstate != PATH_UP && newstate != PATH_GHOST &&
>  	     newstate != PATH_PENDING) && (pp->state == PATH_DELAYED)) {
>  		/* If path state become failed again cancel path delay state */
> @@ -2918,9 +2912,11 @@ check_paths(struct vectors *vecs, unsigned int ticks)
>  	vector_foreach_slot(vecs->pathvec, pp, i) {
>  		if (pp->is_checked != CHECK_PATH_UNCHECKED)
>  			continue;
> -		if (pp->mpp)
> +		if (pp->mpp) {
>  			pp->is_checked = check_path(pp, ticks);
> -		else
> +			if (pp->is_checked == CHECK_PATH_STARTED)
> +				pp->mpp->checker_count++;
> +		} else
>  			pp->is_checked = check_uninitialized_path(pp, ticks);
>  		if (pp->is_checked == CHECK_PATH_STARTED &&
>  		    checker_need_wait(&pp->checker))
> @@ -3014,12 +3010,10 @@ checkerloop (void *ap)
>  			pthread_cleanup_push(cleanup_lock, &vecs->lock);
>  			lock(&vecs->lock);
>  			pthread_testcancel();
> -			vector_foreach_slot(vecs->mpvec, mpp, i)
> -				mpp->synced_count = 0;
>  			if (checker_state == CHECKER_STARTING) {
>  				vector_foreach_slot(vecs->mpvec, mpp, i) {
> -					sync_mpp(vecs, mpp, ticks);
>  					mpp->prio_update = PRIO_UPDATE_NONE;
> +					mpp->checker_count = 0;
>  				}
>  				vector_foreach_slot(vecs->pathvec, pp, i)
>  					pp->is_checked = CHECK_PATH_UNCHECKED;
> @@ -3032,11 +3026,13 @@ checkerloop (void *ap)
>  							     start_time.tv_sec);
>  			if (checker_state == CHECKER_FINISHED) {
>  				vector_foreach_slot(vecs->mpvec, mpp, i) {
> -					if ((update_mpp_prio(mpp) ||
> -					     (mpp->need_reload && mpp->synced_count > 0)) &&

When you call reload_and_sync_map(), it will automatically resync the
map via setup_multipath() -> refresh_multipath() ->
update_multipath_strings().

This means that if for some reason multipathd and the kernel disagree
about a map, and reloading it doesn't fix the problem, you will
immediately set mpp->need_reload again. With the old mpp->synced_count
check, you only reload maps with need_reload() when a path is checked.
Without this check, or a (mpp->checker_count > 0) check to replace it,
you will keep reloading these maps every loop, roughly once a second. I
would rather not do this.

If you want to make sure to immediately handle a need_reload that wasn't
triggered by this call to reload_and_sync_map() which was because of an
earlier need_reload, we could make need_reload have three states, to
distinguish between a reload we want done immediately, and one we would
like to wait on because we just did a reload and it didn't fix the
problem. Then we could remember if need_reload was set before calling
reload_and_sync_map(), and if it was, and if it is still set after, we
could switch it to the delayed version.

Or perhaps I'm just being paranoid here. 

-Ben

> -					    reload_and_sync_map(mpp, vecs) == 2)
> +					sync_mpp(vecs, mpp, ticks);
> +					if ((update_mpp_prio(mpp) || mpp->need_reload) &&
> +					    reload_and_sync_map(mpp, vecs) == 2) {
>  						/* multipath device deleted */
>  						i--;
> +						continue;
> +					}
>  				}
>  			}
>  			lock_cleanup_pop(vecs->lock);
> -- 
> 2.47.0


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 04/14] multipathd: quickly re-sync if a map is inconsistent
  2024-12-11 22:58 ` [PATCH v2 04/14] multipathd: quickly re-sync if a map is inconsistent Martin Wilck
  2024-12-19 21:57   ` Benjamin Marzinski
@ 2024-12-19 23:05   ` Benjamin Marzinski
  1 sibling, 0 replies; 27+ messages in thread
From: Benjamin Marzinski @ 2024-12-19 23:05 UTC (permalink / raw)
  To: Martin Wilck; +Cc: Christophe Varoqui, dm-devel, Martin Wilck

On Wed, Dec 11, 2024 at 11:58:59PM +0100, Martin Wilck wrote:
> After reading the kernel device-mapper table, update_pathvec_from_dm()
> sets the mpp->need_reload flag if an inconsistent state was found (often a
> path with wrong WWID). We expect reload_and_sync_map() to fix this situation.
> However, schedule a quick resync in this case, to be double-check that the
> inconsistency has been fixed.

Like I mentioned in by second comment to patch 03, this sync already
happens as part of reload_and_sync_map().

> 
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  multipathd/main.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/multipathd/main.c b/multipathd/main.c
> index e4e6bf7..178618c 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -3026,13 +3026,22 @@ checkerloop (void *ap)
>  							     start_time.tv_sec);
>  			if (checker_state == CHECKER_FINISHED) {
>  				vector_foreach_slot(vecs->mpvec, mpp, i) {
> +					bool inconsistent;
> +
>  					sync_mpp(vecs, mpp, ticks);
> -					if ((update_mpp_prio(mpp) || mpp->need_reload) &&
> +					inconsistent = mpp->need_reload;
> +					if ((update_mpp_prio(mpp) || inconsistent) &&
>  					    reload_and_sync_map(mpp, vecs) == 2) {
>  						/* multipath device deleted */
>  						i--;
>  						continue;
>  					}
> +					/*
> +					 * If we reloaded due to inconsistent state,
> +					 * schedule another sync at the next tick.
> +					 */
> +					if (inconsistent)
> +						mpp->sync_tick = 1;
>  				}
>  			}
>  			lock_cleanup_pop(vecs->lock);
> -- 
> 2.47.0


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 03/14] multipathd: sync maps at end of checkerloop
  2024-12-19 23:04   ` Benjamin Marzinski
@ 2025-01-14 21:36     ` Martin Wilck
  2025-01-15 16:45       ` Benjamin Marzinski
  0 siblings, 1 reply; 27+ messages in thread
From: Martin Wilck @ 2025-01-14 21:36 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: Christophe Varoqui, dm-devel

On Thu, 2024-12-19 at 18:04 -0500, Benjamin Marzinski wrote:
> On Wed, Dec 11, 2024 at 11:58:58PM +0100, Martin Wilck wrote:
> > Rather than calling sync_mpp early in the checkerloop and tracking
> > map synchronization with synced_count, call sync_mpp() in the
> > CHECKER_FINISHED
> > state, if either at least one path of the map has been checked in
> > the current
> > iteration, or the sync tick has expired. This avoids potentially
> > deleting
> > paths from the pathvec through the do_sync_mpp() ->
> > update_multipath_strings()
> > -> sync_paths -> check_removed_paths() call chain while we're
> > iterating over
> > the pathvec. Also, the time gap between obtaining path states and
> > syncing
> > the state with the kernel is smaller this way.
> > 
> > Suggested-by: Benjamin Marzinski <bmarzins@redhat.com>
> > Signed-off-by: Martin Wilck <mwilck@suse.com>
> > ---
> >  libmultipath/structs.h     |  2 +-
> >  libmultipath/structs_vec.c |  1 -
> >  multipathd/main.c          | 26 +++++++++++---------------
> >  3 files changed, 12 insertions(+), 17 deletions(-)
> > 
> > diff --git a/libmultipath/structs.h b/libmultipath/structs.h
> > index 6a30c59..9d22bdd 100644
> > --- a/libmultipath/structs.h
> > +++ b/libmultipath/structs.h
> > @@ -471,7 +471,7 @@ struct multipath {
> >  	int ghost_delay_tick;
> >  	int queue_mode;
> >  	unsigned int sync_tick;
> > -	int synced_count;
> > +	int checker_count;
> >  	enum prio_update_type prio_update;
> >  	uid_t uid;
> >  	gid_t gid;
> > diff --git a/libmultipath/structs_vec.c
> > b/libmultipath/structs_vec.c
> > index 7a4e3eb..6aa744d 100644
> > --- a/libmultipath/structs_vec.c
> > +++ b/libmultipath/structs_vec.c
> > @@ -530,7 +530,6 @@ update_multipath_table (struct multipath *mpp,
> > vector pathvec, int flags)
> >  	conf = get_multipath_config();
> >  	mpp->sync_tick = conf->max_checkint;
> >  	put_multipath_config(conf);
> > -	mpp->synced_count++;
> >  
> >  	r = libmp_mapinfo(DM_MAP_BY_NAME | MAPINFO_MPATH_ONLY,
> >  			  (mapid_t) { .str = mpp->alias },
> > diff --git a/multipathd/main.c b/multipathd/main.c
> > index 4a28fbb..e4e6bf7 100644
> > --- a/multipathd/main.c
> > +++ b/multipathd/main.c
> > @@ -2470,7 +2470,7 @@ sync_mpp(struct vectors * vecs, struct
> > multipath *mpp, unsigned int ticks)
> >  	if (mpp->sync_tick)
> >  		mpp->sync_tick -= (mpp->sync_tick > ticks) ? ticks
> > :
> >  				  mpp->sync_tick;
> > -	if (mpp->sync_tick)
> > +	if (mpp->sync_tick && !mpp->checker_count)
> >  		return;
> >  
> >  	do_sync_mpp(vecs, mpp);
> > @@ -2513,12 +2513,6 @@ update_path_state (struct vectors * vecs,
> > struct path * pp)
> >  		return handle_path_wwid_change(pp, vecs)?
> > CHECK_PATH_REMOVED :
> >  							 
> > CHECK_PATH_SKIPPED;
> >  	}
> > -	if (pp->mpp->synced_count == 0) {
> > -		do_sync_mpp(vecs, pp->mpp);
> > -		/* if update_multipath_strings orphaned the path,
> > quit early */
> > -		if (!pp->mpp)
> > -			return CHECK_PATH_SKIPPED;
> > -	}
> >  	if ((newstate != PATH_UP && newstate != PATH_GHOST &&
> >  	     newstate != PATH_PENDING) && (pp->state ==
> > PATH_DELAYED)) {
> >  		/* If path state become failed again cancel path
> > delay state */
> > @@ -2918,9 +2912,11 @@ check_paths(struct vectors *vecs, unsigned
> > int ticks)
> >  	vector_foreach_slot(vecs->pathvec, pp, i) {
> >  		if (pp->is_checked != CHECK_PATH_UNCHECKED)
> >  			continue;
> > -		if (pp->mpp)
> > +		if (pp->mpp) {
> >  			pp->is_checked = check_path(pp, ticks);
> > -		else
> > +			if (pp->is_checked == CHECK_PATH_STARTED)
> > +				pp->mpp->checker_count++;
> > +		} else
> >  			pp->is_checked =
> > check_uninitialized_path(pp, ticks);
> >  		if (pp->is_checked == CHECK_PATH_STARTED &&
> >  		    checker_need_wait(&pp->checker))
> > @@ -3014,12 +3010,10 @@ checkerloop (void *ap)
> >  			pthread_cleanup_push(cleanup_lock, &vecs-
> > >lock);
> >  			lock(&vecs->lock);
> >  			pthread_testcancel();
> > -			vector_foreach_slot(vecs->mpvec, mpp, i)
> > -				mpp->synced_count = 0;
> >  			if (checker_state == CHECKER_STARTING) {
> >  				vector_foreach_slot(vecs->mpvec,
> > mpp, i) {
> > -					sync_mpp(vecs, mpp,
> > ticks);
> >  					mpp->prio_update =
> > PRIO_UPDATE_NONE;
> > +					mpp->checker_count = 0;
> >  				}
> >  				vector_foreach_slot(vecs->pathvec,
> > pp, i)
> >  					pp->is_checked =
> > CHECK_PATH_UNCHECKED;
> > @@ -3032,11 +3026,13 @@ checkerloop (void *ap)
> >  							    
> > start_time.tv_sec);
> >  			if (checker_state == CHECKER_FINISHED) {
> >  				vector_foreach_slot(vecs->mpvec,
> > mpp, i) {
> > -					if ((update_mpp_prio(mpp)
> > ||
> > -					     (mpp->need_reload &&
> > mpp->synced_count > 0)) &&
> 
> When you call reload_and_sync_map(), it will automatically resync the
> map via setup_multipath() -> refresh_multipath() ->
> update_multipath_strings().
> 
> This means that if for some reason multipathd and the kernel disagree
> about a map, and reloading it doesn't fix the problem, you will
> immediately set mpp->need_reload again. With the old mpp-
> >synced_count
> check, you only reload maps with need_reload() when a path is
> checked.
> Without this check, or a (mpp->checker_count > 0) check to replace
> it,
> you will keep reloading these maps every loop, roughly once a second.
> I
> would rather not do this.
> 
> If you want to make sure to immediately handle a need_reload that
> wasn't
> triggered by this call to reload_and_sync_map() which was because of
> an
> earlier need_reload, we could make need_reload have three states, to
> distinguish between a reload we want done immediately, and one we
> would
> like to wait on because we just did a reload and it didn't fix the
> problem. Then we could remember if need_reload was set before calling
> reload_and_sync_map(), and if it was, and if it is still set after,
> we
> could switch it to the delayed version.
> 
> Or perhaps I'm just being paranoid here. 

As you probably know and as I recently verified, reloading the kernel
from the checker loop will hardly ever fail except with -ENOMEM [1]. We
can pass non-existing or failed devices to the kernel, it will happily
accept them.

update_pathvec_from_dm() never adds any devices to the map, it just
removes some. If need_reload is set, it means that it has removed
either a path or an entire pathgroup. When the map is reloaded, it will
only reference (a subset of) the devices that were already mapped. I
see no way how this could fail unless either multipathd or the kernel
are really badly malfunctioning, in which case we don't need to bother
about reloading too frequently.

But *if* the reload succeeds, the set of devices in the kernel is
guaranteed to be identical to the table that we've just used for
reloading. So only way that another difference between kernel and
multipath state could occur between the reload and
update_pathvec_from_dm() running again is that another device has just
diappeared from the system, in which case a quick reload would be a
reasonable action. (Well I guess another possibility would be a 3rd
party maliciously adding wrong path devices to the maps we maintain,
but that's not something we can do much about).

If need_reload is indeed set again in this situation, I would indeed
prefer to double-check this map quickly. As argued above, I strongly
believe that such a situation will not persist. IMO a detected
inconsistency between the kernel and multipathd is a very bad thing
that we should try to fix rather sooner than later. It's at least as
bad as a failed path, which we'll check every second, too.

Bottom line: I think re-checking this quickly is actually the right
thing to do. Would you accept this if I add a warning in the
"inconsistent" case, so that in the event that we actually run into a
persistent discrepancy situation, we will notice?

Regards,
Martin

[1]
https://lore.kernel.org/dm-devel/ee6fcbda31fd1f13969653782417fbed748f5bc7.camel@suse.com/

> -Ben
> 
> > -					   
> > reload_and_sync_map(mpp, vecs) == 2)
> > +					sync_mpp(vecs, mpp,
> > ticks);
> > +					if ((update_mpp_prio(mpp)
> > || mpp->need_reload) &&
> > +					   
> > reload_and_sync_map(mpp, vecs) == 2) {
> >  						/* multipath
> > device deleted */
> >  						i--;
> > +						continue;
> > +					}
> >  				}
> >  			}
> >  			lock_cleanup_pop(vecs->lock);
> > -- 
> > 2.47.0
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 04/14] multipathd: quickly re-sync if a map is inconsistent
  2024-12-19 21:57   ` Benjamin Marzinski
@ 2025-01-14 21:37     ` Martin Wilck
  2025-01-15 18:48       ` Benjamin Marzinski
  0 siblings, 1 reply; 27+ messages in thread
From: Martin Wilck @ 2025-01-14 21:37 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: Christophe Varoqui, dm-devel

On Thu, 2024-12-19 at 16:57 -0500, Benjamin Marzinski wrote:
> On Wed, Dec 11, 2024 at 11:58:59PM +0100, Martin Wilck wrote:
> > After reading the kernel device-mapper table,
> > update_pathvec_from_dm()
> > sets the mpp->need_reload flag if an inconsistent state was found
> > (often a
> > path with wrong WWID). We expect reload_and_sync_map() to fix this
> > situation.
> > However, schedule a quick resync in this case, to be double-check
> > that the
> > inconsistency has been fixed.
> 
> I'm not too sure about this. My biggest worry with handling
> mpp->need_reload in the checkerloop is what happens if for some
> reason
> multipathd and the kernel keep disagreeing on something. You would
> just
> keep reloading the device. That seems unlikely, so I've o.k. with
> handling it here, but if that does happen, this would make it much
> worse.  Instead of reloading every path check, you would reload every
> loop.
> 
> If you do detect an inconsistent state, and trigger a reload, and the
> state is still inconsistent after that, I would argue that yet
> another
> reload is more likely to remain inconsistent than it is to fix the
> problem. So I would rather not speed it up.
> 

Please see my reply to 03/14.

Martin


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 08/14] multipathd: don't call reload_and_sync_map() from deferred_failback_tick()
  2024-12-19 22:04   ` Benjamin Marzinski
@ 2025-01-14 21:40     ` Martin Wilck
  0 siblings, 0 replies; 27+ messages in thread
From: Martin Wilck @ 2025-01-14 21:40 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: Christophe Varoqui, dm-devel

On Thu, 2024-12-19 at 17:04 -0500, Benjamin Marzinski wrote:
> On Wed, Dec 11, 2024 at 11:59:03PM +0100, Martin Wilck wrote:
> > Instead, move the call inside the existing loop over vecs->mpvec
> > and call
> > reload_and_sync_map() from checker_finished(). Note that we can't
> > simply
> > put the call deferred_failback_tick() in the "if" condition,
> > because we
> > need to adjust the ticks even if update_mpp_prio() returns true.
> > For
> > consistency, use separate bool variables for each condition that
> > would
> > necessitate a reload.
> > 
> > Signed-off-by: Martin Wilck <mwilck@suse.com>
> > ---
> >  multipathd/main.c | 39 ++++++++++++++-------------------------
> >  1 file changed, 14 insertions(+), 25 deletions(-)
> > 
> > diff --git a/multipathd/main.c b/multipathd/main.c
> > index b045f8b..b1f0f81 100644
> > --- a/multipathd/main.c
> > +++ b/multipathd/main.c
> > @@ -2076,32 +2076,20 @@ ghost_delay_tick(struct vectors *vecs)
> >  	}
> >  }
> >  
> > -static void
> > -deferred_failback_tick (struct vectors *vecs)
> > +static bool deferred_failback_tick(struct multipath *mpp)
> >  {
> > -	struct multipath * mpp;
> > -	int i;
> >  	bool need_reload;
> >  
> > -	vector_foreach_slot (vecs->mpvec, mpp, i) {
> > -		/*
> > -		 * deferred failback getting sooner
> > -		 */
> > -		if (mpp->pgfailback > 0 && mpp->failback_tick > 0)
> > {
> > -			mpp->failback_tick--;
> > +	if (mpp->pgfailback <= 0 || mpp->failback_tick <= 0)
> > +		return false;
> >  
> > -			if (!mpp->failback_tick &&
> > -			    need_switch_pathgroup(mpp,
> > &need_reload)) {
> > -				if (need_reload) {
> > -					if
> > (reload_and_sync_map(mpp, vecs) == 2) {
> > -						/* multipath
> > device removed */
> > -						i--;
> > -					}
> > -				} else
> > -					switch_pathgroup(mpp);
> > -			}
> > -		}
> > -	}
> > +	mpp->failback_tick--;
> > +	if (!mpp->failback_tick &&
> > +	    need_switch_pathgroup(mpp, &need_reload) &&
> > +	    need_reload)
> > +		return true;
> > +	else
> 
> If (!mpp->failback_tick && need_switch_pathgroup(mpp, &need_reload)
> is
> true by need_reload isn't, don't we still need to call
> switch_pathgroup(mpp);

Ups. Good call, thanks!

Martin


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 03/14] multipathd: sync maps at end of checkerloop
  2025-01-14 21:36     ` Martin Wilck
@ 2025-01-15 16:45       ` Benjamin Marzinski
  0 siblings, 0 replies; 27+ messages in thread
From: Benjamin Marzinski @ 2025-01-15 16:45 UTC (permalink / raw)
  To: Martin Wilck; +Cc: Christophe Varoqui, dm-devel

On Tue, Jan 14, 2025 at 10:36:06PM +0100, Martin Wilck wrote:
> On Thu, 2024-12-19 at 18:04 -0500, Benjamin Marzinski wrote:
> > On Wed, Dec 11, 2024 at 11:58:58PM +0100, Martin Wilck wrote:
> > > @@ -3032,11 +3026,13 @@ checkerloop (void *ap)
> > >  							    
> > > start_time.tv_sec);
> > >  			if (checker_state == CHECKER_FINISHED) {
> > >  				vector_foreach_slot(vecs->mpvec,
> > > mpp, i) {
> > > -					if ((update_mpp_prio(mpp)
> > > ||
> > > -					     (mpp->need_reload &&
> > > mpp->synced_count > 0)) &&
> > 
> > When you call reload_and_sync_map(), it will automatically resync the
> > map via setup_multipath() -> refresh_multipath() ->
> > update_multipath_strings().
> > 
> > This means that if for some reason multipathd and the kernel disagree
> > about a map, and reloading it doesn't fix the problem, you will
> > immediately set mpp->need_reload again. With the old mpp-
> > >synced_count
> > check, you only reload maps with need_reload() when a path is
> > checked.
> > Without this check, or a (mpp->checker_count > 0) check to replace
> > it,
> > you will keep reloading these maps every loop, roughly once a second.
> > I
> > would rather not do this.
> > 
> > If you want to make sure to immediately handle a need_reload that
> > wasn't
> > triggered by this call to reload_and_sync_map() which was because of
> > an
> > earlier need_reload, we could make need_reload have three states, to
> > distinguish between a reload we want done immediately, and one we
> > would
> > like to wait on because we just did a reload and it didn't fix the
> > problem. Then we could remember if need_reload was set before calling
> > reload_and_sync_map(), and if it was, and if it is still set after,
> > we
> > could switch it to the delayed version.
> > 
> > Or perhaps I'm just being paranoid here. 
> 
> As you probably know and as I recently verified, reloading the kernel
> from the checker loop will hardly ever fail except with -ENOMEM [1]. We
> can pass non-existing or failed devices to the kernel, it will happily
> accept them.
> 
> update_pathvec_from_dm() never adds any devices to the map, it just
> removes some. If need_reload is set, it means that it has removed
> either a path or an entire pathgroup. When the map is reloaded, it will
> only reference (a subset of) the devices that were already mapped. I
> see no way how this could fail unless either multipathd or the kernel
> are really badly malfunctioning, in which case we don't need to bother
> about reloading too frequently.
> 
> But *if* the reload succeeds, the set of devices in the kernel is
> guaranteed to be identical to the table that we've just used for
> reloading. So only way that another difference between kernel and
> multipath state could occur between the reload and
> update_pathvec_from_dm() running again is that another device has just
> diappeared from the system, in which case a quick reload would be a
> reasonable action. (Well I guess another possibility would be a 3rd
> party maliciously adding wrong path devices to the maps we maintain,
> but that's not something we can do much about).
> 
> If need_reload is indeed set again in this situation, I would indeed
> prefer to double-check this map quickly. As argued above, I strongly
> believe that such a situation will not persist. IMO a detected
> inconsistency between the kernel and multipathd is a very bad thing
> that we should try to fix rather sooner than later. It's at least as
> bad as a failed path, which we'll check every second, too.
> 
> Bottom line: I think re-checking this quickly is actually the right
> thing to do. Would you accept this if I add a warning in the
> "inconsistent" case, so that in the event that we actually run into a
> persistent discrepancy situation, we will notice?
> 

Sure. This is probably just my paranoia here, since I can't actually
come up with a concrete case where there would be a persistent
discrepancy. If it ever happens, it's almost definitely a bug in the
multipath code.

-Ben

> Regards,
> Martin
> 
> [1]
> https://lore.kernel.org/dm-devel/ee6fcbda31fd1f13969653782417fbed748f5bc7.camel@suse.com/
> 
> > -Ben
> > 
> > > -					   
> > > reload_and_sync_map(mpp, vecs) == 2)
> > > +					sync_mpp(vecs, mpp,
> > > ticks);
> > > +					if ((update_mpp_prio(mpp)
> > > || mpp->need_reload) &&
> > > +					   
> > > reload_and_sync_map(mpp, vecs) == 2) {
> > >  						/* multipath
> > > device deleted */
> > >  						i--;
> > > +						continue;
> > > +					}
> > >  				}
> > >  			}
> > >  			lock_cleanup_pop(vecs->lock);
> > > -- 
> > > 2.47.0
> > 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 04/14] multipathd: quickly re-sync if a map is inconsistent
  2025-01-14 21:37     ` Martin Wilck
@ 2025-01-15 18:48       ` Benjamin Marzinski
  0 siblings, 0 replies; 27+ messages in thread
From: Benjamin Marzinski @ 2025-01-15 18:48 UTC (permalink / raw)
  To: Martin Wilck; +Cc: Christophe Varoqui, dm-devel

On Tue, Jan 14, 2025 at 10:37:24PM +0100, Martin Wilck wrote:
> On Thu, 2024-12-19 at 16:57 -0500, Benjamin Marzinski wrote:
> > On Wed, Dec 11, 2024 at 11:58:59PM +0100, Martin Wilck wrote:
> > > After reading the kernel device-mapper table,
> > > update_pathvec_from_dm()
> > > sets the mpp->need_reload flag if an inconsistent state was found
> > > (often a
> > > path with wrong WWID). We expect reload_and_sync_map() to fix this
> > > situation.
> > > However, schedule a quick resync in this case, to be double-check
> > > that the
> > > inconsistency has been fixed.
> > 
> > I'm not too sure about this. My biggest worry with handling
> > mpp->need_reload in the checkerloop is what happens if for some
> > reason
> > multipathd and the kernel keep disagreeing on something. You would
> > just
> > keep reloading the device. That seems unlikely, so I've o.k. with
> > handling it here, but if that does happen, this would make it much
> > worse.  Instead of reloading every path check, you would reload every
> > loop.
> > 
> > If you do detect an inconsistent state, and trigger a reload, and the
> > state is still inconsistent after that, I would argue that yet
> > another
> > reload is more likely to remain inconsistent than it is to fix the
> > problem. So I would rather not speed it up.
> > 
> 
> Please see my reply to 03/14.

Fine. Since I can see situations where a cascade of device changes would
make an inconsistency appear immediately after a reload and I can't
actually come up with a case (excluding bugs and ENOMEM) where nothing
changed, and we reloaded to fix and inconsistency, but it still isn't
fixed, we should probably handle the case that we know can actually
happen. Warning on inconsistent states should be good enough, and I
think we already do that in update_pathvec_from_dm().

-Ben

> 
> Martin


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 03/14] multipathd: sync maps at end of checkerloop
  2024-12-19 21:50   ` Benjamin Marzinski
@ 2025-01-17 19:04     ` Martin Wilck
  0 siblings, 0 replies; 27+ messages in thread
From: Martin Wilck @ 2025-01-17 19:04 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: Christophe Varoqui, dm-devel

On Thu, 2024-12-19 at 16:50 -0500, Benjamin Marzinski wrote:
> On Wed, Dec 11, 2024 at 11:58:58PM +0100, Martin Wilck wrote:
> > Rather than calling sync_mpp early in the checkerloop and tracking
> > map synchronization with synced_count, call sync_mpp() in the
> > CHECKER_FINISHED
> > state, if either at least one path of the map has been checked in
> > the current
> > iteration, or the sync tick has expired. This avoids potentially
> > deleting
> > paths from the pathvec through the do_sync_mpp() ->
> > update_multipath_strings()
> > -> sync_paths -> check_removed_paths() call chain while we're
> > iterating over
> > the pathvec. Also, the time gap between obtaining path states and
> > syncing
> > the state with the kernel is smaller this way.
> > 
> 
> Sorry for the delayed review. I've been busy lately and there was a
> lot
> to look at with moving the syncs from before we check the paths till
> the
> end. Turns out I'm mostly o.k. with this. Syncing right before we
> checked the paths still left a small window where things could change
> in
> the kernel state, and we wouldn't see them before we checked the
> paths.
> It's just a larger window now.
> 
> The one place where we won't just pick up the change on the next
> checker
> loop is enable_group(). If the kernel disables a pathgroup,
> multipathd
> would re-enable it if a path in it switched states to PATH_UP. I'm
> not
> totally sure how necessary this is. The kenel has code to re-enable
> the
> pathgroups. But it's been there a while, and perhaps it does avoid an
> issue.
> 
> The kernel doesn't even necessarily send an event if it disables a
> pathgroup (since the pathgroup isn't actually disabled, it's just
> bypassed, which means all the other pathgroups are checked first). 
> So
> there is a real chance that since the last path check, the pgp-
> >status
> could have changed. By not checking before we check the path, we
> could
> not realize that tht pgp is in PGSTATE_DISABLED when a path switches
> states to PATH_UP.  It shouldn't be that hard to check in sync_mpp()
> if
> there are any disabled path_groups the include a path where
> pp->is_checked == CHECK_PATH_NEW_UP. So we could move the
> enable_group()
> code to checker_finished() as well and fix this.

Right. I'll add a patch on top of the series. Thanks for spotting this.

Martin


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2025-01-17 19:04 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-11 22:58 [PATCH v2 00/14] multipathd: More map reload handling, and checkerloop work Martin Wilck
2024-12-11 22:58 ` [PATCH v2 01/14] multipathd: don't reload map in update_mpp_prio() Martin Wilck
2024-12-11 22:58 ` [PATCH v2 02/14] multipathd: remove dm_get_info() call from refresh_multipath() Martin Wilck
2024-12-11 22:58 ` [PATCH v2 03/14] multipathd: sync maps at end of checkerloop Martin Wilck
2024-12-19 21:50   ` Benjamin Marzinski
2025-01-17 19:04     ` Martin Wilck
2024-12-19 23:04   ` Benjamin Marzinski
2025-01-14 21:36     ` Martin Wilck
2025-01-15 16:45       ` Benjamin Marzinski
2024-12-11 22:58 ` [PATCH v2 04/14] multipathd: quickly re-sync if a map is inconsistent Martin Wilck
2024-12-19 21:57   ` Benjamin Marzinski
2025-01-14 21:37     ` Martin Wilck
2025-01-15 18:48       ` Benjamin Marzinski
2024-12-19 23:05   ` Benjamin Marzinski
2024-12-11 22:59 ` [PATCH v2 05/14] multipathd: move yielding for waiters to start of checkerloop Martin Wilck
2024-12-11 22:59 ` [PATCH v2 06/14] multipathd: add checker_finished() Martin Wilck
2024-12-11 22:59 ` [PATCH v2 07/14] multipathd: move "tick" calls into checker_finished() Martin Wilck
2024-12-11 22:59 ` [PATCH v2 08/14] multipathd: don't call reload_and_sync_map() from deferred_failback_tick() Martin Wilck
2024-12-19 22:04   ` Benjamin Marzinski
2025-01-14 21:40     ` Martin Wilck
2024-12-11 22:59 ` [PATCH v2 09/14] multipathd: move retry_count_tick() into existing mpvec loop Martin Wilck
2024-12-11 22:59 ` [PATCH v2 10/14] multipathd: don't call update_map() from missing_uev_wait_tick() Martin Wilck
2024-12-11 22:59 ` [PATCH v2 11/14] multipathd: don't call udpate_map() from ghost_delay_tick() Martin Wilck
2024-12-11 22:59 ` [PATCH v2 12/14] multipathd: only call reload_and_sync_map() when ghost delay expires Martin Wilck
2024-12-11 22:59 ` [PATCH v2 13/14] multipathd: remove non-existent maps in checkerloop Martin Wilck
2024-12-11 22:59 ` [PATCH v2 14/14] multipathd: remove mpvec_garbage_collector() Martin Wilck
2024-12-19 22:33 ` [PATCH v2 00/14] multipathd: More map reload handling, and checkerloop work Benjamin Marzinski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.