From: Benjamin Marzinski <bmarzins@redhat.com>
To: Martin Wilck <martin.wilck@suse.com>
Cc: Christophe Varoqui <christophe.varoqui@opensvc.com>,
dm-devel@lists.linux.dev, Martin Wilck <mwilck@suse.com>
Subject: Re: [PATCH v2 03/14] multipathd: sync maps at end of checkerloop
Date: Thu, 19 Dec 2024 16:50:23 -0500 [thread overview]
Message-ID: <Z2SVH8f7z72SV6n5@redhat.com> (raw)
In-Reply-To: <20241211225909.298770-4-mwilck@suse.com>
On Wed, Dec 11, 2024 at 11:58:58PM +0100, Martin Wilck wrote:
> Rather than calling sync_mpp early in the checkerloop and tracking
> map synchronization with synced_count, call sync_mpp() in the CHECKER_FINISHED
> state, if either at least one path of the map has been checked in the current
> iteration, or the sync tick has expired. This avoids potentially deleting
> paths from the pathvec through the do_sync_mpp() -> update_multipath_strings()
> -> sync_paths -> check_removed_paths() call chain while we're iterating over
> the pathvec. Also, the time gap between obtaining path states and syncing
> the state with the kernel is smaller this way.
>
Sorry for the delayed review. I've been busy lately and there was a lot
to look at with moving the syncs from before we check the paths till the
end. Turns out I'm mostly o.k. with this. Syncing right before we
checked the paths still left a small window where things could change in
the kernel state, and we wouldn't see them before we checked the paths.
It's just a larger window now.
The one place where we won't just pick up the change on the next checker
loop is enable_group(). If the kernel disables a pathgroup, multipathd
would re-enable it if a path in it switched states to PATH_UP. I'm not
totally sure how necessary this is. The kenel has code to re-enable the
pathgroups. But it's been there a while, and perhaps it does avoid an
issue.
The kernel doesn't even necessarily send an event if it disables a
pathgroup (since the pathgroup isn't actually disabled, it's just
bypassed, which means all the other pathgroups are checked first). So
there is a real chance that since the last path check, the pgp->status
could have changed. By not checking before we check the path, we could
not realize that tht pgp is in PGSTATE_DISABLED when a path switches
states to PATH_UP. It shouldn't be that hard to check in sync_mpp() if
there are any disabled path_groups the include a path where
pp->is_checked == CHECK_PATH_NEW_UP. So we could move the enable_group()
code to checker_finished() as well and fix this.
-Ben
> Suggested-by: Benjamin Marzinski <bmarzins@redhat.com>
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
> libmultipath/structs.h | 2 +-
> libmultipath/structs_vec.c | 1 -
> multipathd/main.c | 26 +++++++++++---------------
> 3 files changed, 12 insertions(+), 17 deletions(-)
>
> diff --git a/libmultipath/structs.h b/libmultipath/structs.h
> index 6a30c59..9d22bdd 100644
> --- a/libmultipath/structs.h
> +++ b/libmultipath/structs.h
> @@ -471,7 +471,7 @@ struct multipath {
> int ghost_delay_tick;
> int queue_mode;
> unsigned int sync_tick;
> - int synced_count;
> + int checker_count;
> enum prio_update_type prio_update;
> uid_t uid;
> gid_t gid;
> diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c
> index 7a4e3eb..6aa744d 100644
> --- a/libmultipath/structs_vec.c
> +++ b/libmultipath/structs_vec.c
> @@ -530,7 +530,6 @@ update_multipath_table (struct multipath *mpp, vector pathvec, int flags)
> conf = get_multipath_config();
> mpp->sync_tick = conf->max_checkint;
> put_multipath_config(conf);
> - mpp->synced_count++;
>
> r = libmp_mapinfo(DM_MAP_BY_NAME | MAPINFO_MPATH_ONLY,
> (mapid_t) { .str = mpp->alias },
> diff --git a/multipathd/main.c b/multipathd/main.c
> index 4a28fbb..e4e6bf7 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -2470,7 +2470,7 @@ sync_mpp(struct vectors * vecs, struct multipath *mpp, unsigned int ticks)
> if (mpp->sync_tick)
> mpp->sync_tick -= (mpp->sync_tick > ticks) ? ticks :
> mpp->sync_tick;
> - if (mpp->sync_tick)
> + if (mpp->sync_tick && !mpp->checker_count)
> return;
>
> do_sync_mpp(vecs, mpp);
> @@ -2513,12 +2513,6 @@ update_path_state (struct vectors * vecs, struct path * pp)
> return handle_path_wwid_change(pp, vecs)? CHECK_PATH_REMOVED :
> CHECK_PATH_SKIPPED;
> }
> - if (pp->mpp->synced_count == 0) {
> - do_sync_mpp(vecs, pp->mpp);
> - /* if update_multipath_strings orphaned the path, quit early */
> - if (!pp->mpp)
> - return CHECK_PATH_SKIPPED;
> - }
> if ((newstate != PATH_UP && newstate != PATH_GHOST &&
> newstate != PATH_PENDING) && (pp->state == PATH_DELAYED)) {
> /* If path state become failed again cancel path delay state */
> @@ -2918,9 +2912,11 @@ check_paths(struct vectors *vecs, unsigned int ticks)
> vector_foreach_slot(vecs->pathvec, pp, i) {
> if (pp->is_checked != CHECK_PATH_UNCHECKED)
> continue;
> - if (pp->mpp)
> + if (pp->mpp) {
> pp->is_checked = check_path(pp, ticks);
> - else
> + if (pp->is_checked == CHECK_PATH_STARTED)
> + pp->mpp->checker_count++;
> + } else
> pp->is_checked = check_uninitialized_path(pp, ticks);
> if (pp->is_checked == CHECK_PATH_STARTED &&
> checker_need_wait(&pp->checker))
> @@ -3014,12 +3010,10 @@ checkerloop (void *ap)
> pthread_cleanup_push(cleanup_lock, &vecs->lock);
> lock(&vecs->lock);
> pthread_testcancel();
> - vector_foreach_slot(vecs->mpvec, mpp, i)
> - mpp->synced_count = 0;
> if (checker_state == CHECKER_STARTING) {
> vector_foreach_slot(vecs->mpvec, mpp, i) {
> - sync_mpp(vecs, mpp, ticks);
> mpp->prio_update = PRIO_UPDATE_NONE;
> + mpp->checker_count = 0;
> }
> vector_foreach_slot(vecs->pathvec, pp, i)
> pp->is_checked = CHECK_PATH_UNCHECKED;
> @@ -3032,11 +3026,13 @@ checkerloop (void *ap)
> start_time.tv_sec);
> if (checker_state == CHECKER_FINISHED) {
> vector_foreach_slot(vecs->mpvec, mpp, i) {
> - if ((update_mpp_prio(mpp) ||
> - (mpp->need_reload && mpp->synced_count > 0)) &&
> - reload_and_sync_map(mpp, vecs) == 2)
> + sync_mpp(vecs, mpp, ticks);
> + if ((update_mpp_prio(mpp) || mpp->need_reload) &&
> + reload_and_sync_map(mpp, vecs) == 2) {
> /* multipath device deleted */
> i--;
> + continue;
> + }
> }
> }
> lock_cleanup_pop(vecs->lock);
> --
> 2.47.0
next prev parent reply other threads:[~2024-12-19 21:50 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-11 22:58 [PATCH v2 00/14] multipathd: More map reload handling, and checkerloop work Martin Wilck
2024-12-11 22:58 ` [PATCH v2 01/14] multipathd: don't reload map in update_mpp_prio() Martin Wilck
2024-12-11 22:58 ` [PATCH v2 02/14] multipathd: remove dm_get_info() call from refresh_multipath() Martin Wilck
2024-12-11 22:58 ` [PATCH v2 03/14] multipathd: sync maps at end of checkerloop Martin Wilck
2024-12-19 21:50 ` Benjamin Marzinski [this message]
2025-01-17 19:04 ` Martin Wilck
2024-12-19 23:04 ` Benjamin Marzinski
2025-01-14 21:36 ` Martin Wilck
2025-01-15 16:45 ` Benjamin Marzinski
2024-12-11 22:58 ` [PATCH v2 04/14] multipathd: quickly re-sync if a map is inconsistent Martin Wilck
2024-12-19 21:57 ` Benjamin Marzinski
2025-01-14 21:37 ` Martin Wilck
2025-01-15 18:48 ` Benjamin Marzinski
2024-12-19 23:05 ` Benjamin Marzinski
2024-12-11 22:59 ` [PATCH v2 05/14] multipathd: move yielding for waiters to start of checkerloop Martin Wilck
2024-12-11 22:59 ` [PATCH v2 06/14] multipathd: add checker_finished() Martin Wilck
2024-12-11 22:59 ` [PATCH v2 07/14] multipathd: move "tick" calls into checker_finished() Martin Wilck
2024-12-11 22:59 ` [PATCH v2 08/14] multipathd: don't call reload_and_sync_map() from deferred_failback_tick() Martin Wilck
2024-12-19 22:04 ` Benjamin Marzinski
2025-01-14 21:40 ` Martin Wilck
2024-12-11 22:59 ` [PATCH v2 09/14] multipathd: move retry_count_tick() into existing mpvec loop Martin Wilck
2024-12-11 22:59 ` [PATCH v2 10/14] multipathd: don't call update_map() from missing_uev_wait_tick() Martin Wilck
2024-12-11 22:59 ` [PATCH v2 11/14] multipathd: don't call udpate_map() from ghost_delay_tick() Martin Wilck
2024-12-11 22:59 ` [PATCH v2 12/14] multipathd: only call reload_and_sync_map() when ghost delay expires Martin Wilck
2024-12-11 22:59 ` [PATCH v2 13/14] multipathd: remove non-existent maps in checkerloop Martin Wilck
2024-12-11 22:59 ` [PATCH v2 14/14] multipathd: remove mpvec_garbage_collector() Martin Wilck
2024-12-19 22:33 ` [PATCH v2 00/14] multipathd: More map reload handling, and checkerloop work Benjamin Marzinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z2SVH8f7z72SV6n5@redhat.com \
--to=bmarzins@redhat.com \
--cc=christophe.varoqui@opensvc.com \
--cc=dm-devel@lists.linux.dev \
--cc=martin.wilck@suse.com \
--cc=mwilck@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.