From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9615119E98C for ; Thu, 19 Dec 2024 23:04:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734649477; cv=none; b=dFu1enkOQwkc/xEH/I1tZw27sInRaAbQEepFGDoidzJX3sSrc4MN0EIVr0CIBP+O2t1WgSeOE+2W7k6OL6Q0iVQROQW4LtPP4CNpLm0cIHHnZ/TggN8V4ER87hcbxABCEEeg2/UXfpMCQzdT0UkXByv7bvPiBMhXnJ5AMgidBPE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734649477; c=relaxed/simple; bh=/ufWHCZZ0yW7Xmzo22l7yVs7UiQtEqPVUnWEUw0K22k=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: In-Reply-To:Content-Type:Content-Disposition; b=Gsc3dM/lLss7ye0BfG6oW7HQK3rSh96b6fhbDdeof5uBhI7eIZF50xrl+IVtj20RVoise40VTpy4hVArs7SPPISlBzsM0VM1+vvLhMDWxyv8oFat4xhZ4x/0NqWWf45AdszGZ7sK8Tbg/xZGc6UBxeo2hbkm4idI1rQ3zNZ7KVo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=dCGQMfm0; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="dCGQMfm0" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1734649474; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=PeNHVf623IzUbuDF2yPzCCsKLKtZE8qvOX/oLmYv62U=; b=dCGQMfm07vouMR3N+TBrvemxeeuhr6ZnmUomKMLSaV1TNVtejys6UEpKb5yB3Fe41HrdUc 8CkY4Ld30FMwm/mjjG5d3B4f+oifRd+BgemGNuvi087tC6yNL395/b1Tu+iZPyhdsIiK1E QtH2lG5LAT0WmDwI3ShavJ/kK7Xb28A= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-262-vemUiRy1PCifFp0GMyuHLQ-1; Thu, 19 Dec 2024 18:04:31 -0500 X-MC-Unique: vemUiRy1PCifFp0GMyuHLQ-1 X-Mimecast-MFC-AGG-ID: vemUiRy1PCifFp0GMyuHLQ Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id AB2CE19560AA; Thu, 19 Dec 2024 23:04:28 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (bmarzins-01.fast.eng.rdu2.dc.redhat.com [10.6.23.12]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 1E98619560AD; Thu, 19 Dec 2024 23:04:27 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (localhost [127.0.0.1]) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.17.2/8.17.1) with ESMTPS id 4BJN4QJc1754892 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 19 Dec 2024 18:04:26 -0500 Received: (from bmarzins@localhost) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.17.2/8.17.2/Submit) id 4BJN4Qax1754891; Thu, 19 Dec 2024 18:04:26 -0500 Date: Thu, 19 Dec 2024 18:04:26 -0500 From: Benjamin Marzinski To: Martin Wilck Cc: Christophe Varoqui , dm-devel@lists.linux.dev, Martin Wilck Subject: Re: [PATCH v2 03/14] multipathd: sync maps at end of checkerloop Message-ID: References: <20241211225909.298770-1-mwilck@suse.com> <20241211225909.298770-4-mwilck@suse.com> Precedence: bulk X-Mailing-List: dm-devel@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In-Reply-To: <20241211225909.298770-4-mwilck@suse.com> X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: vhMRq77FfAsdp04NB-Ib98itdpDO1Cewr90YLme2twc_1734649469 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Wed, Dec 11, 2024 at 11:58:58PM +0100, Martin Wilck wrote: > Rather than calling sync_mpp early in the checkerloop and tracking > map synchronization with synced_count, call sync_mpp() in the CHECKER_FINISHED > state, if either at least one path of the map has been checked in the current > iteration, or the sync tick has expired. This avoids potentially deleting > paths from the pathvec through the do_sync_mpp() -> update_multipath_strings() > -> sync_paths -> check_removed_paths() call chain while we're iterating over > the pathvec. Also, the time gap between obtaining path states and syncing > the state with the kernel is smaller this way. > > Suggested-by: Benjamin Marzinski > Signed-off-by: Martin Wilck > --- > libmultipath/structs.h | 2 +- > libmultipath/structs_vec.c | 1 - > multipathd/main.c | 26 +++++++++++--------------- > 3 files changed, 12 insertions(+), 17 deletions(-) > > diff --git a/libmultipath/structs.h b/libmultipath/structs.h > index 6a30c59..9d22bdd 100644 > --- a/libmultipath/structs.h > +++ b/libmultipath/structs.h > @@ -471,7 +471,7 @@ struct multipath { > int ghost_delay_tick; > int queue_mode; > unsigned int sync_tick; > - int synced_count; > + int checker_count; > enum prio_update_type prio_update; > uid_t uid; > gid_t gid; > diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c > index 7a4e3eb..6aa744d 100644 > --- a/libmultipath/structs_vec.c > +++ b/libmultipath/structs_vec.c > @@ -530,7 +530,6 @@ update_multipath_table (struct multipath *mpp, vector pathvec, int flags) > conf = get_multipath_config(); > mpp->sync_tick = conf->max_checkint; > put_multipath_config(conf); > - mpp->synced_count++; > > r = libmp_mapinfo(DM_MAP_BY_NAME | MAPINFO_MPATH_ONLY, > (mapid_t) { .str = mpp->alias }, > diff --git a/multipathd/main.c b/multipathd/main.c > index 4a28fbb..e4e6bf7 100644 > --- a/multipathd/main.c > +++ b/multipathd/main.c > @@ -2470,7 +2470,7 @@ sync_mpp(struct vectors * vecs, struct multipath *mpp, unsigned int ticks) > if (mpp->sync_tick) > mpp->sync_tick -= (mpp->sync_tick > ticks) ? ticks : > mpp->sync_tick; > - if (mpp->sync_tick) > + if (mpp->sync_tick && !mpp->checker_count) > return; > > do_sync_mpp(vecs, mpp); > @@ -2513,12 +2513,6 @@ update_path_state (struct vectors * vecs, struct path * pp) > return handle_path_wwid_change(pp, vecs)? CHECK_PATH_REMOVED : > CHECK_PATH_SKIPPED; > } > - if (pp->mpp->synced_count == 0) { > - do_sync_mpp(vecs, pp->mpp); > - /* if update_multipath_strings orphaned the path, quit early */ > - if (!pp->mpp) > - return CHECK_PATH_SKIPPED; > - } > if ((newstate != PATH_UP && newstate != PATH_GHOST && > newstate != PATH_PENDING) && (pp->state == PATH_DELAYED)) { > /* If path state become failed again cancel path delay state */ > @@ -2918,9 +2912,11 @@ check_paths(struct vectors *vecs, unsigned int ticks) > vector_foreach_slot(vecs->pathvec, pp, i) { > if (pp->is_checked != CHECK_PATH_UNCHECKED) > continue; > - if (pp->mpp) > + if (pp->mpp) { > pp->is_checked = check_path(pp, ticks); > - else > + if (pp->is_checked == CHECK_PATH_STARTED) > + pp->mpp->checker_count++; > + } else > pp->is_checked = check_uninitialized_path(pp, ticks); > if (pp->is_checked == CHECK_PATH_STARTED && > checker_need_wait(&pp->checker)) > @@ -3014,12 +3010,10 @@ checkerloop (void *ap) > pthread_cleanup_push(cleanup_lock, &vecs->lock); > lock(&vecs->lock); > pthread_testcancel(); > - vector_foreach_slot(vecs->mpvec, mpp, i) > - mpp->synced_count = 0; > if (checker_state == CHECKER_STARTING) { > vector_foreach_slot(vecs->mpvec, mpp, i) { > - sync_mpp(vecs, mpp, ticks); > mpp->prio_update = PRIO_UPDATE_NONE; > + mpp->checker_count = 0; > } > vector_foreach_slot(vecs->pathvec, pp, i) > pp->is_checked = CHECK_PATH_UNCHECKED; > @@ -3032,11 +3026,13 @@ checkerloop (void *ap) > start_time.tv_sec); > if (checker_state == CHECKER_FINISHED) { > vector_foreach_slot(vecs->mpvec, mpp, i) { > - if ((update_mpp_prio(mpp) || > - (mpp->need_reload && mpp->synced_count > 0)) && When you call reload_and_sync_map(), it will automatically resync the map via setup_multipath() -> refresh_multipath() -> update_multipath_strings(). This means that if for some reason multipathd and the kernel disagree about a map, and reloading it doesn't fix the problem, you will immediately set mpp->need_reload again. With the old mpp->synced_count check, you only reload maps with need_reload() when a path is checked. Without this check, or a (mpp->checker_count > 0) check to replace it, you will keep reloading these maps every loop, roughly once a second. I would rather not do this. If you want to make sure to immediately handle a need_reload that wasn't triggered by this call to reload_and_sync_map() which was because of an earlier need_reload, we could make need_reload have three states, to distinguish between a reload we want done immediately, and one we would like to wait on because we just did a reload and it didn't fix the problem. Then we could remember if need_reload was set before calling reload_and_sync_map(), and if it was, and if it is still set after, we could switch it to the delayed version. Or perhaps I'm just being paranoid here. -Ben > - reload_and_sync_map(mpp, vecs) == 2) > + sync_mpp(vecs, mpp, ticks); > + if ((update_mpp_prio(mpp) || mpp->need_reload) && > + reload_and_sync_map(mpp, vecs) == 2) { > /* multipath device deleted */ > i--; > + continue; > + } > } > } > lock_cleanup_pop(vecs->lock); > -- > 2.47.0