linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4 v3] nfsd: fix deadlock in move_to_close_lru()
@ 2024-03-04 22:45 NeilBrown
  2024-03-04 22:45 ` [PATCH 1/4] nfsd: move nfsd4_cstate_assign_replay() earlier in open handling NeilBrown
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: NeilBrown @ 2024-03-04 22:45 UTC (permalink / raw)
  To: Chuck Lever, Jeff Layton
  Cc: linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

This is very similar to v2 with:
 - first two patches swapped
 - while(1) loop changed to "goto retry"
 - add a comment to patch 3
 - rebase on nfsd-next
 - rb from Jeff added.

NeilBrown

 [PATCH 1/4] nfsd: move nfsd4_cstate_assign_replay() earlier in open
 [PATCH 2/4] nfsd: perform all find_openstateowner_str calls in the
 [PATCH 3/4] nfsd: replace rp_mutex to avoid deadlock in
 [PATCH 4/4] nfsd: drop st_mutex_mutex before calling

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/4] nfsd: move nfsd4_cstate_assign_replay() earlier in open handling.
  2024-03-04 22:45 [PATCH 0/4 v3] nfsd: fix deadlock in move_to_close_lru() NeilBrown
@ 2024-03-04 22:45 ` NeilBrown
  2024-03-04 22:45 ` [PATCH 2/4] nfsd: perform all find_openstateowner_str calls in the one place NeilBrown
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: NeilBrown @ 2024-03-04 22:45 UTC (permalink / raw)
  To: Chuck Lever, Jeff Layton
  Cc: linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

Rather than taking the rp_mutex in nfsd4_cleanup_open_state() (which
seems counter-intuitive), take it and assign rp_owner as soon as
possible.

This will support a future change when nfsd4_cstate_assign_replay() might
fail.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/nfsd/nfs4state.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index ee9aa4843443..b9b3a2601675 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -5333,15 +5333,15 @@ nfsd4_process_open1(struct nfsd4_compound_state *cstate,
 	strhashval = ownerstr_hashval(&open->op_owner);
 	oo = find_openstateowner_str(strhashval, open, clp);
 	open->op_openowner = oo;
-	if (!oo) {
+	if (!oo)
 		goto new_owner;
-	}
 	if (!(oo->oo_flags & NFS4_OO_CONFIRMED)) {
 		/* Replace unconfirmed owners without checking for replay. */
 		release_openowner(oo);
 		open->op_openowner = NULL;
 		goto new_owner;
 	}
+	nfsd4_cstate_assign_replay(cstate, &oo->oo_owner);
 	status = nfsd4_check_seqid(cstate, &oo->oo_owner, open->op_seqid);
 	if (status)
 		return status;
@@ -5351,6 +5351,7 @@ nfsd4_process_open1(struct nfsd4_compound_state *cstate,
 	if (oo == NULL)
 		return nfserr_jukebox;
 	open->op_openowner = oo;
+	nfsd4_cstate_assign_replay(cstate, &oo->oo_owner);
 alloc_stateid:
 	open->op_stp = nfs4_alloc_open_stateid(clp);
 	if (!open->op_stp)
@@ -6122,12 +6123,8 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf
 void nfsd4_cleanup_open_state(struct nfsd4_compound_state *cstate,
 			      struct nfsd4_open *open)
 {
-	if (open->op_openowner) {
-		struct nfs4_stateowner *so = &open->op_openowner->oo_owner;
-
-		nfsd4_cstate_assign_replay(cstate, so);
-		nfs4_put_stateowner(so);
-	}
+	if (open->op_openowner)
+		nfs4_put_stateowner(&open->op_openowner->oo_owner);
 	if (open->op_file)
 		kmem_cache_free(file_slab, open->op_file);
 	if (open->op_stp)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/4] nfsd: perform all find_openstateowner_str calls in the one place.
  2024-03-04 22:45 [PATCH 0/4 v3] nfsd: fix deadlock in move_to_close_lru() NeilBrown
  2024-03-04 22:45 ` [PATCH 1/4] nfsd: move nfsd4_cstate_assign_replay() earlier in open handling NeilBrown
@ 2024-03-04 22:45 ` NeilBrown
  2024-03-09 18:56   ` Chuck Lever
  2024-03-04 22:45 ` [PATCH 3/4] nfsd: replace rp_mutex to avoid deadlock in move_to_close_lru() NeilBrown
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 8+ messages in thread
From: NeilBrown @ 2024-03-04 22:45 UTC (permalink / raw)
  To: Chuck Lever, Jeff Layton
  Cc: linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

Currently find_openstateowner_str looks are done both in
nfsd4_process_open1() and alloc_init_open_stateowner() - the latter
possibly being a surprise based on its name.

It would be easier to follow, and more conformant to common patterns, if
the lookup was all in the one place.

So replace alloc_init_open_stateowner() with
find_or_alloc_open_stateowner() and use the latter in
nfsd4_process_open1() without any calls to find_openstateowner_str().

This means all finds are find_openstateowner_str_locked() and
find_openstateowner_str() is no longer needed.  So discard
find_openstateowner_str() and rename find_openstateowner_str_locked() to
find_openstateowner_str().

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/nfsd/nfs4state.c | 93 +++++++++++++++++++--------------------------
 1 file changed, 40 insertions(+), 53 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index b9b3a2601675..0c1058e8cc4b 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -541,7 +541,7 @@ same_owner_str(struct nfs4_stateowner *sop, struct xdr_netobj *owner)
 }
 
 static struct nfs4_openowner *
-find_openstateowner_str_locked(unsigned int hashval, struct nfsd4_open *open,
+find_openstateowner_str(unsigned int hashval, struct nfsd4_open *open,
 			struct nfs4_client *clp)
 {
 	struct nfs4_stateowner *so;
@@ -558,18 +558,6 @@ find_openstateowner_str_locked(unsigned int hashval, struct nfsd4_open *open,
 	return NULL;
 }
 
-static struct nfs4_openowner *
-find_openstateowner_str(unsigned int hashval, struct nfsd4_open *open,
-			struct nfs4_client *clp)
-{
-	struct nfs4_openowner *oo;
-
-	spin_lock(&clp->cl_lock);
-	oo = find_openstateowner_str_locked(hashval, open, clp);
-	spin_unlock(&clp->cl_lock);
-	return oo;
-}
-
 static inline u32
 opaque_hashval(const void *ptr, int nbytes)
 {
@@ -4855,34 +4843,46 @@ nfsd4_find_and_lock_existing_open(struct nfs4_file *fp, struct nfsd4_open *open)
 }
 
 static struct nfs4_openowner *
-alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
-			   struct nfsd4_compound_state *cstate)
+find_or_alloc_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
+			      struct nfsd4_compound_state *cstate)
 {
 	struct nfs4_client *clp = cstate->clp;
-	struct nfs4_openowner *oo, *ret;
+	struct nfs4_openowner *oo, *new = NULL;
 
-	oo = alloc_stateowner(openowner_slab, &open->op_owner, clp);
-	if (!oo)
-		return NULL;
-	oo->oo_owner.so_ops = &openowner_ops;
-	oo->oo_owner.so_is_open_owner = 1;
-	oo->oo_owner.so_seqid = open->op_seqid;
-	oo->oo_flags = 0;
-	if (nfsd4_has_session(cstate))
-		oo->oo_flags |= NFS4_OO_CONFIRMED;
-	oo->oo_time = 0;
-	oo->oo_last_closed_stid = NULL;
-	INIT_LIST_HEAD(&oo->oo_close_lru);
-	spin_lock(&clp->cl_lock);
-	ret = find_openstateowner_str_locked(strhashval, open, clp);
-	if (ret == NULL) {
-		hash_openowner(oo, clp, strhashval);
-		ret = oo;
-	} else
-		nfs4_free_stateowner(&oo->oo_owner);
+	while (1) {
+		spin_lock(&clp->cl_lock);
+		oo = find_openstateowner_str(strhashval, open, clp);
+		if (oo && !(oo->oo_flags & NFS4_OO_CONFIRMED)) {
+			/* Replace unconfirmed owners without checking for replay. */
+			release_openowner(oo);
+			oo = NULL;
+		}
+		if (oo) {
+			spin_unlock(&clp->cl_lock);
+			if (new)
+				nfs4_free_stateowner(&new->oo_owner);
+			return oo;
+		}
+		if (new) {
+			hash_openowner(new, clp, strhashval);
+			spin_unlock(&clp->cl_lock);
+			return new;
+		}
+		spin_unlock(&clp->cl_lock);
 
-	spin_unlock(&clp->cl_lock);
-	return ret;
+		new = alloc_stateowner(openowner_slab, &open->op_owner, clp);
+		if (!new)
+			return NULL;
+		new->oo_owner.so_ops = &openowner_ops;
+		new->oo_owner.so_is_open_owner = 1;
+		new->oo_owner.so_seqid = open->op_seqid;
+		new->oo_flags = 0;
+		if (nfsd4_has_session(cstate))
+			new->oo_flags |= NFS4_OO_CONFIRMED;
+		new->oo_time = 0;
+		new->oo_last_closed_stid = NULL;
+		INIT_LIST_HEAD(&new->oo_close_lru);
+	}
 }
 
 static struct nfs4_ol_stateid *
@@ -5331,28 +5331,15 @@ nfsd4_process_open1(struct nfsd4_compound_state *cstate,
 	clp = cstate->clp;
 
 	strhashval = ownerstr_hashval(&open->op_owner);
-	oo = find_openstateowner_str(strhashval, open, clp);
+	oo = find_or_alloc_open_stateowner(strhashval, open, cstate);
 	open->op_openowner = oo;
 	if (!oo)
-		goto new_owner;
-	if (!(oo->oo_flags & NFS4_OO_CONFIRMED)) {
-		/* Replace unconfirmed owners without checking for replay. */
-		release_openowner(oo);
-		open->op_openowner = NULL;
-		goto new_owner;
-	}
+		return nfserr_jukebox;
 	nfsd4_cstate_assign_replay(cstate, &oo->oo_owner);
 	status = nfsd4_check_seqid(cstate, &oo->oo_owner, open->op_seqid);
 	if (status)
 		return status;
-	goto alloc_stateid;
-new_owner:
-	oo = alloc_init_open_stateowner(strhashval, open, cstate);
-	if (oo == NULL)
-		return nfserr_jukebox;
-	open->op_openowner = oo;
-	nfsd4_cstate_assign_replay(cstate, &oo->oo_owner);
-alloc_stateid:
+
 	open->op_stp = nfs4_alloc_open_stateid(clp);
 	if (!open->op_stp)
 		return nfserr_jukebox;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/4] nfsd: replace rp_mutex to avoid deadlock in move_to_close_lru()
  2024-03-04 22:45 [PATCH 0/4 v3] nfsd: fix deadlock in move_to_close_lru() NeilBrown
  2024-03-04 22:45 ` [PATCH 1/4] nfsd: move nfsd4_cstate_assign_replay() earlier in open handling NeilBrown
  2024-03-04 22:45 ` [PATCH 2/4] nfsd: perform all find_openstateowner_str calls in the one place NeilBrown
@ 2024-03-04 22:45 ` NeilBrown
  2024-03-04 22:45 ` [PATCH 4/4] nfsd: drop st_mutex_mutex before calling move_to_close_lru() NeilBrown
  2024-03-05 15:18 ` [PATCH 0/4 v3] nfsd: fix deadlock in move_to_close_lru() Jeff Layton
  4 siblings, 0 replies; 8+ messages in thread
From: NeilBrown @ 2024-03-04 22:45 UTC (permalink / raw)
  To: Chuck Lever, Jeff Layton
  Cc: linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

move_to_close_lru() waits for sc_count to become zero while holding
rp_mutex.  This can deadlock if another thread holds a reference and is
waiting for rp_mutex.

By the time we get to move_to_close_lru() the openowner is unhashed and
cannot be found any more.  So code waiting for the mutex can safely
retry the lookup if move_to_close_lru() has started.

So change rp_mutex to an atomic_t with three states:

 RP_UNLOCK   - state is still hashed, not locked for reply
 RP_LOCKED   - state is still hashed, is locked for reply
 RP_UNHASHED - state is not hashed, no code can get a lock.

Use wait_var_event() to wait for either a lock, or for the owner to be
unhashed.  In the latter case, retry the lookup.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/nfsd/nfs4state.c | 43 ++++++++++++++++++++++++++++++++++++-------
 fs/nfsd/state.h     |  2 +-
 2 files changed, 37 insertions(+), 8 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 0c1058e8cc4b..52838db0c46e 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4639,21 +4639,37 @@ nfsd4_init_leases_net(struct nfsd_net *nn)
 	atomic_set(&nn->nfsd_courtesy_clients, 0);
 }
 
+enum rp_lock {
+	RP_UNLOCKED,
+	RP_LOCKED,
+	RP_UNHASHED,
+};
+
 static void init_nfs4_replay(struct nfs4_replay *rp)
 {
 	rp->rp_status = nfserr_serverfault;
 	rp->rp_buflen = 0;
 	rp->rp_buf = rp->rp_ibuf;
-	mutex_init(&rp->rp_mutex);
+	atomic_set(&rp->rp_locked, RP_UNLOCKED);
 }
 
-static void nfsd4_cstate_assign_replay(struct nfsd4_compound_state *cstate,
-		struct nfs4_stateowner *so)
+static int nfsd4_cstate_assign_replay(struct nfsd4_compound_state *cstate,
+				      struct nfs4_stateowner *so)
 {
 	if (!nfsd4_has_session(cstate)) {
-		mutex_lock(&so->so_replay.rp_mutex);
+		/*
+		 * rp_locked is an open-coded mutex which is unlocked in
+		 * nfsd4_cstate_clear_replay() or aborted in
+		 * move_to_close_lru().
+		 */
+		wait_var_event(&so->so_replay.rp_locked,
+			       atomic_cmpxchg(&so->so_replay.rp_locked,
+					      RP_UNLOCKED, RP_LOCKED) != RP_LOCKED);
+		if (atomic_read(&so->so_replay.rp_locked) == RP_UNHASHED)
+			return -EAGAIN;
 		cstate->replay_owner = nfs4_get_stateowner(so);
 	}
+	return 0;
 }
 
 void nfsd4_cstate_clear_replay(struct nfsd4_compound_state *cstate)
@@ -4662,7 +4678,8 @@ void nfsd4_cstate_clear_replay(struct nfsd4_compound_state *cstate)
 
 	if (so != NULL) {
 		cstate->replay_owner = NULL;
-		mutex_unlock(&so->so_replay.rp_mutex);
+		atomic_set(&so->so_replay.rp_locked, RP_UNLOCKED);
+		wake_up_var(&so->so_replay.rp_locked);
 		nfs4_put_stateowner(so);
 	}
 }
@@ -4958,7 +4975,11 @@ move_to_close_lru(struct nfs4_ol_stateid *s, struct net *net)
 	 * Wait for the refcount to drop to 2. Since it has been unhashed,
 	 * there should be no danger of the refcount going back up again at
 	 * this point.
+	 * Some threads with a reference might be waiting for rp_locked,
+	 * so tell them to stop waiting.
 	 */
+	atomic_set(&oo->oo_owner.so_replay.rp_locked, RP_UNHASHED);
+	wake_up_var(&oo->oo_owner.so_replay.rp_locked);
 	wait_event(close_wq, refcount_read(&s->st_stid.sc_count) == 2);
 
 	release_all_access(s);
@@ -5331,11 +5352,15 @@ nfsd4_process_open1(struct nfsd4_compound_state *cstate,
 	clp = cstate->clp;
 
 	strhashval = ownerstr_hashval(&open->op_owner);
+retry:
 	oo = find_or_alloc_open_stateowner(strhashval, open, cstate);
 	open->op_openowner = oo;
 	if (!oo)
 		return nfserr_jukebox;
-	nfsd4_cstate_assign_replay(cstate, &oo->oo_owner);
+	if (nfsd4_cstate_assign_replay(cstate, &oo->oo_owner) == -EAGAIN) {
+		nfs4_put_stateowner(&oo->oo_owner);
+		goto retry;
+	}
 	status = nfsd4_check_seqid(cstate, &oo->oo_owner, open->op_seqid);
 	if (status)
 		return status;
@@ -7175,12 +7200,16 @@ nfs4_preprocess_seqid_op(struct nfsd4_compound_state *cstate, u32 seqid,
 	trace_nfsd_preprocess(seqid, stateid);
 
 	*stpp = NULL;
+retry:
 	status = nfsd4_lookup_stateid(cstate, stateid,
 				      typemask, statusmask, &s, nn);
 	if (status)
 		return status;
 	stp = openlockstateid(s);
-	nfsd4_cstate_assign_replay(cstate, stp->st_stateowner);
+	if (nfsd4_cstate_assign_replay(cstate, stp->st_stateowner) == -EAGAIN) {
+		nfs4_put_stateowner(stp->st_stateowner);
+		goto retry;
+	}
 
 	status = nfs4_seqid_op_checks(cstate, stateid, seqid, stp);
 	if (!status)
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 01c6f3445646..0400441c87c1 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -486,7 +486,7 @@ struct nfs4_replay {
 	unsigned int		rp_buflen;
 	char			*rp_buf;
 	struct knfsd_fh		rp_openfh;
-	struct mutex		rp_mutex;
+	atomic_t		rp_locked;
 	char			rp_ibuf[NFSD4_REPLAY_ISIZE];
 };
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 4/4] nfsd: drop st_mutex_mutex before calling move_to_close_lru()
  2024-03-04 22:45 [PATCH 0/4 v3] nfsd: fix deadlock in move_to_close_lru() NeilBrown
                   ` (2 preceding siblings ...)
  2024-03-04 22:45 ` [PATCH 3/4] nfsd: replace rp_mutex to avoid deadlock in move_to_close_lru() NeilBrown
@ 2024-03-04 22:45 ` NeilBrown
  2024-03-05 15:18 ` [PATCH 0/4 v3] nfsd: fix deadlock in move_to_close_lru() Jeff Layton
  4 siblings, 0 replies; 8+ messages in thread
From: NeilBrown @ 2024-03-04 22:45 UTC (permalink / raw)
  To: Chuck Lever, Jeff Layton
  Cc: linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

move_to_close_lru() is currently called with ->st_mutex held.
This can lead to a deadlock as move_to_close_lru() waits for sc_count to
drop to 2, and some threads holding a reference might be waiting for the
mutex.  These references will never be dropped so sc_count will never
reach 2.

There can be no harm in dropping ->st_mutex before
move_to_close_lru() because the only place that takes the mutex is
nfsd4_lock_ol_stateid(), and it quickly aborts if sc_type is
NFS4_CLOSED_STID, which it will be before move_to_close_lru() is called.

See also
 https://lore.kernel.org/lkml/4dd1fe21e11344e5969bb112e954affb@jd.com/T/
where this problem was raised but not successfully resolved.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/nfs4state.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 52838db0c46e..13cb91d704fd 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -7351,7 +7351,7 @@ nfsd4_open_downgrade(struct svc_rqst *rqstp,
 	return status;
 }
 
-static void nfsd4_close_open_stateid(struct nfs4_ol_stateid *s)
+static bool nfsd4_close_open_stateid(struct nfs4_ol_stateid *s)
 {
 	struct nfs4_client *clp = s->st_stid.sc_client;
 	bool unhashed;
@@ -7368,11 +7368,11 @@ static void nfsd4_close_open_stateid(struct nfs4_ol_stateid *s)
 		list_for_each_entry(stp, &reaplist, st_locks)
 			nfs4_free_cpntf_statelist(clp->net, &stp->st_stid);
 		free_ol_stateid_reaplist(&reaplist);
+		return false;
 	} else {
 		spin_unlock(&clp->cl_lock);
 		free_ol_stateid_reaplist(&reaplist);
-		if (unhashed)
-			move_to_close_lru(s, clp->net);
+		return unhashed;
 	}
 }
 
@@ -7388,6 +7388,7 @@ nfsd4_close(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	struct nfs4_ol_stateid *stp;
 	struct net *net = SVC_NET(rqstp);
 	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
+	bool need_move_to_close_list;
 
 	dprintk("NFSD: nfsd4_close on file %pd\n",
 			cstate->current_fh.fh_dentry);
@@ -7412,8 +7413,10 @@ nfsd4_close(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	 */
 	nfs4_inc_and_copy_stateid(&close->cl_stateid, &stp->st_stid);
 
-	nfsd4_close_open_stateid(stp);
+	need_move_to_close_list = nfsd4_close_open_stateid(stp);
 	mutex_unlock(&stp->st_mutex);
+	if (need_move_to_close_list)
+		move_to_close_lru(stp, net);
 
 	/* v4.1+ suggests that we send a special stateid in here, since the
 	 * clients should just ignore this anyway. Since this is not useful
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/4 v3] nfsd: fix deadlock in move_to_close_lru()
  2024-03-04 22:45 [PATCH 0/4 v3] nfsd: fix deadlock in move_to_close_lru() NeilBrown
                   ` (3 preceding siblings ...)
  2024-03-04 22:45 ` [PATCH 4/4] nfsd: drop st_mutex_mutex before calling move_to_close_lru() NeilBrown
@ 2024-03-05 15:18 ` Jeff Layton
  4 siblings, 0 replies; 8+ messages in thread
From: Jeff Layton @ 2024-03-05 15:18 UTC (permalink / raw)
  To: NeilBrown, Chuck Lever; +Cc: linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

On Tue, 2024-03-05 at 09:45 +1100, NeilBrown wrote:
> This is very similar to v2 with:
>  - first two patches swapped
>  - while(1) loop changed to "goto retry"
>  - add a comment to patch 3
>  - rebase on nfsd-next
>  - rb from Jeff added.
> 
> NeilBrown
> 
>  [PATCH 1/4] nfsd: move nfsd4_cstate_assign_replay() earlier in open
>  [PATCH 2/4] nfsd: perform all find_openstateowner_str calls in the
>  [PATCH 3/4] nfsd: replace rp_mutex to avoid deadlock in
>  [PATCH 4/4] nfsd: drop st_mutex_mutex before calling

You can add this to the set. Nice work!

Reviewed-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/4] nfsd: perform all find_openstateowner_str calls in the one place.
  2024-03-04 22:45 ` [PATCH 2/4] nfsd: perform all find_openstateowner_str calls in the one place NeilBrown
@ 2024-03-09 18:56   ` Chuck Lever
  2024-03-10 22:13     ` NeilBrown
  0 siblings, 1 reply; 8+ messages in thread
From: Chuck Lever @ 2024-03-09 18:56 UTC (permalink / raw)
  To: NeilBrown; +Cc: Jeff Layton, linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

On Tue, Mar 05, 2024 at 09:45:22AM +1100, NeilBrown wrote:
> Currently find_openstateowner_str looks are done both in
> nfsd4_process_open1() and alloc_init_open_stateowner() - the latter
> possibly being a surprise based on its name.
> 
> It would be easier to follow, and more conformant to common patterns, if
> the lookup was all in the one place.
> 
> So replace alloc_init_open_stateowner() with
> find_or_alloc_open_stateowner() and use the latter in
> nfsd4_process_open1() without any calls to find_openstateowner_str().
> 
> This means all finds are find_openstateowner_str_locked() and
> find_openstateowner_str() is no longer needed.  So discard
> find_openstateowner_str() and rename find_openstateowner_str_locked() to
> find_openstateowner_str().
> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
>  fs/nfsd/nfs4state.c | 93 +++++++++++++++++++--------------------------
>  1 file changed, 40 insertions(+), 53 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index b9b3a2601675..0c1058e8cc4b 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -541,7 +541,7 @@ same_owner_str(struct nfs4_stateowner *sop, struct xdr_netobj *owner)
>  }
>  
>  static struct nfs4_openowner *
> -find_openstateowner_str_locked(unsigned int hashval, struct nfsd4_open *open,
> +find_openstateowner_str(unsigned int hashval, struct nfsd4_open *open,
>  			struct nfs4_client *clp)
>  {
>  	struct nfs4_stateowner *so;
> @@ -558,18 +558,6 @@ find_openstateowner_str_locked(unsigned int hashval, struct nfsd4_open *open,
>  	return NULL;
>  }
>  
> -static struct nfs4_openowner *
> -find_openstateowner_str(unsigned int hashval, struct nfsd4_open *open,
> -			struct nfs4_client *clp)
> -{
> -	struct nfs4_openowner *oo;
> -
> -	spin_lock(&clp->cl_lock);
> -	oo = find_openstateowner_str_locked(hashval, open, clp);
> -	spin_unlock(&clp->cl_lock);
> -	return oo;
> -}
> -
>  static inline u32
>  opaque_hashval(const void *ptr, int nbytes)
>  {
> @@ -4855,34 +4843,46 @@ nfsd4_find_and_lock_existing_open(struct nfs4_file *fp, struct nfsd4_open *open)
>  }
>  
>  static struct nfs4_openowner *
> -alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
> -			   struct nfsd4_compound_state *cstate)
> +find_or_alloc_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
> +			      struct nfsd4_compound_state *cstate)
>  {
>  	struct nfs4_client *clp = cstate->clp;
> -	struct nfs4_openowner *oo, *ret;
> +	struct nfs4_openowner *oo, *new = NULL;
>  
> -	oo = alloc_stateowner(openowner_slab, &open->op_owner, clp);
> -	if (!oo)
> -		return NULL;
> -	oo->oo_owner.so_ops = &openowner_ops;
> -	oo->oo_owner.so_is_open_owner = 1;
> -	oo->oo_owner.so_seqid = open->op_seqid;
> -	oo->oo_flags = 0;
> -	if (nfsd4_has_session(cstate))
> -		oo->oo_flags |= NFS4_OO_CONFIRMED;
> -	oo->oo_time = 0;
> -	oo->oo_last_closed_stid = NULL;
> -	INIT_LIST_HEAD(&oo->oo_close_lru);
> -	spin_lock(&clp->cl_lock);
> -	ret = find_openstateowner_str_locked(strhashval, open, clp);
> -	if (ret == NULL) {
> -		hash_openowner(oo, clp, strhashval);
> -		ret = oo;
> -	} else
> -		nfs4_free_stateowner(&oo->oo_owner);
> +	while (1) {
> +		spin_lock(&clp->cl_lock);
> +		oo = find_openstateowner_str(strhashval, open, clp);
> +		if (oo && !(oo->oo_flags & NFS4_OO_CONFIRMED)) {
> +			/* Replace unconfirmed owners without checking for replay. */
> +			release_openowner(oo);
> +			oo = NULL;
> +		}
> +		if (oo) {
> +			spin_unlock(&clp->cl_lock);
> +			if (new)
> +				nfs4_free_stateowner(&new->oo_owner);
> +			return oo;
> +		}
> +		if (new) {
> +			hash_openowner(new, clp, strhashval);
> +			spin_unlock(&clp->cl_lock);
> +			return new;
> +		}
> +		spin_unlock(&clp->cl_lock);
>  
> -	spin_unlock(&clp->cl_lock);
> -	return ret;
> +		new = alloc_stateowner(openowner_slab, &open->op_owner, clp);
> +		if (!new)
> +			return NULL;
> +		new->oo_owner.so_ops = &openowner_ops;
> +		new->oo_owner.so_is_open_owner = 1;
> +		new->oo_owner.so_seqid = open->op_seqid;
> +		new->oo_flags = 0;
> +		if (nfsd4_has_session(cstate))
> +			new->oo_flags |= NFS4_OO_CONFIRMED;
> +		new->oo_time = 0;
> +		new->oo_last_closed_stid = NULL;
> +		INIT_LIST_HEAD(&new->oo_close_lru);
> +	}
>  }
>  
>  static struct nfs4_ol_stateid *
> @@ -5331,28 +5331,15 @@ nfsd4_process_open1(struct nfsd4_compound_state *cstate,
>  	clp = cstate->clp;
>  
>  	strhashval = ownerstr_hashval(&open->op_owner);
> -	oo = find_openstateowner_str(strhashval, open, clp);
> +	oo = find_or_alloc_open_stateowner(strhashval, open, cstate);
>  	open->op_openowner = oo;
>  	if (!oo)
> -		goto new_owner;
> -	if (!(oo->oo_flags & NFS4_OO_CONFIRMED)) {
> -		/* Replace unconfirmed owners without checking for replay. */
> -		release_openowner(oo);
> -		open->op_openowner = NULL;
> -		goto new_owner;
> -	}
> +		return nfserr_jukebox;
>  	nfsd4_cstate_assign_replay(cstate, &oo->oo_owner);
>  	status = nfsd4_check_seqid(cstate, &oo->oo_owner, open->op_seqid);
>  	if (status)
>  		return status;
> -	goto alloc_stateid;
> -new_owner:
> -	oo = alloc_init_open_stateowner(strhashval, open, cstate);
> -	if (oo == NULL)
> -		return nfserr_jukebox;
> -	open->op_openowner = oo;
> -	nfsd4_cstate_assign_replay(cstate, &oo->oo_owner);
> -alloc_stateid:
> +
>  	open->op_stp = nfs4_alloc_open_stateid(clp);
>  	if (!open->op_stp)
>  		return nfserr_jukebox;
> -- 
> 2.43.0
> 

While running the NFSv4.0 pynfs tests with KASAN and lock debugging
enabled, I get a number of CPU soft lockup warnings on the test
server and then it wedges hard. I bisected to this patch.

Because we are just a day shy of the v6.9 merge window, I'm going to
drop these four patches for v6.9. We can try them again for v6.10.


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/4] nfsd: perform all find_openstateowner_str calls in the one place.
  2024-03-09 18:56   ` Chuck Lever
@ 2024-03-10 22:13     ` NeilBrown
  0 siblings, 0 replies; 8+ messages in thread
From: NeilBrown @ 2024-03-10 22:13 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Jeff Layton, linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

On Sun, 10 Mar 2024, Chuck Lever wrote:
> On Tue, Mar 05, 2024 at 09:45:22AM +1100, NeilBrown wrote:
> > Currently find_openstateowner_str looks are done both in
> > nfsd4_process_open1() and alloc_init_open_stateowner() - the latter
> > possibly being a surprise based on its name.
> > 
> > It would be easier to follow, and more conformant to common patterns, if
> > the lookup was all in the one place.
> > 
> > So replace alloc_init_open_stateowner() with
> > find_or_alloc_open_stateowner() and use the latter in
> > nfsd4_process_open1() without any calls to find_openstateowner_str().
> > 
> > This means all finds are find_openstateowner_str_locked() and
> > find_openstateowner_str() is no longer needed.  So discard
> > find_openstateowner_str() and rename find_openstateowner_str_locked() to
> > find_openstateowner_str().
> > 
> > Signed-off-by: NeilBrown <neilb@suse.de>
> > ---
> >  fs/nfsd/nfs4state.c | 93 +++++++++++++++++++--------------------------
> >  1 file changed, 40 insertions(+), 53 deletions(-)
> > 
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index b9b3a2601675..0c1058e8cc4b 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -541,7 +541,7 @@ same_owner_str(struct nfs4_stateowner *sop, struct xdr_netobj *owner)
> >  }
> >  
> >  static struct nfs4_openowner *
> > -find_openstateowner_str_locked(unsigned int hashval, struct nfsd4_open *open,
> > +find_openstateowner_str(unsigned int hashval, struct nfsd4_open *open,
> >  			struct nfs4_client *clp)
> >  {
> >  	struct nfs4_stateowner *so;
> > @@ -558,18 +558,6 @@ find_openstateowner_str_locked(unsigned int hashval, struct nfsd4_open *open,
> >  	return NULL;
> >  }
> >  
> > -static struct nfs4_openowner *
> > -find_openstateowner_str(unsigned int hashval, struct nfsd4_open *open,
> > -			struct nfs4_client *clp)
> > -{
> > -	struct nfs4_openowner *oo;
> > -
> > -	spin_lock(&clp->cl_lock);
> > -	oo = find_openstateowner_str_locked(hashval, open, clp);
> > -	spin_unlock(&clp->cl_lock);
> > -	return oo;
> > -}
> > -
> >  static inline u32
> >  opaque_hashval(const void *ptr, int nbytes)
> >  {
> > @@ -4855,34 +4843,46 @@ nfsd4_find_and_lock_existing_open(struct nfs4_file *fp, struct nfsd4_open *open)
> >  }
> >  
> >  static struct nfs4_openowner *
> > -alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
> > -			   struct nfsd4_compound_state *cstate)
> > +find_or_alloc_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
> > +			      struct nfsd4_compound_state *cstate)
> >  {
> >  	struct nfs4_client *clp = cstate->clp;
> > -	struct nfs4_openowner *oo, *ret;
> > +	struct nfs4_openowner *oo, *new = NULL;
> >  
> > -	oo = alloc_stateowner(openowner_slab, &open->op_owner, clp);
> > -	if (!oo)
> > -		return NULL;
> > -	oo->oo_owner.so_ops = &openowner_ops;
> > -	oo->oo_owner.so_is_open_owner = 1;
> > -	oo->oo_owner.so_seqid = open->op_seqid;
> > -	oo->oo_flags = 0;
> > -	if (nfsd4_has_session(cstate))
> > -		oo->oo_flags |= NFS4_OO_CONFIRMED;
> > -	oo->oo_time = 0;
> > -	oo->oo_last_closed_stid = NULL;
> > -	INIT_LIST_HEAD(&oo->oo_close_lru);
> > -	spin_lock(&clp->cl_lock);
> > -	ret = find_openstateowner_str_locked(strhashval, open, clp);
> > -	if (ret == NULL) {
> > -		hash_openowner(oo, clp, strhashval);
> > -		ret = oo;
> > -	} else
> > -		nfs4_free_stateowner(&oo->oo_owner);
> > +	while (1) {
> > +		spin_lock(&clp->cl_lock);
> > +		oo = find_openstateowner_str(strhashval, open, clp);
> > +		if (oo && !(oo->oo_flags & NFS4_OO_CONFIRMED)) {
> > +			/* Replace unconfirmed owners without checking for replay. */
> > +			release_openowner(oo);
> > +			oo = NULL;
> > +		}
> > +		if (oo) {
> > +			spin_unlock(&clp->cl_lock);
> > +			if (new)
> > +				nfs4_free_stateowner(&new->oo_owner);
> > +			return oo;
> > +		}
> > +		if (new) {
> > +			hash_openowner(new, clp, strhashval);
> > +			spin_unlock(&clp->cl_lock);
> > +			return new;
> > +		}
> > +		spin_unlock(&clp->cl_lock);
> >  
> > -	spin_unlock(&clp->cl_lock);
> > -	return ret;
> > +		new = alloc_stateowner(openowner_slab, &open->op_owner, clp);
> > +		if (!new)
> > +			return NULL;
> > +		new->oo_owner.so_ops = &openowner_ops;
> > +		new->oo_owner.so_is_open_owner = 1;
> > +		new->oo_owner.so_seqid = open->op_seqid;
> > +		new->oo_flags = 0;
> > +		if (nfsd4_has_session(cstate))
> > +			new->oo_flags |= NFS4_OO_CONFIRMED;
> > +		new->oo_time = 0;
> > +		new->oo_last_closed_stid = NULL;
> > +		INIT_LIST_HEAD(&new->oo_close_lru);
> > +	}
> >  }
> >  
> >  static struct nfs4_ol_stateid *
> > @@ -5331,28 +5331,15 @@ nfsd4_process_open1(struct nfsd4_compound_state *cstate,
> >  	clp = cstate->clp;
> >  
> >  	strhashval = ownerstr_hashval(&open->op_owner);
> > -	oo = find_openstateowner_str(strhashval, open, clp);
> > +	oo = find_or_alloc_open_stateowner(strhashval, open, cstate);
> >  	open->op_openowner = oo;
> >  	if (!oo)
> > -		goto new_owner;
> > -	if (!(oo->oo_flags & NFS4_OO_CONFIRMED)) {
> > -		/* Replace unconfirmed owners without checking for replay. */
> > -		release_openowner(oo);
> > -		open->op_openowner = NULL;
> > -		goto new_owner;
> > -	}
> > +		return nfserr_jukebox;
> >  	nfsd4_cstate_assign_replay(cstate, &oo->oo_owner);
> >  	status = nfsd4_check_seqid(cstate, &oo->oo_owner, open->op_seqid);
> >  	if (status)
> >  		return status;
> > -	goto alloc_stateid;
> > -new_owner:
> > -	oo = alloc_init_open_stateowner(strhashval, open, cstate);
> > -	if (oo == NULL)
> > -		return nfserr_jukebox;
> > -	open->op_openowner = oo;
> > -	nfsd4_cstate_assign_replay(cstate, &oo->oo_owner);
> > -alloc_stateid:
> > +
> >  	open->op_stp = nfs4_alloc_open_stateid(clp);
> >  	if (!open->op_stp)
> >  		return nfserr_jukebox;
> > -- 
> > 2.43.0
> > 
> 
> While running the NFSv4.0 pynfs tests with KASAN and lock debugging
> enabled, I get a number of CPU soft lockup warnings on the test
> server and then it wedges hard. I bisected to this patch.

Almost certainly this is the problem Dan Carpenter found and reported.

> 
> Because we are just a day shy of the v6.9 merge window, I'm going to
> drop these four patches for v6.9. We can try them again for v6.10.

I'll post a revised version of this patch - it shouldn't affect the
subsequent patches (though I can resend them to if/when you want).

Up to you when it lands of course.

NeilBrown


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-03-10 22:13 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-04 22:45 [PATCH 0/4 v3] nfsd: fix deadlock in move_to_close_lru() NeilBrown
2024-03-04 22:45 ` [PATCH 1/4] nfsd: move nfsd4_cstate_assign_replay() earlier in open handling NeilBrown
2024-03-04 22:45 ` [PATCH 2/4] nfsd: perform all find_openstateowner_str calls in the one place NeilBrown
2024-03-09 18:56   ` Chuck Lever
2024-03-10 22:13     ` NeilBrown
2024-03-04 22:45 ` [PATCH 3/4] nfsd: replace rp_mutex to avoid deadlock in move_to_close_lru() NeilBrown
2024-03-04 22:45 ` [PATCH 4/4] nfsd: drop st_mutex_mutex before calling move_to_close_lru() NeilBrown
2024-03-05 15:18 ` [PATCH 0/4 v3] nfsd: fix deadlock in move_to_close_lru() Jeff Layton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).