[PATCH 0/7] lockd: fix races that can result in stuck filelocks

Linux NFS development
 help / color / mirror / Atom feed

* [PATCH 0/7] lockd: fix races that can result in stuck filelocks
@ 2023-03-03 12:15 Jeff Layton
  2023-03-03 12:15 ` [PATCH 1/7] lockd: purge resources held on behalf of nlm clients when shutting down Jeff Layton
                   ` (7 more replies)
  0 siblings, 8 replies; 15+ messages in thread
From: Jeff Layton @ 2023-03-03 12:15 UTC (permalink / raw)
  To: trond.myklebust, anna, chuck.lever; +Cc: linux-nfs, yoyang

I sent the first patch in this series the other day, but didn't get any
responses. Since then I've had time to follow up on the client-side part
of this problem, which eventually also pointed out yet another bug on
the server side. There are also a couple of cleanup patches in here too,
and a patch to add some tracepoints that I found useful while diagnosing
this.

With this set on both client and server, I'm now able to run Yongcheng's
test for an hour straight with no stuck locks.

Jeff Layton (7):
  lockd: purge resources held on behalf of nlm clients when shutting
    down
  lockd: remove 2 unused helper functions
  lockd: move struct nlm_wait to lockd.h
  lockd: fix races in client GRANTED_MSG wait logic
  lockd: server should unlock lock if client rejects the grant
  nfs: move nfs_fhandle_hash to common include file
  lockd: add some client-side tracepoints

 fs/lockd/Makefile           |  6 ++-
 fs/lockd/clntlock.c         | 58 +++++++++++---------------
 fs/lockd/clntproc.c         | 42 ++++++++++++++-----
 fs/lockd/host.c             |  1 +
 fs/lockd/svclock.c          | 21 ++++++++--
 fs/lockd/trace.c            |  3 ++
 fs/lockd/trace.h            | 83 +++++++++++++++++++++++++++++++++++++
 fs/nfs/internal.h           | 15 -------
 include/linux/lockd/lockd.h | 29 ++++++-------
 include/linux/nfs.h         | 20 +++++++++
 10 files changed, 200 insertions(+), 78 deletions(-)
 create mode 100644 fs/lockd/trace.c
 create mode 100644 fs/lockd/trace.h

-- 
2.39.2


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/7] lockd: purge resources held on behalf of nlm clients when shutting down
  2023-03-03 12:15 [PATCH 0/7] lockd: fix races that can result in stuck filelocks Jeff Layton
@ 2023-03-03 12:15 ` Jeff Layton
  2023-03-03 12:15 ` [PATCH 2/7] lockd: remove 2 unused helper functions Jeff Layton
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Jeff Layton @ 2023-03-03 12:15 UTC (permalink / raw)
  To: trond.myklebust, anna, chuck.lever; +Cc: linux-nfs, yoyang

It's easily possible for the server to have an outstanding lock when we
go to shut down. When that happens, we often get a warning like this in
the kernel log:

    lockd: couldn't shutdown host module for net f0000000!

This is because the shutdown procedures skip removing any hosts that
still have outstanding resources (locks). Eventually, things seem to get
cleaned up anyway, but the log message is unsettling, and server
shutdown doesn't seem to be working the way it was intended.

Ensure that we tear down any resources held on behalf of a client when
tearing one down for server shutdown.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=2063818
Reported-by: Yongcheng Yang <yoyang@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/lockd/host.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/lockd/host.c b/fs/lockd/host.c
index cdc8e12cdac4..127a728fcbc8 100644
--- a/fs/lockd/host.c
+++ b/fs/lockd/host.c
@@ -629,6 +629,7 @@ nlm_shutdown_hosts_net(struct net *net)
 			rpc_shutdown_client(host->h_rpcclnt);
 			host->h_rpcclnt = NULL;
 		}
+		nlmsvc_free_host_resources(host);
 	}

 	/* Then, perform a garbage collection pass */
-- 
2.39.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/7] lockd: remove 2 unused helper functions
  2023-03-03 12:15 [PATCH 0/7] lockd: fix races that can result in stuck filelocks Jeff Layton
  2023-03-03 12:15 ` [PATCH 1/7] lockd: purge resources held on behalf of nlm clients when shutting down Jeff Layton
@ 2023-03-03 12:15 ` Jeff Layton
  2023-03-03 12:15 ` [PATCH 3/7] lockd: move struct nlm_wait to lockd.h Jeff Layton
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Jeff Layton @ 2023-03-03 12:15 UTC (permalink / raw)
  To: trond.myklebust, anna, chuck.lever; +Cc: linux-nfs, yoyang

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/lockd/lockd.h | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/include/linux/lockd/lockd.h b/include/linux/lockd/lockd.h
index 0168ac9fdda8..26c2aed31a0c 100644
--- a/include/linux/lockd/lockd.h
+++ b/include/linux/lockd/lockd.h
@@ -99,21 +99,11 @@ struct nsm_handle {
 /*
  * Rigorous type checking on sockaddr type conversions
  */
-static inline struct sockaddr_in *nlm_addr_in(const struct nlm_host *host)
-{
-	return (struct sockaddr_in *)&host->h_addr;
-}
-
 static inline struct sockaddr *nlm_addr(const struct nlm_host *host)
 {
 	return (struct sockaddr *)&host->h_addr;
 }
 
-static inline struct sockaddr_in *nlm_srcaddr_in(const struct nlm_host *host)
-{
-	return (struct sockaddr_in *)&host->h_srcaddr;
-}
-
 static inline struct sockaddr *nlm_srcaddr(const struct nlm_host *host)
 {
 	return (struct sockaddr *)&host->h_srcaddr;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/7] lockd: move struct nlm_wait to lockd.h
  2023-03-03 12:15 [PATCH 0/7] lockd: fix races that can result in stuck filelocks Jeff Layton
  2023-03-03 12:15 ` [PATCH 1/7] lockd: purge resources held on behalf of nlm clients when shutting down Jeff Layton
  2023-03-03 12:15 ` [PATCH 2/7] lockd: remove 2 unused helper functions Jeff Layton
@ 2023-03-03 12:15 ` Jeff Layton
  2023-03-03 12:16 ` [PATCH 4/7] lockd: fix races in client GRANTED_MSG wait logic Jeff Layton
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Jeff Layton @ 2023-03-03 12:15 UTC (permalink / raw)
  To: trond.myklebust, anna, chuck.lever; +Cc: linux-nfs, yoyang

...and drop the unused b_reclaim field.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/lockd/clntlock.c         | 12 ------------
 include/linux/lockd/lockd.h | 11 ++++++++++-
 2 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/fs/lockd/clntlock.c b/fs/lockd/clntlock.c
index 82b19a30e0f0..464cb15c1a06 100644
--- a/fs/lockd/clntlock.c
+++ b/fs/lockd/clntlock.c
@@ -29,18 +29,6 @@ static int			reclaimer(void *ptr);
  * client perspective.
  */
 
-/*
- * This is the representation of a blocked client lock.
- */
-struct nlm_wait {
-	struct list_head	b_list;		/* linked list */
-	wait_queue_head_t	b_wait;		/* where to wait on */
-	struct nlm_host *	b_host;
-	struct file_lock *	b_lock;		/* local file lock */
-	unsigned short		b_reclaim;	/* got to reclaim lock */
-	__be32			b_status;	/* grant callback status */
-};
-
 static LIST_HEAD(nlm_blocked);
 static DEFINE_SPINLOCK(nlm_blocked_lock);
 
diff --git a/include/linux/lockd/lockd.h b/include/linux/lockd/lockd.h
index 26c2aed31a0c..0eec760fcc05 100644
--- a/include/linux/lockd/lockd.h
+++ b/include/linux/lockd/lockd.h
@@ -121,7 +121,16 @@ struct nlm_lockowner {
 	uint32_t pid;
 };
 
-struct nlm_wait;
+/*
+ * This is the representation of a blocked client lock.
+ */
+struct nlm_wait {
+	struct list_head	b_list;		/* linked list */
+	wait_queue_head_t	b_wait;		/* where to wait on */
+	struct nlm_host *	b_host;
+	struct file_lock *	b_lock;		/* local file lock */
+	__be32			b_status;	/* grant callback status */
+};
 
 /*
  * Memory chunk for NLM client RPC request.
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/7] lockd: fix races in client GRANTED_MSG wait logic
  2023-03-03 12:15 [PATCH 0/7] lockd: fix races that can result in stuck filelocks Jeff Layton
                   ` (2 preceding siblings ...)
  2023-03-03 12:15 ` [PATCH 3/7] lockd: move struct nlm_wait to lockd.h Jeff Layton
@ 2023-03-03 12:16 ` Jeff Layton
  2023-03-03 12:16 ` [PATCH 5/7] lockd: server should unlock lock if client rejects the grant Jeff Layton
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Jeff Layton @ 2023-03-03 12:16 UTC (permalink / raw)
  To: trond.myklebust, anna, chuck.lever; +Cc: linux-nfs, yoyang

After the wait for a grant is done (for whatever reason), nlmclnt_block
updates the status of the nlm_rqst with the status of the block. At the
point it does this, however, the block is still queued its status could
change at any time.

This is particularly a problem when the waiting task is signaled during
the wait. We can end up giving up on the lock just before the GRANTED_MSG
callback comes in, and accept it even though the lock request gets back
an error, leaving a dangling lock on the server.

Since the nlm_wait never lives beyond the end of nlmclnt_lock, put it on
the stack and add functions to allow us to enqueue and dequeue the
block. Enqueue it just before the lock/wait loop, and dequeue it
just after we exit the loop instead of waiting until the end of
the function. Also, scrape the status at the time that we dequeue it to
ensure that it's final.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=2063818
Reported-by: Yongcheng Yang <yoyang@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/lockd/clntlock.c         | 42 ++++++++++++++++++-------------------
 fs/lockd/clntproc.c         | 28 ++++++++++++++++---------
 include/linux/lockd/lockd.h |  8 ++++---
 3 files changed, 44 insertions(+), 34 deletions(-)

diff --git a/fs/lockd/clntlock.c b/fs/lockd/clntlock.c
index 464cb15c1a06..c374ee072db3 100644
--- a/fs/lockd/clntlock.c
+++ b/fs/lockd/clntlock.c
@@ -82,41 +82,42 @@ void nlmclnt_done(struct nlm_host *host)
 }
 EXPORT_SYMBOL_GPL(nlmclnt_done);
 
+void nlmclnt_prepare_block(struct nlm_wait *block, struct nlm_host *host, struct file_lock *fl)
+{
+	block->b_host = host;
+	block->b_lock = fl;
+	init_waitqueue_head(&block->b_wait);
+	block->b_status = nlm_lck_blocked;
+}
+
 /*
  * Queue up a lock for blocking so that the GRANTED request can see it
  */
-struct nlm_wait *nlmclnt_prepare_block(struct nlm_host *host, struct file_lock *fl)
+void nlmclnt_queue_block(struct nlm_wait *block)
 {
-	struct nlm_wait *block;
-
-	block = kmalloc(sizeof(*block), GFP_KERNEL);
-	if (block != NULL) {
-		block->b_host = host;
-		block->b_lock = fl;
-		init_waitqueue_head(&block->b_wait);
-		block->b_status = nlm_lck_blocked;
-
-		spin_lock(&nlm_blocked_lock);
-		list_add(&block->b_list, &nlm_blocked);
-		spin_unlock(&nlm_blocked_lock);
-	}
-	return block;
+	spin_lock(&nlm_blocked_lock);
+	list_add(&block->b_list, &nlm_blocked);
+	spin_unlock(&nlm_blocked_lock);
 }
 
-void nlmclnt_finish_block(struct nlm_wait *block)
+/*
+ * Dequeue the block and return its final status
+ */
+__be32 nlmclnt_dequeue_block(struct nlm_wait *block)
 {
-	if (block == NULL)
-		return;
+	__be32 status;
+
 	spin_lock(&nlm_blocked_lock);
 	list_del(&block->b_list);
+	status = block->b_status;
 	spin_unlock(&nlm_blocked_lock);
-	kfree(block);
+	return status;
 }
 
 /*
  * Block on a lock
  */
-int nlmclnt_block(struct nlm_wait *block, struct nlm_rqst *req, long timeout)
+int nlmclnt_wait(struct nlm_wait *block, struct nlm_rqst *req, long timeout)
 {
 	long ret;
 
@@ -142,7 +143,6 @@ int nlmclnt_block(struct nlm_wait *block, struct nlm_rqst *req, long timeout)
 	/* Reset the lock status after a server reboot so we resend */
 	if (block->b_status == nlm_lck_denied_grace_period)
 		block->b_status = nlm_lck_blocked;
-	req->a_res.status = block->b_status;
 	return 0;
 }
 
diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c
index 16b4de868cd2..a14c9110719c 100644
--- a/fs/lockd/clntproc.c
+++ b/fs/lockd/clntproc.c
@@ -516,9 +516,10 @@ nlmclnt_lock(struct nlm_rqst *req, struct file_lock *fl)
 	const struct cred *cred = nfs_file_cred(fl->fl_file);
 	struct nlm_host	*host = req->a_host;
 	struct nlm_res	*resp = &req->a_res;
-	struct nlm_wait *block = NULL;
+	struct nlm_wait block;
 	unsigned char fl_flags = fl->fl_flags;
 	unsigned char fl_type;
+	__be32 b_status;
 	int status = -ENOLCK;
 
 	if (nsm_monitor(host) < 0)
@@ -531,31 +532,41 @@ nlmclnt_lock(struct nlm_rqst *req, struct file_lock *fl)
 	if (status < 0)
 		goto out;
 
-	block = nlmclnt_prepare_block(host, fl);
+	nlmclnt_prepare_block(&block, host, fl);
 again:
 	/*
 	 * Initialise resp->status to a valid non-zero value,
 	 * since 0 == nlm_lck_granted
 	 */
 	resp->status = nlm_lck_blocked;
-	for(;;) {
+
+	/*
+	 * A GRANTED callback can come at any time -- even before the reply
+	 * to the LOCK request arrives, so we queue the wait before
+	 * requesting the lock.
+	 */
+	nlmclnt_queue_block(&block);
+	for (;;) {
 		/* Reboot protection */
 		fl->fl_u.nfs_fl.state = host->h_state;
 		status = nlmclnt_call(cred, req, NLMPROC_LOCK);
 		if (status < 0)
 			break;
 		/* Did a reclaimer thread notify us of a server reboot? */
-		if (resp->status ==  nlm_lck_denied_grace_period)
+		if (resp->status == nlm_lck_denied_grace_period)
 			continue;
 		if (resp->status != nlm_lck_blocked)
 			break;
 		/* Wait on an NLM blocking lock */
-		status = nlmclnt_block(block, req, NLMCLNT_POLL_TIMEOUT);
+		status = nlmclnt_wait(&block, req, NLMCLNT_POLL_TIMEOUT);
 		if (status < 0)
 			break;
-		if (resp->status != nlm_lck_blocked)
+		if (block.b_status != nlm_lck_blocked)
 			break;
 	}
+	b_status = nlmclnt_dequeue_block(&block);
+	if (resp->status == nlm_lck_blocked)
+		resp->status = b_status;
 
 	/* if we were interrupted while blocking, then cancel the lock request
 	 * and exit
@@ -564,7 +575,7 @@ nlmclnt_lock(struct nlm_rqst *req, struct file_lock *fl)
 		if (!req->a_args.block)
 			goto out_unlock;
 		if (nlmclnt_cancel(host, req->a_args.block, fl) == 0)
-			goto out_unblock;
+			goto out;
 	}
 
 	if (resp->status == nlm_granted) {
@@ -593,8 +604,6 @@ nlmclnt_lock(struct nlm_rqst *req, struct file_lock *fl)
 		status = -ENOLCK;
 	else
 		status = nlm_stat_to_errno(resp->status);
-out_unblock:
-	nlmclnt_finish_block(block);
 out:
 	nlmclnt_release_call(req);
 	return status;
@@ -602,7 +611,6 @@ nlmclnt_lock(struct nlm_rqst *req, struct file_lock *fl)
 	/* Fatal error: ensure that we remove the lock altogether */
 	dprintk("lockd: lock attempt ended in fatal error.\n"
 		"       Attempting to unlock.\n");
-	nlmclnt_finish_block(block);
 	fl_type = fl->fl_type;
 	fl->fl_type = F_UNLCK;
 	down_read(&host->h_rwsem);
diff --git a/include/linux/lockd/lockd.h b/include/linux/lockd/lockd.h
index 0eec760fcc05..7452fb88ecd4 100644
--- a/include/linux/lockd/lockd.h
+++ b/include/linux/lockd/lockd.h
@@ -211,9 +211,11 @@ struct nlm_rqst * nlm_alloc_call(struct nlm_host *host);
 int		  nlm_async_call(struct nlm_rqst *, u32, const struct rpc_call_ops *);
 int		  nlm_async_reply(struct nlm_rqst *, u32, const struct rpc_call_ops *);
 void		  nlmclnt_release_call(struct nlm_rqst *);
-struct nlm_wait * nlmclnt_prepare_block(struct nlm_host *host, struct file_lock *fl);
-void		  nlmclnt_finish_block(struct nlm_wait *block);
-int		  nlmclnt_block(struct nlm_wait *block, struct nlm_rqst *req, long timeout);
+void		  nlmclnt_prepare_block(struct nlm_wait *block, struct nlm_host *host,
+					struct file_lock *fl);
+void		  nlmclnt_queue_block(struct nlm_wait *block);
+__be32		  nlmclnt_dequeue_block(struct nlm_wait *block);
+int		  nlmclnt_wait(struct nlm_wait *block, struct nlm_rqst *req, long timeout);
 __be32		  nlmclnt_grant(const struct sockaddr *addr,
 				const struct nlm_lock *lock);
 void		  nlmclnt_recovery(struct nlm_host *);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/7] lockd: server should unlock lock if client rejects the grant
  2023-03-03 12:15 [PATCH 0/7] lockd: fix races that can result in stuck filelocks Jeff Layton
                   ` (3 preceding siblings ...)
  2023-03-03 12:16 ` [PATCH 4/7] lockd: fix races in client GRANTED_MSG wait logic Jeff Layton
@ 2023-03-03 12:16 ` Jeff Layton
  2023-03-03 12:16 ` [PATCH 6/7] nfs: move nfs_fhandle_hash to common include file Jeff Layton
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Jeff Layton @ 2023-03-03 12:16 UTC (permalink / raw)
  To: trond.myklebust, anna, chuck.lever; +Cc: linux-nfs, yoyang

Currently lockd just dequeues the block and ignores it if the client
sends a GRANT_RES with a status of nlm_lck_denied. That status is an
indicator that the client has rejected the lock, so the right thing to
do is to unlock the lock we were trying to grant.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=2063818
Reported-by: Yongcheng Yang <yoyang@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/lockd/svclock.c | 21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index 4e30f3c50970..c43ccdf28ed9 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -954,19 +954,32 @@ void
 nlmsvc_grant_reply(struct nlm_cookie *cookie, __be32 status)
 {
 	struct nlm_block	*block;
+	struct file_lock	*fl;
+	int			error;
 
 	dprintk("grant_reply: looking for cookie %x, s=%d \n",
 		*(unsigned int *)(cookie->data), status);
 	if (!(block = nlmsvc_find_block(cookie)))
 		return;
 
-	if (status == nlm_lck_denied_grace_period) {
+	switch (status) {
+	case nlm_lck_denied_grace_period:
 		/* Try again in a couple of seconds */
 		nlmsvc_insert_block(block, 10 * HZ);
-	} else {
+		break;
+	case nlm_lck_denied:
+		/* Client doesn't want it, just unlock it */
+		nlmsvc_unlink_block(block);
+		fl = &block->b_call->a_args.lock.fl;
+		fl->fl_type = F_UNLCK;
+		error = vfs_lock_file(fl->fl_file, F_SETLK, fl, NULL);
+		if (error)
+			pr_warn("lockd: unable to unlock lock rejected by client!\n");
+		break;
+	default:
 		/*
-		 * Lock is now held by client, or has been rejected.
-		 * In both cases, the block should be removed.
+		 * Either it was accepted or the status makes no sense
+		 * just unlink it either way.
 		 */
 		nlmsvc_unlink_block(block);
 	}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 6/7] nfs: move nfs_fhandle_hash to common include file
  2023-03-03 12:15 [PATCH 0/7] lockd: fix races that can result in stuck filelocks Jeff Layton
                   ` (4 preceding siblings ...)
  2023-03-03 12:16 ` [PATCH 5/7] lockd: server should unlock lock if client rejects the grant Jeff Layton
@ 2023-03-03 12:16 ` Jeff Layton
  2023-03-03 12:16 ` [PATCH 7/7] lockd: add some client-side tracepoints Jeff Layton
  2023-03-03 14:41 ` [PATCH 0/7] lockd: fix races that can result in stuck filelocks Chuck Lever III
  7 siblings, 0 replies; 15+ messages in thread
From: Jeff Layton @ 2023-03-03 12:16 UTC (permalink / raw)
  To: trond.myklebust, anna, chuck.lever; +Cc: linux-nfs, yoyang

lockd needs to be able to hash filehandles for tracepoints. Move
nfs_fhandle_hash to a common nfs include file.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfs/internal.h   | 15 ---------------
 include/linux/nfs.h | 20 ++++++++++++++++++++
 2 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
index 2a65fe2a63ab..10fb5e7573eb 100644
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -846,27 +846,12 @@ u64 nfs_timespec_to_change_attr(const struct timespec64 *ts)
 }
 
 #ifdef CONFIG_CRC32
-/**
- * nfs_fhandle_hash - calculate the crc32 hash for the filehandle
- * @fh - pointer to filehandle
- *
- * returns a crc32 hash for the filehandle that is compatible with
- * the one displayed by "wireshark".
- */
-static inline u32 nfs_fhandle_hash(const struct nfs_fh *fh)
-{
-	return ~crc32_le(0xFFFFFFFF, &fh->data[0], fh->size);
-}
 static inline u32 nfs_stateid_hash(const nfs4_stateid *stateid)
 {
 	return ~crc32_le(0xFFFFFFFF, &stateid->other[0],
 				NFS4_STATEID_OTHER_SIZE);
 }
 #else
-static inline u32 nfs_fhandle_hash(const struct nfs_fh *fh)
-{
-	return 0;
-}
 static inline u32 nfs_stateid_hash(nfs4_stateid *stateid)
 {
 	return 0;
diff --git a/include/linux/nfs.h b/include/linux/nfs.h
index b06375e88e58..ceb70a926b95 100644
--- a/include/linux/nfs.h
+++ b/include/linux/nfs.h
@@ -10,6 +10,7 @@
 
 #include <linux/sunrpc/msg_prot.h>
 #include <linux/string.h>
+#include <linux/crc32.h>
 #include <uapi/linux/nfs.h>
 
 /*
@@ -44,4 +45,23 @@ enum nfs3_stable_how {
 	/* used by direct.c to mark verf as invalid */
 	NFS_INVALID_STABLE_HOW = -1
 };
+
+#ifdef CONFIG_CRC32
+/**
+ * nfs_fhandle_hash - calculate the crc32 hash for the filehandle
+ * @fh - pointer to filehandle
+ *
+ * returns a crc32 hash for the filehandle that is compatible with
+ * the one displayed by "wireshark".
+ */
+static inline u32 nfs_fhandle_hash(const struct nfs_fh *fh)
+{
+	return ~crc32_le(0xFFFFFFFF, &fh->data[0], fh->size);
+}
+#else /* CONFIG_CRC32 */
+static inline u32 nfs_fhandle_hash(const struct nfs_fh *fh)
+{
+	return 0;
+}
+#endif /* CONFIG_CRC32 */
 #endif /* _LINUX_NFS_H */
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 7/7] lockd: add some client-side tracepoints
  2023-03-03 12:15 [PATCH 0/7] lockd: fix races that can result in stuck filelocks Jeff Layton
                   ` (5 preceding siblings ...)
  2023-03-03 12:16 ` [PATCH 6/7] nfs: move nfs_fhandle_hash to common include file Jeff Layton
@ 2023-03-03 12:16 ` Jeff Layton
  2023-03-03 14:41 ` [PATCH 0/7] lockd: fix races that can result in stuck filelocks Chuck Lever III
  7 siblings, 0 replies; 15+ messages in thread
From: Jeff Layton @ 2023-03-03 12:16 UTC (permalink / raw)
  To: trond.myklebust, anna, chuck.lever; +Cc: linux-nfs, yoyang

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/lockd/Makefile   |  6 ++--
 fs/lockd/clntlock.c |  4 +++
 fs/lockd/clntproc.c | 14 ++++++++
 fs/lockd/trace.c    |  3 ++
 fs/lockd/trace.h    | 83 +++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 108 insertions(+), 2 deletions(-)
 create mode 100644 fs/lockd/trace.c
 create mode 100644 fs/lockd/trace.h

diff --git a/fs/lockd/Makefile b/fs/lockd/Makefile
index 6d5e83ed4476..ac9f9d84510e 100644
--- a/fs/lockd/Makefile
+++ b/fs/lockd/Makefile
@@ -3,10 +3,12 @@
 # Makefile for the linux lock manager stuff
 #
 
+ccflags-y += -I$(src)			# needed for trace events
+
 obj-$(CONFIG_LOCKD) += lockd.o
 
-lockd-objs-y := clntlock.o clntproc.o clntxdr.o host.o svc.o svclock.o \
-	        svcshare.o svcproc.o svcsubs.o mon.o xdr.o
+lockd-objs-y += clntlock.o clntproc.o clntxdr.o host.o svc.o svclock.o \
+	        svcshare.o svcproc.o svcsubs.o mon.o trace.o xdr.o
 lockd-objs-$(CONFIG_LOCKD_V4) += clnt4xdr.o xdr4.o svc4proc.o
 lockd-objs-$(CONFIG_PROC_FS) += procfs.o
 lockd-objs		      := $(lockd-objs-y)
diff --git a/fs/lockd/clntlock.c b/fs/lockd/clntlock.c
index c374ee072db3..e3972aa3045a 100644
--- a/fs/lockd/clntlock.c
+++ b/fs/lockd/clntlock.c
@@ -14,9 +14,12 @@
 #include <linux/nfs_fs.h>
 #include <linux/sunrpc/addr.h>
 #include <linux/sunrpc/svc.h>
+#include <linux/sunrpc/svc_xprt.h>
 #include <linux/lockd/lockd.h>
 #include <linux/kthread.h>
 
+#include "trace.h"
+
 #define NLMDBG_FACILITY		NLMDBG_CLIENT
 
 /*
@@ -186,6 +189,7 @@ __be32 nlmclnt_grant(const struct sockaddr *addr, const struct nlm_lock *lock)
 		res = nlm_granted;
 	}
 	spin_unlock(&nlm_blocked_lock);
+	trace_nlmclnt_grant(lock, addr, svc_addr_len(addr), res);
 	return res;
 }
 
diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c
index a14c9110719c..fba6c7fa7474 100644
--- a/fs/lockd/clntproc.c
+++ b/fs/lockd/clntproc.c
@@ -20,6 +20,8 @@
 #include <linux/sunrpc/svc.h>
 #include <linux/lockd/lockd.h>
 
+#include "trace.h"
+
 #define NLMDBG_FACILITY		NLMDBG_CLIENT
 #define NLMCLNT_GRACE_WAIT	(5*HZ)
 #define NLMCLNT_POLL_TIMEOUT	(30*HZ)
@@ -451,6 +453,9 @@ nlmclnt_test(struct nlm_rqst *req, struct file_lock *fl)
 			status = nlm_stat_to_errno(req->a_res.status);
 	}
 out:
+	trace_nlmclnt_test(&req->a_args.lock,
+			   (const struct sockaddr *)&req->a_host->h_addr,
+			   req->a_host->h_addrlen, req->a_res.status);
 	nlmclnt_release_call(req);
 	return status;
 }
@@ -605,10 +610,16 @@ nlmclnt_lock(struct nlm_rqst *req, struct file_lock *fl)
 	else
 		status = nlm_stat_to_errno(resp->status);
 out:
+	trace_nlmclnt_lock(&req->a_args.lock,
+			   (const struct sockaddr *)&req->a_host->h_addr,
+			   req->a_host->h_addrlen, req->a_res.status);
 	nlmclnt_release_call(req);
 	return status;
 out_unlock:
 	/* Fatal error: ensure that we remove the lock altogether */
+	trace_nlmclnt_lock(&req->a_args.lock,
+			   (const struct sockaddr *)&req->a_host->h_addr,
+			   req->a_host->h_addrlen, req->a_res.status);
 	dprintk("lockd: lock attempt ended in fatal error.\n"
 		"       Attempting to unlock.\n");
 	fl_type = fl->fl_type;
@@ -704,6 +715,9 @@ nlmclnt_unlock(struct nlm_rqst *req, struct file_lock *fl)
 	/* What to do now? I'm out of my depth... */
 	status = -ENOLCK;
 out:
+	trace_nlmclnt_unlock(&req->a_args.lock,
+			     (const struct sockaddr *)&req->a_host->h_addr,
+			     req->a_host->h_addrlen, req->a_res.status);
 	nlmclnt_release_call(req);
 	return status;
 }
diff --git a/fs/lockd/trace.c b/fs/lockd/trace.c
new file mode 100644
index 000000000000..d9a6ff6e673c
--- /dev/null
+++ b/fs/lockd/trace.c
@@ -0,0 +1,3 @@
+// SPDX-License-Identifier: GPL-2.0
+#define CREATE_TRACE_POINTS
+#include "trace.h"
diff --git a/fs/lockd/trace.h b/fs/lockd/trace.h
new file mode 100644
index 000000000000..b79d154d5e66
--- /dev/null
+++ b/fs/lockd/trace.h
@@ -0,0 +1,83 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM lockd
+
+#if !defined(_TRACE_LOCKD_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_LOCKD_H
+
+#include <linux/tracepoint.h>
+#include <linux/crc32.h>
+#include <linux/nfs.h>
+#include <linux/lockd/lockd.h>
+
+#define show_nlm_status(val)							\
+	__print_symbolic(val,							\
+		{ NLM_LCK_GRANTED,		"LCK_GRANTED" },		\
+		{ NLM_LCK_DENIED,		"LCK_DENIED" },			\
+		{ NLM_LCK_DENIED_NOLOCKS,	"LCK_DENIED_NOLOCKS" },		\
+		{ NLM_LCK_BLOCKED,		"LCK_BLOCKED" },		\
+		{ NLM_LCK_DENIED_GRACE_PERIOD,	"LCK_DENIED_GRACE_PERIOD" },	\
+		{ NLM_DEADLCK,			"DEADLCK" },			\
+		{ NLM_ROFS,			"ROFS" },			\
+		{ NLM_STALE_FH,			"STALE_FH" },			\
+		{ NLM_FBIG,			"FBIG" },			\
+		{ NLM_FAILED,			"FAILED" })
+
+DECLARE_EVENT_CLASS(nlmclnt_lock_event,
+		TP_PROTO(
+			const struct nlm_lock *lock,
+			const struct sockaddr *addr,
+			unsigned int addrlen,
+			__be32	status
+		),
+
+		TP_ARGS(lock, addr, addrlen, status),
+
+		TP_STRUCT__entry(
+			__field(u32, oh)
+			__field(u32, svid)
+			__field(u32, fh)
+			__field(u32, status)
+			__field(u64, start)
+			__field(u64, len)
+			__sockaddr(addr, addrlen)
+		),
+
+		TP_fast_assign(
+			__entry->oh = ~crc32_le(0xffffffff, lock->oh.data, lock->oh.len);
+			__entry->svid = lock->svid;
+			__entry->fh = nfs_fhandle_hash(&lock->fh);
+			__entry->start = lock->lock_start;
+			__entry->len = lock->lock_len;
+			__entry->status = be32_to_cpu(status);
+			__assign_sockaddr(addr, addr, addrlen);
+		),
+
+		TP_printk(
+			"addr=%pISpc oh=0x%08x svid=0x%08x fh=0x%08x start=%llu len=%llu status=%s",
+			__get_sockaddr(addr), __entry->oh, __entry->svid,
+			__entry->fh, __entry->start, __entry->len,
+			show_nlm_status(__entry->status)
+		)
+);
+
+#define DEFINE_NLMCLNT_EVENT(name)				\
+	DEFINE_EVENT(nlmclnt_lock_event, name,			\
+			TP_PROTO( 				\
+				const struct nlm_lock *lock,	\
+				const struct sockaddr *addr,	\
+				unsigned int addrlen,		\
+				__be32	status			\
+			), 					\
+			TP_ARGS(lock, addr, addrlen, status))
+
+DEFINE_NLMCLNT_EVENT(nlmclnt_test);
+DEFINE_NLMCLNT_EVENT(nlmclnt_lock);
+DEFINE_NLMCLNT_EVENT(nlmclnt_unlock);
+DEFINE_NLMCLNT_EVENT(nlmclnt_grant);
+#endif /* _TRACE_LOCKD_H */
+
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH .
+#define TRACE_INCLUDE_FILE trace
+#include <trace/define_trace.h>
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/7] lockd: fix races that can result in stuck filelocks
  2023-03-03 12:15 [PATCH 0/7] lockd: fix races that can result in stuck filelocks Jeff Layton
                   ` (6 preceding siblings ...)
  2023-03-03 12:16 ` [PATCH 7/7] lockd: add some client-side tracepoints Jeff Layton
@ 2023-03-03 14:41 ` Chuck Lever III
  2023-03-03 18:11   ` Chuck Lever III
  2023-03-12 15:33   ` Amir Goldstein
  7 siblings, 2 replies; 15+ messages in thread
From: Chuck Lever III @ 2023-03-03 14:41 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Trond Myklebust, Anna Schumaker, Linux NFS Mailing List,
	yoyang@redhat.com



> On Mar 3, 2023, at 7:15 AM, Jeff Layton <jlayton@kernel.org> wrote:
> 
> I sent the first patch in this series the other day, but didn't get any
> responses.

We'll have to work out who will take which patches in this set.
Once fully reviewed, I can take the set if the client maintainers
send Acks for 2-4 and 6-7.

nfsd-next for v6.4 is not yet open. I can work on setting that up
today.


> Since then I've had time to follow up on the client-side part
> of this problem, which eventually also pointed out yet another bug on
> the server side. There are also a couple of cleanup patches in here too,
> and a patch to add some tracepoints that I found useful while diagnosing
> this.
> 
> With this set on both client and server, I'm now able to run Yongcheng's
> test for an hour straight with no stuck locks.
> 
> Jeff Layton (7):
>  lockd: purge resources held on behalf of nlm clients when shutting
>    down
>  lockd: remove 2 unused helper functions
>  lockd: move struct nlm_wait to lockd.h
>  lockd: fix races in client GRANTED_MSG wait logic
>  lockd: server should unlock lock if client rejects the grant
>  nfs: move nfs_fhandle_hash to common include file
>  lockd: add some client-side tracepoints
> 
> fs/lockd/Makefile           |  6 ++-
> fs/lockd/clntlock.c         | 58 +++++++++++---------------
> fs/lockd/clntproc.c         | 42 ++++++++++++++-----
> fs/lockd/host.c             |  1 +
> fs/lockd/svclock.c          | 21 ++++++++--
> fs/lockd/trace.c            |  3 ++
> fs/lockd/trace.h            | 83 +++++++++++++++++++++++++++++++++++++
> fs/nfs/internal.h           | 15 -------
> include/linux/lockd/lockd.h | 29 ++++++-------
> include/linux/nfs.h         | 20 +++++++++
> 10 files changed, 200 insertions(+), 78 deletions(-)
> create mode 100644 fs/lockd/trace.c
> create mode 100644 fs/lockd/trace.h
> 
> -- 
> 2.39.2
> 

--
Chuck Lever



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/7] lockd: fix races that can result in stuck filelocks
  2023-03-03 14:41 ` [PATCH 0/7] lockd: fix races that can result in stuck filelocks Chuck Lever III
@ 2023-03-03 18:11   ` Chuck Lever III
  2023-03-12 15:33   ` Amir Goldstein
  1 sibling, 0 replies; 15+ messages in thread
From: Chuck Lever III @ 2023-03-03 18:11 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Trond Myklebust, Anna Schumaker, Linux NFS Mailing List,
	yoyang@redhat.com



> On Mar 3, 2023, at 9:41 AM, Chuck Lever III <chuck.lever@oracle.com> wrote:
> 
> 
> 
>> On Mar 3, 2023, at 7:15 AM, Jeff Layton <jlayton@kernel.org> wrote:
>> 
>> I sent the first patch in this series the other day, but didn't get any
>> responses.
> 
> We'll have to work out who will take which patches in this set.
> Once fully reviewed, I can take the set if the client maintainers
> send Acks for 2-4 and 6-7.
> 
> nfsd-next for v6.4 is not yet open. I can work on setting that up
> today.
> 
> 
>> Since then I've had time to follow up on the client-side part
>> of this problem, which eventually also pointed out yet another bug on
>> the server side. There are also a couple of cleanup patches in here too,
>> and a patch to add some tracepoints that I found useful while diagnosing
>> this.
>> 
>> With this set on both client and server, I'm now able to run Yongcheng's
>> test for an hour straight with no stuck locks.
>> 
>> Jeff Layton (7):
>> lockd: purge resources held on behalf of nlm clients when shutting
>>   down
>> lockd: remove 2 unused helper functions
>> lockd: move struct nlm_wait to lockd.h
>> lockd: fix races in client GRANTED_MSG wait logic
>> lockd: server should unlock lock if client rejects the grant
>> nfs: move nfs_fhandle_hash to common include file
>> lockd: add some client-side tracepoints
>> 
>> fs/lockd/Makefile           |  6 ++-
>> fs/lockd/clntlock.c         | 58 +++++++++++---------------
>> fs/lockd/clntproc.c         | 42 ++++++++++++++-----
>> fs/lockd/host.c             |  1 +
>> fs/lockd/svclock.c          | 21 ++++++++--
>> fs/lockd/trace.c            |  3 ++
>> fs/lockd/trace.h            | 83 +++++++++++++++++++++++++++++++++++++
>> fs/nfs/internal.h           | 15 -------
>> include/linux/lockd/lockd.h | 29 ++++++-------
>> include/linux/nfs.h         | 20 +++++++++
>> 10 files changed, 200 insertions(+), 78 deletions(-)
>> create mode 100644 fs/lockd/trace.c
>> create mode 100644 fs/lockd/trace.h
>> 
>> -- 
>> 2.39.2

I've opened nfsd-next for v6.4 and applied these. I can drop any
that the client maintainers wish to take through their tree or
would prefer to reject.

Noted that several of these had checkpatch.pl warnings or errors.
I fixed up the issues before applying them.


--
Chuck Lever



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/7] lockd: fix races that can result in stuck filelocks
  2023-03-03 14:41 ` [PATCH 0/7] lockd: fix races that can result in stuck filelocks Chuck Lever III
  2023-03-03 18:11   ` Chuck Lever III
@ 2023-03-12 15:33   ` Amir Goldstein
  2023-03-12 16:44     ` Chuck Lever III
  2023-03-13 10:45     ` Jeff Layton
  1 sibling, 2 replies; 15+ messages in thread
From: Amir Goldstein @ 2023-03-12 15:33 UTC (permalink / raw)
  To: Chuck Lever III, Jeff Layton
  Cc: Trond Myklebust, Anna Schumaker, Linux NFS Mailing List,
	yoyang@redhat.com

On Fri, Mar 3, 2023 at 4:54 PM Chuck Lever III <chuck.lever@oracle.com> wrote:
>
>
>
> > On Mar 3, 2023, at 7:15 AM, Jeff Layton <jlayton@kernel.org> wrote:
> >
> > I sent the first patch in this series the other day, but didn't get any
> > responses.
>
> We'll have to work out who will take which patches in this set.
> Once fully reviewed, I can take the set if the client maintainers
> send Acks for 2-4 and 6-7.
>
> nfsd-next for v6.4 is not yet open. I can work on setting that up
> today.
>
>
> > Since then I've had time to follow up on the client-side part
> > of this problem, which eventually also pointed out yet another bug on
> > the server side. There are also a couple of cleanup patches in here too,
> > and a patch to add some tracepoints that I found useful while diagnosing
> > this.
> >
> > With this set on both client and server, I'm now able to run Yongcheng's
> > test for an hour straight with no stuck locks.

My nfstest_lock test occasionally gets into an endless wait loop for the lock in
one of the optests.

AFAIK, this started happening after I upgraded my client machine to v5.15.88.
Does this seem related to the client bug fixes in this patch set?

If so, is this bug a regression? and why are the fixes aimed for v6.4?

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/7] lockd: fix races that can result in stuck filelocks
  2023-03-12 15:33   ` Amir Goldstein
@ 2023-03-12 16:44     ` Chuck Lever III
  2023-03-13 10:45     ` Jeff Layton
  1 sibling, 0 replies; 15+ messages in thread
From: Chuck Lever III @ 2023-03-12 16:44 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Jeff Layton, Trond Myklebust, Anna Schumaker,
	Linux NFS Mailing List, yoyang@redhat.com



> On Mar 12, 2023, at 11:33 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> 
> On Fri, Mar 3, 2023 at 4:54 PM Chuck Lever III <chuck.lever@oracle.com> wrote:
>> 
>> 
>> 
>>> On Mar 3, 2023, at 7:15 AM, Jeff Layton <jlayton@kernel.org> wrote:
>>> 
>>> I sent the first patch in this series the other day, but didn't get any
>>> responses.
>> 
>> We'll have to work out who will take which patches in this set.
>> Once fully reviewed, I can take the set if the client maintainers
>> send Acks for 2-4 and 6-7.
>> 
>> nfsd-next for v6.4 is not yet open. I can work on setting that up
>> today.
>> 
>> 
>>> Since then I've had time to follow up on the client-side part
>>> of this problem, which eventually also pointed out yet another bug on
>>> the server side. There are also a couple of cleanup patches in here too,
>>> and a patch to add some tracepoints that I found useful while diagnosing
>>> this.
>>> 
>>> With this set on both client and server, I'm now able to run Yongcheng's
>>> test for an hour straight with no stuck locks.
> 
> My nfstest_lock test occasionally gets into an endless wait loop for the lock in
> one of the optests.
> 
> AFAIK, this started happening after I upgraded my client machine to v5.15.88.
> Does this seem related to the client bug fixes in this patch set?

I will let Jeff tackle that question. He did not add a Fixes:
tag, so it's difficult to say off-hand.


> If so, is this bug a regression?

If your test misbehavior is related to these fixes, then probably
yes. But this is the first I've heard of a longer-term problem.


> and why are the fixes aimed for v6.4?

Because these are test failures, not failures of non-artificial
workloads. Also because we haven't heard reports, potential or
otherwise, of a regression, until now.

Since they are test failures only, there doesn't seem to be an
urgency to get them into 6.3-rc. I prefer to let these sit in
-next for a bit, therefore. As we are well aware, patches that
go into -rc are rather aggressively pulled into stable, and I
would like to have some confidence that these fixes do not
introduce further problems.

You are welcome to petition for faster integration. It would
help if you had a positive test result to share with us.

--
Chuck Lever



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/7] lockd: fix races that can result in stuck filelocks
  2023-03-12 15:33   ` Amir Goldstein
  2023-03-12 16:44     ` Chuck Lever III
@ 2023-03-13 10:45     ` Jeff Layton
  2023-03-13 15:14       ` Amir Goldstein
  1 sibling, 1 reply; 15+ messages in thread
From: Jeff Layton @ 2023-03-13 10:45 UTC (permalink / raw)
  To: Amir Goldstein, Chuck Lever III
  Cc: Trond Myklebust, Anna Schumaker, Linux NFS Mailing List,
	yoyang@redhat.com

On Sun, 2023-03-12 at 17:33 +0200, Amir Goldstein wrote:
> On Fri, Mar 3, 2023 at 4:54 PM Chuck Lever III <chuck.lever@oracle.com> wrote:
> > 
> > 
> > 
> > > On Mar 3, 2023, at 7:15 AM, Jeff Layton <jlayton@kernel.org> wrote:
> > > 
> > > I sent the first patch in this series the other day, but didn't get any
> > > responses.
> > 
> > We'll have to work out who will take which patches in this set.
> > Once fully reviewed, I can take the set if the client maintainers
> > send Acks for 2-4 and 6-7.
> > 
> > nfsd-next for v6.4 is not yet open. I can work on setting that up
> > today.
> > 
> > 
> > > Since then I've had time to follow up on the client-side part
> > > of this problem, which eventually also pointed out yet another bug on
> > > the server side. There are also a couple of cleanup patches in here too,
> > > and a patch to add some tracepoints that I found useful while diagnosing
> > > this.
> > > 
> > > With this set on both client and server, I'm now able to run Yongcheng's
> > > test for an hour straight with no stuck locks.
> 
> My nfstest_lock test occasionally gets into an endless wait loop for the lock in
> one of the optests.
> 
> AFAIK, this started happening after I upgraded my client machine to v5.15.88.
> Does this seem related to the client bug fixes in this patch set?
> 
> If so, is this bug a regression? and why are the fixes aimed for v6.4?
> 

Most of this (lockd) code hasn't changed in well over a decade, so if
this is a regression then it's a very old one. I suppose it's possible
that this regressed after the BKL was removed from this code, but that
was a long time ago now and I'm not sure I can identify a commit that
this fixes.

I'm fine with this going in sooner than v6.4, but given that this has
been broken so long, I didn't see the need to rush.

Cheers,
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/7] lockd: fix races that can result in stuck filelocks
  2023-03-13 10:45     ` Jeff Layton
@ 2023-03-13 15:14       ` Amir Goldstein
  2023-03-13 19:19         ` Jeff Layton
  0 siblings, 1 reply; 15+ messages in thread
From: Amir Goldstein @ 2023-03-13 15:14 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Chuck Lever III, Trond Myklebust, Anna Schumaker,
	Linux NFS Mailing List, yoyang@redhat.com

[-- Attachment #1: Type: text/plain, Size: 4259 bytes --]

On Mon, Mar 13, 2023 at 12:45 PM Jeff Layton <jlayton@kernel.org> wrote:
>
> On Sun, 2023-03-12 at 17:33 +0200, Amir Goldstein wrote:
> > On Fri, Mar 3, 2023 at 4:54 PM Chuck Lever III <chuck.lever@oracle.com> wrote:
> > >
> > >
> > >
> > > > On Mar 3, 2023, at 7:15 AM, Jeff Layton <jlayton@kernel.org> wrote:
> > > >
> > > > I sent the first patch in this series the other day, but didn't get any
> > > > responses.
> > >
> > > We'll have to work out who will take which patches in this set.
> > > Once fully reviewed, I can take the set if the client maintainers
> > > send Acks for 2-4 and 6-7.
> > >
> > > nfsd-next for v6.4 is not yet open. I can work on setting that up
> > > today.
> > >
> > >
> > > > Since then I've had time to follow up on the client-side part
> > > > of this problem, which eventually also pointed out yet another bug on
> > > > the server side. There are also a couple of cleanup patches in here too,
> > > > and a patch to add some tracepoints that I found useful while diagnosing
> > > > this.
> > > >
> > > > With this set on both client and server, I'm now able to run Yongcheng's
> > > > test for an hour straight with no stuck locks.
> >
> > My nfstest_lock test occasionally gets into an endless wait loop for the lock in
> > one of the optests.

I forgot to mention that the regression is only with nfsversion=3!
Is anyone else running nfstest_lock with nfsversion=3?

> >
> > AFAIK, this started happening after I upgraded my client machine to v5.15.88.
> > Does this seem related to the client bug fixes in this patch set?
> >
> > If so, is this bug a regression? and why are the fixes aimed for v6.4?
> >
>
> Most of this (lockd) code hasn't changed in well over a decade, so if
> this is a regression then it's a very old one. I suppose it's possible
> that this regressed after the BKL was removed from this code, but that
> was a long time ago now and I'm not sure I can identify a commit that
> this fixes.
>
> I'm fine with this going in sooner than v6.4, but given that this has
> been broken so long, I didn't see the need to rush.
>

I don't know what is the relation of the optest regression that I am
experiencing and the client and server bugs mentioned in this patch set.
I just re-tested optest01 with several combinations of client-server kernels.
I rebooted both client and server before each test.
The results are a bit odd:

client           server      optest01 result
------------------------------------------------------
5.10.109     5.10.109  optest01 completes successfully after <30s
5.15.88       5.15.88    optest01 never completes (see attached log)
5.15.88       5.10.109  optest01 never completes
5.15.88+ [*] 5.15.88   optest01 never completes
5.15.88+     5.10.109  optest01 never completes
5.15.88+     5.15.88+  optest01 completes successfully after ~300s [**]

Unless I missed something with the tests, it looks like
1.a. There was a regressions in client from 5.10.109..5.15.88
1.b. The regression is manifested with both 5.10 and 5.15 servers
2.a. The patches improve the situation (from infinite to 30s per wait)...
2.b. ...but only when applied to both client and server and...
2.c. The situation is still a lot worse than 5.10 client with 5.10 server

Attached also the NFS[D] Kconfig which is identical for the tested
5.10 and 5.15 kernels.

Do you need me to provide any traces or any other info?

Thanks,
Amir.

[*] 5.15.88+ stands for 5.15.88 + the patches in this set, which all
apply cleanly
[**] The test takes 300s because every single 30s wait takes the entire 30s:

    DBG1: 15:21:47.118095 - Unlock file (F_UNLCK, F_SETLK) off=0 len=0
range(0, 18446744073709551615)
    DBG3: 15:21:47.119832 - Wait up to 30 secs to check if blocked
lock has been granted @253.87
    DBG3: 15:21:48.121296 - Check if blocked lock has been granted @254.87
...
    DBG3: 15:22:14.158314 - Check if blocked lock has been granted @280.90
    DBG3: 15:22:15.017594 - Getting results from blocked lock @281.76
    DBG1: 15:22:15.017832 - Unlock file (F_UNLCK, F_SETLK) off=0 len=0
range(0, 18446744073709551615) on second process @281.76
    PASS: Locking byte range (72 passed, 0 failed)

[-- Attachment #2: optest01.nfsver3.linux-5.15.88.log.gz --]
[-- Type: application/gzip, Size: 32973 bytes --]

[-- Attachment #3: 5.15.88.NFS.config --]
[-- Type: application/octet-stream, Size: 671 bytes --]

CONFIG_NFS_FS=y
CONFIG_NFS_V2=y
CONFIG_NFS_V3=y
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=y
# CONFIG_NFS_SWAP is not set
CONFIG_NFS_V4_1=y
# CONFIG_NFS_V4_2 is not set
CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN="kernel.org"
# CONFIG_NFS_V4_1_MIGRATION is not set
CONFIG_ROOT_NFS=y
# CONFIG_NFS_USE_LEGACY_DNS is not set
CONFIG_NFS_USE_KERNEL_DNS=y
CONFIG_NFS_DEBUG=y
CONFIG_NFS_DISABLE_UDP_SUPPORT=y
CONFIG_NFSD=y
CONFIG_NFSD_V3=y
# CONFIG_NFSD_V3_ACL is not set
CONFIG_NFSD_V4=y
# CONFIG_NFSD_BLOCKLAYOUT is not set
# CONFIG_NFSD_SCSILAYOUT is not set
# CONFIG_NFSD_FLEXFILELAYOUT is not set
# CONFIG_NFSD_V4_SECURITY_LABEL is not set
CONFIG_NFS_ACL_SUPPORT=y
CONFIG_NFS_COMMON=y

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/7] lockd: fix races that can result in stuck filelocks
  2023-03-13 15:14       ` Amir Goldstein
@ 2023-03-13 19:19         ` Jeff Layton
  0 siblings, 0 replies; 15+ messages in thread
From: Jeff Layton @ 2023-03-13 19:19 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Chuck Lever III, Trond Myklebust, Anna Schumaker,
	Linux NFS Mailing List, yoyang@redhat.com

On Mon, 2023-03-13 at 17:14 +0200, Amir Goldstein wrote:
> On Mon, Mar 13, 2023 at 12:45 PM Jeff Layton <jlayton@kernel.org> wrote:
> > 
> > On Sun, 2023-03-12 at 17:33 +0200, Amir Goldstein wrote:
> > > On Fri, Mar 3, 2023 at 4:54 PM Chuck Lever III <chuck.lever@oracle.com> wrote:
> > > > 
> > > > 
> > > > 
> > > > > On Mar 3, 2023, at 7:15 AM, Jeff Layton <jlayton@kernel.org> wrote:
> > > > > 
> > > > > I sent the first patch in this series the other day, but didn't get any
> > > > > responses.
> > > > 
> > > > We'll have to work out who will take which patches in this set.
> > > > Once fully reviewed, I can take the set if the client maintainers
> > > > send Acks for 2-4 and 6-7.
> > > > 
> > > > nfsd-next for v6.4 is not yet open. I can work on setting that up
> > > > today.
> > > > 
> > > > 
> > > > > Since then I've had time to follow up on the client-side part
> > > > > of this problem, which eventually also pointed out yet another bug on
> > > > > the server side. There are also a couple of cleanup patches in here too,
> > > > > and a patch to add some tracepoints that I found useful while diagnosing
> > > > > this.
> > > > > 
> > > > > With this set on both client and server, I'm now able to run Yongcheng's
> > > > > test for an hour straight with no stuck locks.
> > > 
> > > My nfstest_lock test occasionally gets into an endless wait loop for the lock in
> > > one of the optests.
> 
> I forgot to mention that the regression is only with nfsversion=3!
> Is anyone else running nfstest_lock with nfsversion=3?
> 
> > > 
> > > AFAIK, this started happening after I upgraded my client machine to v5.15.88.
> > > Does this seem related to the client bug fixes in this patch set?
> > > 
> > > If so, is this bug a regression? and why are the fixes aimed for v6.4?
> > > 
> > 
> > Most of this (lockd) code hasn't changed in well over a decade, so if
> > this is a regression then it's a very old one. I suppose it's possible
> > that this regressed after the BKL was removed from this code, but that
> > was a long time ago now and I'm not sure I can identify a commit that
> > this fixes.
> > 
> > I'm fine with this going in sooner than v6.4, but given that this has
> > been broken so long, I didn't see the need to rush.
> > 
> 
> I don't know what is the relation of the optest regression that I am
> experiencing and the client and server bugs mentioned in this patch set.
> I just re-tested optest01 with several combinations of client-server kernels.
> I rebooted both client and server before each test.
> The results are a bit odd:
> 
> client           server      optest01 result
> ------------------------------------------------------
> 5.10.109     5.10.109  optest01 completes successfully after <30s
> 5.15.88       5.15.88    optest01 never completes (see attached log)
> 5.15.88       5.10.109  optest01 never completes
> 5.15.88+ [*] 5.15.88   optest01 never completes
> 5.15.88+     5.10.109  optest01 never completes
> 5.15.88+     5.15.88+  optest01 completes successfully after ~300s [**]
> 
> Unless I missed something with the tests, it looks like
> 1.a. There was a regressions in client from 5.10.109..5.15.88
> 1.b. The regression is manifested with both 5.10 and 5.15 servers
> 2.a. The patches improve the situation (from infinite to 30s per wait)...
> 2.b. ...but only when applied to both client and server and...
> 2.c. The situation is still a lot worse than 5.10 client with 5.10 server
> 
> Attached also the NFS[D] Kconfig which is identical for the tested
> 5.10 and 5.15 kernels.
> 
> Do you need me to provide any traces or any other info?
> 
> Thanks,
> Amir.
> 
> [*] 5.15.88+ stands for 5.15.88 + the patches in this set, which all
> apply cleanly
> [**] The test takes 300s because every single 30s wait takes the entire 30s:
> 
>     DBG1: 15:21:47.118095 - Unlock file (F_UNLCK, F_SETLK) off=0 len=0
> range(0, 18446744073709551615)
>     DBG3: 15:21:47.119832 - Wait up to 30 secs to check if blocked
> lock has been granted @253.87
>     DBG3: 15:21:48.121296 - Check if blocked lock has been granted @254.87
> ...
>     DBG3: 15:22:14.158314 - Check if blocked lock has been granted @280.90
>     DBG3: 15:22:15.017594 - Getting results from blocked lock @281.76
>     DBG1: 15:22:15.017832 - Unlock file (F_UNLCK, F_SETLK) off=0 len=0
> range(0, 18446744073709551615) on second process @281.76
>     PASS: Locking byte range (72 passed, 0 failed)

This sounds like a different problem than what this patchset fixes. This
patchset is really all about signal handling during the wait for a lock.
That sounds more like the wait is just not completing?

I just kicked off this test in nfstests with vers=3 and I think I see
the same 30s stalls. Coincidentally:

    #define NLMCLNT_POLL_TIMEOUT    (30*HZ)                            

So it does look like something may be going wrong with the lock granting
mechanism. I'll need to do a bit of investigation to figure out what's
going on.

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-03-13 19:20 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-03 12:15 [PATCH 0/7] lockd: fix races that can result in stuck filelocks Jeff Layton
2023-03-03 12:15 ` [PATCH 1/7] lockd: purge resources held on behalf of nlm clients when shutting down Jeff Layton
2023-03-03 12:15 ` [PATCH 2/7] lockd: remove 2 unused helper functions Jeff Layton
2023-03-03 12:15 ` [PATCH 3/7] lockd: move struct nlm_wait to lockd.h Jeff Layton
2023-03-03 12:16 ` [PATCH 4/7] lockd: fix races in client GRANTED_MSG wait logic Jeff Layton
2023-03-03 12:16 ` [PATCH 5/7] lockd: server should unlock lock if client rejects the grant Jeff Layton
2023-03-03 12:16 ` [PATCH 6/7] nfs: move nfs_fhandle_hash to common include file Jeff Layton
2023-03-03 12:16 ` [PATCH 7/7] lockd: add some client-side tracepoints Jeff Layton
2023-03-03 14:41 ` [PATCH 0/7] lockd: fix races that can result in stuck filelocks Chuck Lever III
2023-03-03 18:11   ` Chuck Lever III
2023-03-12 15:33   ` Amir Goldstein
2023-03-12 16:44     ` Chuck Lever III
2023-03-13 10:45     ` Jeff Layton
2023-03-13 15:14       ` Amir Goldstein
2023-03-13 19:19         ` Jeff Layton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox