[RFCv2 0/7] fs: nfs: async lock request changes

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFCv2 0/7] fs: nfs: async lock request changes
@ 2023-08-14 21:11 Alexander Aring
  2023-08-14 21:11 ` [RFCv2 1/7] lockd: fix race in async lock request handling Alexander Aring
                   ` (6 more replies)
  0 siblings, 7 replies; 22+ messages in thread
From: Alexander Aring @ 2023-08-14 21:11 UTC (permalink / raw)
  To: linux-nfs
  Cc: cluster-devel, ocfs2-devel, linux-fsdevel, teigland, rpeterso,
	agruenba, trond.myklebust, anna, chuck.lever, jlayton

Hi,

this patch series contains changes to lockd asynchronous lock request
handling over vfs_lock_file(). The only one upstream user so far I see
is DLM. The current DLM handling of asynchronous lock request is broken.
Those patches makes the asynchronous lock request handling better, there
are still issues with fencing or lock cancellation that this patch
series does not tackle.

The cifs filesystem is another user of FILE_LOCK_DEFERRED but it does
not do anything on checking lm_grant() as the asynchronous lock API is
required to check if it's set.

- Alex

changes since v2:
 - add dlm fixes/changes to show how the new API should work. It's used
   by gfs2/ocfs2.
 - handle non-blocking request synchronously. For this reason we needed
   to introduce a new per nlm_block request to fix a race between
   request and lm_grant() callback handling, see b_cb_mutex
 - update documentation for new FL_SLEEP handling when lm_grant() is
   set and new requirement that lm_grant() need to be called from a
   sleepable contex.
 - clear new CALLBACK PENDING flag when nlm_block is removed from
   nlm_blocked
 - introduce helper export_op_support_safe_async_lock() to check if
   the filesystem supports asynchronous lock request calls
 - fix block variable never NULL when iterating over nlm_blocked list

Alexander Aring (7):
  lockd: fix race in async lock request handling
  lockd: FILE_LOCK_DEFERRED only on FL_SLEEP
  lockd: introduce safe async lock op
  locks: update lock callback documentation
  dlm: use fl_owner from lockd
  dlm: use FL_SLEEP to check if blocking request
  dlm: implement EXPORT_OP_SAFE_ASYNC_LOCK

 fs/dlm/plock.c              |  54 +++++++-------
 fs/gfs2/export.c            |   1 +
 fs/lockd/svclock.c          | 140 ++++++++++++++++++++----------------
 fs/locks.c                  |  28 ++++----
 fs/nfsd/nfs4state.c         |  13 +++-
 fs/ocfs2/export.c           |   1 +
 include/linux/exportfs.h    |   8 +++
 include/linux/lockd/lockd.h |   2 +
 8 files changed, 140 insertions(+), 107 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFCv2 1/7] lockd: fix race in async lock request handling
  2023-08-14 21:11 [RFCv2 0/7] fs: nfs: async lock request changes Alexander Aring
@ 2023-08-14 21:11 ` Alexander Aring
  2023-08-15 17:49   ` Jeff Layton
  2023-08-14 21:11 ` [RFCv2 2/7] lockd: FILE_LOCK_DEFERRED only on FL_SLEEP Alexander Aring
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Alexander Aring @ 2023-08-14 21:11 UTC (permalink / raw)
  To: linux-nfs
  Cc: cluster-devel, ocfs2-devel, linux-fsdevel, teigland, rpeterso,
	agruenba, trond.myklebust, anna, chuck.lever, jlayton

This patch fixes a race in async lock request handling between adding
the relevant struct nlm_block to nlm_blocked list after the request was
sent by vfs_lock_file() and nlmsvc_grant_deferred() does a lookup of the
nlm_block in the nlm_blocked list. It could be that the async request is
completed before the nlm_block was added to the list. This would end
in a -ENOENT and a kernel log message of "lockd: grant for unknown
block".

To solve this issue we add the nlm_block before the vfs_lock_file() call
to be sure it has been added when a possible nlmsvc_grant_deferred() is
called. If the vfs_lock_file() results in an case when it wouldn't be
added to nlm_blocked list, the nlm_block struct will be removed from
this list again.

The introducing of the new B_PENDING_CALLBACK nlm_block flag will handle
async lock requests on a pending lock requests as a retry on the caller
level to hit the -EAGAIN case.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/lockd/svclock.c          | 100 ++++++++++++++++++++++++++----------
 include/linux/lockd/lockd.h |   2 +
 2 files changed, 74 insertions(+), 28 deletions(-)

diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index c43ccdf28ed9..7d63524bdb81 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -133,6 +133,7 @@ nlmsvc_remove_block(struct nlm_block *block)
 {
 	if (!list_empty(&block->b_list)) {
 		spin_lock(&nlm_blocked_lock);
+		block->b_flags &= ~B_PENDING_CALLBACK;
 		list_del_init(&block->b_list);
 		spin_unlock(&nlm_blocked_lock);
 		nlmsvc_release_block(block);
@@ -232,6 +233,7 @@ nlmsvc_create_block(struct svc_rqst *rqstp, struct nlm_host *host,
 	kref_init(&block->b_count);
 	INIT_LIST_HEAD(&block->b_list);
 	INIT_LIST_HEAD(&block->b_flist);
+	mutex_init(&block->b_cb_mutex);
 
 	if (!nlmsvc_setgrantargs(call, lock))
 		goto failed_free;
@@ -289,6 +291,8 @@ static void nlmsvc_free_block(struct kref *kref)
 
 	dprintk("lockd: freeing block %p...\n", block);
 
+	WARN_ON_ONCE(block->b_flags & B_PENDING_CALLBACK);
+
 	/* Remove block from file's list of blocks */
 	list_del_init(&block->b_flist);
 	mutex_unlock(&file->f_mutex);
@@ -532,6 +536,13 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
 		goto out;
 	}
 
+	mutex_lock(&block->b_cb_mutex);
+	if (block->b_flags & B_PENDING_CALLBACK)
+		goto pending_request;
+
+	/* Append to list of blocked */
+	nlmsvc_insert_block(block, NLM_NEVER);
+
 	if (!wait)
 		lock->fl.fl_flags &= ~FL_SLEEP;
 	mode = lock_to_openmode(&lock->fl);
@@ -541,9 +552,13 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
 	dprintk("lockd: vfs_lock_file returned %d\n", error);
 	switch (error) {
 		case 0:
+			nlmsvc_remove_block(block);
 			ret = nlm_granted;
-			goto out;
+			goto out_cb_mutex;
 		case -EAGAIN:
+			if (!wait)
+				nlmsvc_remove_block(block);
+pending_request:
 			/*
 			 * If this is a blocking request for an
 			 * already pending lock request then we need
@@ -552,26 +567,29 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
 			if (wait)
 				break;
 			ret = async_block ? nlm_lck_blocked : nlm_lck_denied;
-			goto out;
+			goto out_cb_mutex;
 		case FILE_LOCK_DEFERRED:
+			block->b_flags |= B_PENDING_CALLBACK;
+
 			if (wait)
 				break;
 			/* Filesystem lock operation is in progress
 			   Add it to the queue waiting for callback */
 			ret = nlmsvc_defer_lock_rqst(rqstp, block);
-			goto out;
+			goto out_cb_mutex;
 		case -EDEADLK:
+			nlmsvc_remove_block(block);
 			ret = nlm_deadlock;
-			goto out;
+			goto out_cb_mutex;
 		default:			/* includes ENOLCK */
+			nlmsvc_remove_block(block);
 			ret = nlm_lck_denied_nolocks;
-			goto out;
+			goto out_cb_mutex;
 	}
 
 	ret = nlm_lck_blocked;
-
-	/* Append to list of blocked */
-	nlmsvc_insert_block(block, NLM_NEVER);
+out_cb_mutex:
+	mutex_unlock(&block->b_cb_mutex);
 out:
 	mutex_unlock(&file->f_mutex);
 	nlmsvc_release_block(block);
@@ -728,34 +746,60 @@ nlmsvc_update_deferred_block(struct nlm_block *block, int result)
 		block->b_flags |= B_TIMED_OUT;
 }
 
+static int __nlmsvc_grant_deferred(struct nlm_block *block,
+				   struct file_lock *fl,
+				   int result)
+{
+	int rc = 0;
+
+	dprintk("lockd: nlmsvc_notify_blocked block %p flags %d\n",
+					block, block->b_flags);
+	if (block->b_flags & B_QUEUED) {
+		if (block->b_flags & B_TIMED_OUT) {
+			rc = -ENOLCK;
+			goto out;
+		}
+		nlmsvc_update_deferred_block(block, result);
+	} else if (result == 0)
+		block->b_granted = 1;
+
+	nlmsvc_insert_block_locked(block, 0);
+	svc_wake_up(block->b_daemon);
+out:
+	return rc;
+}
+
 static int nlmsvc_grant_deferred(struct file_lock *fl, int result)
 {
-	struct nlm_block *block;
-	int rc = -ENOENT;
+	struct nlm_block *iter, *block = NULL;
+	int rc;
 
 	spin_lock(&nlm_blocked_lock);
-	list_for_each_entry(block, &nlm_blocked, b_list) {
-		if (nlm_compare_locks(&block->b_call->a_args.lock.fl, fl)) {
-			dprintk("lockd: nlmsvc_notify_blocked block %p flags %d\n",
-							block, block->b_flags);
-			if (block->b_flags & B_QUEUED) {
-				if (block->b_flags & B_TIMED_OUT) {
-					rc = -ENOLCK;
-					break;
-				}
-				nlmsvc_update_deferred_block(block, result);
-			} else if (result == 0)
-				block->b_granted = 1;
-
-			nlmsvc_insert_block_locked(block, 0);
-			svc_wake_up(block->b_daemon);
-			rc = 0;
+	list_for_each_entry(iter, &nlm_blocked, b_list) {
+		if (nlm_compare_locks(&iter->b_call->a_args.lock.fl, fl)) {
+			kref_get(&iter->b_count);
+			block = iter;
 			break;
 		}
 	}
 	spin_unlock(&nlm_blocked_lock);
-	if (rc == -ENOENT)
-		printk(KERN_WARNING "lockd: grant for unknown block\n");
+
+	if (!block) {
+		pr_warn("lockd: grant for unknown pending block\n");
+		return -ENOENT;
+	}
+
+	/* don't interfere with nlmsvc_lock() */
+	mutex_lock(&block->b_cb_mutex);
+	block->b_flags &= ~B_PENDING_CALLBACK;
+
+	spin_lock(&nlm_blocked_lock);
+	WARN_ON_ONCE(list_empty(&block->b_list));
+	rc = __nlmsvc_grant_deferred(block, fl, result);
+	spin_unlock(&nlm_blocked_lock);
+	mutex_unlock(&block->b_cb_mutex);
+
+	nlmsvc_release_block(block);
 	return rc;
 }
 
diff --git a/include/linux/lockd/lockd.h b/include/linux/lockd/lockd.h
index f42594a9efe0..91f55458f5fc 100644
--- a/include/linux/lockd/lockd.h
+++ b/include/linux/lockd/lockd.h
@@ -185,10 +185,12 @@ struct nlm_block {
 	struct nlm_file *	b_file;		/* file in question */
 	struct cache_req *	b_cache_req;	/* deferred request handling */
 	struct cache_deferred_req * b_deferred_req;
+	struct mutex		b_cb_mutex;	/* callback mutex */
 	unsigned int		b_flags;	/* block flags */
 #define B_QUEUED		1	/* lock queued */
 #define B_GOT_CALLBACK		2	/* got lock or conflicting lock */
 #define B_TIMED_OUT		4	/* filesystem too slow to respond */
+#define B_PENDING_CALLBACK	8	/* pending callback for lock request */
 };
 
 /*
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFCv2 2/7] lockd: FILE_LOCK_DEFERRED only on FL_SLEEP
  2023-08-14 21:11 [RFCv2 0/7] fs: nfs: async lock request changes Alexander Aring
  2023-08-14 21:11 ` [RFCv2 1/7] lockd: fix race in async lock request handling Alexander Aring
@ 2023-08-14 21:11 ` Alexander Aring
  2023-08-16 11:37   ` Jeff Layton
  2023-08-14 21:11 ` [RFCv2 3/7] lockd: introduce safe async lock op Alexander Aring
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Alexander Aring @ 2023-08-14 21:11 UTC (permalink / raw)
  To: linux-nfs
  Cc: cluster-devel, ocfs2-devel, linux-fsdevel, teigland, rpeterso,
	agruenba, trond.myklebust, anna, chuck.lever, jlayton

This patch removes to handle non-blocking lock requests as asynchronous
lock request returning FILE_LOCK_DEFERRED. When fl_lmops and lm_grant()
is set and a non-blocking lock request returns FILE_LOCK_DEFERRED will
end in an WARNING to signal the user the misusage of the API.

The reason why we moving to make non-blocking lock request as
synchronized call is that we already doing this behaviour for unlock or
cancellation as well. Those are POSIX lock operations which are handled
in an synchronized way and waiting for an answer. For non-blocking lock
requests the answer will probably arrive in the same time as unlock or
cancellation operations as those are trylock operations only.

In case of a blocking lock request we need to have it asynchronously
because the time when the lock request getting granted is unknown.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/lockd/svclock.c | 39 +++++++--------------------------------
 1 file changed, 7 insertions(+), 32 deletions(-)

diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index 7d63524bdb81..1e74a578d7de 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -440,31 +440,6 @@ static void nlmsvc_freegrantargs(struct nlm_rqst *call)
 	locks_release_private(&call->a_args.lock.fl);
 }
 
-/*
- * Deferred lock request handling for non-blocking lock
- */
-static __be32
-nlmsvc_defer_lock_rqst(struct svc_rqst *rqstp, struct nlm_block *block)
-{
-	__be32 status = nlm_lck_denied_nolocks;
-
-	block->b_flags |= B_QUEUED;
-
-	nlmsvc_insert_block(block, NLM_TIMEOUT);
-
-	block->b_cache_req = &rqstp->rq_chandle;
-	if (rqstp->rq_chandle.defer) {
-		block->b_deferred_req =
-			rqstp->rq_chandle.defer(block->b_cache_req);
-		if (block->b_deferred_req != NULL)
-			status = nlm_drop_reply;
-	}
-	dprintk("lockd: nlmsvc_defer_lock_rqst block %p flags %d status %d\n",
-		block, block->b_flags, ntohl(status));
-
-	return status;
-}
-
 /*
  * Attempt to establish a lock, and if it can't be granted, block it
  * if required.
@@ -569,14 +544,14 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
 			ret = async_block ? nlm_lck_blocked : nlm_lck_denied;
 			goto out_cb_mutex;
 		case FILE_LOCK_DEFERRED:
-			block->b_flags |= B_PENDING_CALLBACK;
+			/* lock requests without waiters are handled in
+			 * a non async way. Let assert this to inform
+			 * the user about a API violation.
+			 */
+			WARN_ON_ONCE(!wait);
 
-			if (wait)
-				break;
-			/* Filesystem lock operation is in progress
-			   Add it to the queue waiting for callback */
-			ret = nlmsvc_defer_lock_rqst(rqstp, block);
-			goto out_cb_mutex;
+			block->b_flags |= B_PENDING_CALLBACK;
+			break;
 		case -EDEADLK:
 			nlmsvc_remove_block(block);
 			ret = nlm_deadlock;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFCv2 3/7] lockd: introduce safe async lock op
  2023-08-14 21:11 [RFCv2 0/7] fs: nfs: async lock request changes Alexander Aring
  2023-08-14 21:11 ` [RFCv2 1/7] lockd: fix race in async lock request handling Alexander Aring
  2023-08-14 21:11 ` [RFCv2 2/7] lockd: FILE_LOCK_DEFERRED only on FL_SLEEP Alexander Aring
@ 2023-08-14 21:11 ` Alexander Aring
  2023-08-16 11:43   ` Jeff Layton
  2023-08-14 21:11 ` [RFCv2 4/7] locks: update lock callback documentation Alexander Aring
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Alexander Aring @ 2023-08-14 21:11 UTC (permalink / raw)
  To: linux-nfs
  Cc: cluster-devel, ocfs2-devel, linux-fsdevel, teigland, rpeterso,
	agruenba, trond.myklebust, anna, chuck.lever, jlayton

This patch reverts mostly commit 40595cdc93ed ("nfs: block notification
on fs with its own ->lock") and introduces an EXPORT_OP_SAFE_ASYNC_LOCK
export flag to signal that the "own ->lock" implementation supports
async lock requests. The only main user is DLM that is used by GFS2 and
OCFS2 filesystem. Those implement their own lock() implementation and
return FILE_LOCK_DEFERRED as return value. Since commit 40595cdc93ed
("nfs: block notification on fs with its own ->lock") the DLM
implementation were never updated. This patch should prepare for DLM
to set the EXPORT_OP_SAFE_ASYNC_LOCK export flag and update the DLM
plock implementation regarding to it.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/lockd/svclock.c       |  5 ++---
 fs/nfsd/nfs4state.c      | 13 ++++++++++---
 include/linux/exportfs.h |  8 ++++++++
 3 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index 1e74a578d7de..b8bd2b841ee1 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -449,9 +449,7 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
 	    struct nlm_host *host, struct nlm_lock *lock, int wait,
 	    struct nlm_cookie *cookie, int reclaim)
 {
-#if IS_ENABLED(CONFIG_SUNRPC_DEBUG)
 	struct inode		*inode = nlmsvc_file_inode(file);
-#endif
 	struct nlm_block	*block = NULL;
 	int			error;
 	int			mode;
@@ -465,7 +463,8 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
 				(long long)lock->fl.fl_end,
 				wait);
 
-	if (nlmsvc_file_file(file)->f_op->lock) {
+	if (!export_op_support_safe_async_lock(inode->i_sb->s_export_op,
+					       nlmsvc_file_file(file)->f_op)) {
 		async_block = wait;
 		wait = 0;
 	}
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 3aefbad4cc09..14ca06424ff1 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -7430,6 +7430,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	struct nfsd4_blocked_lock *nbl = NULL;
 	struct file_lock *file_lock = NULL;
 	struct file_lock *conflock = NULL;
+	struct super_block *sb;
 	__be32 status = 0;
 	int lkflg;
 	int err;
@@ -7451,6 +7452,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		dprintk("NFSD: nfsd4_lock: permission denied!\n");
 		return status;
 	}
+	sb = cstate->current_fh.fh_dentry->d_sb;
 
 	if (lock->lk_is_new) {
 		if (nfsd4_has_session(cstate))
@@ -7502,7 +7504,9 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	fp = lock_stp->st_stid.sc_file;
 	switch (lock->lk_type) {
 		case NFS4_READW_LT:
-			if (nfsd4_has_session(cstate))
+			if (nfsd4_has_session(cstate) ||
+			    export_op_support_safe_async_lock(sb->s_export_op,
+							      nf->nf_file->f_op))
 				fl_flags |= FL_SLEEP;
 			fallthrough;
 		case NFS4_READ_LT:
@@ -7514,7 +7518,9 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 			fl_type = F_RDLCK;
 			break;
 		case NFS4_WRITEW_LT:
-			if (nfsd4_has_session(cstate))
+			if (nfsd4_has_session(cstate) ||
+			    export_op_support_safe_async_lock(sb->s_export_op,
+							      nf->nf_file->f_op))
 				fl_flags |= FL_SLEEP;
 			fallthrough;
 		case NFS4_WRITE_LT:
@@ -7542,7 +7548,8 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	 * for file locks), so don't attempt blocking lock notifications
 	 * on those filesystems:
 	 */
-	if (nf->nf_file->f_op->lock)
+	if (!export_op_support_safe_async_lock(sb->s_export_op,
+					       nf->nf_file->f_op))
 		fl_flags &= ~FL_SLEEP;
 
 	nbl = find_or_allocate_block(lock_sop, &fp->fi_fhandle, nn);
diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h
index 11fbd0ee1370..10358a93cdc1 100644
--- a/include/linux/exportfs.h
+++ b/include/linux/exportfs.h
@@ -3,6 +3,7 @@
 #define LINUX_EXPORTFS_H 1
 
 #include <linux/types.h>
+#include <linux/fs.h>
 
 struct dentry;
 struct iattr;
@@ -224,9 +225,16 @@ struct export_operations {
 						  atomic attribute updates
 						*/
 #define EXPORT_OP_FLUSH_ON_CLOSE	(0x20) /* fs flushes file data on close */
+#define EXPORT_OP_SAFE_ASYNC_LOCK	(0x40) /* fs can do async lock request */
 	unsigned long	flags;
 };
 
+static inline bool export_op_support_safe_async_lock(const struct export_operations *export_ops,
+						     const struct file_operations *f_op)
+{
+	return (export_ops->flags & EXPORT_OP_SAFE_ASYNC_LOCK) || !f_op->lock;
+}
+
 extern int exportfs_encode_inode_fh(struct inode *inode, struct fid *fid,
 				    int *max_len, struct inode *parent,
 				    int flags);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFCv2 4/7] locks: update lock callback documentation
  2023-08-14 21:11 [RFCv2 0/7] fs: nfs: async lock request changes Alexander Aring
                   ` (2 preceding siblings ...)
  2023-08-14 21:11 ` [RFCv2 3/7] lockd: introduce safe async lock op Alexander Aring
@ 2023-08-14 21:11 ` Alexander Aring
  2023-08-16 12:01   ` Jeff Layton
  2023-08-14 21:11 ` [RFCv2 5/7] dlm: use fl_owner from lockd Alexander Aring
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Alexander Aring @ 2023-08-14 21:11 UTC (permalink / raw)
  To: linux-nfs
  Cc: cluster-devel, ocfs2-devel, linux-fsdevel, teigland, rpeterso,
	agruenba, trond.myklebust, anna, chuck.lever, jlayton

This patch updates the existing documentation regarding recent changes
to vfs_lock_file() and lm_grant() is set. In case of lm_grant() is set
we only handle FILE_LOCK_DEFERRED in case of FL_SLEEP in fl_flags is not
set. This is the case of an blocking lock request. Non-blocking lock
requests, when FL_SLEEP is not set, are handled in a synchronized way.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/locks.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index df8b26a42524..a8e51f462b43 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -2255,21 +2255,21 @@ int fcntl_getlk(struct file *filp, unsigned int cmd, struct flock *flock)
  * To avoid blocking kernel daemons, such as lockd, that need to acquire POSIX
  * locks, the ->lock() interface may return asynchronously, before the lock has
  * been granted or denied by the underlying filesystem, if (and only if)
- * lm_grant is set. Callers expecting ->lock() to return asynchronously
- * will only use F_SETLK, not F_SETLKW; they will set FL_SLEEP if (and only if)
- * the request is for a blocking lock. When ->lock() does return asynchronously,
- * it must return FILE_LOCK_DEFERRED, and call ->lm_grant() when the lock
- * request completes.
- * If the request is for non-blocking lock the file system should return
- * FILE_LOCK_DEFERRED then try to get the lock and call the callback routine
- * with the result. If the request timed out the callback routine will return a
+ * lm_grant and FL_SLEEP in fl_flags is set. Callers expecting ->lock() to return
+ * asynchronously will only use F_SETLK, not F_SETLKW; When ->lock() does return
+ * asynchronously, it must return FILE_LOCK_DEFERRED, and call ->lm_grant() when
+ * the lock request completes. The lm_grant() callback must be called in a
+ * sleepable context.
+ *
+ * If the request timed out the ->lm_grant() callback routine will return a
  * nonzero return code and the file system should release the lock. The file
- * system is also responsible to keep a corresponding posix lock when it
- * grants a lock so the VFS can find out which locks are locally held and do
- * the correct lock cleanup when required.
- * The underlying filesystem must not drop the kernel lock or call
- * ->lm_grant() before returning to the caller with a FILE_LOCK_DEFERRED
- * return code.
+ * system is also responsible to keep a corresponding posix lock when it grants
+ * a lock so the VFS can find out which locks are locally held and do the correct
+ * lock cleanup when required.
+ *
+ * If the request is for non-blocking lock (when F_SETLK and FL_SLEEP in fl_flags is not set)
+ * the file system should return -EAGAIN if failed to acquire or zero if acquiring was
+ * successfully without calling the ->lm_grant() callback routine.
  */
 int vfs_lock_file(struct file *filp, unsigned int cmd, struct file_lock *fl, struct file_lock *conf)
 {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFCv2 5/7] dlm: use fl_owner from lockd
  2023-08-14 21:11 [RFCv2 0/7] fs: nfs: async lock request changes Alexander Aring
                   ` (3 preceding siblings ...)
  2023-08-14 21:11 ` [RFCv2 4/7] locks: update lock callback documentation Alexander Aring
@ 2023-08-14 21:11 ` Alexander Aring
  2023-08-16 12:02   ` Jeff Layton
  2023-08-14 21:11 ` [RFCv2 6/7] dlm: use FL_SLEEP to check if blocking request Alexander Aring
  2023-08-14 21:11 ` [RFCv2 7/7] dlm: implement EXPORT_OP_SAFE_ASYNC_LOCK Alexander Aring
  6 siblings, 1 reply; 22+ messages in thread
From: Alexander Aring @ 2023-08-14 21:11 UTC (permalink / raw)
  To: linux-nfs
  Cc: cluster-devel, ocfs2-devel, linux-fsdevel, teigland, rpeterso,
	agruenba, trond.myklebust, anna, chuck.lever, jlayton

This patch is changing the fl_owner value in case of an nfs lock request
to not be the pid of lockd. Instead this patch changes it to be the
owner value that nfs is giving us.

Currently there exists proved problems with this behaviour. One nfsd
server was created to export a gfs2 filesystem mount. Two nfs clients
doing a nfs mount of this export. Those two clients should conflict each
other operating on the same nfs file.

A small test program was written:

int main(int argc, const char *argv[])
{
	struct flock fl = {
		.l_type = F_WRLCK,
		.l_whence = SEEK_SET,
		.l_start = 1L,
		.l_len = 1L,
	};
	int fd;

	fd = open("filename", O_RDWR | O_CREAT, 0700);
	printf("try to lock...\n");
	fcntl(fd, F_SETLKW, &fl);
	printf("locked!\n");
	getc(stdin);

	return 0;
}

Running on both clients at the same time and don't interrupting by
pressing any key. It will show that both clients are able to acquire the
lock which shouldn't be the case. The issue is here that the fl_owner
value is the same and the lock context of both clients should be
separated.

This patch lets lockd define how to deal with lock contexts and chose
hopefully the right fl_owner value. A test after this patch was made and
the locks conflicts each other which should be the case.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/plock.c | 18 ++++--------------
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
index 00e1d802a81c..0094fa4004cc 100644
--- a/fs/dlm/plock.c
+++ b/fs/dlm/plock.c
@@ -145,6 +145,7 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
 	op->info.number		= number;
 	op->info.start		= fl->fl_start;
 	op->info.end		= fl->fl_end;
+	op->info.owner = (__u64)(long)fl->fl_owner;
 	/* async handling */
 	if (fl->fl_lmops && fl->fl_lmops->lm_grant) {
 		op_data = kzalloc(sizeof(*op_data), GFP_NOFS);
@@ -154,9 +155,6 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
 			goto out;
 		}
 
-		/* fl_owner is lockd which doesn't distinguish
-		   processes on the nfs client */
-		op->info.owner	= (__u64) fl->fl_pid;
 		op_data->callback = fl->fl_lmops->lm_grant;
 		locks_init_lock(&op_data->flc);
 		locks_copy_lock(&op_data->flc, fl);
@@ -168,8 +166,6 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
 		send_op(op);
 		rv = FILE_LOCK_DEFERRED;
 		goto out;
-	} else {
-		op->info.owner	= (__u64)(long) fl->fl_owner;
 	}
 
 	send_op(op);
@@ -326,10 +322,7 @@ int dlm_posix_unlock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
 	op->info.number		= number;
 	op->info.start		= fl->fl_start;
 	op->info.end		= fl->fl_end;
-	if (fl->fl_lmops && fl->fl_lmops->lm_grant)
-		op->info.owner	= (__u64) fl->fl_pid;
-	else
-		op->info.owner	= (__u64)(long) fl->fl_owner;
+	op->info.owner = (__u64)(long)fl->fl_owner;
 
 	if (fl->fl_flags & FL_CLOSE) {
 		op->info.flags |= DLM_PLOCK_FL_CLOSE;
@@ -389,7 +382,7 @@ int dlm_posix_cancel(dlm_lockspace_t *lockspace, u64 number, struct file *file,
 	info.number = number;
 	info.start = fl->fl_start;
 	info.end = fl->fl_end;
-	info.owner = (__u64)fl->fl_pid;
+	info.owner = (__u64)(long)fl->fl_owner;
 
 	rv = do_lock_cancel(&info);
 	switch (rv) {
@@ -450,10 +443,7 @@ int dlm_posix_get(dlm_lockspace_t *lockspace, u64 number, struct file *file,
 	op->info.number		= number;
 	op->info.start		= fl->fl_start;
 	op->info.end		= fl->fl_end;
-	if (fl->fl_lmops && fl->fl_lmops->lm_grant)
-		op->info.owner	= (__u64) fl->fl_pid;
-	else
-		op->info.owner	= (__u64)(long) fl->fl_owner;
+	op->info.owner = (__u64)(long)fl->fl_owner;
 
 	send_op(op);
 	wait_event(recv_wq, (op->done != 0));
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFCv2 6/7] dlm: use FL_SLEEP to check if blocking request
  2023-08-14 21:11 [RFCv2 0/7] fs: nfs: async lock request changes Alexander Aring
                   ` (4 preceding siblings ...)
  2023-08-14 21:11 ` [RFCv2 5/7] dlm: use fl_owner from lockd Alexander Aring
@ 2023-08-14 21:11 ` Alexander Aring
  2023-08-16 13:07   ` Jeff Layton
  2023-08-14 21:11 ` [RFCv2 7/7] dlm: implement EXPORT_OP_SAFE_ASYNC_LOCK Alexander Aring
  6 siblings, 1 reply; 22+ messages in thread
From: Alexander Aring @ 2023-08-14 21:11 UTC (permalink / raw)
  To: linux-nfs
  Cc: cluster-devel, ocfs2-devel, linux-fsdevel, teigland, rpeterso,
	agruenba, trond.myklebust, anna, chuck.lever, jlayton

This patch uses the FL_SLEEP flag in struct file_lock to check if it's a
blocking request in case if the request coming from nfs lockd process
indicated by lm_grant() is set.

IF FL_SLEEP is set a asynchronous blocking request is being made and
it's waiting for lm_grant() callback being called to signal the lock was
granted. If it's not set a synchronous non-blocking request is being made.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/plock.c | 38 ++++++++++++++++++++++----------------
 1 file changed, 22 insertions(+), 16 deletions(-)

diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
index 0094fa4004cc..524771002a2f 100644
--- a/fs/dlm/plock.c
+++ b/fs/dlm/plock.c
@@ -140,7 +140,6 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
 	op->info.optype		= DLM_PLOCK_OP_LOCK;
 	op->info.pid		= fl->fl_pid;
 	op->info.ex		= (fl->fl_type == F_WRLCK);
-	op->info.wait		= IS_SETLKW(cmd);
 	op->info.fsid		= ls->ls_global_id;
 	op->info.number		= number;
 	op->info.start		= fl->fl_start;
@@ -148,24 +147,31 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
 	op->info.owner = (__u64)(long)fl->fl_owner;
 	/* async handling */
 	if (fl->fl_lmops && fl->fl_lmops->lm_grant) {
-		op_data = kzalloc(sizeof(*op_data), GFP_NOFS);
-		if (!op_data) {
-			dlm_release_plock_op(op);
-			rv = -ENOMEM;
-			goto out;
-		}
+		if (fl->fl_flags & FL_SLEEP) {
+			op_data = kzalloc(sizeof(*op_data), GFP_NOFS);
+			if (!op_data) {
+				dlm_release_plock_op(op);
+				rv = -ENOMEM;
+				goto out;
+			}
 
-		op_data->callback = fl->fl_lmops->lm_grant;
-		locks_init_lock(&op_data->flc);
-		locks_copy_lock(&op_data->flc, fl);
-		op_data->fl		= fl;
-		op_data->file	= file;
+			op->info.wait = 1;
+			op_data->callback = fl->fl_lmops->lm_grant;
+			locks_init_lock(&op_data->flc);
+			locks_copy_lock(&op_data->flc, fl);
+			op_data->fl		= fl;
+			op_data->file	= file;
 
-		op->data = op_data;
+			op->data = op_data;
 
-		send_op(op);
-		rv = FILE_LOCK_DEFERRED;
-		goto out;
+			send_op(op);
+			rv = FILE_LOCK_DEFERRED;
+			goto out;
+		} else {
+			op->info.wait = 0;
+		}
+	} else {
+		op->info.wait = IS_SETLKW(cmd);
 	}
 
 	send_op(op);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFCv2 7/7] dlm: implement EXPORT_OP_SAFE_ASYNC_LOCK
  2023-08-14 21:11 [RFCv2 0/7] fs: nfs: async lock request changes Alexander Aring
                   ` (5 preceding siblings ...)
  2023-08-14 21:11 ` [RFCv2 6/7] dlm: use FL_SLEEP to check if blocking request Alexander Aring
@ 2023-08-14 21:11 ` Alexander Aring
  6 siblings, 0 replies; 22+ messages in thread
From: Alexander Aring @ 2023-08-14 21:11 UTC (permalink / raw)
  To: linux-nfs
  Cc: cluster-devel, ocfs2-devel, linux-fsdevel, teigland, rpeterso,
	agruenba, trond.myklebust, anna, chuck.lever, jlayton

This patch is activating the EXPORT_OP_SAFE_ASYNC_LOCK export flag to
signal lockd that both filesystems are able to handle async lock
requests. The cluster filesystems gfs2 and ocfs2 will redirect their
lock requests to DLMs plock implementation that can handle async lock
requests.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/gfs2/export.c  | 1 +
 fs/ocfs2/export.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/fs/gfs2/export.c b/fs/gfs2/export.c
index cf40895233f5..36bc43b9d141 100644
--- a/fs/gfs2/export.c
+++ b/fs/gfs2/export.c
@@ -192,5 +192,6 @@ const struct export_operations gfs2_export_ops = {
 	.fh_to_parent = gfs2_fh_to_parent,
 	.get_name = gfs2_get_name,
 	.get_parent = gfs2_get_parent,
+	.flags = EXPORT_OP_SAFE_ASYNC_LOCK,
 };
 
diff --git a/fs/ocfs2/export.c b/fs/ocfs2/export.c
index eaa8c80ace3c..8a1169e01dd9 100644
--- a/fs/ocfs2/export.c
+++ b/fs/ocfs2/export.c
@@ -280,4 +280,5 @@ const struct export_operations ocfs2_export_ops = {
 	.fh_to_dentry	= ocfs2_fh_to_dentry,
 	.fh_to_parent	= ocfs2_fh_to_parent,
 	.get_parent	= ocfs2_get_parent,
+	.flags		= EXPORT_OP_SAFE_ASYNC_LOCK,
 };
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [RFCv2 1/7] lockd: fix race in async lock request handling
  2023-08-14 21:11 ` [RFCv2 1/7] lockd: fix race in async lock request handling Alexander Aring
@ 2023-08-15 17:49   ` Jeff Layton
  2023-08-15 18:21     ` Jeff Layton
  2023-08-17 18:36     ` Alexander Aring
  0 siblings, 2 replies; 22+ messages in thread
From: Jeff Layton @ 2023-08-15 17:49 UTC (permalink / raw)
  To: Alexander Aring, linux-nfs
  Cc: cluster-devel, ocfs2-devel, linux-fsdevel, teigland, rpeterso,
	agruenba, trond.myklebust, anna, chuck.lever

On Mon, 2023-08-14 at 17:11 -0400, Alexander Aring wrote:
> This patch fixes a race in async lock request handling between adding
> the relevant struct nlm_block to nlm_blocked list after the request was
> sent by vfs_lock_file() and nlmsvc_grant_deferred() does a lookup of the
> nlm_block in the nlm_blocked list. It could be that the async request is
> completed before the nlm_block was added to the list. This would end
> in a -ENOENT and a kernel log message of "lockd: grant for unknown
> block".
> 
> To solve this issue we add the nlm_block before the vfs_lock_file() call
> to be sure it has been added when a possible nlmsvc_grant_deferred() is
> called. If the vfs_lock_file() results in an case when it wouldn't be
> added to nlm_blocked list, the nlm_block struct will be removed from
> this list again.
> 
> The introducing of the new B_PENDING_CALLBACK nlm_block flag will handle
> async lock requests on a pending lock requests as a retry on the caller
> level to hit the -EAGAIN case.
> 
> Signed-off-by: Alexander Aring <aahringo@redhat.com>
> ---
>  fs/lockd/svclock.c          | 100 ++++++++++++++++++++++++++----------
>  include/linux/lockd/lockd.h |   2 +
>  2 files changed, 74 insertions(+), 28 deletions(-)
> 
> diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
> index c43ccdf28ed9..7d63524bdb81 100644
> --- a/fs/lockd/svclock.c
> +++ b/fs/lockd/svclock.c
> @@ -133,6 +133,7 @@ nlmsvc_remove_block(struct nlm_block *block)
>  {
>  	if (!list_empty(&block->b_list)) {
>  		spin_lock(&nlm_blocked_lock);
> +		block->b_flags &= ~B_PENDING_CALLBACK;
> 		list_del_init(&block->b_list);
>  		spin_unlock(&nlm_blocked_lock);
>  		nlmsvc_release_block(block);
> @@ -232,6 +233,7 @@ nlmsvc_create_block(struct svc_rqst *rqstp, struct nlm_host *host,
>  	kref_init(&block->b_count);
>  	INIT_LIST_HEAD(&block->b_list);
>  	INIT_LIST_HEAD(&block->b_flist);
> +	mutex_init(&block->b_cb_mutex);
>  
>  	if (!nlmsvc_setgrantargs(call, lock))
>  		goto failed_free;
> @@ -289,6 +291,8 @@ static void nlmsvc_free_block(struct kref *kref)
>  
>  	dprintk("lockd: freeing block %p...\n", block);
>  
> +	WARN_ON_ONCE(block->b_flags & B_PENDING_CALLBACK);
> +
>  	/* Remove block from file's list of blocks */
>  	list_del_init(&block->b_flist);
>  	mutex_unlock(&file->f_mutex);
> @@ -532,6 +536,13 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
>  		goto out;
>  	}
>  
> +	mutex_lock(&block->b_cb_mutex);
> +	if (block->b_flags & B_PENDING_CALLBACK)
> +		goto pending_request;
> +
> +	/* Append to list of blocked */
> +	nlmsvc_insert_block(block, NLM_NEVER);
> +
>  	if (!wait)
>  		lock->fl.fl_flags &= ~FL_SLEEP;
>  	mode = lock_to_openmode(&lock->fl);
> @@ -541,9 +552,13 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
>  	dprintk("lockd: vfs_lock_file returned %d\n", error);
>  	switch (error) {
>  		case 0:
> +			nlmsvc_remove_block(block);
>  			ret = nlm_granted;
> -			goto out;
> +			goto out_cb_mutex;
>  		case -EAGAIN:
> +			if (!wait)
> +				nlmsvc_remove_block(block);
> +pending_request:
>  			/*
>  			 * If this is a blocking request for an
>  			 * already pending lock request then we need
> @@ -552,26 +567,29 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
>  			if (wait)
>  				break;
>  			ret = async_block ? nlm_lck_blocked : nlm_lck_denied;
> -			goto out;
> +			goto out_cb_mutex;
>  		case FILE_LOCK_DEFERRED:
> +			block->b_flags |= B_PENDING_CALLBACK;
> +
>  			if (wait)
>  				break;
>  			/* Filesystem lock operation is in progress
>  			   Add it to the queue waiting for callback */
>  			ret = nlmsvc_defer_lock_rqst(rqstp, block);
> -			goto out;
> +			goto out_cb_mutex;
>  		case -EDEADLK:
> +			nlmsvc_remove_block(block);
>  			ret = nlm_deadlock;
> -			goto out;
> +			goto out_cb_mutex;
>  		default:			/* includes ENOLCK */
> +			nlmsvc_remove_block(block);
>  			ret = nlm_lck_denied_nolocks;
> -			goto out;
> +			goto out_cb_mutex;
>  	}
>  
>  	ret = nlm_lck_blocked;
> -
> -	/* Append to list of blocked */
> -	nlmsvc_insert_block(block, NLM_NEVER);
> +out_cb_mutex:
> +	mutex_unlock(&block->b_cb_mutex);
>  out:
>  	mutex_unlock(&file->f_mutex);
>  	nlmsvc_release_block(block);
> @@ -728,34 +746,60 @@ nlmsvc_update_deferred_block(struct nlm_block *block, int result)
>  		block->b_flags |= B_TIMED_OUT;
>  }
>  
> +static int __nlmsvc_grant_deferred(struct nlm_block *block,
> +				   struct file_lock *fl,
> +				   int result)
> +{
> +	int rc = 0;
> +
> +	dprintk("lockd: nlmsvc_notify_blocked block %p flags %d\n",
> +					block, block->b_flags);
> +	if (block->b_flags & B_QUEUED) {
> +		if (block->b_flags & B_TIMED_OUT) {
> +			rc = -ENOLCK;
> +			goto out;
> +		}
> +		nlmsvc_update_deferred_block(block, result);
> +	} else if (result == 0)
> +		block->b_granted = 1;
> +
> +	nlmsvc_insert_block_locked(block, 0);
> +	svc_wake_up(block->b_daemon);
> +out:
> +	return rc;
> +}
> +
>  static int nlmsvc_grant_deferred(struct file_lock *fl, int result)
>  {
> -	struct nlm_block *block;
> -	int rc = -ENOENT;
> +	struct nlm_block *iter, *block = NULL;
> +	int rc;
>  
>  	spin_lock(&nlm_blocked_lock);
> -	list_for_each_entry(block, &nlm_blocked, b_list) {
> -		if (nlm_compare_locks(&block->b_call->a_args.lock.fl, fl)) {
> -			dprintk("lockd: nlmsvc_notify_blocked block %p flags %d\n",
> -							block, block->b_flags);
> -			if (block->b_flags & B_QUEUED) {
> -				if (block->b_flags & B_TIMED_OUT) {
> -					rc = -ENOLCK;
> -					break;
> -				}
> -				nlmsvc_update_deferred_block(block, result);
> -			} else if (result == 0)
> -				block->b_granted = 1;
> -
> -			nlmsvc_insert_block_locked(block, 0);
> -			svc_wake_up(block->b_daemon);
> -			rc = 0;
> +	list_for_each_entry(iter, &nlm_blocked, b_list) {
> +		if (nlm_compare_locks(&iter->b_call->a_args.lock.fl, fl)) {
> +			kref_get(&iter->b_count);
> +			block = iter;
>  			break;
>  		}
>  	}
>  	spin_unlock(&nlm_blocked_lock);
> -	if (rc == -ENOENT)
> -		printk(KERN_WARNING "lockd: grant for unknown block\n");
> +
> +	if (!block) {
> +		pr_warn("lockd: grant for unknown pending block\n");
> +		return -ENOENT;
> +	}
> +
> +	/* don't interfere with nlmsvc_lock() */
> +	mutex_lock(&block->b_cb_mutex);
> +	block->b_flags &= ~B_PENDING_CALLBACK;
> +
> +	spin_lock(&nlm_blocked_lock);
> +	WARN_ON_ONCE(list_empty(&block->b_list));
> +	rc = __nlmsvc_grant_deferred(block, fl, result);
> +	spin_unlock(&nlm_blocked_lock);
> +	mutex_unlock(&block->b_cb_mutex);
> +
> +	nlmsvc_release_block(block);
>  	return rc;
>  }
>  
> diff --git a/include/linux/lockd/lockd.h b/include/linux/lockd/lockd.h
> index f42594a9efe0..91f55458f5fc 100644
> --- a/include/linux/lockd/lockd.h
> +++ b/include/linux/lockd/lockd.h
> @@ -185,10 +185,12 @@ struct nlm_block {
>  	struct nlm_file *	b_file;		/* file in question */
>  	struct cache_req *	b_cache_req;	/* deferred request handling */
>  	struct cache_deferred_req * b_deferred_req
> +	struct mutex		b_cb_mutex;	/* callback mutex */

There is no mention at all of this new mutex in the changelog or
comments. It's not at all clear to me what this is intended to protect.
In general, with lockd being a single-threaded service, we want to avoid
sleeping locks. This will need some clear justification.

At a glance, it looks like you're trying to use this to hold
B_PENDING_CALLBACK steady while a lock request is being handled. That
suggests that you're using this mutex to serialize access to a section
of code and not one or more specific data structures. We usually like to
avoid that sort of thing, since locks that protect arbitrary sections of
code become difficult to work with over time.

I'm going to go out on a limb here though and suggest that there is
probably a way to solve this problem that doesn't involve adding new
locks.

>  	unsigned int		b_flags;	/* block flags */
>  #define B_QUEUED		1	/* lock queued */
>  #define B_GOT_CALLBACK		2	/* got lock or conflicting lock */
>  #define B_TIMED_OUT		4	/* filesystem too slow to respond */
> +#define B_PENDING_CALLBACK	8	/* pending callback for lock request */
>  };
>  
>  /*

Do we need this new flag at all? It seems redundant. If we have a block
on the list, then it is sort of by definition "pending callback". If
it's not on the list anymore, then it's not. No?

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFCv2 1/7] lockd: fix race in async lock request handling
  2023-08-15 17:49   ` Jeff Layton
@ 2023-08-15 18:21     ` Jeff Layton
  2023-08-17 18:39       ` Alexander Aring
  2023-08-17 18:36     ` Alexander Aring
  1 sibling, 1 reply; 22+ messages in thread
From: Jeff Layton @ 2023-08-15 18:21 UTC (permalink / raw)
  To: Alexander Aring, linux-nfs
  Cc: cluster-devel, ocfs2-devel, linux-fsdevel, teigland, rpeterso,
	agruenba, trond.myklebust, anna, chuck.lever

On Tue, 2023-08-15 at 13:49 -0400, Jeff Layton wrote:
> On Mon, 2023-08-14 at 17:11 -0400, Alexander Aring wrote:
> > This patch fixes a race in async lock request handling between adding
> > the relevant struct nlm_block to nlm_blocked list after the request was
> > sent by vfs_lock_file() and nlmsvc_grant_deferred() does a lookup of the
> > nlm_block in the nlm_blocked list. It could be that the async request is
> > completed before the nlm_block was added to the list. This would end
> > in a -ENOENT and a kernel log message of "lockd: grant for unknown
> > block".
> > 
> > To solve this issue we add the nlm_block before the vfs_lock_file() call
> > to be sure it has been added when a possible nlmsvc_grant_deferred() is
> > called. If the vfs_lock_file() results in an case when it wouldn't be
> > added to nlm_blocked list, the nlm_block struct will be removed from
> > this list again.
> > 
> > The introducing of the new B_PENDING_CALLBACK nlm_block flag will handle
> > async lock requests on a pending lock requests as a retry on the caller
> > level to hit the -EAGAIN case.
> > 
> > Signed-off-by: Alexander Aring <aahringo@redhat.com>
> > ---
> >  fs/lockd/svclock.c          | 100 ++++++++++++++++++++++++++----------
> >  include/linux/lockd/lockd.h |   2 +
> >  2 files changed, 74 insertions(+), 28 deletions(-)
> > 
> > 

[...]

> > diff --git a/include/linux/lockd/lockd.h b/include/linux/lockd/lockd.h
> > index f42594a9efe0..91f55458f5fc 100644
> > --- a/include/linux/lockd/lockd.h
> > +++ b/include/linux/lockd/lockd.h
> > @@ -185,10 +185,12 @@ struct nlm_block {
> >  	struct nlm_file *	b_file;		/* file in question */
> >  	struct cache_req *	b_cache_req;	/* deferred request handling */
> >  	struct cache_deferred_req * b_deferred_req
> > +	struct mutex		b_cb_mutex;	/* callback mutex */
> 
> There is no mention at all of this new mutex in the changelog or
> comments. It's not at all clear to me what this is intended to protect.
> In general, with lockd being a single-threaded service, we want to avoid
> sleeping locks. This will need some clear justification.
> 
> At a glance, it looks like you're trying to use this to hold
> B_PENDING_CALLBACK steady while a lock request is being handled. That
> suggests that you're using this mutex to serialize access to a section
> of code and not one or more specific data structures. We usually like to
> avoid that sort of thing, since locks that protect arbitrary sections of
> code become difficult to work with over time.
> 
> I'm going to go out on a limb here though and suggest that there is
> probably a way to solve this problem that doesn't involve adding new
> locks.
> 
> >  	unsigned int		b_flags;	/* block flags */
> >  #define B_QUEUED		1	/* lock queued */
> >  #define B_GOT_CALLBACK		2	/* got lock or conflicting lock */
> >  #define B_TIMED_OUT		4	/* filesystem too slow to respond */
> > +#define B_PENDING_CALLBACK	8	/* pending callback for lock request */
> >  };
> >  
> >  /*
> 
> Do we need this new flag at all? It seems redundant. If we have a block
> on the list, then it is sort of by definition "pending callback". If
> it's not on the list anymore, then it's not. No?
> 

Do we need anything more than a patch along these lines? Note that this
is untested, so RFC:

---------------------8<-----------------------

[RFC PATCH] lockd: alternate fix for race between deferred lock and grant

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/lockd/svclock.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index c43ccdf28ed9..e9a84363c26e 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -446,6 +446,8 @@ nlmsvc_defer_lock_rqst(struct svc_rqst *rqstp, struct nlm_block *block)
 
 	block->b_flags |= B_QUEUED;
 
+	/* FIXME: remove and reinsert w/o dropping spinlock */
+	nlmsvc_remove_block(block);
 	nlmsvc_insert_block(block, NLM_TIMEOUT);
 
 	block->b_cache_req = &rqstp->rq_chandle;
@@ -535,6 +537,9 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
 	if (!wait)
 		lock->fl.fl_flags &= ~FL_SLEEP;
 	mode = lock_to_openmode(&lock->fl);
+
+	/* Append to list of blocked */
+	nlmsvc_insert_block(block, NLM_NEVER);
 	error = vfs_lock_file(file->f_file[mode], F_SETLK, &lock->fl, NULL);
 	lock->fl.fl_flags &= ~FL_SLEEP;
 
@@ -542,6 +547,7 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
 	switch (error) {
 		case 0:
 			ret = nlm_granted;
+			nlmsvc_remove_block(block);
 			goto out;
 		case -EAGAIN:
 			/*
@@ -552,6 +558,7 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
 			if (wait)
 				break;
 			ret = async_block ? nlm_lck_blocked : nlm_lck_denied;
+			nlmsvc_remove_block(block);
 			goto out;
 		case FILE_LOCK_DEFERRED:
 			if (wait)
@@ -570,8 +577,6 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
 
 	ret = nlm_lck_blocked;
 
-	/* Append to list of blocked */
-	nlmsvc_insert_block(block, NLM_NEVER);
 out:
 	mutex_unlock(&file->f_mutex);
 	nlmsvc_release_block(block);
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [RFCv2 2/7] lockd: FILE_LOCK_DEFERRED only on FL_SLEEP
  2023-08-14 21:11 ` [RFCv2 2/7] lockd: FILE_LOCK_DEFERRED only on FL_SLEEP Alexander Aring
@ 2023-08-16 11:37   ` Jeff Layton
  2023-08-17  1:40     ` Alexander Aring
  0 siblings, 1 reply; 22+ messages in thread
From: Jeff Layton @ 2023-08-16 11:37 UTC (permalink / raw)
  To: Alexander Aring, linux-nfs
  Cc: cluster-devel, ocfs2-devel, linux-fsdevel, teigland, rpeterso,
	agruenba, trond.myklebust, anna, chuck.lever

On Mon, 2023-08-14 at 17:11 -0400, Alexander Aring wrote:
> This patch removes to handle non-blocking lock requests as asynchronous
> lock request returning FILE_LOCK_DEFERRED. When fl_lmops and lm_grant()
> is set and a non-blocking lock request returns FILE_LOCK_DEFERRED will
> end in an WARNING to signal the user the misusage of the API.
> 

Probably need to rephrase the word salad in the first sentence of the
commit log. I had to go over it a few times to understand what was going
on here.

In any case, I'm guessing that the idea here is that GFS2/DLM shouldn't
ever return FILE_LOCK_DEFERRED if this is a non-wait request (i.e.
someone called F_SETLK instead of F_SETLKW)?

That may be ok, but again, lockd goes to great lengths to avoid blocking
and I think it's generally a good idea. If an upcall to DLM can take a
long time, it might be a good idea to continue to allow a !wait request
to return FILE_LOCK_DEFERRED.

I guess this really depends on the current behavior today though. Does
DLM ever return FILE_LOCK_DEFERRED on a non-blocking lock request?


> The reason why we moving to make non-blocking lock request as
> synchronized call is that we already doing this behaviour for unlock or
> cancellation as well. Those are POSIX lock operations which are handled
> in an synchronized way and waiting for an answer. For non-blocking lock
> requests the answer will probably arrive in the same time as unlock or
> cancellation operations as those are trylock operations only.
> 
> In case of a blocking lock request we need to have it asynchronously
> because the time when the lock request getting granted is unknown.
> 
> Signed-off-by: Alexander Aring <aahringo@redhat.com>
> ---
>  fs/lockd/svclock.c | 39 +++++++--------------------------------
>  1 file changed, 7 insertions(+), 32 deletions(-)
> 
> diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
> index 7d63524bdb81..1e74a578d7de 100644
> --- a/fs/lockd/svclock.c
> +++ b/fs/lockd/svclock.c
> @@ -440,31 +440,6 @@ static void nlmsvc_freegrantargs(struct nlm_rqst *call)
>  	locks_release_private(&call->a_args.lock.fl);
>  }
>  
> -/*
> - * Deferred lock request handling for non-blocking lock
> - */
> -static __be32
> -nlmsvc_defer_lock_rqst(struct svc_rqst *rqstp, struct nlm_block *block)
> -{
> -	__be32 status = nlm_lck_denied_nolocks;
> -
> -	block->b_flags |= B_QUEUED;
> -
> -	nlmsvc_insert_block(block, NLM_TIMEOUT);
> -
> -	block->b_cache_req = &rqstp->rq_chandle;
> -	if (rqstp->rq_chandle.defer) {
> -		block->b_deferred_req =
> -			rqstp->rq_chandle.defer(block->b_cache_req);
> -		if (block->b_deferred_req != NULL)
> -			status = nlm_drop_reply;
> -	}
> -	dprintk("lockd: nlmsvc_defer_lock_rqst block %p flags %d status %d\n",
> -		block, block->b_flags, ntohl(status));
> -
> -	return status;
> -}
> -
>  /*
>   * Attempt to establish a lock, and if it can't be granted, block it
>   * if required.
> @@ -569,14 +544,14 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
>  			ret = async_block ? nlm_lck_blocked : nlm_lck_denied;
>  			goto out_cb_mutex;
>  		case FILE_LOCK_DEFERRED:
> -			block->b_flags |= B_PENDING_CALLBACK;
> +			/* lock requests without waiters are handled in
> +			 * a non async way. Let assert this to inform
> +			 * the user about a API violation.
> +			 */
> +			WARN_ON_ONCE(!wait);
>  
> -			if (wait)
> -				break;
> -			/* Filesystem lock operation is in progress
> -			   Add it to the queue waiting for callback */
> -			ret = nlmsvc_defer_lock_rqst(rqstp, block);
> -			goto out_cb_mutex;
> +			block->b_flags |= B_PENDING_CALLBACK;
> +			break;
>  		case -EDEADLK:
>  			nlmsvc_remove_block(block);
>  			ret = nlm_deadlock;

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFCv2 3/7] lockd: introduce safe async lock op
  2023-08-14 21:11 ` [RFCv2 3/7] lockd: introduce safe async lock op Alexander Aring
@ 2023-08-16 11:43   ` Jeff Layton
  0 siblings, 0 replies; 22+ messages in thread
From: Jeff Layton @ 2023-08-16 11:43 UTC (permalink / raw)
  To: Alexander Aring, linux-nfs
  Cc: cluster-devel, ocfs2-devel, linux-fsdevel, teigland, rpeterso,
	agruenba, trond.myklebust, anna, chuck.lever

On Mon, 2023-08-14 at 17:11 -0400, Alexander Aring wrote:
> This patch reverts mostly commit 40595cdc93ed ("nfs: block notification
> on fs with its own ->lock") and introduces an EXPORT_OP_SAFE_ASYNC_LOCK
> export flag to signal that the "own ->lock" implementation supports
> async lock requests. The only main user is DLM that is used by GFS2 and
> OCFS2 filesystem. Those implement their own lock() implementation and
> return FILE_LOCK_DEFERRED as return value. Since commit 40595cdc93ed
> ("nfs: block notification on fs with its own ->lock") the DLM
> implementation were never updated. This patch should prepare for DLM
> to set the EXPORT_OP_SAFE_ASYNC_LOCK export flag and update the DLM
> plock implementation regarding to it.
> 
> Signed-off-by: Alexander Aring <aahringo@redhat.com>
> ---
>  fs/lockd/svclock.c       |  5 ++---
>  fs/nfsd/nfs4state.c      | 13 ++++++++++---
>  include/linux/exportfs.h |  8 ++++++++
>  3 files changed, 20 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
> index 1e74a578d7de..b8bd2b841ee1 100644
> --- a/fs/lockd/svclock.c
> +++ b/fs/lockd/svclock.c
> @@ -449,9 +449,7 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
>  	    struct nlm_host *host, struct nlm_lock *lock, int wait,
>  	    struct nlm_cookie *cookie, int reclaim)
>  {
> -#if IS_ENABLED(CONFIG_SUNRPC_DEBUG)
>  	struct inode		*inode = nlmsvc_file_inode(file);
> -#endif
>  	struct nlm_block	*block = NULL;
>  	int			error;
>  	int			mode;
> @@ -465,7 +463,8 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
>  				(long long)lock->fl.fl_end,
>  				wait);
>  
> -	if (nlmsvc_file_file(file)->f_op->lock) {
> +	if (!export_op_support_safe_async_lock(inode->i_sb->s_export_op,
> +					       nlmsvc_file_file(file)->f_op)) {
>  		async_block = wait;
>  		wait = 0;
>  	}
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 3aefbad4cc09..14ca06424ff1 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -7430,6 +7430,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  	struct nfsd4_blocked_lock *nbl = NULL;
>  	struct file_lock *file_lock = NULL;
>  	struct file_lock *conflock = NULL;
> +	struct super_block *sb;
>  	__be32 status = 0;
>  	int lkflg;
>  	int err;
> @@ -7451,6 +7452,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  		dprintk("NFSD: nfsd4_lock: permission denied!\n");
>  		return status;
>  	}
> +	sb = cstate->current_fh.fh_dentry->d_sb;
>  
>  	if (lock->lk_is_new) {
>  		if (nfsd4_has_session(cstate))
> @@ -7502,7 +7504,9 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  	fp = lock_stp->st_stid.sc_file;
>  	switch (lock->lk_type) {
>  		case NFS4_READW_LT:
> -			if (nfsd4_has_session(cstate))
> +			if (nfsd4_has_session(cstate) ||
> +			    export_op_support_safe_async_lock(sb->s_export_op,
> +							      nf->nf_file->f_op))
>  				fl_flags |= FL_SLEEP;
>  			fallthrough;
>  		case NFS4_READ_LT:
> @@ -7514,7 +7518,9 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  			fl_type = F_RDLCK;
>  			break;
>  		case NFS4_WRITEW_LT:
> -			if (nfsd4_has_session(cstate))
> +			if (nfsd4_has_session(cstate) ||
> +			    export_op_support_safe_async_lock(sb->s_export_op,
> +							      nf->nf_file->f_op))
>  				fl_flags |= FL_SLEEP;
>  			fallthrough;
>  		case NFS4_WRITE_LT:
> @@ -7542,7 +7548,8 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  	 * for file locks), so don't attempt blocking lock notifications
>  	 * on those filesystems:
>  	 */
> -	if (nf->nf_file->f_op->lock)
> +	if (!export_op_support_safe_async_lock(sb->s_export_op,
> +					       nf->nf_file->f_op))
>  		fl_flags &= ~FL_SLEEP;
>  
>  	nbl = find_or_allocate_block(lock_sop, &fp->fi_fhandle, nn);
> diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h
> index 11fbd0ee1370..10358a93cdc1 100644
> --- a/include/linux/exportfs.h
> +++ b/include/linux/exportfs.h
> @@ -3,6 +3,7 @@
>  #define LINUX_EXPORTFS_H 1
>  
>  #include <linux/types.h>
> +#include <linux/fs.h>
>  
>  struct dentry;
>  struct iattr;
> @@ -224,9 +225,16 @@ struct export_operations {
>  						  atomic attribute updates
>  						*/
>  #define EXPORT_OP_FLUSH_ON_CLOSE	(0x20) /* fs flushes file data on close */
> +#define EXPORT_OP_SAFE_ASYNC_LOCK	(0x40) /* fs can do async lock request */
>  	unsigned long	flags;
>  };
>  
> +static inline bool export_op_support_safe_async_lock(const struct export_operations *export_ops,
> +						     const struct file_operations *f_op)
> +{
> +	return (export_ops->flags & EXPORT_OP_SAFE_ASYNC_LOCK) || !f_op->lock;
> +}
> +
>  extern int exportfs_encode_inode_fh(struct inode *inode, struct fid *fid,
>  				    int *max_len, struct inode *parent,
>  				    int flags);

This seems like a reasonable approach, in principle.

Acked-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFCv2 4/7] locks: update lock callback documentation
  2023-08-14 21:11 ` [RFCv2 4/7] locks: update lock callback documentation Alexander Aring
@ 2023-08-16 12:01   ` Jeff Layton
  2023-08-17  1:23     ` Alexander Aring
  0 siblings, 1 reply; 22+ messages in thread
From: Jeff Layton @ 2023-08-16 12:01 UTC (permalink / raw)
  To: Alexander Aring, linux-nfs
  Cc: cluster-devel, ocfs2-devel, linux-fsdevel, teigland, rpeterso,
	agruenba, trond.myklebust, anna, chuck.lever

On Mon, 2023-08-14 at 17:11 -0400, Alexander Aring wrote:
> This patch updates the existing documentation regarding recent changes
> to vfs_lock_file() and lm_grant() is set. In case of lm_grant() is set
> we only handle FILE_LOCK_DEFERRED in case of FL_SLEEP in fl_flags is not
> set. This is the case of an blocking lock request. Non-blocking lock
> requests, when FL_SLEEP is not set, are handled in a synchronized way.
> 
> Signed-off-by: Alexander Aring <aahringo@redhat.com>
> ---
>  fs/locks.c | 28 ++++++++++++++--------------
>  1 file changed, 14 insertions(+), 14 deletions(-)
> 
> diff --git a/fs/locks.c b/fs/locks.c
> index df8b26a42524..a8e51f462b43 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -2255,21 +2255,21 @@ int fcntl_getlk(struct file *filp, unsigned int cmd, struct flock *flock)
>   * To avoid blocking kernel daemons, such as lockd, that need to acquire POSIX
>   * locks, the ->lock() interface may return asynchronously, before the lock has
>   * been granted or denied by the underlying filesystem, if (and only if)
> - * lm_grant is set. Callers expecting ->lock() to return asynchronously
> - * will only use F_SETLK, not F_SETLKW; they will set FL_SLEEP if (and only if)
> - * the request is for a blocking lock. When ->lock() does return asynchronously,
> - * it must return FILE_LOCK_DEFERRED, and call ->lm_grant() when the lock
> - * request completes.
> - * If the request is for non-blocking lock the file system should return
> - * FILE_LOCK_DEFERRED then try to get the lock and call the callback routine
> - * with the result. If the request timed out the callback routine will return a
> + * lm_grant and FL_SLEEP in fl_flags is set. Callers expecting ->lock() to return
> + * asynchronously will only use F_SETLK, not F_SETLKW; When ->lock() does return

Isn't the above backward? Shouldn't it say "Callers expecting ->lock()
to return asynchronously will only use F_SETLKW, not F_SETLK" ?

> + * asynchronously, it must return FILE_LOCK_DEFERRED, and call ->lm_grant() when
> + * the lock request completes. The lm_grant() callback must be called in a
> + * sleepable context.
> + *
> + * If the request timed out the ->lm_grant() callback routine will return a
>   * nonzero return code and the file system should release the lock. The file
> - * system is also responsible to keep a corresponding posix lock when it
> - * grants a lock so the VFS can find out which locks are locally held and do
> - * the correct lock cleanup when required.
> - * The underlying filesystem must not drop the kernel lock or call
> - * ->lm_grant() before returning to the caller with a FILE_LOCK_DEFERRED
> - * return code.
> + * system is also responsible to keep a corresponding posix lock when it grants
> + * a lock so the VFS can find out which locks are locally held and do the correct
> + * lock cleanup when required.
> + *
> + * If the request is for non-blocking lock (when F_SETLK and FL_SLEEP in fl_flags is not set)
> + * the file system should return -EAGAIN if failed to acquire or zero if acquiring was
> + * successfully without calling the ->lm_grant() callback routine.
>   */
>  int vfs_lock_file(struct file *filp, unsigned int cmd, struct file_lock *fl, struct file_lock *conf)
>  {

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFCv2 5/7] dlm: use fl_owner from lockd
  2023-08-14 21:11 ` [RFCv2 5/7] dlm: use fl_owner from lockd Alexander Aring
@ 2023-08-16 12:02   ` Jeff Layton
  0 siblings, 0 replies; 22+ messages in thread
From: Jeff Layton @ 2023-08-16 12:02 UTC (permalink / raw)
  To: Alexander Aring, linux-nfs
  Cc: cluster-devel, ocfs2-devel, linux-fsdevel, teigland, rpeterso,
	agruenba, trond.myklebust, anna, chuck.lever

On Mon, 2023-08-14 at 17:11 -0400, Alexander Aring wrote:
> This patch is changing the fl_owner value in case of an nfs lock request
> to not be the pid of lockd. Instead this patch changes it to be the
> owner value that nfs is giving us.
> 
> Currently there exists proved problems with this behaviour. One nfsd
> server was created to export a gfs2 filesystem mount. Two nfs clients
> doing a nfs mount of this export. Those two clients should conflict each
> other operating on the same nfs file.
> 
> A small test program was written:
> 
> int main(int argc, const char *argv[])
> {
> 	struct flock fl = {
> 		.l_type = F_WRLCK,
> 		.l_whence = SEEK_SET,
> 		.l_start = 1L,
> 		.l_len = 1L,
> 	};
> 	int fd;
> 
> 	fd = open("filename", O_RDWR | O_CREAT, 0700);
> 	printf("try to lock...\n");
> 	fcntl(fd, F_SETLKW, &fl);
> 	printf("locked!\n");
> 	getc(stdin);
> 
> 	return 0;
> }
> 
> Running on both clients at the same time and don't interrupting by
> pressing any key. It will show that both clients are able to acquire the
> lock which shouldn't be the case. The issue is here that the fl_owner
> value is the same and the lock context of both clients should be
> separated.
> 
> This patch lets lockd define how to deal with lock contexts and chose
> hopefully the right fl_owner value. A test after this patch was made and
> the locks conflicts each other which should be the case.
> 
> Signed-off-by: Alexander Aring <aahringo@redhat.com>
> ---
>  fs/dlm/plock.c | 18 ++++--------------
>  1 file changed, 4 insertions(+), 14 deletions(-)
> 
> diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
> index 00e1d802a81c..0094fa4004cc 100644
> --- a/fs/dlm/plock.c
> +++ b/fs/dlm/plock.c
> @@ -145,6 +145,7 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
>  	op->info.number		= number;
>  	op->info.start		= fl->fl_start;
>  	op->info.end		= fl->fl_end;
> +	op->info.owner = (__u64)(long)fl->fl_owner;
>  	/* async handling */
>  	if (fl->fl_lmops && fl->fl_lmops->lm_grant) {
>  		op_data = kzalloc(sizeof(*op_data), GFP_NOFS);
> @@ -154,9 +155,6 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
>  			goto out;
>  		}
>  
> -		/* fl_owner is lockd which doesn't distinguish
> -		   processes on the nfs client */
> -		op->info.owner	= (__u64) fl->fl_pid;
>  		op_data->callback = fl->fl_lmops->lm_grant;
>  		locks_init_lock(&op_data->flc);
>  		locks_copy_lock(&op_data->flc, fl);
> @@ -168,8 +166,6 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
>  		send_op(op);
>  		rv = FILE_LOCK_DEFERRED;
>  		goto out;
> -	} else {
> -		op->info.owner	= (__u64)(long) fl->fl_owner;
>  	}
>  
>  	send_op(op);
> @@ -326,10 +322,7 @@ int dlm_posix_unlock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
>  	op->info.number		= number;
>  	op->info.start		= fl->fl_start;
>  	op->info.end		= fl->fl_end;
> -	if (fl->fl_lmops && fl->fl_lmops->lm_grant)
> -		op->info.owner	= (__u64) fl->fl_pid;
> -	else
> -		op->info.owner	= (__u64)(long) fl->fl_owner;
> +	op->info.owner = (__u64)(long)fl->fl_owner;
>  
>  	if (fl->fl_flags & FL_CLOSE) {
>  		op->info.flags |= DLM_PLOCK_FL_CLOSE;
> @@ -389,7 +382,7 @@ int dlm_posix_cancel(dlm_lockspace_t *lockspace, u64 number, struct file *file,
>  	info.number = number;
>  	info.start = fl->fl_start;
>  	info.end = fl->fl_end;
> -	info.owner = (__u64)fl->fl_pid;
> +	info.owner = (__u64)(long)fl->fl_owner;
>  
>  	rv = do_lock_cancel(&info);
>  	switch (rv) {
> @@ -450,10 +443,7 @@ int dlm_posix_get(dlm_lockspace_t *lockspace, u64 number, struct file *file,
>  	op->info.number		= number;
>  	op->info.start		= fl->fl_start;
>  	op->info.end		= fl->fl_end;
> -	if (fl->fl_lmops && fl->fl_lmops->lm_grant)
> -		op->info.owner	= (__u64) fl->fl_pid;
> -	else
> -		op->info.owner	= (__u64)(long) fl->fl_owner;
> +	op->info.owner = (__u64)(long)fl->fl_owner;
>  
>  	send_op(op);
>  	wait_event(recv_wq, (op->done != 0));

This is the way.

Acked-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFCv2 6/7] dlm: use FL_SLEEP to check if blocking request
  2023-08-14 21:11 ` [RFCv2 6/7] dlm: use FL_SLEEP to check if blocking request Alexander Aring
@ 2023-08-16 13:07   ` Jeff Layton
  2023-08-17  1:19     ` Alexander Aring
  0 siblings, 1 reply; 22+ messages in thread
From: Jeff Layton @ 2023-08-16 13:07 UTC (permalink / raw)
  To: Alexander Aring, linux-nfs
  Cc: cluster-devel, ocfs2-devel, linux-fsdevel, teigland, rpeterso,
	agruenba, trond.myklebust, anna, chuck.lever

On Mon, 2023-08-14 at 17:11 -0400, Alexander Aring wrote:
> This patch uses the FL_SLEEP flag in struct file_lock to check if it's a
> blocking request in case if the request coming from nfs lockd process
> indicated by lm_grant() is set.
> 
> IF FL_SLEEP is set a asynchronous blocking request is being made and
> it's waiting for lm_grant() callback being called to signal the lock was
> granted. If it's not set a synchronous non-blocking request is being made.
> 
> Signed-off-by: Alexander Aring <aahringo@redhat.com>
> ---
>  fs/dlm/plock.c | 38 ++++++++++++++++++++++----------------
>  1 file changed, 22 insertions(+), 16 deletions(-)
> 
> diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
> index 0094fa4004cc..524771002a2f 100644
> --- a/fs/dlm/plock.c
> +++ b/fs/dlm/plock.c
> @@ -140,7 +140,6 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
>  	op->info.optype		= DLM_PLOCK_OP_LOCK;
>  	op->info.pid		= fl->fl_pid;
>  	op->info.ex		= (fl->fl_type == F_WRLCK);
> -	op->info.wait		= IS_SETLKW(cmd);
>  	op->info.fsid		= ls->ls_global_id;
>  	op->info.number		= number;
>  	op->info.start		= fl->fl_start;
> @@ -148,24 +147,31 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
>  	op->info.owner = (__u64)(long)fl->fl_owner;
>  	/* async handling */
>  	if (fl->fl_lmops && fl->fl_lmops->lm_grant) {
> -		op_data = kzalloc(sizeof(*op_data), GFP_NOFS);
> -		if (!op_data) {
> -			dlm_release_plock_op(op);
> -			rv = -ENOMEM;
> -			goto out;
> -		}
> +		if (fl->fl_flags & FL_SLEEP) {
> +			op_data = kzalloc(sizeof(*op_data), GFP_NOFS);
> +			if (!op_data) {
> +				dlm_release_plock_op(op);
> +				rv = -ENOMEM;
> +				goto out;
> +			}
>  
> -		op_data->callback = fl->fl_lmops->lm_grant;
> -		locks_init_lock(&op_data->flc);
> -		locks_copy_lock(&op_data->flc, fl);
> -		op_data->fl		= fl;
> -		op_data->file	= file;
> +			op->info.wait = 1;
> +			op_data->callback = fl->fl_lmops->lm_grant;
> +			locks_init_lock(&op_data->flc);
> +			locks_copy_lock(&op_data->flc, fl);
> +			op_data->fl		= fl;
> +			op_data->file	= file;
>  
> -		op->data = op_data;
> +			op->data = op_data;
>  
> -		send_op(op);
> -		rv = FILE_LOCK_DEFERRED;
> -		goto out;
> +			send_op(op);
> +			rv = FILE_LOCK_DEFERRED;
> +			goto out;

A question...we're returning FILE_LOCK_DEFERRED after the DLM request is
sent. If it ends up being blocked, what happens? Does it do a lm_grant
downcall with -EAGAIN or something as the result?


> +		} else {
> +			op->info.wait = 0;
> +		}
> +	} else {
> +		op->info.wait = IS_SETLKW(cmd);
>  	}
>  
>  	send_op(op);

Looks reasonable overall.

Now that I look, we have quite a number of places in the kernel that
seem to check for F_SETLKW, when what they really want is to check
FL_SLEEP.
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFCv2 6/7] dlm: use FL_SLEEP to check if blocking request
  2023-08-16 13:07   ` Jeff Layton
@ 2023-08-17  1:19     ` Alexander Aring
  2023-08-17 11:27       ` Jeff Layton
  0 siblings, 1 reply; 22+ messages in thread
From: Alexander Aring @ 2023-08-17  1:19 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-nfs, cluster-devel, ocfs2-devel, linux-fsdevel, teigland,
	rpeterso, agruenba, trond.myklebust, anna, chuck.lever

Hi,

On Wed, Aug 16, 2023 at 9:07 AM Jeff Layton <jlayton@kernel.org> wrote:
>
> On Mon, 2023-08-14 at 17:11 -0400, Alexander Aring wrote:
> > This patch uses the FL_SLEEP flag in struct file_lock to check if it's a
> > blocking request in case if the request coming from nfs lockd process
> > indicated by lm_grant() is set.
> >
> > IF FL_SLEEP is set a asynchronous blocking request is being made and
> > it's waiting for lm_grant() callback being called to signal the lock was
> > granted. If it's not set a synchronous non-blocking request is being made.
> >
> > Signed-off-by: Alexander Aring <aahringo@redhat.com>
> > ---
> >  fs/dlm/plock.c | 38 ++++++++++++++++++++++----------------
> >  1 file changed, 22 insertions(+), 16 deletions(-)
> >
> > diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
> > index 0094fa4004cc..524771002a2f 100644
> > --- a/fs/dlm/plock.c
> > +++ b/fs/dlm/plock.c
> > @@ -140,7 +140,6 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
> >       op->info.optype         = DLM_PLOCK_OP_LOCK;
> >       op->info.pid            = fl->fl_pid;
> >       op->info.ex             = (fl->fl_type == F_WRLCK);
> > -     op->info.wait           = IS_SETLKW(cmd);
> >       op->info.fsid           = ls->ls_global_id;
> >       op->info.number         = number;
> >       op->info.start          = fl->fl_start;
> > @@ -148,24 +147,31 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
> >       op->info.owner = (__u64)(long)fl->fl_owner;
> >       /* async handling */
> >       if (fl->fl_lmops && fl->fl_lmops->lm_grant) {
> > -             op_data = kzalloc(sizeof(*op_data), GFP_NOFS);
> > -             if (!op_data) {
> > -                     dlm_release_plock_op(op);
> > -                     rv = -ENOMEM;
> > -                     goto out;
> > -             }
> > +             if (fl->fl_flags & FL_SLEEP) {
> > +                     op_data = kzalloc(sizeof(*op_data), GFP_NOFS);
> > +                     if (!op_data) {
> > +                             dlm_release_plock_op(op);
> > +                             rv = -ENOMEM;
> > +                             goto out;
> > +                     }
> >
> > -             op_data->callback = fl->fl_lmops->lm_grant;
> > -             locks_init_lock(&op_data->flc);
> > -             locks_copy_lock(&op_data->flc, fl);
> > -             op_data->fl             = fl;
> > -             op_data->file   = file;
> > +                     op->info.wait = 1;
> > +                     op_data->callback = fl->fl_lmops->lm_grant;
> > +                     locks_init_lock(&op_data->flc);
> > +                     locks_copy_lock(&op_data->flc, fl);
> > +                     op_data->fl             = fl;
> > +                     op_data->file   = file;
> >
> > -             op->data = op_data;
> > +                     op->data = op_data;
> >
> > -             send_op(op);
> > -             rv = FILE_LOCK_DEFERRED;
> > -             goto out;
> > +                     send_op(op);
> > +                     rv = FILE_LOCK_DEFERRED;
> > +                     goto out;
>
> A question...we're returning FILE_LOCK_DEFERRED after the DLM request is
> sent. If it ends up being blocked, what happens? Does it do a lm_grant
> downcall with -EAGAIN or something as the result?
>

no, when info->wait is set then it is a blocked lock request, which
means lm_grant() will be called when the lock request is granted.

>
> > +             } else {
> > +                     op->info.wait = 0;
> > +             }
> > +     } else {
> > +             op->info.wait = IS_SETLKW(cmd);
> >       }
> >
> >       send_op(op);
>
> Looks reasonable overall.
>
> Now that I look, we have quite a number of places in the kernel that
> seem to check for F_SETLKW, when what they really want is to check
> FL_SLEEP.

Yes, so far I understand FL_SLEEP is F_SETLKW when you get only
F_SETLK in case of fl->fl_lmops && fl->fl_lmops->lm_grant is true. It
is confusing but this is how it works... if it's not set we will get
F_SETLKW and this should imply FL_SLEEP is set.

- Alex


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFCv2 4/7] locks: update lock callback documentation
  2023-08-16 12:01   ` Jeff Layton
@ 2023-08-17  1:23     ` Alexander Aring
  0 siblings, 0 replies; 22+ messages in thread
From: Alexander Aring @ 2023-08-17  1:23 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-nfs, cluster-devel, ocfs2-devel, linux-fsdevel, teigland,
	rpeterso, agruenba, trond.myklebust, anna, chuck.lever

Hi,

On Wed, Aug 16, 2023 at 8:01 AM Jeff Layton <jlayton@kernel.org> wrote:
>
> On Mon, 2023-08-14 at 17:11 -0400, Alexander Aring wrote:
> > This patch updates the existing documentation regarding recent changes
> > to vfs_lock_file() and lm_grant() is set. In case of lm_grant() is set
> > we only handle FILE_LOCK_DEFERRED in case of FL_SLEEP in fl_flags is not
> > set. This is the case of an blocking lock request. Non-blocking lock
> > requests, when FL_SLEEP is not set, are handled in a synchronized way.
> >
> > Signed-off-by: Alexander Aring <aahringo@redhat.com>
> > ---
> >  fs/locks.c | 28 ++++++++++++++--------------
> >  1 file changed, 14 insertions(+), 14 deletions(-)
> >
> > diff --git a/fs/locks.c b/fs/locks.c
> > index df8b26a42524..a8e51f462b43 100644
> > --- a/fs/locks.c
> > +++ b/fs/locks.c
> > @@ -2255,21 +2255,21 @@ int fcntl_getlk(struct file *filp, unsigned int cmd, struct flock *flock)
> >   * To avoid blocking kernel daemons, such as lockd, that need to acquire POSIX
> >   * locks, the ->lock() interface may return asynchronously, before the lock has
> >   * been granted or denied by the underlying filesystem, if (and only if)
> > - * lm_grant is set. Callers expecting ->lock() to return asynchronously
> > - * will only use F_SETLK, not F_SETLKW; they will set FL_SLEEP if (and only if)
> > - * the request is for a blocking lock. When ->lock() does return asynchronously,
> > - * it must return FILE_LOCK_DEFERRED, and call ->lm_grant() when the lock
> > - * request completes.
> > - * If the request is for non-blocking lock the file system should return
> > - * FILE_LOCK_DEFERRED then try to get the lock and call the callback routine
> > - * with the result. If the request timed out the callback routine will return a
> > + * lm_grant and FL_SLEEP in fl_flags is set. Callers expecting ->lock() to return
> > + * asynchronously will only use F_SETLK, not F_SETLKW; When ->lock() does return
>
> Isn't the above backward? Shouldn't it say "Callers expecting ->lock()
> to return asynchronously will only use F_SETLKW, not F_SETLK" ?
>

So far I know lockd will always use F_SETLK only, if it's a blocking
or non-blocking request you need to evaluate FL_SLEEP. But if
lm_grant() is not set we are using a check on cmd if it's F_SETLK or
F_SETLKW to check if it's non-blocking or blocking.

If lm_grant() is set and checking on F_SETLKW should never be the
case, because it will never be true (speaking from lockd point of
view).

- Alex


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFCv2 2/7] lockd: FILE_LOCK_DEFERRED only on FL_SLEEP
  2023-08-16 11:37   ` Jeff Layton
@ 2023-08-17  1:40     ` Alexander Aring
  0 siblings, 0 replies; 22+ messages in thread
From: Alexander Aring @ 2023-08-17  1:40 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-nfs, cluster-devel, ocfs2-devel, linux-fsdevel, teigland,
	rpeterso, agruenba, trond.myklebust, anna, chuck.lever

Hi,

On Wed, Aug 16, 2023 at 7:37 AM Jeff Layton <jlayton@kernel.org> wrote:
>
> On Mon, 2023-08-14 at 17:11 -0400, Alexander Aring wrote:
> > This patch removes to handle non-blocking lock requests as asynchronous
> > lock request returning FILE_LOCK_DEFERRED. When fl_lmops and lm_grant()
> > is set and a non-blocking lock request returns FILE_LOCK_DEFERRED will
> > end in an WARNING to signal the user the misusage of the API.
> >
>
> Probably need to rephrase the word salad in the first sentence of the
> commit log. I had to go over it a few times to understand what was going
> on here.
>

ok. I will go over it again.

> In any case, I'm guessing that the idea here is that GFS2/DLM shouldn't
> ever return FILE_LOCK_DEFERRED if this is a non-wait request (i.e.
> someone called F_SETLK instead of F_SETLKW)?
>

Yes, non-wait requests (meaning trylock) does not return
FILE_LOCK_DEFERRED. I added in some patch a WARN_ON() if this would be
the case.

However I mentioned in other patches there is this mismatch between
F_SETLK from lockd and figure out if it's wait and non-wait if
FL_SLEEP is set, but somehow if it's not coming from lockd (lm_grant
is not set) it's going over the cmd if it's F_SETLKW. So far I
understand DLM should never make this decision over the F_SETLK vs
F_SETLKW it should always check on FL_SLEEP. I can change it to use
FL_SLEEP only.

> That may be ok, but again, lockd goes to great lengths to avoid blocking
> and I think it's generally a good idea. If an upcall to DLM can take a
> long time, it might be a good idea to continue to allow a !wait request
> to return FILE_LOCK_DEFERRED.
>

In the case of DLM there is no difference between upcall/downcall if
lockd does other operations like unlock/cancellation requests. We
don't do the optimization there, why are we doing it for !wait
requests... but okay I can change it.

> I guess this really depends on the current behavior today though. Does
> DLM ever return FILE_LOCK_DEFERRED on a non-blocking lock request?
>

I change it so that it doesn't do it, but I can change it !wait
requests will return FILE_LOCK_DEFERRED and be handled asynchronously.

- Alex

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFCv2 6/7] dlm: use FL_SLEEP to check if blocking request
  2023-08-17  1:19     ` Alexander Aring
@ 2023-08-17 11:27       ` Jeff Layton
  2023-08-17 13:02         ` Alexander Aring
  0 siblings, 1 reply; 22+ messages in thread
From: Jeff Layton @ 2023-08-17 11:27 UTC (permalink / raw)
  To: Alexander Aring
  Cc: linux-nfs, cluster-devel, ocfs2-devel, linux-fsdevel, teigland,
	rpeterso, agruenba, trond.myklebust, anna, chuck.lever

On Wed, 2023-08-16 at 21:19 -0400, Alexander Aring wrote:
> Hi,
> 
> On Wed, Aug 16, 2023 at 9:07 AM Jeff Layton <jlayton@kernel.org> wrote:
> > 
> > On Mon, 2023-08-14 at 17:11 -0400, Alexander Aring wrote:
> > > This patch uses the FL_SLEEP flag in struct file_lock to check if it's a
> > > blocking request in case if the request coming from nfs lockd process
> > > indicated by lm_grant() is set.
> > > 
> > > IF FL_SLEEP is set a asynchronous blocking request is being made and
> > > it's waiting for lm_grant() callback being called to signal the lock was
> > > granted. If it's not set a synchronous non-blocking request is being made.
> > > 
> > > Signed-off-by: Alexander Aring <aahringo@redhat.com>
> > > ---
> > >  fs/dlm/plock.c | 38 ++++++++++++++++++++++----------------
> > >  1 file changed, 22 insertions(+), 16 deletions(-)
> > > 
> > > diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
> > > index 0094fa4004cc..524771002a2f 100644
> > > --- a/fs/dlm/plock.c
> > > +++ b/fs/dlm/plock.c
> > > @@ -140,7 +140,6 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
> > >       op->info.optype         = DLM_PLOCK_OP_LOCK;
> > >       op->info.pid            = fl->fl_pid;
> > >       op->info.ex             = (fl->fl_type == F_WRLCK);
> > > -     op->info.wait           = IS_SETLKW(cmd);
> > >       op->info.fsid           = ls->ls_global_id;
> > >       op->info.number         = number;
> > >       op->info.start          = fl->fl_start;
> > > @@ -148,24 +147,31 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
> > >       op->info.owner = (__u64)(long)fl->fl_owner;
> > >       /* async handling */
> > >       if (fl->fl_lmops && fl->fl_lmops->lm_grant) {
> > > -             op_data = kzalloc(sizeof(*op_data), GFP_NOFS);
> > > -             if (!op_data) {
> > > -                     dlm_release_plock_op(op);
> > > -                     rv = -ENOMEM;
> > > -                     goto out;
> > > -             }
> > > +             if (fl->fl_flags & FL_SLEEP) {
> > > +                     op_data = kzalloc(sizeof(*op_data), GFP_NOFS);
> > > +                     if (!op_data) {
> > > +                             dlm_release_plock_op(op);
> > > +                             rv = -ENOMEM;
> > > +                             goto out;
> > > +                     }
> > > 
> > > -             op_data->callback = fl->fl_lmops->lm_grant;
> > > -             locks_init_lock(&op_data->flc);
> > > -             locks_copy_lock(&op_data->flc, fl);
> > > -             op_data->fl             = fl;
> > > -             op_data->file   = file;
> > > +                     op->info.wait = 1;
> > > +                     op_data->callback = fl->fl_lmops->lm_grant;
> > > +                     locks_init_lock(&op_data->flc);
> > > +                     locks_copy_lock(&op_data->flc, fl);
> > > +                     op_data->fl             = fl;
> > > +                     op_data->file   = file;
> > > 
> > > -             op->data = op_data;
> > > +                     op->data = op_data;
> > > 
> > > -             send_op(op);
> > > -             rv = FILE_LOCK_DEFERRED;
> > > -             goto out;
> > > +                     send_op(op);
> > > +                     rv = FILE_LOCK_DEFERRED;
> > > +                     goto out;
> > 
> > A question...we're returning FILE_LOCK_DEFERRED after the DLM request is
> > sent. If it ends up being blocked, what happens? Does it do a lm_grant
> > downcall with -EAGAIN or something as the result?
> > 
> 
> no, when info->wait is set then it is a blocked lock request, which
> means lm_grant() will be called when the lock request is granted.
> 

Ok, that's probably problematic with the current code too. lockd will
time out the block after 7s, so if the lock isn't granted in that time
it'll give up on it.

> > 
> > > +             } else {
> > > +                     op->info.wait = 0;
> > > +             }
> > > +     } else {
> > > +             op->info.wait = IS_SETLKW(cmd);
> > >       }
> > > 
> > >       send_op(op);
> > 
> > Looks reasonable overall.
> > 
> > Now that I look, we have quite a number of places in the kernel that
> > seem to check for F_SETLKW, when what they really want is to check
> > FL_SLEEP.
> 
> Yes, so far I understand FL_SLEEP is F_SETLKW when you get only
> F_SETLK in case of fl->fl_lmops && fl->fl_lmops->lm_grant is true. It
> is confusing but this is how it works... if it's not set we will get
> F_SETLKW and this should imply FL_SLEEP is set.
> 
> 

Yeah. Might be good to consider how to make this more consistent across
the kernel.
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFCv2 6/7] dlm: use FL_SLEEP to check if blocking request
  2023-08-17 11:27       ` Jeff Layton
@ 2023-08-17 13:02         ` Alexander Aring
  0 siblings, 0 replies; 22+ messages in thread
From: Alexander Aring @ 2023-08-17 13:02 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-nfs, cluster-devel, ocfs2-devel, linux-fsdevel, teigland,
	rpeterso, agruenba, trond.myklebust, anna, chuck.lever

Hi,

On Thu, Aug 17, 2023 at 7:27 AM Jeff Layton <jlayton@kernel.org> wrote:
>
> On Wed, 2023-08-16 at 21:19 -0400, Alexander Aring wrote:
> > Hi,
> >
> > On Wed, Aug 16, 2023 at 9:07 AM Jeff Layton <jlayton@kernel.org> wrote:
> > >
> > > On Mon, 2023-08-14 at 17:11 -0400, Alexander Aring wrote:
> > > > This patch uses the FL_SLEEP flag in struct file_lock to check if it's a
> > > > blocking request in case if the request coming from nfs lockd process
> > > > indicated by lm_grant() is set.
> > > >
> > > > IF FL_SLEEP is set a asynchronous blocking request is being made and
> > > > it's waiting for lm_grant() callback being called to signal the lock was
> > > > granted. If it's not set a synchronous non-blocking request is being made.
> > > >
> > > > Signed-off-by: Alexander Aring <aahringo@redhat.com>
> > > > ---
> > > >  fs/dlm/plock.c | 38 ++++++++++++++++++++++----------------
> > > >  1 file changed, 22 insertions(+), 16 deletions(-)
> > > >
> > > > diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
> > > > index 0094fa4004cc..524771002a2f 100644
> > > > --- a/fs/dlm/plock.c
> > > > +++ b/fs/dlm/plock.c
> > > > @@ -140,7 +140,6 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
> > > >       op->info.optype         = DLM_PLOCK_OP_LOCK;
> > > >       op->info.pid            = fl->fl_pid;
> > > >       op->info.ex             = (fl->fl_type == F_WRLCK);
> > > > -     op->info.wait           = IS_SETLKW(cmd);
> > > >       op->info.fsid           = ls->ls_global_id;
> > > >       op->info.number         = number;
> > > >       op->info.start          = fl->fl_start;
> > > > @@ -148,24 +147,31 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
> > > >       op->info.owner = (__u64)(long)fl->fl_owner;
> > > >       /* async handling */
> > > >       if (fl->fl_lmops && fl->fl_lmops->lm_grant) {
> > > > -             op_data = kzalloc(sizeof(*op_data), GFP_NOFS);
> > > > -             if (!op_data) {
> > > > -                     dlm_release_plock_op(op);
> > > > -                     rv = -ENOMEM;
> > > > -                     goto out;
> > > > -             }
> > > > +             if (fl->fl_flags & FL_SLEEP) {
> > > > +                     op_data = kzalloc(sizeof(*op_data), GFP_NOFS);
> > > > +                     if (!op_data) {
> > > > +                             dlm_release_plock_op(op);
> > > > +                             rv = -ENOMEM;
> > > > +                             goto out;
> > > > +                     }
> > > >
> > > > -             op_data->callback = fl->fl_lmops->lm_grant;
> > > > -             locks_init_lock(&op_data->flc);
> > > > -             locks_copy_lock(&op_data->flc, fl);
> > > > -             op_data->fl             = fl;
> > > > -             op_data->file   = file;
> > > > +                     op->info.wait = 1;
> > > > +                     op_data->callback = fl->fl_lmops->lm_grant;
> > > > +                     locks_init_lock(&op_data->flc);
> > > > +                     locks_copy_lock(&op_data->flc, fl);
> > > > +                     op_data->fl             = fl;
> > > > +                     op_data->file   = file;
> > > >
> > > > -             op->data = op_data;
> > > > +                     op->data = op_data;
> > > >
> > > > -             send_op(op);
> > > > -             rv = FILE_LOCK_DEFERRED;
> > > > -             goto out;
> > > > +                     send_op(op);
> > > > +                     rv = FILE_LOCK_DEFERRED;
> > > > +                     goto out;
> > >
> > > A question...we're returning FILE_LOCK_DEFERRED after the DLM request is
> > > sent. If it ends up being blocked, what happens? Does it do a lm_grant
> > > downcall with -EAGAIN or something as the result?
> > >
> >
> > no, when info->wait is set then it is a blocked lock request, which
> > means lm_grant() will be called when the lock request is granted.
> >
>
> Ok, that's probably problematic with the current code too. lockd will
> time out the block after 7s, so if the lock isn't granted in that time
> it'll give up on it.
>

that's why blocked lock requests (meaning F_SETLKW, FL_SLEEP is set)
need to use NLM_NEVER? It will never give up on it.

I thought NLM_TIMEOUT is just the polling frequency for doing a
blocked request by doing non-blocking requests. Means trylocks in a
loop with a polling frequency.

> > >
> > > > +             } else {
> > > > +                     op->info.wait = 0;
> > > > +             }
> > > > +     } else {
> > > > +             op->info.wait = IS_SETLKW(cmd);
> > > >       }
> > > >
> > > >       send_op(op);
> > >
> > > Looks reasonable overall.
> > >
> > > Now that I look, we have quite a number of places in the kernel that
> > > seem to check for F_SETLKW, when what they really want is to check
> > > FL_SLEEP.
> >
> > Yes, so far I understand FL_SLEEP is F_SETLKW when you get only
> > F_SETLK in case of fl->fl_lmops && fl->fl_lmops->lm_grant is true. It
> > is confusing but this is how it works... if it's not set we will get
> > F_SETLKW and this should imply FL_SLEEP is set.
> >
> >
>
> Yeah. Might be good to consider how to make this more consistent across
> the kernel.

ok. I will do some grep and try to find those places.

- Alex


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFCv2 1/7] lockd: fix race in async lock request handling
  2023-08-15 17:49   ` Jeff Layton
  2023-08-15 18:21     ` Jeff Layton
@ 2023-08-17 18:36     ` Alexander Aring
  1 sibling, 0 replies; 22+ messages in thread
From: Alexander Aring @ 2023-08-17 18:36 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-nfs, cluster-devel, ocfs2-devel, linux-fsdevel, teigland,
	rpeterso, agruenba, trond.myklebust, anna, chuck.lever

Hi,

On Tue, Aug 15, 2023 at 1:49 PM Jeff Layton <jlayton@kernel.org> wrote:
>
> On Mon, 2023-08-14 at 17:11 -0400, Alexander Aring wrote:
> > This patch fixes a race in async lock request handling between adding
> > the relevant struct nlm_block to nlm_blocked list after the request was
> > sent by vfs_lock_file() and nlmsvc_grant_deferred() does a lookup of the
> > nlm_block in the nlm_blocked list. It could be that the async request is
> > completed before the nlm_block was added to the list. This would end
> > in a -ENOENT and a kernel log message of "lockd: grant for unknown
> > block".
> >
> > To solve this issue we add the nlm_block before the vfs_lock_file() call
> > to be sure it has been added when a possible nlmsvc_grant_deferred() is
> > called. If the vfs_lock_file() results in an case when it wouldn't be
> > added to nlm_blocked list, the nlm_block struct will be removed from
> > this list again.
> >
> > The introducing of the new B_PENDING_CALLBACK nlm_block flag will handle
> > async lock requests on a pending lock requests as a retry on the caller
> > level to hit the -EAGAIN case.
> >
> > Signed-off-by: Alexander Aring <aahringo@redhat.com>
> > ---
> >  fs/lockd/svclock.c          | 100 ++++++++++++++++++++++++++----------
> >  include/linux/lockd/lockd.h |   2 +
> >  2 files changed, 74 insertions(+), 28 deletions(-)
> >
> > diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
> > index c43ccdf28ed9..7d63524bdb81 100644
> > --- a/fs/lockd/svclock.c
> > +++ b/fs/lockd/svclock.c
> > @@ -133,6 +133,7 @@ nlmsvc_remove_block(struct nlm_block *block)
> >  {
> >       if (!list_empty(&block->b_list)) {
> >               spin_lock(&nlm_blocked_lock);
> > +             block->b_flags &= ~B_PENDING_CALLBACK;
> >               list_del_init(&block->b_list);
> >               spin_unlock(&nlm_blocked_lock);
> >               nlmsvc_release_block(block);
> > @@ -232,6 +233,7 @@ nlmsvc_create_block(struct svc_rqst *rqstp, struct nlm_host *host,
> >       kref_init(&block->b_count);
> >       INIT_LIST_HEAD(&block->b_list);
> >       INIT_LIST_HEAD(&block->b_flist);
> > +     mutex_init(&block->b_cb_mutex);
> >
> >       if (!nlmsvc_setgrantargs(call, lock))
> >               goto failed_free;
> > @@ -289,6 +291,8 @@ static void nlmsvc_free_block(struct kref *kref)
> >
> >       dprintk("lockd: freeing block %p...\n", block);
> >
> > +     WARN_ON_ONCE(block->b_flags & B_PENDING_CALLBACK);
> > +
> >       /* Remove block from file's list of blocks */
> >       list_del_init(&block->b_flist);
> >       mutex_unlock(&file->f_mutex);
> > @@ -532,6 +536,13 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
> >               goto out;
> >       }
> >
> > +     mutex_lock(&block->b_cb_mutex);
> > +     if (block->b_flags & B_PENDING_CALLBACK)
> > +             goto pending_request;
> > +
> > +     /* Append to list of blocked */
> > +     nlmsvc_insert_block(block, NLM_NEVER);
> > +
> >       if (!wait)
> >               lock->fl.fl_flags &= ~FL_SLEEP;
> >       mode = lock_to_openmode(&lock->fl);
> > @@ -541,9 +552,13 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
> >       dprintk("lockd: vfs_lock_file returned %d\n", error);
> >       switch (error) {
> >               case 0:
> > +                     nlmsvc_remove_block(block);
> >                       ret = nlm_granted;
> > -                     goto out;
> > +                     goto out_cb_mutex;
> >               case -EAGAIN:
> > +                     if (!wait)
> > +                             nlmsvc_remove_block(block);
> > +pending_request:
> >                       /*
> >                        * If this is a blocking request for an
> >                        * already pending lock request then we need
> > @@ -552,26 +567,29 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
> >                       if (wait)
> >                               break;
> >                       ret = async_block ? nlm_lck_blocked : nlm_lck_denied;
> > -                     goto out;
> > +                     goto out_cb_mutex;
> >               case FILE_LOCK_DEFERRED:
> > +                     block->b_flags |= B_PENDING_CALLBACK;
> > +
> >                       if (wait)
> >                               break;
> >                       /* Filesystem lock operation is in progress
> >                          Add it to the queue waiting for callback */
> >                       ret = nlmsvc_defer_lock_rqst(rqstp, block);
> > -                     goto out;
> > +                     goto out_cb_mutex;
> >               case -EDEADLK:
> > +                     nlmsvc_remove_block(block);
> >                       ret = nlm_deadlock;
> > -                     goto out;
> > +                     goto out_cb_mutex;
> >               default:                        /* includes ENOLCK */
> > +                     nlmsvc_remove_block(block);
> >                       ret = nlm_lck_denied_nolocks;
> > -                     goto out;
> > +                     goto out_cb_mutex;
> >       }
> >
> >       ret = nlm_lck_blocked;
> > -
> > -     /* Append to list of blocked */
> > -     nlmsvc_insert_block(block, NLM_NEVER);
> > +out_cb_mutex:
> > +     mutex_unlock(&block->b_cb_mutex);
> >  out:
> >       mutex_unlock(&file->f_mutex);
> >       nlmsvc_release_block(block);
> > @@ -728,34 +746,60 @@ nlmsvc_update_deferred_block(struct nlm_block *block, int result)
> >               block->b_flags |= B_TIMED_OUT;
> >  }
> >
> > +static int __nlmsvc_grant_deferred(struct nlm_block *block,
> > +                                struct file_lock *fl,
> > +                                int result)
> > +{
> > +     int rc = 0;
> > +
> > +     dprintk("lockd: nlmsvc_notify_blocked block %p flags %d\n",
> > +                                     block, block->b_flags);
> > +     if (block->b_flags & B_QUEUED) {
> > +             if (block->b_flags & B_TIMED_OUT) {
> > +                     rc = -ENOLCK;
> > +                     goto out;
> > +             }
> > +             nlmsvc_update_deferred_block(block, result);
> > +     } else if (result == 0)
> > +             block->b_granted = 1;
> > +
> > +     nlmsvc_insert_block_locked(block, 0);
> > +     svc_wake_up(block->b_daemon);
> > +out:
> > +     return rc;
> > +}
> > +
> >  static int nlmsvc_grant_deferred(struct file_lock *fl, int result)
> >  {
> > -     struct nlm_block *block;
> > -     int rc = -ENOENT;
> > +     struct nlm_block *iter, *block = NULL;
> > +     int rc;
> >
> >       spin_lock(&nlm_blocked_lock);
> > -     list_for_each_entry(block, &nlm_blocked, b_list) {
> > -             if (nlm_compare_locks(&block->b_call->a_args.lock.fl, fl)) {
> > -                     dprintk("lockd: nlmsvc_notify_blocked block %p flags %d\n",
> > -                                                     block, block->b_flags);
> > -                     if (block->b_flags & B_QUEUED) {
> > -                             if (block->b_flags & B_TIMED_OUT) {
> > -                                     rc = -ENOLCK;
> > -                                     break;
> > -                             }
> > -                             nlmsvc_update_deferred_block(block, result);
> > -                     } else if (result == 0)
> > -                             block->b_granted = 1;
> > -
> > -                     nlmsvc_insert_block_locked(block, 0);
> > -                     svc_wake_up(block->b_daemon);
> > -                     rc = 0;
> > +     list_for_each_entry(iter, &nlm_blocked, b_list) {
> > +             if (nlm_compare_locks(&iter->b_call->a_args.lock.fl, fl)) {
> > +                     kref_get(&iter->b_count);
> > +                     block = iter;
> >                       break;
> >               }
> >       }
> >       spin_unlock(&nlm_blocked_lock);
> > -     if (rc == -ENOENT)
> > -             printk(KERN_WARNING "lockd: grant for unknown block\n");
> > +
> > +     if (!block) {
> > +             pr_warn("lockd: grant for unknown pending block\n");
> > +             return -ENOENT;
> > +     }
> > +
> > +     /* don't interfere with nlmsvc_lock() */
> > +     mutex_lock(&block->b_cb_mutex);
> > +     block->b_flags &= ~B_PENDING_CALLBACK;
> > +
> > +     spin_lock(&nlm_blocked_lock);
> > +     WARN_ON_ONCE(list_empty(&block->b_list));
> > +     rc = __nlmsvc_grant_deferred(block, fl, result);
> > +     spin_unlock(&nlm_blocked_lock);
> > +     mutex_unlock(&block->b_cb_mutex);
> > +
> > +     nlmsvc_release_block(block);
> >       return rc;
> >  }
> >
> > diff --git a/include/linux/lockd/lockd.h b/include/linux/lockd/lockd.h
> > index f42594a9efe0..91f55458f5fc 100644
> > --- a/include/linux/lockd/lockd.h
> > +++ b/include/linux/lockd/lockd.h
> > @@ -185,10 +185,12 @@ struct nlm_block {
> >       struct nlm_file *       b_file;         /* file in question */
> >       struct cache_req *      b_cache_req;    /* deferred request handling */
> >       struct cache_deferred_req * b_deferred_req
> > +     struct mutex            b_cb_mutex;     /* callback mutex */
>
> There is no mention at all of this new mutex in the changelog or
> comments. It's not at all clear to me what this is intended to protect.
> In general, with lockd being a single-threaded service, we want to avoid
> sleeping locks. This will need some clear justification.
>

The idea was to protect the PENDING flag and to wait until
nlmsvc_lock() is "mostly" done and then allowing lm_grant() callback
to proceed. Before the f_mutex was doing that, but since I made !wait
request (non-blocking requests) being handled synchronously there was
a deadlock because the callback can be run in an unknown order. It was
fixed by having a new mutex on a per nlm_block basis.

However I will remove those changes and try to find another way.


> At a glance, it looks like you're trying to use this to hold
> B_PENDING_CALLBACK steady while a lock request is being handled. That
> suggests that you're using this mutex to serialize access to a section
> of code and not one or more specific data structures. We usually like to
> avoid that sort of thing, since locks that protect arbitrary sections of
> code become difficult to work with over time.
>
> I'm going to go out on a limb here though and suggest that there is
> probably a way to solve this problem that doesn't involve adding new
> locks.
>

ok.

> >       unsigned int            b_flags;        /* block flags */
> >  #define B_QUEUED             1       /* lock queued */
> >  #define B_GOT_CALLBACK               2       /* got lock or conflicting lock */
> >  #define B_TIMED_OUT          4       /* filesystem too slow to respond */
> > +#define B_PENDING_CALLBACK   8       /* pending callback for lock request */
> >  };
> >
> >  /*
>
> Do we need this new flag at all? It seems redundant. If we have a block
> on the list, then it is sort of by definition "pending callback". If
> it's not on the list anymore, then it's not. No?
>

I will try to find another way. A pending callback is more a check on
if it's a wait request (blocking request) and is it on the list or
not.

This flag was also there to avoid multiple wait requests while a wait
request is pending. I ran into this case and the DLM implementation
was never able to handle it. I can return -EAGAIN but DLM lock() needs
to keep track of those things, I prefer to handle this situation in
the upper layer.

I am not sure why lockd do multiple wait requests for the same
nlm_block when it's pending and waiting for lm_grant() callback. I
will split this handling out of this patch as it is something what
doesn't belong to the race fix between request and lm_grant()
callback.

- Alex


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFCv2 1/7] lockd: fix race in async lock request handling
  2023-08-15 18:21     ` Jeff Layton
@ 2023-08-17 18:39       ` Alexander Aring
  0 siblings, 0 replies; 22+ messages in thread
From: Alexander Aring @ 2023-08-17 18:39 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-nfs, cluster-devel, ocfs2-devel, linux-fsdevel, teigland,
	rpeterso, agruenba, trond.myklebust, anna, chuck.lever

Hi,

On Tue, Aug 15, 2023 at 2:21 PM Jeff Layton <jlayton@kernel.org> wrote:
>
> On Tue, 2023-08-15 at 13:49 -0400, Jeff Layton wrote:
> > On Mon, 2023-08-14 at 17:11 -0400, Alexander Aring wrote:
> > > This patch fixes a race in async lock request handling between adding
> > > the relevant struct nlm_block to nlm_blocked list after the request was
> > > sent by vfs_lock_file() and nlmsvc_grant_deferred() does a lookup of the
> > > nlm_block in the nlm_blocked list. It could be that the async request is
> > > completed before the nlm_block was added to the list. This would end
> > > in a -ENOENT and a kernel log message of "lockd: grant for unknown
> > > block".
> > >
> > > To solve this issue we add the nlm_block before the vfs_lock_file() call
> > > to be sure it has been added when a possible nlmsvc_grant_deferred() is
> > > called. If the vfs_lock_file() results in an case when it wouldn't be
> > > added to nlm_blocked list, the nlm_block struct will be removed from
> > > this list again.
> > >
> > > The introducing of the new B_PENDING_CALLBACK nlm_block flag will handle
> > > async lock requests on a pending lock requests as a retry on the caller
> > > level to hit the -EAGAIN case.
> > >
> > > Signed-off-by: Alexander Aring <aahringo@redhat.com>
> > > ---
> > >  fs/lockd/svclock.c          | 100 ++++++++++++++++++++++++++----------
> > >  include/linux/lockd/lockd.h |   2 +
> > >  2 files changed, 74 insertions(+), 28 deletions(-)
> > >
> > >
>
> [...]
>
> > > diff --git a/include/linux/lockd/lockd.h b/include/linux/lockd/lockd.h
> > > index f42594a9efe0..91f55458f5fc 100644
> > > --- a/include/linux/lockd/lockd.h
> > > +++ b/include/linux/lockd/lockd.h
> > > @@ -185,10 +185,12 @@ struct nlm_block {
> > >     struct nlm_file *       b_file;         /* file in question */
> > >     struct cache_req *      b_cache_req;    /* deferred request handling */
> > >     struct cache_deferred_req * b_deferred_req
> > > +   struct mutex            b_cb_mutex;     /* callback mutex */
> >
> > There is no mention at all of this new mutex in the changelog or
> > comments. It's not at all clear to me what this is intended to protect.
> > In general, with lockd being a single-threaded service, we want to avoid
> > sleeping locks. This will need some clear justification.
> >
> > At a glance, it looks like you're trying to use this to hold
> > B_PENDING_CALLBACK steady while a lock request is being handled. That
> > suggests that you're using this mutex to serialize access to a section
> > of code and not one or more specific data structures. We usually like to
> > avoid that sort of thing, since locks that protect arbitrary sections of
> > code become difficult to work with over time.
> >
> > I'm going to go out on a limb here though and suggest that there is
> > probably a way to solve this problem that doesn't involve adding new
> > locks.
> >
> > >     unsigned int            b_flags;        /* block flags */
> > >  #define B_QUEUED           1       /* lock queued */
> > >  #define B_GOT_CALLBACK             2       /* got lock or conflicting lock */
> > >  #define B_TIMED_OUT                4       /* filesystem too slow to respond */
> > > +#define B_PENDING_CALLBACK 8       /* pending callback for lock request */
> > >  };
> > >
> > >  /*
> >
> > Do we need this new flag at all? It seems redundant. If we have a block
> > on the list, then it is sort of by definition "pending callback". If
> > it's not on the list anymore, then it's not. No?
> >
>
> Do we need anything more than a patch along these lines? Note that this
> is untested, so RFC:
>
> ---------------------8<-----------------------
>
> [RFC PATCH] lockd: alternate fix for race between deferred lock and grant
>
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>  fs/lockd/svclock.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
> index c43ccdf28ed9..e9a84363c26e 100644
> --- a/fs/lockd/svclock.c
> +++ b/fs/lockd/svclock.c
> @@ -446,6 +446,8 @@ nlmsvc_defer_lock_rqst(struct svc_rqst *rqstp, struct nlm_block *block)
>
>         block->b_flags |= B_QUEUED;
>
> +       /* FIXME: remove and reinsert w/o dropping spinlock */
> +       nlmsvc_remove_block(block);
>         nlmsvc_insert_block(block, NLM_TIMEOUT);
>

a insert should just be okay, because there is an atomic switch if
it's already part of nlm_blocked and it will just update the timeout
of nlm_block in the list and it's order (because nlm_blocked is kind
of sorted according their timeouts in nlm_blocked).

>         block->b_cache_req = &rqstp->rq_chandle;
> @@ -535,6 +537,9 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
>         if (!wait)
>                 lock->fl.fl_flags &= ~FL_SLEEP;
>         mode = lock_to_openmode(&lock->fl);
> +
> +       /* Append to list of blocked */
> +       nlmsvc_insert_block(block, NLM_NEVER);
>         error = vfs_lock_file(file->f_file[mode], F_SETLK, &lock->fl, NULL);
>         lock->fl.fl_flags &= ~FL_SLEEP;
>
> @@ -542,6 +547,7 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
>         switch (error) {
>                 case 0:
>                         ret = nlm_granted;
> +                       nlmsvc_remove_block(block);
>                         goto out;
>                 case -EAGAIN:
>                         /*
> @@ -552,6 +558,7 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
>                         if (wait)
>                                 break;
>                         ret = async_block ? nlm_lck_blocked : nlm_lck_denied;
> +                       nlmsvc_remove_block(block);
>                         goto out;
>                 case FILE_LOCK_DEFERRED:
>                         if (wait)
> @@ -570,8 +577,6 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
>
>         ret = nlm_lck_blocked;
>
> -       /* Append to list of blocked */
> -       nlmsvc_insert_block(block, NLM_NEVER);

ok. I will try to start with that.

- Alex


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2023-08-17 18:40 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-14 21:11 [RFCv2 0/7] fs: nfs: async lock request changes Alexander Aring
2023-08-14 21:11 ` [RFCv2 1/7] lockd: fix race in async lock request handling Alexander Aring
2023-08-15 17:49   ` Jeff Layton
2023-08-15 18:21     ` Jeff Layton
2023-08-17 18:39       ` Alexander Aring
2023-08-17 18:36     ` Alexander Aring
2023-08-14 21:11 ` [RFCv2 2/7] lockd: FILE_LOCK_DEFERRED only on FL_SLEEP Alexander Aring
2023-08-16 11:37   ` Jeff Layton
2023-08-17  1:40     ` Alexander Aring
2023-08-14 21:11 ` [RFCv2 3/7] lockd: introduce safe async lock op Alexander Aring
2023-08-16 11:43   ` Jeff Layton
2023-08-14 21:11 ` [RFCv2 4/7] locks: update lock callback documentation Alexander Aring
2023-08-16 12:01   ` Jeff Layton
2023-08-17  1:23     ` Alexander Aring
2023-08-14 21:11 ` [RFCv2 5/7] dlm: use fl_owner from lockd Alexander Aring
2023-08-16 12:02   ` Jeff Layton
2023-08-14 21:11 ` [RFCv2 6/7] dlm: use FL_SLEEP to check if blocking request Alexander Aring
2023-08-16 13:07   ` Jeff Layton
2023-08-17  1:19     ` Alexander Aring
2023-08-17 11:27       ` Jeff Layton
2023-08-17 13:02         ` Alexander Aring
2023-08-14 21:11 ` [RFCv2 7/7] dlm: implement EXPORT_OP_SAFE_ASYNC_LOCK Alexander Aring

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).