cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
* [Cluster-devel] [PATCHv4 v6.5-rc2 0/3] fs: dlm: lock cancellation feature
@ 2023-07-20 12:22 Alexander Aring
  2023-07-20 12:22 ` [Cluster-devel] [PATCHv4 v6.5-rc2 1/3] fs: dlm: remove twice newline Alexander Aring
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Alexander Aring @ 2023-07-20 12:22 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

this patch series implements lock cancellation feature. Either if a
POSIX F_SETLKW got interrupted by a signal (most common use case) or
lockd sends a F_CANCELLK request.

As I note: the current nfs handling seems to be broken. This patch
assumes that the current behaviour works. Future patches will fix the
nfs handling which seems to be broken anyway.

- Alex

changes since v4:

- zero memory of info structure in dlm_posix_cancel() 

changes since v3:

- drop patch "fs: dlm: ignore DLM_PLOCK_FL_CLOSE flag results", we
  assume
  that plock ops with DLM_PLOCK_FL_CLOSE flag set will never fail.
- Let DLM_PLOCK_OP_CANCEL to always send a reply back. This is useful
  when
  the op fails, e.g. older dlm_controld will return -EINVAL and we can
  implement F_CANCELLK which does not have a reference of the plock_op
  instance.
- remove DLM_PLOCK_OP_FLAG_SENT as it was only a optimization for a
  rare case. That DLM_PLOCK_OP_CANCEL sends a reply back will
  synchronize it now.
- remove DLM_PLOCK_OP_FLAG_INTERRUPTED as it's not necessary anymore
  because waiting for a reply of DLM_PLOCK_OP_CANCEL we don't need to
  handle this special case anymore.
- add "fs: dlm: remove twice newline" because I saw this while doing nfs
  lockd experiments.

Alexander Aring (3):
  fs: dlm: remove twice newline
  fs: dlm: allow to F_SETLKW getting interrupted
  fs: dlm: fix F_CANCELLK to cancel pending request

 fs/dlm/plock.c                 | 163 ++++++++++++++++++++++++++-------
 fs/gfs2/file.c                 |   9 +-
 fs/ocfs2/stack_user.c          |  13 +--
 include/linux/dlm_plock.h      |   2 +
 include/uapi/linux/dlm_plock.h |   1 +
 5 files changed, 137 insertions(+), 51 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Cluster-devel] [PATCHv4 v6.5-rc2 1/3] fs: dlm: remove twice newline
  2023-07-20 12:22 [Cluster-devel] [PATCHv4 v6.5-rc2 0/3] fs: dlm: lock cancellation feature Alexander Aring
@ 2023-07-20 12:22 ` Alexander Aring
  2023-07-20 12:22 ` [Cluster-devel] [PATCHv4 v6.5-rc2 2/3] fs: dlm: allow to F_SETLKW getting interrupted Alexander Aring
  2023-07-20 12:22 ` [Cluster-devel] [PATCHv4 v6.5-rc2 3/3] fs: dlm: fix F_CANCELLK to cancel pending request Alexander Aring
  2 siblings, 0 replies; 9+ messages in thread
From: Alexander Aring @ 2023-07-20 12:22 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch removes a newline which log_print() already adds, also
removes wrapped string that causes a checkpatch warning.
---
 fs/dlm/plock.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
index 70a4752ed913..a34f605d8505 100644
--- a/fs/dlm/plock.c
+++ b/fs/dlm/plock.c
@@ -240,8 +240,8 @@ static int dlm_plock_callback(struct plock_op *op)
 	rv = notify(fl, 0);
 	if (rv) {
 		/* XXX: We need to cancel the fs lock here: */
-		log_print("dlm_plock_callback: lock granted after lock request "
-			  "failed; dangling lock!\n");
+		log_print("%s: lock granted after lock request failed; dangling lock!",
+			  __func__);
 		goto out;
 	}
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Cluster-devel] [PATCHv4 v6.5-rc2 2/3] fs: dlm: allow to F_SETLKW getting interrupted
  2023-07-20 12:22 [Cluster-devel] [PATCHv4 v6.5-rc2 0/3] fs: dlm: lock cancellation feature Alexander Aring
  2023-07-20 12:22 ` [Cluster-devel] [PATCHv4 v6.5-rc2 1/3] fs: dlm: remove twice newline Alexander Aring
@ 2023-07-20 12:22 ` Alexander Aring
  2023-07-20 12:22 ` [Cluster-devel] [PATCHv4 v6.5-rc2 3/3] fs: dlm: fix F_CANCELLK to cancel pending request Alexander Aring
  2 siblings, 0 replies; 9+ messages in thread
From: Alexander Aring @ 2023-07-20 12:22 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch implements dlm plock F_SETLKW interruption feature. If a
blocking posix lock request got interrupted in user space by a signal a
cancellation request for a non granted lock request to the user space
lock manager will be send. The user lock manager answers either with
zero or a negative errno code. A errno of -ENOENT signals that there is
currently no blocking lock request waiting to being granted. In case of
-ENOENT it was probably to late to request a cancellation and the
pending lock got granted. In any error case we will wait until the lock
is being granted as cancellation failed, this causes also that in case
of an older user lock manager returning -EINVAL we will wait as
cancellation is not supported which should be fine. If a user requires
this feature the user should update dlm user space to support lock
request cancellation.
---
 fs/dlm/plock.c                 | 56 ++++++++++++++++++++++------------
 include/uapi/linux/dlm_plock.h |  1 +
 2 files changed, 37 insertions(+), 20 deletions(-)

diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
index a34f605d8505..a8ffa0760913 100644
--- a/fs/dlm/plock.c
+++ b/fs/dlm/plock.c
@@ -74,30 +74,26 @@ static void send_op(struct plock_op *op)
 	wake_up(&send_wq);
 }
 
-/* If a process was killed while waiting for the only plock on a file,
-   locks_remove_posix will not see any lock on the file so it won't
-   send an unlock-close to us to pass on to userspace to clean up the
-   abandoned waiter.  So, we have to insert the unlock-close when the
-   lock call is interrupted. */
-
-static void do_unlock_close(const struct dlm_plock_info *info)
+static int do_lock_cancel(const struct dlm_plock_info *orig_info)
 {
 	struct plock_op *op;
+	int rv;
 
 	op = kzalloc(sizeof(*op), GFP_NOFS);
 	if (!op)
-		return;
+		return -ENOMEM;
+
+	op->info = *orig_info;
+	op->info.optype = DLM_PLOCK_OP_CANCEL;
+	op->info.wait = 0;
 
-	op->info.optype		= DLM_PLOCK_OP_UNLOCK;
-	op->info.pid		= info->pid;
-	op->info.fsid		= info->fsid;
-	op->info.number		= info->number;
-	op->info.start		= 0;
-	op->info.end		= OFFSET_MAX;
-	op->info.owner		= info->owner;
-
-	op->info.flags |= DLM_PLOCK_FL_CLOSE;
 	send_op(op);
+	wait_event(recv_wq, (op->done != 0));
+
+	rv = op->info.rv;
+
+	dlm_release_plock_op(op);
+	return rv;
 }
 
 int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
@@ -156,7 +152,7 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
 	send_op(op);
 
 	if (op->info.wait) {
-		rv = wait_event_killable(recv_wq, (op->done != 0));
+		rv = wait_event_interruptible(recv_wq, (op->done != 0));
 		if (rv == -ERESTARTSYS) {
 			spin_lock(&ops_lock);
 			/* recheck under ops_lock if we got a done != 0,
@@ -166,17 +162,37 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
 				spin_unlock(&ops_lock);
 				goto do_lock_wait;
 			}
-			list_del(&op->list);
 			spin_unlock(&ops_lock);
 
+			rv = do_lock_cancel(&op->info);
+			switch (rv) {
+			case 0:
+				/* waiter was deleted in user space, answer will never come
+				 * remove original request. The original request must be
+				 * on recv_list because the answer of do_lock_cancel()
+				 * synchronized it.
+				 */
+				spin_lock(&ops_lock);
+				list_del(&op->list);
+				spin_unlock(&ops_lock);
+				rv = -EINTR;
+				break;
+			case -ENOENT:
+				/* cancellation wasn't successful but op should be done */
+				fallthrough;
+			default:
+				/* internal error doing cancel we need to wait */
+				goto wait;
+			}
+
 			log_debug(ls, "%s: wait interrupted %x %llx pid %d",
 				  __func__, ls->ls_global_id,
 				  (unsigned long long)number, op->info.pid);
-			do_unlock_close(&op->info);
 			dlm_release_plock_op(op);
 			goto out;
 		}
 	} else {
+wait:
 		wait_event(recv_wq, (op->done != 0));
 	}
 
diff --git a/include/uapi/linux/dlm_plock.h b/include/uapi/linux/dlm_plock.h
index 63b6c1fd9169..eb66afcac40e 100644
--- a/include/uapi/linux/dlm_plock.h
+++ b/include/uapi/linux/dlm_plock.h
@@ -22,6 +22,7 @@ enum {
 	DLM_PLOCK_OP_LOCK = 1,
 	DLM_PLOCK_OP_UNLOCK,
 	DLM_PLOCK_OP_GET,
+	DLM_PLOCK_OP_CANCEL,
 };
 
 #define DLM_PLOCK_FL_CLOSE 1
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Cluster-devel] [PATCHv4 v6.5-rc2 3/3] fs: dlm: fix F_CANCELLK to cancel pending request
  2023-07-20 12:22 [Cluster-devel] [PATCHv4 v6.5-rc2 0/3] fs: dlm: lock cancellation feature Alexander Aring
  2023-07-20 12:22 ` [Cluster-devel] [PATCHv4 v6.5-rc2 1/3] fs: dlm: remove twice newline Alexander Aring
  2023-07-20 12:22 ` [Cluster-devel] [PATCHv4 v6.5-rc2 2/3] fs: dlm: allow to F_SETLKW getting interrupted Alexander Aring
@ 2023-07-20 12:22 ` Alexander Aring
  2023-07-21 16:25   ` Andreas Gruenbacher
  2 siblings, 1 reply; 9+ messages in thread
From: Alexander Aring @ 2023-07-20 12:22 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch fixes the current handling of F_CANCELLK by not just doing a
unlock as we need to try to cancel a lock at first. A unlock makes sense
on a non-blocking lock request but if it's a blocking lock request we
need to cancel the request until it's not granted yet. This patch is fixing
this behaviour by first try to cancel a lock request and if it's failed
it's unlocking the lock which seems to be granted.

Note: currently the nfs locking handling was disabled by commit
40595cdc93ed ("nfs: block notification on fs with its own ->lock").
However DLM was never being updated regarding to this change. Future
patches will try to fix lockd lock requests for DLM. This patch is
currently assuming the upstream DLM lockd handling is correct.
---
 fs/dlm/plock.c            | 103 +++++++++++++++++++++++++++++++++-----
 fs/gfs2/file.c            |   9 ++--
 fs/ocfs2/stack_user.c     |  13 ++---
 include/linux/dlm_plock.h |   2 +
 4 files changed, 98 insertions(+), 29 deletions(-)

diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
index a8ffa0760913..943d9f8e5564 100644
--- a/fs/dlm/plock.c
+++ b/fs/dlm/plock.c
@@ -42,6 +42,27 @@ static inline void set_version(struct dlm_plock_info *info)
 	info->version[2] = DLM_PLOCK_VERSION_PATCH;
 }
 
+static struct plock_op *plock_lookup_waiter(const struct dlm_plock_info *info)
+{
+	struct plock_op *op = NULL, *iter;
+
+	list_for_each_entry(iter, &recv_list, list) {
+		if (iter->info.fsid == info->fsid &&
+		    iter->info.number == info->number &&
+		    iter->info.owner == info->owner &&
+		    iter->info.pid == info->pid &&
+		    iter->info.start == info->start &&
+		    iter->info.end == info->end &&
+		    iter->info.ex == info->ex &&
+		    iter->info.wait) {
+			op = iter;
+			break;
+		}
+	}
+
+	return op;
+}
+
 static int check_version(struct dlm_plock_info *info)
 {
 	if ((DLM_PLOCK_VERSION_MAJOR != info->version[0]) ||
@@ -334,6 +355,74 @@ int dlm_posix_unlock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
 }
 EXPORT_SYMBOL_GPL(dlm_posix_unlock);
 
+/*
+ * NOTE: This implementation can only handle async lock requests as nfs
+ * do it. It cannot handle cancellation of a pending lock request sitting
+ * in wait_event(), but for now only nfs is the only user local kernel
+ * user.
+ */
+int dlm_posix_cancel(dlm_lockspace_t *lockspace, u64 number, struct file *file,
+		     struct file_lock *fl)
+{
+	struct dlm_plock_info info;
+	struct plock_op *op;
+	struct dlm_ls *ls;
+	int rv;
+
+	/* this only works for async request for now and nfs is the only
+	 * kernel user right now.
+	 */
+	if (WARN_ON_ONCE(!fl->fl_lmops || !fl->fl_lmops->lm_grant))
+		return -EOPNOTSUPP;
+
+	ls = dlm_find_lockspace_local(lockspace);
+	if (!ls)
+		return -EINVAL;
+
+	memset(&info, 0, sizeof(info));
+	info.pid = fl->fl_pid;
+	info.ex = (fl->fl_type == F_WRLCK);
+	info.fsid = ls->ls_global_id;
+	dlm_put_lockspace(ls);
+	info.number = number;
+	info.start = fl->fl_start;
+	info.end = fl->fl_end;
+	info.owner = (__u64)fl->fl_pid;
+
+	rv = do_lock_cancel(&info);
+	switch (rv) {
+	case 0:
+		spin_lock(&ops_lock);
+		/* lock request to cancel must be on recv_list because
+		 * do_lock_cancel() synchronizes it.
+		 */
+		op = plock_lookup_waiter(&info);
+		if (WARN_ON_ONCE(!op)) {
+			rv = -ENOLCK;
+			break;
+		}
+
+		list_del(&op->list);
+		spin_unlock(&ops_lock);
+		WARN_ON(op->info.optype != DLM_PLOCK_OP_LOCK);
+		op->data->callback(op->data->fl, -EINTR);
+		dlm_release_plock_op(op);
+		rv = -EINTR;
+		break;
+	case -ENOENT:
+		/* if cancel wasn't successful we probably were to late
+		 * or it was a non-blocking lock request, so just unlock it.
+		 */
+		rv = dlm_posix_unlock(lockspace, number, file, fl);
+		break;
+	default:
+		break;
+	}
+
+	return rv;
+}
+EXPORT_SYMBOL_GPL(dlm_posix_cancel);
+
 int dlm_posix_get(dlm_lockspace_t *lockspace, u64 number, struct file *file,
 		  struct file_lock *fl)
 {
@@ -457,19 +546,7 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count,
 	 */
 	spin_lock(&ops_lock);
 	if (info.wait) {
-		list_for_each_entry(iter, &recv_list, list) {
-			if (iter->info.fsid == info.fsid &&
-			    iter->info.number == info.number &&
-			    iter->info.owner == info.owner &&
-			    iter->info.pid == info.pid &&
-			    iter->info.start == info.start &&
-			    iter->info.end == info.end &&
-			    iter->info.ex == info.ex &&
-			    iter->info.wait) {
-				op = iter;
-				break;
-			}
-		}
+		op = plock_lookup_waiter(&info);
 	} else {
 		list_for_each_entry(iter, &recv_list, list) {
 			if (!iter->info.wait) {
diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index 1bf3c4453516..386eceb2f574 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -1436,17 +1436,14 @@ static int gfs2_lock(struct file *file, int cmd, struct file_lock *fl)
 
 	if (!(fl->fl_flags & FL_POSIX))
 		return -ENOLCK;
-	if (cmd == F_CANCELLK) {
-		/* Hack: */
-		cmd = F_SETLK;
-		fl->fl_type = F_UNLCK;
-	}
 	if (unlikely(gfs2_withdrawn(sdp))) {
 		if (fl->fl_type == F_UNLCK)
 			locks_lock_file_wait(file, fl);
 		return -EIO;
 	}
-	if (IS_GETLK(cmd))
+	if (cmd == F_CANCELLK)
+		return dlm_posix_cancel(ls->ls_dlm, ip->i_no_addr, file, fl);
+	else if (IS_GETLK(cmd))
 		return dlm_posix_get(ls->ls_dlm, ip->i_no_addr, file, fl);
 	else if (fl->fl_type == F_UNLCK)
 		return dlm_posix_unlock(ls->ls_dlm, ip->i_no_addr, file, fl);
diff --git a/fs/ocfs2/stack_user.c b/fs/ocfs2/stack_user.c
index 05d4414d0c33..9b76ee66aeb2 100644
--- a/fs/ocfs2/stack_user.c
+++ b/fs/ocfs2/stack_user.c
@@ -738,18 +738,11 @@ static int user_plock(struct ocfs2_cluster_connection *conn,
 	 *
 	 * Internally, fs/dlm will pass these to a misc device, which
 	 * a userspace daemon will read and write to.
-	 *
-	 * For now, cancel requests (which happen internally only),
-	 * are turned into unlocks. Most of this function taken from
-	 * gfs2_lock.
 	 */
 
-	if (cmd == F_CANCELLK) {
-		cmd = F_SETLK;
-		fl->fl_type = F_UNLCK;
-	}
-
-	if (IS_GETLK(cmd))
+	if (cmd == F_CANCELLK)
+		return dlm_posix_cancel(conn->cc_lockspace, ino, file, fl);
+	else if (IS_GETLK(cmd))
 		return dlm_posix_get(conn->cc_lockspace, ino, file, fl);
 	else if (fl->fl_type == F_UNLCK)
 		return dlm_posix_unlock(conn->cc_lockspace, ino, file, fl);
diff --git a/include/linux/dlm_plock.h b/include/linux/dlm_plock.h
index e6d76e8715a6..15fc856d198c 100644
--- a/include/linux/dlm_plock.h
+++ b/include/linux/dlm_plock.h
@@ -11,6 +11,8 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
 		int cmd, struct file_lock *fl);
 int dlm_posix_unlock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
 		struct file_lock *fl);
+int dlm_posix_cancel(dlm_lockspace_t *lockspace, u64 number, struct file *file,
+		     struct file_lock *fl);
 int dlm_posix_get(dlm_lockspace_t *lockspace, u64 number, struct file *file,
 		struct file_lock *fl);
 #endif
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Cluster-devel] [PATCHv4 v6.5-rc2 3/3] fs: dlm: fix F_CANCELLK to cancel pending request
  2023-07-20 12:22 ` [Cluster-devel] [PATCHv4 v6.5-rc2 3/3] fs: dlm: fix F_CANCELLK to cancel pending request Alexander Aring
@ 2023-07-21 16:25   ` Andreas Gruenbacher
  2023-07-21 18:55     ` Alexander Aring
  0 siblings, 1 reply; 9+ messages in thread
From: Andreas Gruenbacher @ 2023-07-21 16:25 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Thu, Jul 20, 2023 at 2:22?PM Alexander Aring <aahringo@redhat.com> wrote:
> This patch fixes the current handling of F_CANCELLK by not just doing a
> unlock as we need to try to cancel a lock at first. A unlock makes sense
> on a non-blocking lock request but if it's a blocking lock request we
> need to cancel the request until it's not granted yet. This patch is fixing
> this behaviour by first try to cancel a lock request and if it's failed
> it's unlocking the lock which seems to be granted.

I don't like how this is implemented, but the problem already starts
in commit 57e2c2f2d94c ("fs: dlm: fix mismatch of plock results from
userspace"). That commit relies on how dlm_controld is implemented
internally; it requires dlm_controld to keep all replies to
non-blocking requests in perfect order. The correctness of that
approach is more difficult to argue for than it should be, the code
might break in non-obvious ways, and it limits the kinds of things
dlm_controld will be able to do in the future. And this change relies
on the same flawed design.

Instead, when noticing that we don't have a way to uniquely identify
requests, such a way should just have been introduced. The CANCEL
request would then carry the same unique identifier as the original
locking request, dlm_controld would reply either to the original
locking request or to the cancel request, and the kernel would have no
problem matching the reply to the request.

But without unique request identifiers, we now end up with those
redundant -ENOENT replies to CANCEL requests and with those
essentially duplicate entries for the same request on recv_list. Not
great.

Andreas

> Note: currently the nfs locking handling was disabled by commit
> 40595cdc93ed ("nfs: block notification on fs with its own ->lock").
> However DLM was never being updated regarding to this change. Future
> patches will try to fix lockd lock requests for DLM. This patch is
> currently assuming the upstream DLM lockd handling is correct.
> ---
>  fs/dlm/plock.c            | 103 +++++++++++++++++++++++++++++++++-----
>  fs/gfs2/file.c            |   9 ++--
>  fs/ocfs2/stack_user.c     |  13 ++---
>  include/linux/dlm_plock.h |   2 +
>  4 files changed, 98 insertions(+), 29 deletions(-)
>
> diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
> index a8ffa0760913..943d9f8e5564 100644
> --- a/fs/dlm/plock.c
> +++ b/fs/dlm/plock.c
> @@ -42,6 +42,27 @@ static inline void set_version(struct dlm_plock_info *info)
>         info->version[2] = DLM_PLOCK_VERSION_PATCH;
>  }
>
> +static struct plock_op *plock_lookup_waiter(const struct dlm_plock_info *info)
> +{
> +       struct plock_op *op = NULL, *iter;
> +
> +       list_for_each_entry(iter, &recv_list, list) {
> +               if (iter->info.fsid == info->fsid &&
> +                   iter->info.number == info->number &&
> +                   iter->info.owner == info->owner &&
> +                   iter->info.pid == info->pid &&
> +                   iter->info.start == info->start &&
> +                   iter->info.end == info->end &&
> +                   iter->info.ex == info->ex &&
> +                   iter->info.wait) {
> +                       op = iter;
> +                       break;
> +               }
> +       }
> +
> +       return op;
> +}
> +
>  static int check_version(struct dlm_plock_info *info)
>  {
>         if ((DLM_PLOCK_VERSION_MAJOR != info->version[0]) ||
> @@ -334,6 +355,74 @@ int dlm_posix_unlock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
>  }
>  EXPORT_SYMBOL_GPL(dlm_posix_unlock);
>
> +/*
> + * NOTE: This implementation can only handle async lock requests as nfs
> + * do it. It cannot handle cancellation of a pending lock request sitting
> + * in wait_event(), but for now only nfs is the only user local kernel
> + * user.
> + */
> +int dlm_posix_cancel(dlm_lockspace_t *lockspace, u64 number, struct file *file,
> +                    struct file_lock *fl)
> +{
> +       struct dlm_plock_info info;
> +       struct plock_op *op;
> +       struct dlm_ls *ls;
> +       int rv;
> +
> +       /* this only works for async request for now and nfs is the only
> +        * kernel user right now.
> +        */
> +       if (WARN_ON_ONCE(!fl->fl_lmops || !fl->fl_lmops->lm_grant))
> +               return -EOPNOTSUPP;
> +
> +       ls = dlm_find_lockspace_local(lockspace);
> +       if (!ls)
> +               return -EINVAL;
> +
> +       memset(&info, 0, sizeof(info));
> +       info.pid = fl->fl_pid;
> +       info.ex = (fl->fl_type == F_WRLCK);
> +       info.fsid = ls->ls_global_id;
> +       dlm_put_lockspace(ls);
> +       info.number = number;
> +       info.start = fl->fl_start;
> +       info.end = fl->fl_end;
> +       info.owner = (__u64)fl->fl_pid;
> +
> +       rv = do_lock_cancel(&info);
> +       switch (rv) {
> +       case 0:
> +               spin_lock(&ops_lock);
> +               /* lock request to cancel must be on recv_list because
> +                * do_lock_cancel() synchronizes it.
> +                */
> +               op = plock_lookup_waiter(&info);
> +               if (WARN_ON_ONCE(!op)) {
> +                       rv = -ENOLCK;
> +                       break;
> +               }
> +
> +               list_del(&op->list);
> +               spin_unlock(&ops_lock);
> +               WARN_ON(op->info.optype != DLM_PLOCK_OP_LOCK);
> +               op->data->callback(op->data->fl, -EINTR);
> +               dlm_release_plock_op(op);
> +               rv = -EINTR;
> +               break;
> +       case -ENOENT:
> +               /* if cancel wasn't successful we probably were to late
> +                * or it was a non-blocking lock request, so just unlock it.
> +                */
> +               rv = dlm_posix_unlock(lockspace, number, file, fl);
> +               break;
> +       default:
> +               break;
> +       }
> +
> +       return rv;
> +}
> +EXPORT_SYMBOL_GPL(dlm_posix_cancel);
> +
>  int dlm_posix_get(dlm_lockspace_t *lockspace, u64 number, struct file *file,
>                   struct file_lock *fl)
>  {
> @@ -457,19 +546,7 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count,
>          */
>         spin_lock(&ops_lock);
>         if (info.wait) {
> -               list_for_each_entry(iter, &recv_list, list) {
> -                       if (iter->info.fsid == info.fsid &&
> -                           iter->info.number == info.number &&
> -                           iter->info.owner == info.owner &&
> -                           iter->info.pid == info.pid &&
> -                           iter->info.start == info.start &&
> -                           iter->info.end == info.end &&
> -                           iter->info.ex == info.ex &&
> -                           iter->info.wait) {
> -                               op = iter;
> -                               break;
> -                       }
> -               }
> +               op = plock_lookup_waiter(&info);
>         } else {
>                 list_for_each_entry(iter, &recv_list, list) {
>                         if (!iter->info.wait) {
> diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
> index 1bf3c4453516..386eceb2f574 100644
> --- a/fs/gfs2/file.c
> +++ b/fs/gfs2/file.c
> @@ -1436,17 +1436,14 @@ static int gfs2_lock(struct file *file, int cmd, struct file_lock *fl)
>
>         if (!(fl->fl_flags & FL_POSIX))
>                 return -ENOLCK;
> -       if (cmd == F_CANCELLK) {
> -               /* Hack: */
> -               cmd = F_SETLK;
> -               fl->fl_type = F_UNLCK;
> -       }
>         if (unlikely(gfs2_withdrawn(sdp))) {
>                 if (fl->fl_type == F_UNLCK)
>                         locks_lock_file_wait(file, fl);
>                 return -EIO;
>         }
> -       if (IS_GETLK(cmd))
> +       if (cmd == F_CANCELLK)
> +               return dlm_posix_cancel(ls->ls_dlm, ip->i_no_addr, file, fl);
> +       else if (IS_GETLK(cmd))
>                 return dlm_posix_get(ls->ls_dlm, ip->i_no_addr, file, fl);
>         else if (fl->fl_type == F_UNLCK)
>                 return dlm_posix_unlock(ls->ls_dlm, ip->i_no_addr, file, fl);
> diff --git a/fs/ocfs2/stack_user.c b/fs/ocfs2/stack_user.c
> index 05d4414d0c33..9b76ee66aeb2 100644
> --- a/fs/ocfs2/stack_user.c
> +++ b/fs/ocfs2/stack_user.c
> @@ -738,18 +738,11 @@ static int user_plock(struct ocfs2_cluster_connection *conn,
>          *
>          * Internally, fs/dlm will pass these to a misc device, which
>          * a userspace daemon will read and write to.
> -        *
> -        * For now, cancel requests (which happen internally only),
> -        * are turned into unlocks. Most of this function taken from
> -        * gfs2_lock.
>          */
>
> -       if (cmd == F_CANCELLK) {
> -               cmd = F_SETLK;
> -               fl->fl_type = F_UNLCK;
> -       }
> -
> -       if (IS_GETLK(cmd))
> +       if (cmd == F_CANCELLK)
> +               return dlm_posix_cancel(conn->cc_lockspace, ino, file, fl);
> +       else if (IS_GETLK(cmd))
>                 return dlm_posix_get(conn->cc_lockspace, ino, file, fl);
>         else if (fl->fl_type == F_UNLCK)
>                 return dlm_posix_unlock(conn->cc_lockspace, ino, file, fl);
> diff --git a/include/linux/dlm_plock.h b/include/linux/dlm_plock.h
> index e6d76e8715a6..15fc856d198c 100644
> --- a/include/linux/dlm_plock.h
> +++ b/include/linux/dlm_plock.h
> @@ -11,6 +11,8 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
>                 int cmd, struct file_lock *fl);
>  int dlm_posix_unlock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
>                 struct file_lock *fl);
> +int dlm_posix_cancel(dlm_lockspace_t *lockspace, u64 number, struct file *file,
> +                    struct file_lock *fl);
>  int dlm_posix_get(dlm_lockspace_t *lockspace, u64 number, struct file *file,
>                 struct file_lock *fl);
>  #endif
> --
> 2.31.1
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Cluster-devel] [PATCHv4 v6.5-rc2 3/3] fs: dlm: fix F_CANCELLK to cancel pending request
  2023-07-21 16:25   ` Andreas Gruenbacher
@ 2023-07-21 18:55     ` Alexander Aring
  2023-07-24 14:40       ` Andreas Gruenbacher
  0 siblings, 1 reply; 9+ messages in thread
From: Alexander Aring @ 2023-07-21 18:55 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On Fri, Jul 21, 2023 at 12:25?PM Andreas Gruenbacher
<agruenba@redhat.com> wrote:
>
> On Thu, Jul 20, 2023 at 2:22?PM Alexander Aring <aahringo@redhat.com> wrote:
> > This patch fixes the current handling of F_CANCELLK by not just doing a
> > unlock as we need to try to cancel a lock at first. A unlock makes sense
> > on a non-blocking lock request but if it's a blocking lock request we
> > need to cancel the request until it's not granted yet. This patch is fixing
> > this behaviour by first try to cancel a lock request and if it's failed
> > it's unlocking the lock which seems to be granted.
>
> I don't like how this is implemented, but the problem already starts
> in commit 57e2c2f2d94c ("fs: dlm: fix mismatch of plock results from
> userspace"). That commit relies on how dlm_controld is implemented
> internally; it requires dlm_controld to keep all replies to
> non-blocking requests in perfect order. The correctness of that
> approach is more difficult to argue for than it should be, the code
> might break in non-obvious ways, and it limits the kinds of things
> dlm_controld will be able to do in the future. And this change relies
> on the same flawed design.
>

I agree it is not the best design and I mentioned it in my previous
patch series versions [0]:

TODO we should move to a instance reference from request and reply and
not go over lock states, but it seems going over lock states and get
the instance does not make any problems (at least there were no issues
found yet) but it's much cleaner to not think about all possible
special cases and states to instance has no 1:1 mapping anymore.

> Instead, when noticing that we don't have a way to uniquely identify
> requests, such a way should just have been introduced. The CANCEL
> request would then carry the same unique identifier as the original
> locking request, dlm_controld would reply either to the original
> locking request or to the cancel request, and the kernel would have no
> problem matching the reply to the request.
>
> But without unique request identifiers, we now end up with those
> redundant -ENOENT replies to CANCEL requests and with those
> essentially duplicate entries for the same request on recv_list. Not
> great.
>

There is no reference of lock request instances between kernel and
user yet. It does it by having a match if it's the same lock request.
The whole mechanism works like this, but one reason why we recently
had problems with it. A lock request is the same in the sense that
they are being granted at the same time. So far we did not experience
any problems with that... but as mentioned in [0] you need to think
about all potential cases.

NOTE: that even F_CANCELLK does not give you a unique reference of the
original locking request, it works over matching the field of struct
file_lock... There is already the same problem. The struct file_lock
pointer could be used, but struct file_lock does not represent a lock
request, this does not exist in VFS posix lock API.

- Alex

[0] https://listman.redhat.com/archives/cluster-devel/2023-July/024477.html


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Cluster-devel] [PATCHv4 v6.5-rc2 3/3] fs: dlm: fix F_CANCELLK to cancel pending request
  2023-07-21 18:55     ` Alexander Aring
@ 2023-07-24 14:40       ` Andreas Gruenbacher
  2023-07-24 19:03         ` Alexander Aring
  0 siblings, 1 reply; 9+ messages in thread
From: Andreas Gruenbacher @ 2023-07-24 14:40 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Fri, Jul 21, 2023 at 8:55?PM Alexander Aring <aahringo@redhat.com> wrote:
> Hi,
>
> On Fri, Jul 21, 2023 at 12:25?PM Andreas Gruenbacher
> <agruenba@redhat.com> wrote:
> >
> > On Thu, Jul 20, 2023 at 2:22?PM Alexander Aring <aahringo@redhat.com> wrote:
> > > This patch fixes the current handling of F_CANCELLK by not just doing a
> > > unlock as we need to try to cancel a lock at first. A unlock makes sense
> > > on a non-blocking lock request but if it's a blocking lock request we
> > > need to cancel the request until it's not granted yet. This patch is fixing
> > > this behaviour by first try to cancel a lock request and if it's failed
> > > it's unlocking the lock which seems to be granted.
> >
> > I don't like how this is implemented, but the problem already starts
> > in commit 57e2c2f2d94c ("fs: dlm: fix mismatch of plock results from
> > userspace"). That commit relies on how dlm_controld is implemented
> > internally; it requires dlm_controld to keep all replies to
> > non-blocking requests in perfect order. The correctness of that
> > approach is more difficult to argue for than it should be, the code
> > might break in non-obvious ways, and it limits the kinds of things
> > dlm_controld will be able to do in the future. And this change relies
> > on the same flawed design.
> >
>
> I agree it is not the best design and I mentioned it in my previous
> patch series versions [0]:
>
> TODO we should move to a instance reference from request and reply and
> not go over lock states, but it seems going over lock states and get
> the instance does not make any problems (at least there were no issues
> found yet) but it's much cleaner to not think about all possible
> special cases and states to instance has no 1:1 mapping anymore.

I guess by /go over lock states/, you mean what dev_write() does --
compare the request and response fields to match requests with
responses as well as possible, based on the information that's in
struct dlm_plock_info today.

> > Instead, when noticing that we don't have a way to uniquely identify
> > requests, such a way should just have been introduced. The CANCEL
> > request would then carry the same unique identifier as the original
> > locking request, dlm_controld would reply either to the original
> > locking request or to the cancel request, and the kernel would have no
> > problem matching the reply to the request.
> >
> > But without unique request identifiers, we now end up with those
> > redundant -ENOENT replies to CANCEL requests and with those
> > essentially duplicate entries for the same request on recv_list. Not
> > great.
> >
>
> There is no reference of lock request instances between kernel and
> user yet. It does it by having a match if it's the same lock request.
> The whole mechanism works like this, but one reason why we recently
> had problems with it. A lock request is the same in the sense that
> they are being granted at the same time. So far we did not experience
> any problems with that... but as mentioned in [0] you need to think
> about all potential cases.

I have no idea what you're talking about.

Let me point out one thing though: when dlm_controld replies to
multiple pending locking requests "in one go", without having to wait
on any unlocks to happen, this doesn't mean that those requests are
all "the same" at all. The requests may be for the same "actor" on the
kernel side, but they don't have to be. As such, it does make sense to
treat each of those requests independently.

> NOTE: that even F_CANCELLK does not give you a unique reference of the
> original locking request, it works over matching the field of struct
> file_lock... There is already the same problem. The struct file_lock
> pointer could be used, but struct file_lock does not represent a lock
> request, this does not exist in VFS posix lock API.

It seems to me that struct file_lock objects persist across the entire
locking request, both for the synchronous locking requests of gfs2 and
for the asynchronous locking requests of lockd. In other words, when
lockd cancels a locking request, it seems to use the same struct
file_lock object it used for making the original request (see where
lockd calls vfs_cancel_lock()). This means that the address of that
object would be a suitable locking request identifier. And while we
don't have a huge amount of space left in struct dlm_plock_info, we do
have two 32-bit fields in the version[] array that we could use for
communicating that identifier. It would of course be better if we
didn't need that much space, but I don't see a realistic alternative
if we want to cleanly fix this whole mess.

Thanks,
Andreas

> - Alex
>
> [0] https://listman.redhat.com/archives/cluster-devel/2023-July/024477.html
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Cluster-devel] [PATCHv4 v6.5-rc2 3/3] fs: dlm: fix F_CANCELLK to cancel pending request
  2023-07-24 14:40       ` Andreas Gruenbacher
@ 2023-07-24 19:03         ` Alexander Aring
  2023-07-24 20:42           ` Alexander Aring
  0 siblings, 1 reply; 9+ messages in thread
From: Alexander Aring @ 2023-07-24 19:03 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On Mon, Jul 24, 2023 at 10:40?AM Andreas Gruenbacher
<agruenba@redhat.com> wrote:
>
> On Fri, Jul 21, 2023 at 8:55?PM Alexander Aring <aahringo@redhat.com> wrote:
> > Hi,
> >
> > On Fri, Jul 21, 2023 at 12:25?PM Andreas Gruenbacher
> > <agruenba@redhat.com> wrote:
> > >
> > > On Thu, Jul 20, 2023 at 2:22?PM Alexander Aring <aahringo@redhat.com> wrote:
> > > > This patch fixes the current handling of F_CANCELLK by not just doing a
> > > > unlock as we need to try to cancel a lock at first. A unlock makes sense
> > > > on a non-blocking lock request but if it's a blocking lock request we
> > > > need to cancel the request until it's not granted yet. This patch is fixing
> > > > this behaviour by first try to cancel a lock request and if it's failed
> > > > it's unlocking the lock which seems to be granted.
> > >
> > > I don't like how this is implemented, but the problem already starts
> > > in commit 57e2c2f2d94c ("fs: dlm: fix mismatch of plock results from
> > > userspace"). That commit relies on how dlm_controld is implemented
> > > internally; it requires dlm_controld to keep all replies to
> > > non-blocking requests in perfect order. The correctness of that
> > > approach is more difficult to argue for than it should be, the code
> > > might break in non-obvious ways, and it limits the kinds of things
> > > dlm_controld will be able to do in the future. And this change relies
> > > on the same flawed design.
> > >
> >
> > I agree it is not the best design and I mentioned it in my previous
> > patch series versions [0]:
> >
> > TODO we should move to a instance reference from request and reply and
> > not go over lock states, but it seems going over lock states and get
> > the instance does not make any problems (at least there were no issues
> > found yet) but it's much cleaner to not think about all possible
> > special cases and states to instance has no 1:1 mapping anymore.
>
> I guess by /go over lock states/, you mean what dev_write() does --
> compare the request and response fields to match requests with
> responses as well as possible, based on the information that's in
> struct dlm_plock_info today.
>

yes, it's what we doing when wait is true.

> > > Instead, when noticing that we don't have a way to uniquely identify
> > > requests, such a way should just have been introduced. The CANCEL
> > > request would then carry the same unique identifier as the original
> > > locking request, dlm_controld would reply either to the original
> > > locking request or to the cancel request, and the kernel would have no
> > > problem matching the reply to the request.
> > >
> > > But without unique request identifiers, we now end up with those
> > > redundant -ENOENT replies to CANCEL requests and with those
> > > essentially duplicate entries for the same request on recv_list. Not
> > > great.
> > >
> >
> > There is no reference of lock request instances between kernel and
> > user yet. It does it by having a match if it's the same lock request.
> > The whole mechanism works like this, but one reason why we recently
> > had problems with it. A lock request is the same in the sense that
> > they are being granted at the same time. So far we did not experience
> > any problems with that... but as mentioned in [0] you need to think
> > about all potential cases.
>
> I have no idea what you're talking about.
>

I mean that currently there are no issues I am aware of and there
exists no reference between lock request in the kernel and lock
request in the user. This all goes over the lock states, if they are
identical as "they are getting granted at the same time".

> Let me point out one thing though: when dlm_controld replies to
> multiple pending locking requests "in one go", without having to wait
> on any unlocks to happen, this doesn't mean that those requests are
> all "the same" at all. The requests may be for the same "actor" on the
> kernel side, but they don't have to be. As such, it does make sense to
> treat each of those requests independently.
>

Now, I have no idea what you are talking about.

> > NOTE: that even F_CANCELLK does not give you a unique reference of the
> > original locking request, it works over matching the field of struct
> > file_lock... There is already the same problem. The struct file_lock
> > pointer could be used, but struct file_lock does not represent a lock
> > request, this does not exist in VFS posix lock API.
>
> It seems to me that struct file_lock objects persist across the entire
> locking request, both for the synchronous locking requests of gfs2 and
> for the asynchronous locking requests of lockd. In other words, when
> lockd cancels a locking request, it seems to use the same struct
> file_lock object it used for making the original request (see where
> lockd calls vfs_cancel_lock()). This means that the address of that
> object would be a suitable locking request identifier. And while we
> don't have a huge amount of space left in struct dlm_plock_info, we do
> have two 32-bit fields in the version[] array that we could use for
> communicating that identifier. It would of course be better if we
> didn't need that much space, but I don't see a realistic alternative
> if we want to cleanly fix this whole mess.

Please tell me the API describing that you can use struct file_lock
pointer to lookup the pending lock. This is implementation specific,
the next person hacking on lockd does not know that.
I will try to come up with something. Because 32 bits are limited we
could run into collisions, but it should be enough for normal usage.

But I will not use the file_lock * as lookup, it will work over fields
which end in the same "mess" behaviour. There could be follow up
patches to fix the VFS layer to use "lock request" unique identifiers
but VFS does not have any storage type to keep track of pending lock
requests so far I know.

- Alex


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Cluster-devel] [PATCHv4 v6.5-rc2 3/3] fs: dlm: fix F_CANCELLK to cancel pending request
  2023-07-24 19:03         ` Alexander Aring
@ 2023-07-24 20:42           ` Alexander Aring
  0 siblings, 0 replies; 9+ messages in thread
From: Alexander Aring @ 2023-07-24 20:42 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On Mon, Jul 24, 2023 at 3:03?PM Alexander Aring <aahringo@redhat.com> wrote:
>
> Hi,
>
> On Mon, Jul 24, 2023 at 10:40?AM Andreas Gruenbacher
> <agruenba@redhat.com> wrote:
> >
> > On Fri, Jul 21, 2023 at 8:55?PM Alexander Aring <aahringo@redhat.com> wrote:
> > > Hi,
> > >
> > > On Fri, Jul 21, 2023 at 12:25?PM Andreas Gruenbacher
> > > <agruenba@redhat.com> wrote:
> > > >
> > > > On Thu, Jul 20, 2023 at 2:22?PM Alexander Aring <aahringo@redhat.com> wrote:
> > > > > This patch fixes the current handling of F_CANCELLK by not just doing a
> > > > > unlock as we need to try to cancel a lock at first. A unlock makes sense
> > > > > on a non-blocking lock request but if it's a blocking lock request we
> > > > > need to cancel the request until it's not granted yet. This patch is fixing
> > > > > this behaviour by first try to cancel a lock request and if it's failed
> > > > > it's unlocking the lock which seems to be granted.
> > > >
> > > > I don't like how this is implemented, but the problem already starts
> > > > in commit 57e2c2f2d94c ("fs: dlm: fix mismatch of plock results from
> > > > userspace"). That commit relies on how dlm_controld is implemented
> > > > internally; it requires dlm_controld to keep all replies to
> > > > non-blocking requests in perfect order. The correctness of that
> > > > approach is more difficult to argue for than it should be, the code
> > > > might break in non-obvious ways, and it limits the kinds of things
> > > > dlm_controld will be able to do in the future. And this change relies
> > > > on the same flawed design.
> > > >
> > >
> > > I agree it is not the best design and I mentioned it in my previous
> > > patch series versions [0]:
> > >
> > > TODO we should move to a instance reference from request and reply and
> > > not go over lock states, but it seems going over lock states and get
> > > the instance does not make any problems (at least there were no issues
> > > found yet) but it's much cleaner to not think about all possible
> > > special cases and states to instance has no 1:1 mapping anymore.
> >
> > I guess by /go over lock states/, you mean what dev_write() does --
> > compare the request and response fields to match requests with
> > responses as well as possible, based on the information that's in
> > struct dlm_plock_info today.
> >
>
> yes, it's what we doing when wait is true.
>
> > > > Instead, when noticing that we don't have a way to uniquely identify
> > > > requests, such a way should just have been introduced. The CANCEL
> > > > request would then carry the same unique identifier as the original
> > > > locking request, dlm_controld would reply either to the original
> > > > locking request or to the cancel request, and the kernel would have no
> > > > problem matching the reply to the request.
> > > >
> > > > But without unique request identifiers, we now end up with those
> > > > redundant -ENOENT replies to CANCEL requests and with those
> > > > essentially duplicate entries for the same request on recv_list. Not
> > > > great.
> > > >
> > >
> > > There is no reference of lock request instances between kernel and
> > > user yet. It does it by having a match if it's the same lock request.
> > > The whole mechanism works like this, but one reason why we recently
> > > had problems with it. A lock request is the same in the sense that
> > > they are being granted at the same time. So far we did not experience
> > > any problems with that... but as mentioned in [0] you need to think
> > > about all potential cases.
> >
> > I have no idea what you're talking about.
> >
>
> I mean that currently there are no issues I am aware of and there
> exists no reference between lock request in the kernel and lock
> request in the user. This all goes over the lock states, if they are
> identical as "they are getting granted at the same time".
>
> > Let me point out one thing though: when dlm_controld replies to
> > multiple pending locking requests "in one go", without having to wait
> > on any unlocks to happen, this doesn't mean that those requests are
> > all "the same" at all. The requests may be for the same "actor" on the
> > kernel side, but they don't have to be. As such, it does make sense to
> > treat each of those requests independently.
> >
>
> Now, I have no idea what you are talking about.
>
> > > NOTE: that even F_CANCELLK does not give you a unique reference of the
> > > original locking request, it works over matching the field of struct
> > > file_lock... There is already the same problem. The struct file_lock
> > > pointer could be used, but struct file_lock does not represent a lock
> > > request, this does not exist in VFS posix lock API.
> >
> > It seems to me that struct file_lock objects persist across the entire
> > locking request, both for the synchronous locking requests of gfs2 and
> > for the asynchronous locking requests of lockd. In other words, when
> > lockd cancels a locking request, it seems to use the same struct
> > file_lock object it used for making the original request (see where
> > lockd calls vfs_cancel_lock()). This means that the address of that
> > object would be a suitable locking request identifier. And while we
> > don't have a huge amount of space left in struct dlm_plock_info, we do
> > have two 32-bit fields in the version[] array that we could use for
> > communicating that identifier. It would of course be better if we
> > didn't need that much space, but I don't see a realistic alternative
> > if we want to cleanly fix this whole mess.
>
> Please tell me the API describing that you can use struct file_lock
> pointer to lookup the pending lock. This is implementation specific,
> the next person hacking on lockd does not know that.
> I will try to come up with something. Because 32 bits are limited we
> could run into collisions, but it should be enough for normal usage.
>

A draft can be found under [0].

> But I will not use the file_lock * as lookup, it will work over fields
> which end in the same "mess" behaviour. There could be follow up
> patches to fix the VFS layer to use "lock request" unique identifiers
> but VFS does not have any storage type to keep track of pending lock
> requests so far I know.
>

This problem requires actually 2 ids, one for the plock cancel
operation request match itself. And another one for the plock request
which should be cancelled. I've only implemented the plock cancel
operation request match. The other part is how I see F_CANCELLK is
supposed to work as "match on struct file_lock fields". I am open to
changing this behaviour but it requires more work on VFS.

- Alex

[0] https://gitlab.com/netcoder/linux/-/commit/aa67d0ec656a0d060fe3d0c0d86264dea1b2cc7d


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-07-24 20:42 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-20 12:22 [Cluster-devel] [PATCHv4 v6.5-rc2 0/3] fs: dlm: lock cancellation feature Alexander Aring
2023-07-20 12:22 ` [Cluster-devel] [PATCHv4 v6.5-rc2 1/3] fs: dlm: remove twice newline Alexander Aring
2023-07-20 12:22 ` [Cluster-devel] [PATCHv4 v6.5-rc2 2/3] fs: dlm: allow to F_SETLKW getting interrupted Alexander Aring
2023-07-20 12:22 ` [Cluster-devel] [PATCHv4 v6.5-rc2 3/3] fs: dlm: fix F_CANCELLK to cancel pending request Alexander Aring
2023-07-21 16:25   ` Andreas Gruenbacher
2023-07-21 18:55     ` Alexander Aring
2023-07-24 14:40       ` Andreas Gruenbacher
2023-07-24 19:03         ` Alexander Aring
2023-07-24 20:42           ` Alexander Aring

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).