* [Cluster-devel] [PATCH v6.5-rc1 1/2] fs: dlm: introduce DLM_PLOCK_FL_NO_REPLY flag
@ 2023-07-13 14:40 Alexander Aring
2023-07-13 14:40 ` [Cluster-devel] [PATCH v6.5-rc1 2/2] fs: dlm: allow to F_SETLKW getting interrupted Alexander Aring
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Alexander Aring @ 2023-07-13 14:40 UTC (permalink / raw)
To: cluster-devel.redhat.com
This patch introduces a new flag DLM_PLOCK_FL_NO_REPLY in case an dlm
plock operation should not send a reply back. Currently this is kind of
being handled in DLM_PLOCK_FL_CLOSE, but DLM_PLOCK_FL_CLOSE has more
meanings that it will remove all waiters for a specific nodeid/owner
values in by doing a unlock operation. In case of an error in dlm user
space software e.g. dlm_controld we get an reply with an error back.
This cannot be matched because there is no op to match in recv_list. We
filter now on DLM_PLOCK_FL_NO_REPLY in case we had an error back as
reply. In newer dlm_controld version it will never send a result back
when DLM_PLOCK_FL_NO_REPLY is set. This filter is a workaround to handle
older dlm_controld versions.
Fixes: 901025d2f319 ("dlm: make plock operation killable")
Cc: stable at vger.kernel.org
Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
fs/dlm/plock.c | 23 +++++++++++++++++++----
include/uapi/linux/dlm_plock.h | 1 +
2 files changed, 20 insertions(+), 4 deletions(-)
diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
index 70a4752ed913..7fe9f4b922d3 100644
--- a/fs/dlm/plock.c
+++ b/fs/dlm/plock.c
@@ -96,7 +96,7 @@ static void do_unlock_close(const struct dlm_plock_info *info)
op->info.end = OFFSET_MAX;
op->info.owner = info->owner;
- op->info.flags |= DLM_PLOCK_FL_CLOSE;
+ op->info.flags |= (DLM_PLOCK_FL_CLOSE | DLM_PLOCK_FL_NO_REPLY);
send_op(op);
}
@@ -293,7 +293,7 @@ int dlm_posix_unlock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
op->info.owner = (__u64)(long) fl->fl_owner;
if (fl->fl_flags & FL_CLOSE) {
- op->info.flags |= DLM_PLOCK_FL_CLOSE;
+ op->info.flags |= (DLM_PLOCK_FL_CLOSE | DLM_PLOCK_FL_NO_REPLY);
send_op(op);
rv = 0;
goto out;
@@ -392,7 +392,7 @@ static ssize_t dev_read(struct file *file, char __user *u, size_t count,
spin_lock(&ops_lock);
if (!list_empty(&send_list)) {
op = list_first_entry(&send_list, struct plock_op, list);
- if (op->info.flags & DLM_PLOCK_FL_CLOSE)
+ if (op->info.flags & DLM_PLOCK_FL_NO_REPLY)
list_del(&op->list);
else
list_move_tail(&op->list, &recv_list);
@@ -407,7 +407,7 @@ static ssize_t dev_read(struct file *file, char __user *u, size_t count,
that were generated by the vfs cleaning up for a close
(the process did not make an unlock call). */
- if (op->info.flags & DLM_PLOCK_FL_CLOSE)
+ if (op->info.flags & DLM_PLOCK_FL_NO_REPLY)
dlm_release_plock_op(op);
if (copy_to_user(u, &info, sizeof(info)))
@@ -433,6 +433,21 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count,
if (check_version(&info))
return -EINVAL;
+ /* Some old dlm user space software will send replies back,
+ * even if DLM_PLOCK_FL_NO_REPLY is set (because the flag is
+ * new) e.g. if a error occur. We can't match them in recv_list
+ * because they were never be part of it. We filter it here,
+ * new dlm user space software will filter it in user space.
+ *
+ * In future this handling can be removed.
+ */
+ if (info.flags & DLM_PLOCK_FL_NO_REPLY) {
+ pr_info("Received unexpected reply from op %d, "
+ "please update DLM user space software!\n",
+ info.optype);
+ return count;
+ }
+
/*
* The results for waiting ops (SETLKW) can be returned in any
* order, so match all fields to find the op. The results for
diff --git a/include/uapi/linux/dlm_plock.h b/include/uapi/linux/dlm_plock.h
index 63b6c1fd9169..8dfa272c929a 100644
--- a/include/uapi/linux/dlm_plock.h
+++ b/include/uapi/linux/dlm_plock.h
@@ -25,6 +25,7 @@ enum {
};
#define DLM_PLOCK_FL_CLOSE 1
+#define DLM_PLOCK_FL_NO_REPLY 2
struct dlm_plock_info {
__u32 version[3];
--
2.31.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [Cluster-devel] [PATCH v6.5-rc1 2/2] fs: dlm: allow to F_SETLKW getting interrupted
2023-07-13 14:40 [Cluster-devel] [PATCH v6.5-rc1 1/2] fs: dlm: introduce DLM_PLOCK_FL_NO_REPLY flag Alexander Aring
@ 2023-07-13 14:40 ` Alexander Aring
2023-07-13 14:49 ` [Cluster-devel] [PATCH v6.5-rc1 1/2] fs: dlm: introduce DLM_PLOCK_FL_NO_REPLY flag Greg KH
2023-07-14 13:54 ` Andreas Gruenbacher
2 siblings, 0 replies; 7+ messages in thread
From: Alexander Aring @ 2023-07-13 14:40 UTC (permalink / raw)
To: cluster-devel.redhat.com
This patch implements dlm plock F_SETLKW interruption feature. If the
pending plock operation is not sent to user space yet it can simple be
dropped out of the send_list. In case it's already being sent we need to
try to remove the waiters in dlm user space tool. If it was successful a
reply with DLM_PLOCK_OP_CANCEL optype instead of DLM_PLOCK_OP_LOCK comes
back (flag DLM_PLOCK_FL_NO_REPLY was then being cleared in user space)
to signal the cancellation was successful. If a result with optype
DLM_PLOCK_OP_LOCK came back then the cancellation was not successful.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
fs/dlm/plock.c | 91 ++++++++++++++++++++++++----------
include/uapi/linux/dlm_plock.h | 1 +
2 files changed, 66 insertions(+), 26 deletions(-)
diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
index 7fe9f4b922d3..5faa428fff1a 100644
--- a/fs/dlm/plock.c
+++ b/fs/dlm/plock.c
@@ -29,6 +29,9 @@ struct plock_async_data {
struct plock_op {
struct list_head list;
+#define DLM_PLOCK_OP_FLAG_SENT 0
+#define DLM_PLOCK_OP_FLAG_INTERRUPTED 1
+ unsigned long flags;
int done;
struct dlm_plock_info info;
/* if set indicates async handling */
@@ -74,30 +77,25 @@ static void send_op(struct plock_op *op)
wake_up(&send_wq);
}
-/* If a process was killed while waiting for the only plock on a file,
- locks_remove_posix will not see any lock on the file so it won't
- send an unlock-close to us to pass on to userspace to clean up the
- abandoned waiter. So, we have to insert the unlock-close when the
- lock call is interrupted. */
-
-static void do_unlock_close(const struct dlm_plock_info *info)
+static int do_lock_cancel(const struct plock_op *orig_op)
{
struct plock_op *op;
op = kzalloc(sizeof(*op), GFP_NOFS);
if (!op)
- return;
+ return -ENOMEM;
+
+ op->info = orig_op->info;
+ op->info.optype = DLM_PLOCK_OP_CANCEL;
+ op->info.flags = DLM_PLOCK_FL_NO_REPLY;
- op->info.optype = DLM_PLOCK_OP_UNLOCK;
- op->info.pid = info->pid;
- op->info.fsid = info->fsid;
- op->info.number = info->number;
- op->info.start = 0;
- op->info.end = OFFSET_MAX;
- op->info.owner = info->owner;
-
- op->info.flags |= (DLM_PLOCK_FL_CLOSE | DLM_PLOCK_FL_NO_REPLY);
send_op(op);
+ wait_event(recv_wq, (orig_op->done != 0));
+
+ if (orig_op->info.optype == DLM_PLOCK_OP_CANCEL)
+ return 0;
+
+ return 1;
}
int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
@@ -156,7 +154,7 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
send_op(op);
if (op->info.wait) {
- rv = wait_event_killable(recv_wq, (op->done != 0));
+ rv = wait_event_interruptible(recv_wq, (op->done != 0));
if (rv == -ERESTARTSYS) {
spin_lock(&ops_lock);
/* recheck under ops_lock if we got a done != 0,
@@ -166,17 +164,37 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
spin_unlock(&ops_lock);
goto do_lock_wait;
}
- list_del(&op->list);
- spin_unlock(&ops_lock);
+
+ if (!test_bit(DLM_PLOCK_OP_FLAG_SENT, &op->flags)) {
+ /* part of send_list, user never saw the op */
+ list_del(&op->list);
+ spin_unlock(&ops_lock);
+ rv = -EINTR;
+ } else {
+ set_bit(DLM_PLOCK_OP_FLAG_INTERRUPTED, &op->flags);
+ spin_unlock(&ops_lock);
+ rv = do_lock_cancel(op);
+ switch (rv) {
+ case 0:
+ rv = -EINTR;
+ break;
+ case 1:
+ /* cancellation wasn't successful but op is done */
+ goto do_lock_wait;
+ default:
+ /* internal error doing cancel we need to wait */
+ goto wait;
+ }
+ }
log_debug(ls, "%s: wait interrupted %x %llx pid %d",
__func__, ls->ls_global_id,
(unsigned long long)number, op->info.pid);
- do_unlock_close(&op->info);
dlm_release_plock_op(op);
goto out;
}
} else {
+wait:
wait_event(recv_wq, (op->done != 0));
}
@@ -392,10 +410,12 @@ static ssize_t dev_read(struct file *file, char __user *u, size_t count,
spin_lock(&ops_lock);
if (!list_empty(&send_list)) {
op = list_first_entry(&send_list, struct plock_op, list);
- if (op->info.flags & DLM_PLOCK_FL_NO_REPLY)
+ if (op->info.flags & DLM_PLOCK_FL_NO_REPLY) {
list_del(&op->list);
- else
+ } else {
list_move_tail(&op->list, &recv_list);
+ set_bit(DLM_PLOCK_OP_FLAG_SENT, &op->flags);
+ }
memcpy(&info, &op->info, sizeof(info));
}
spin_unlock(&ops_lock);
@@ -457,6 +477,27 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count,
spin_lock(&ops_lock);
if (info.wait) {
list_for_each_entry(iter, &recv_list, list) {
+ /* A very specific posix lock case allows two
+ * lock request with the same meaning by using
+ * threads. It makes no sense from the application
+ * to do such request, however it is possible.
+ * We need to check the state for cancellation because
+ * we need to know the instance which is interrupted
+ * if two or more of the "same" lock requests are
+ * in waiting state and got interrupted.
+ *
+ * TODO we should move to a instance reference from
+ * request and reply and not go over lock states, but
+ * it seems going over lock states and get the instance
+ * does not make any problems (at least there were no
+ * issues found yet) but it's much cleaner to not think
+ * about all possible special cases and states to instance
+ * has no 1:1 mapping anymore.
+ */
+ if (info.optype == DLM_PLOCK_OP_CANCEL &&
+ !test_bit(DLM_PLOCK_OP_FLAG_INTERRUPTED, &iter->flags))
+ continue;
+
if (iter->info.fsid == info.fsid &&
iter->info.number == info.number &&
iter->info.owner == info.owner &&
@@ -480,9 +521,7 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count,
if (op) {
/* Sanity check that op and info match. */
- if (info.wait)
- WARN_ON(op->info.optype != DLM_PLOCK_OP_LOCK);
- else
+ if (!info.wait)
WARN_ON(op->info.fsid != info.fsid ||
op->info.number != info.number ||
op->info.owner != info.owner ||
diff --git a/include/uapi/linux/dlm_plock.h b/include/uapi/linux/dlm_plock.h
index 8dfa272c929a..9c4c083c824a 100644
--- a/include/uapi/linux/dlm_plock.h
+++ b/include/uapi/linux/dlm_plock.h
@@ -22,6 +22,7 @@ enum {
DLM_PLOCK_OP_LOCK = 1,
DLM_PLOCK_OP_UNLOCK,
DLM_PLOCK_OP_GET,
+ DLM_PLOCK_OP_CANCEL,
};
#define DLM_PLOCK_FL_CLOSE 1
--
2.31.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [Cluster-devel] [PATCH v6.5-rc1 1/2] fs: dlm: introduce DLM_PLOCK_FL_NO_REPLY flag
2023-07-13 14:40 [Cluster-devel] [PATCH v6.5-rc1 1/2] fs: dlm: introduce DLM_PLOCK_FL_NO_REPLY flag Alexander Aring
2023-07-13 14:40 ` [Cluster-devel] [PATCH v6.5-rc1 2/2] fs: dlm: allow to F_SETLKW getting interrupted Alexander Aring
@ 2023-07-13 14:49 ` Greg KH
2023-07-13 14:57 ` Alexander Aring
2023-07-14 13:54 ` Andreas Gruenbacher
2 siblings, 1 reply; 7+ messages in thread
From: Greg KH @ 2023-07-13 14:49 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Thu, Jul 13, 2023 at 10:40:28AM -0400, Alexander Aring wrote:
> This patch introduces a new flag DLM_PLOCK_FL_NO_REPLY in case an dlm
> plock operation should not send a reply back. Currently this is kind of
> being handled in DLM_PLOCK_FL_CLOSE, but DLM_PLOCK_FL_CLOSE has more
> meanings that it will remove all waiters for a specific nodeid/owner
> values in by doing a unlock operation. In case of an error in dlm user
> space software e.g. dlm_controld we get an reply with an error back.
> This cannot be matched because there is no op to match in recv_list. We
> filter now on DLM_PLOCK_FL_NO_REPLY in case we had an error back as
> reply. In newer dlm_controld version it will never send a result back
> when DLM_PLOCK_FL_NO_REPLY is set. This filter is a workaround to handle
> older dlm_controld versions.
>
> Fixes: 901025d2f319 ("dlm: make plock operation killable")
> Cc: stable at vger.kernel.org
> Signed-off-by: Alexander Aring <aahringo@redhat.com>
Why is adding a new uapi a stable patch?
> ---
> fs/dlm/plock.c | 23 +++++++++++++++++++----
> include/uapi/linux/dlm_plock.h | 1 +
> 2 files changed, 20 insertions(+), 4 deletions(-)
>
> diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
> index 70a4752ed913..7fe9f4b922d3 100644
> --- a/fs/dlm/plock.c
> +++ b/fs/dlm/plock.c
> @@ -96,7 +96,7 @@ static void do_unlock_close(const struct dlm_plock_info *info)
> op->info.end = OFFSET_MAX;
> op->info.owner = info->owner;
>
> - op->info.flags |= DLM_PLOCK_FL_CLOSE;
> + op->info.flags |= (DLM_PLOCK_FL_CLOSE | DLM_PLOCK_FL_NO_REPLY);
> send_op(op);
> }
>
> @@ -293,7 +293,7 @@ int dlm_posix_unlock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
> op->info.owner = (__u64)(long) fl->fl_owner;
>
> if (fl->fl_flags & FL_CLOSE) {
> - op->info.flags |= DLM_PLOCK_FL_CLOSE;
> + op->info.flags |= (DLM_PLOCK_FL_CLOSE | DLM_PLOCK_FL_NO_REPLY);
> send_op(op);
> rv = 0;
> goto out;
> @@ -392,7 +392,7 @@ static ssize_t dev_read(struct file *file, char __user *u, size_t count,
> spin_lock(&ops_lock);
> if (!list_empty(&send_list)) {
> op = list_first_entry(&send_list, struct plock_op, list);
> - if (op->info.flags & DLM_PLOCK_FL_CLOSE)
> + if (op->info.flags & DLM_PLOCK_FL_NO_REPLY)
> list_del(&op->list);
> else
> list_move_tail(&op->list, &recv_list);
> @@ -407,7 +407,7 @@ static ssize_t dev_read(struct file *file, char __user *u, size_t count,
> that were generated by the vfs cleaning up for a close
> (the process did not make an unlock call). */
>
> - if (op->info.flags & DLM_PLOCK_FL_CLOSE)
> + if (op->info.flags & DLM_PLOCK_FL_NO_REPLY)
> dlm_release_plock_op(op);
>
> if (copy_to_user(u, &info, sizeof(info)))
> @@ -433,6 +433,21 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count,
> if (check_version(&info))
> return -EINVAL;
>
> + /* Some old dlm user space software will send replies back,
> + * even if DLM_PLOCK_FL_NO_REPLY is set (because the flag is
> + * new) e.g. if a error occur. We can't match them in recv_list
> + * because they were never be part of it. We filter it here,
> + * new dlm user space software will filter it in user space.
> + *
> + * In future this handling can be removed.
> + */
> + if (info.flags & DLM_PLOCK_FL_NO_REPLY) {
> + pr_info("Received unexpected reply from op %d, "
> + "please update DLM user space software!\n",
> + info.optype);
Never allow userspace to spam the kernel log. And this is not going to
work, you need to handle the error and at most, report this to userspace
once.
Also, don't wrap your strings, checkpatch should have told you this.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Cluster-devel] [PATCH v6.5-rc1 1/2] fs: dlm: introduce DLM_PLOCK_FL_NO_REPLY flag
2023-07-13 14:49 ` [Cluster-devel] [PATCH v6.5-rc1 1/2] fs: dlm: introduce DLM_PLOCK_FL_NO_REPLY flag Greg KH
@ 2023-07-13 14:57 ` Alexander Aring
2023-07-13 15:08 ` Alexander Aring
0 siblings, 1 reply; 7+ messages in thread
From: Alexander Aring @ 2023-07-13 14:57 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
On Thu, Jul 13, 2023 at 10:49?AM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Thu, Jul 13, 2023 at 10:40:28AM -0400, Alexander Aring wrote:
> > This patch introduces a new flag DLM_PLOCK_FL_NO_REPLY in case an dlm
> > plock operation should not send a reply back. Currently this is kind of
> > being handled in DLM_PLOCK_FL_CLOSE, but DLM_PLOCK_FL_CLOSE has more
> > meanings that it will remove all waiters for a specific nodeid/owner
> > values in by doing a unlock operation. In case of an error in dlm user
> > space software e.g. dlm_controld we get an reply with an error back.
> > This cannot be matched because there is no op to match in recv_list. We
> > filter now on DLM_PLOCK_FL_NO_REPLY in case we had an error back as
> > reply. In newer dlm_controld version it will never send a result back
> > when DLM_PLOCK_FL_NO_REPLY is set. This filter is a workaround to handle
> > older dlm_controld versions.
> >
> > Fixes: 901025d2f319 ("dlm: make plock operation killable")
> > Cc: stable at vger.kernel.org
> > Signed-off-by: Alexander Aring <aahringo@redhat.com>
>
> Why is adding a new uapi a stable patch?
>
because the user space is just to copy the flags back to the kernel. I
thought it would work. :)
> > ---
> > fs/dlm/plock.c | 23 +++++++++++++++++++----
> > include/uapi/linux/dlm_plock.h | 1 +
> > 2 files changed, 20 insertions(+), 4 deletions(-)
> >
> > diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
> > index 70a4752ed913..7fe9f4b922d3 100644
> > --- a/fs/dlm/plock.c
> > +++ b/fs/dlm/plock.c
> > @@ -96,7 +96,7 @@ static void do_unlock_close(const struct dlm_plock_info *info)
> > op->info.end = OFFSET_MAX;
> > op->info.owner = info->owner;
> >
> > - op->info.flags |= DLM_PLOCK_FL_CLOSE;
> > + op->info.flags |= (DLM_PLOCK_FL_CLOSE | DLM_PLOCK_FL_NO_REPLY);
> > send_op(op);
> > }
> >
> > @@ -293,7 +293,7 @@ int dlm_posix_unlock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
> > op->info.owner = (__u64)(long) fl->fl_owner;
> >
> > if (fl->fl_flags & FL_CLOSE) {
> > - op->info.flags |= DLM_PLOCK_FL_CLOSE;
> > + op->info.flags |= (DLM_PLOCK_FL_CLOSE | DLM_PLOCK_FL_NO_REPLY);
> > send_op(op);
> > rv = 0;
> > goto out;
> > @@ -392,7 +392,7 @@ static ssize_t dev_read(struct file *file, char __user *u, size_t count,
> > spin_lock(&ops_lock);
> > if (!list_empty(&send_list)) {
> > op = list_first_entry(&send_list, struct plock_op, list);
> > - if (op->info.flags & DLM_PLOCK_FL_CLOSE)
> > + if (op->info.flags & DLM_PLOCK_FL_NO_REPLY)
> > list_del(&op->list);
> > else
> > list_move_tail(&op->list, &recv_list);
> > @@ -407,7 +407,7 @@ static ssize_t dev_read(struct file *file, char __user *u, size_t count,
> > that were generated by the vfs cleaning up for a close
> > (the process did not make an unlock call). */
> >
> > - if (op->info.flags & DLM_PLOCK_FL_CLOSE)
> > + if (op->info.flags & DLM_PLOCK_FL_NO_REPLY)
> > dlm_release_plock_op(op);
> >
> > if (copy_to_user(u, &info, sizeof(info)))
> > @@ -433,6 +433,21 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count,
> > if (check_version(&info))
> > return -EINVAL;
> >
> > + /* Some old dlm user space software will send replies back,
> > + * even if DLM_PLOCK_FL_NO_REPLY is set (because the flag is
> > + * new) e.g. if a error occur. We can't match them in recv_list
> > + * because they were never be part of it. We filter it here,
> > + * new dlm user space software will filter it in user space.
> > + *
> > + * In future this handling can be removed.
> > + */
> > + if (info.flags & DLM_PLOCK_FL_NO_REPLY) {
> > + pr_info("Received unexpected reply from op %d, "
> > + "please update DLM user space software!\n",
> > + info.optype);
>
> Never allow userspace to spam the kernel log. And this is not going to
> work, you need to handle the error and at most, report this to userspace
> once.
>
I will ignore handling this issue for older kernels because it would
probably be fine that the user space never gets an invalid value
handled.
> Also, don't wrap your strings, checkpatch should have told you this.
>
That is correct and I was ignoring it as the implementation has
another wrapped string somewhere else. It is a warning not an error.
Will send a v2 to not wrap the string around and drop Fixes and cc stable.
- Alex
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Cluster-devel] [PATCH v6.5-rc1 1/2] fs: dlm: introduce DLM_PLOCK_FL_NO_REPLY flag
2023-07-13 14:57 ` Alexander Aring
@ 2023-07-13 15:08 ` Alexander Aring
0 siblings, 0 replies; 7+ messages in thread
From: Alexander Aring @ 2023-07-13 15:08 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
On Thu, Jul 13, 2023 at 10:57?AM Alexander Aring <aahringo@redhat.com> wrote:
>
> Hi,
>
> On Thu, Jul 13, 2023 at 10:49?AM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Thu, Jul 13, 2023 at 10:40:28AM -0400, Alexander Aring wrote:
> > > This patch introduces a new flag DLM_PLOCK_FL_NO_REPLY in case an dlm
> > > plock operation should not send a reply back. Currently this is kind of
> > > being handled in DLM_PLOCK_FL_CLOSE, but DLM_PLOCK_FL_CLOSE has more
> > > meanings that it will remove all waiters for a specific nodeid/owner
> > > values in by doing a unlock operation. In case of an error in dlm user
> > > space software e.g. dlm_controld we get an reply with an error back.
> > > This cannot be matched because there is no op to match in recv_list. We
> > > filter now on DLM_PLOCK_FL_NO_REPLY in case we had an error back as
> > > reply. In newer dlm_controld version it will never send a result back
> > > when DLM_PLOCK_FL_NO_REPLY is set. This filter is a workaround to handle
> > > older dlm_controld versions.
> > >
> > > Fixes: 901025d2f319 ("dlm: make plock operation killable")
> > > Cc: stable at vger.kernel.org
> > > Signed-off-by: Alexander Aring <aahringo@redhat.com>
> >
> > Why is adding a new uapi a stable patch?
> >
>
> because the user space is just to copy the flags back to the kernel. I
> thought it would work. :)
>
* Speaking of dlm_controld here, we don't know any other
implementation which uses this UAPI. If there is another user space
application using it and does a different behaviour then this issue is
unfixable, as we don't know what behaviour we get there.
- Alex
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Cluster-devel] [PATCH v6.5-rc1 1/2] fs: dlm: introduce DLM_PLOCK_FL_NO_REPLY flag
2023-07-13 14:40 [Cluster-devel] [PATCH v6.5-rc1 1/2] fs: dlm: introduce DLM_PLOCK_FL_NO_REPLY flag Alexander Aring
2023-07-13 14:40 ` [Cluster-devel] [PATCH v6.5-rc1 2/2] fs: dlm: allow to F_SETLKW getting interrupted Alexander Aring
2023-07-13 14:49 ` [Cluster-devel] [PATCH v6.5-rc1 1/2] fs: dlm: introduce DLM_PLOCK_FL_NO_REPLY flag Greg KH
@ 2023-07-14 13:54 ` Andreas Gruenbacher
2023-07-14 23:21 ` Alexander Aring
2 siblings, 1 reply; 7+ messages in thread
From: Andreas Gruenbacher @ 2023-07-14 13:54 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Thu, Jul 13, 2023 at 4:40?PM Alexander Aring <aahringo@redhat.com> wrote:
> This patch introduces a new flag DLM_PLOCK_FL_NO_REPLY in case an dlm
> plock operation should not send a reply back. Currently this is kind of
> being handled in DLM_PLOCK_FL_CLOSE, but DLM_PLOCK_FL_CLOSE has more
> meanings that it will remove all waiters for a specific nodeid/owner
> values in by doing a unlock operation. In case of an error in dlm user
> space software e.g. dlm_controld we get an reply with an error back.
> This cannot be matched because there is no op to match in recv_list. We
> filter now on DLM_PLOCK_FL_NO_REPLY in case we had an error back as
> reply. In newer dlm_controld version it will never send a result back
> when DLM_PLOCK_FL_NO_REPLY is set. This filter is a workaround to handle
> older dlm_controld versions.
I don't think this makes sense. If dlm_controld understands a
particular request, it already knows whether or not that request
should receive a reply. On the other hand, if dlm_controld doesn't
understand a particular request, it should communicate that fact back
to the kernel so that the kernel will know. The kernel knows which
kinds of requests should and shouldn't receive replies, so when it is
sent a reply it doesn't expect, it knows that dlm_controld didn't
understand the request and is either outdated or plain broken. The
kernel doesn't need to pipe a flag through dlm_controld for figuring
that out.
Thanks,
Andreas
> Fixes: 901025d2f319 ("dlm: make plock operation killable")
> Cc: stable at vger.kernel.org
> Signed-off-by: Alexander Aring <aahringo@redhat.com>
> ---
> fs/dlm/plock.c | 23 +++++++++++++++++++----
> include/uapi/linux/dlm_plock.h | 1 +
> 2 files changed, 20 insertions(+), 4 deletions(-)
>
> diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
> index 70a4752ed913..7fe9f4b922d3 100644
> --- a/fs/dlm/plock.c
> +++ b/fs/dlm/plock.c
> @@ -96,7 +96,7 @@ static void do_unlock_close(const struct dlm_plock_info *info)
> op->info.end = OFFSET_MAX;
> op->info.owner = info->owner;
>
> - op->info.flags |= DLM_PLOCK_FL_CLOSE;
> + op->info.flags |= (DLM_PLOCK_FL_CLOSE | DLM_PLOCK_FL_NO_REPLY);
> send_op(op);
> }
>
> @@ -293,7 +293,7 @@ int dlm_posix_unlock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
> op->info.owner = (__u64)(long) fl->fl_owner;
>
> if (fl->fl_flags & FL_CLOSE) {
> - op->info.flags |= DLM_PLOCK_FL_CLOSE;
> + op->info.flags |= (DLM_PLOCK_FL_CLOSE | DLM_PLOCK_FL_NO_REPLY);
> send_op(op);
> rv = 0;
> goto out;
> @@ -392,7 +392,7 @@ static ssize_t dev_read(struct file *file, char __user *u, size_t count,
> spin_lock(&ops_lock);
> if (!list_empty(&send_list)) {
> op = list_first_entry(&send_list, struct plock_op, list);
> - if (op->info.flags & DLM_PLOCK_FL_CLOSE)
> + if (op->info.flags & DLM_PLOCK_FL_NO_REPLY)
> list_del(&op->list);
> else
> list_move_tail(&op->list, &recv_list);
> @@ -407,7 +407,7 @@ static ssize_t dev_read(struct file *file, char __user *u, size_t count,
> that were generated by the vfs cleaning up for a close
> (the process did not make an unlock call). */
>
> - if (op->info.flags & DLM_PLOCK_FL_CLOSE)
> + if (op->info.flags & DLM_PLOCK_FL_NO_REPLY)
> dlm_release_plock_op(op);
>
> if (copy_to_user(u, &info, sizeof(info)))
> @@ -433,6 +433,21 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count,
> if (check_version(&info))
> return -EINVAL;
>
> + /* Some old dlm user space software will send replies back,
> + * even if DLM_PLOCK_FL_NO_REPLY is set (because the flag is
> + * new) e.g. if a error occur. We can't match them in recv_list
> + * because they were never be part of it. We filter it here,
> + * new dlm user space software will filter it in user space.
> + *
> + * In future this handling can be removed.
> + */
> + if (info.flags & DLM_PLOCK_FL_NO_REPLY) {
> + pr_info("Received unexpected reply from op %d, "
> + "please update DLM user space software!\n",
> + info.optype);
> + return count;
> + }
> +
> /*
> * The results for waiting ops (SETLKW) can be returned in any
> * order, so match all fields to find the op. The results for
> diff --git a/include/uapi/linux/dlm_plock.h b/include/uapi/linux/dlm_plock.h
> index 63b6c1fd9169..8dfa272c929a 100644
> --- a/include/uapi/linux/dlm_plock.h
> +++ b/include/uapi/linux/dlm_plock.h
> @@ -25,6 +25,7 @@ enum {
> };
>
> #define DLM_PLOCK_FL_CLOSE 1
> +#define DLM_PLOCK_FL_NO_REPLY 2
>
> struct dlm_plock_info {
> __u32 version[3];
> --
> 2.31.1
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Cluster-devel] [PATCH v6.5-rc1 1/2] fs: dlm: introduce DLM_PLOCK_FL_NO_REPLY flag
2023-07-14 13:54 ` Andreas Gruenbacher
@ 2023-07-14 23:21 ` Alexander Aring
0 siblings, 0 replies; 7+ messages in thread
From: Alexander Aring @ 2023-07-14 23:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
On Fri, Jul 14, 2023 at 9:54?AM Andreas Gruenbacher <agruenba@redhat.com> wrote:
>
> On Thu, Jul 13, 2023 at 4:40?PM Alexander Aring <aahringo@redhat.com> wrote:
> > This patch introduces a new flag DLM_PLOCK_FL_NO_REPLY in case an dlm
> > plock operation should not send a reply back. Currently this is kind of
> > being handled in DLM_PLOCK_FL_CLOSE, but DLM_PLOCK_FL_CLOSE has more
> > meanings that it will remove all waiters for a specific nodeid/owner
> > values in by doing a unlock operation. In case of an error in dlm user
> > space software e.g. dlm_controld we get an reply with an error back.
> > This cannot be matched because there is no op to match in recv_list. We
> > filter now on DLM_PLOCK_FL_NO_REPLY in case we had an error back as
> > reply. In newer dlm_controld version it will never send a result back
> > when DLM_PLOCK_FL_NO_REPLY is set. This filter is a workaround to handle
> > older dlm_controld versions.
>
> I don't think this makes sense. If dlm_controld understands a
> particular request, it already knows whether or not that request
> should receive a reply. On the other hand, if dlm_controld doesn't
> understand a particular request, it should communicate that fact back
> to the kernel so that the kernel will know. The kernel knows which
> kinds of requests should and shouldn't receive replies, so when it is
> sent a reply it doesn't expect, it knows that dlm_controld didn't
> understand the request and is either outdated or plain broken. The
> kernel doesn't need to pipe a flag through dlm_controld for figuring
> that out.
>
It's already part of UAPI that a flag signals that a reply is not
expected [0]. If this flag is set and current user space software did
not understand the request (or any possible future user space error
handling) it will send a reply back which cannot be matched and with
the current match logic it will match the wrong one.
I would say it is broken and probably we don't care because we never
assume that an error will happen, we just need to be careful to not
add any other possible reply for errors in future.
The bigger problem is to introduce new ops [1] (not flags) on the
kernel side which does not send a reply back. Older user space
software will not understand the optype and will send a reply, newer
software will not send a reply (because it understands the optype).
Therefore I think we should never introduce a new optype which does
not send a reply back. The new DLM_PLOCK_OP_CANCEL was trying to do
that, that's why I tried to fix this behaviour which is another ugly
workaround passing flags around.
- Alex
[0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/dlm/plock.c#n395
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/dlm_plock.h?h=v6.5-rc1#n21
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-07-14 23:21 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-13 14:40 [Cluster-devel] [PATCH v6.5-rc1 1/2] fs: dlm: introduce DLM_PLOCK_FL_NO_REPLY flag Alexander Aring
2023-07-13 14:40 ` [Cluster-devel] [PATCH v6.5-rc1 2/2] fs: dlm: allow to F_SETLKW getting interrupted Alexander Aring
2023-07-13 14:49 ` [Cluster-devel] [PATCH v6.5-rc1 1/2] fs: dlm: introduce DLM_PLOCK_FL_NO_REPLY flag Greg KH
2023-07-13 14:57 ` Alexander Aring
2023-07-13 15:08 ` Alexander Aring
2023-07-14 13:54 ` Andreas Gruenbacher
2023-07-14 23:21 ` Alexander Aring
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).