cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Alexander Aring <aahringo@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [PATCH RESEND v5.19-rc3 06/20] fs: dlm: change posix lock sigint handling
Date: Wed, 22 Jun 2022 14:45:09 -0400	[thread overview]
Message-ID: <20220622184523.1886869-7-aahringo@redhat.com> (raw)
In-Reply-To: <20220622184523.1886869-1-aahringo@redhat.com>

This patch will change the handling if a plock operation was interrupted
while waiting for a user space reply (probably dlm_controld). This is
not while the posix lock waits in lock blocking state which is done by
locks_lock_file_wait(). However the lock operation should be always
interruptible, doesn't matter which wait is currently blocking the
process.

If an interruption due waiting on a user space reply occurs the current
behaviour is that we remove the already transmitted operation request to
the user space from an list which is used to make a lookup if a reply
comes back. This has as side effect that we see some:

dev_write no op...

in the kernel log because the lookup failed. This is easily reproducible
by running:

stress-ng --fcntl 100

and hitting strg-c afterwards.

Instead of removing the op from the lookup list, we wait until the
operation is completed. When the operation was completed we check if the
wait was interrupted, if so we don't handle the request anymore and
cleanup the original lock request. When there are still "dev_write no op"
messages around it signals an issue that we removed an op while is hasn't
been completed yet. This situation should never happen.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/plock.c | 29 ++++++++++++++++++++++++++---
 1 file changed, 26 insertions(+), 3 deletions(-)

diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
index cf7bba461bfd..737f185aad8d 100644
--- a/fs/dlm/plock.c
+++ b/fs/dlm/plock.c
@@ -29,6 +29,8 @@ struct plock_async_data {
 struct plock_op {
 	struct list_head list;
 	int done;
+	/* if lock op got interrupted while waiting dlm_controld reply */
+	bool sigint;
 	struct dlm_plock_info info;
 	/* if set indicates async handling */
 	struct plock_async_data *data;
@@ -157,16 +159,24 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
 	rv = wait_event_interruptible(recv_wq, (op->done != 0));
 	if (rv == -ERESTARTSYS) {
 		spin_lock(&ops_lock);
-		list_del(&op->list);
+		/* recheck under ops_lock if we got a done != 0,
+		 * if so this interrupt case should be ignored
+		 */
+		if (op->done != 0) {
+			spin_unlock(&ops_lock);
+			goto do_lock_wait;
+		}
+
+		op->sigint = true;
 		spin_unlock(&ops_lock);
 		log_debug(ls, "%s: wait interrupted %x %llx pid %d",
 			  __func__, ls->ls_global_id,
 			  (unsigned long long)number, op->info.pid);
-		dlm_release_plock_op(op);
-		do_unlock_close(&op->info);
 		goto out;
 	}
 
+do_lock_wait:
+
 	WARN_ON(!list_empty(&op->list));
 
 	rv = op->info.rv;
@@ -421,6 +431,19 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count,
 		if (iter->info.fsid == info.fsid &&
 		    iter->info.number == info.number &&
 		    iter->info.owner == info.owner) {
+			if (iter->sigint) {
+				list_del(&iter->list);
+				spin_unlock(&ops_lock);
+
+				pr_debug("%s: sigint cleanup %x %llx pid %d",
+					  __func__, iter->info.fsid,
+					  (unsigned long long)iter->info.number,
+					  iter->info.pid);
+				do_unlock_close(&iter->info);
+				memcpy(&iter->info, &info, sizeof(info));
+				dlm_release_plock_op(iter);
+				return count;
+			}
 			list_del_init(&iter->list);
 			memcpy(&iter->info, &info, sizeof(info));
 			if (iter->data)
-- 
2.31.1


  parent reply	other threads:[~2022-06-22 18:45 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-22 18:45 [Cluster-devel] [PATCH RESEND v5.19-rc3 00/20] fs: dlm: plock, recovery and API deprecation Alexander Aring
2022-06-22 18:45 ` [Cluster-devel] [PATCH RESEND v5.19-rc3 01/20] fs: dlm: plock use list_first_entry Alexander Aring
2022-06-22 18:45 ` [Cluster-devel] [PATCH RESEND v5.19-rc3 02/20] fs: dlm: remove may interrupted message Alexander Aring
2022-06-22 18:45 ` [Cluster-devel] [PATCH RESEND v5.19-rc3 03/20] fs: dlm: add pid to debug log Alexander Aring
2022-06-22 18:45 ` [Cluster-devel] [PATCH RESEND v5.19-rc3 04/20] fs: dlm: change log output to debug again Alexander Aring
2022-06-22 18:45 ` [Cluster-devel] [PATCH RESEND v5.19-rc3 05/20] fs: dlm: use dlm_plock_info for do_unlock_close Alexander Aring
2022-06-22 18:45 ` Alexander Aring [this message]
2022-06-22 18:45 ` [Cluster-devel] [PATCH RESEND v5.19-rc3 07/20] fs: dlm: change ast and bast trace order Alexander Aring
2022-06-22 18:45 ` [Cluster-devel] [PATCH RESEND v5.19-rc3 08/20] fs: dlm: remove additional dereference of lkbsb Alexander Aring
2022-06-22 18:45 ` [Cluster-devel] [PATCH RESEND v5.19-rc3 09/20] fs: dlm: add resource name to tracepoints Alexander Aring
2022-06-22 18:45 ` [Cluster-devel] [PATCH RESEND v5.19-rc3 10/20] fs: dlm: add notes for recovery and membership handling Alexander Aring
2022-06-22 18:45 ` [Cluster-devel] [PATCH RESEND v5.19-rc3 11/20] fs: dlm: call dlm_lsop_recover_prep once Alexander Aring
2022-06-22 18:45 ` [Cluster-devel] [PATCH RESEND v5.19-rc3 12/20] fs: dlm: let new_lockspace() wait until recovery Alexander Aring
2022-06-22 18:45 ` [Cluster-devel] [PATCH RESEND v5.19-rc3 13/20] fs: dlm: handle recovery result outside of ls_recover Alexander Aring
2022-06-22 18:45 ` [Cluster-devel] [PATCH RESEND v5.19-rc3 14/20] fs: dlm: handle recovery -EAGAIN case as retry Alexander Aring
2022-06-22 18:45 ` [Cluster-devel] [PATCH RESEND v5.19-rc3 15/20] fs: dlm: change -EINVAL recovery error to -EAGAIN Alexander Aring
2022-06-22 18:45 ` [Cluster-devel] [PATCH RESEND v5.19-rc3 16/20] fs: dlm: add comment about lkb IFL flags Alexander Aring
2022-06-22 18:45 ` [Cluster-devel] [PATCH RESEND v5.19-rc3 17/20] fs: dlm: remove warn waiter handling Alexander Aring
2022-06-22 18:45 ` [Cluster-devel] [PATCH RESEND v5.19-rc3 18/20] fs: dlm: remove timeout from dlm_user_adopt_orphan Alexander Aring
2022-06-22 18:45 ` [Cluster-devel] [PATCH RESEND v5.19-rc3 19/20] fs: dlm: add API deprecation warning Alexander Aring
2022-06-22 18:45 ` [Cluster-devel] [PATCH RESEND v5.19-rc3 20/20] fs: dlm: don't use deprecated API by default Alexander Aring

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220622184523.1886869-7-aahringo@redhat.com \
    --to=aahringo@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).