lustre-devel-lustre.org archive mirror
 help / color / mirror / Atom feed
From: James Simmons <jsimmons@infradead.org>
To: Andreas Dilger <adilger@whamcloud.com>,
	Oleg Drokin <green@whamcloud.com>, NeilBrown <neilb@suse.de>
Cc: Lustre Development List <lustre-devel@lists.lustre.org>
Subject: [lustre-devel] [PATCH 12/27] lustre: obdclass: fix rpc slot leakage
Date: Mon, 17 Apr 2023 09:47:08 -0400	[thread overview]
Message-ID: <1681739243-29375-13-git-send-email-jsimmons@infradead.org> (raw)
In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org>

From: Alex Zhuravlev <bzzz@whamcloud.com>

obd_get_mod_rpc_slot() can race with obd_put_mod_rpc_slot():
finishing wait_woken() resets WQ_FLAG_WOKEN (which is set
when the corresponding thread gets a slot incrementing
cl_mod_rpcs_in_flight. then another thread execting
__wake_up_locked_key() may find that wq_entry again and call
claim_mod_rpc_function() one more time again incrementing
cl_mod_rpc_in_flight. thus it's incremented twice for a
single obd_get_mod_rpc_slot().

flags &= ~WQ_FLAG_WOKEN
list_add()
wait_woken()
  schedule              claim_mod_rpc_function()
                                cl_mod_rpcs_in_flight++
                                wake_up()

  flags &= ~WQ_FLAG_WOKEN

                        #3: obd_put_mod_rpc_slot()
                        claim_mod_rpc_function()
                                cl_mod_rpcs_in_flight++
                                wake_up()
list_del()

the patch introduces a replacement for WQ_FLAG_WOKEN which is never
reset once set.

Fixes: 6d398c0843 ("lustre: obdclass: improve precision of wakeups for mod_rpcs")
WC-bug-id: https://jira.whamcloud.com/browse/LU-16633
Lustre-commit: 91a3726f313df33e09 ("LU-16633 obdclass: fix rpc slot leakage")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50261
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/mdc/mdc_request.c |  3 +++
 fs/lustre/obdclass/genops.c | 11 +++++++----
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/fs/lustre/mdc/mdc_request.c b/fs/lustre/mdc/mdc_request.c
index 58ea982..15e58e8 100644
--- a/fs/lustre/mdc/mdc_request.c
+++ b/fs/lustre/mdc/mdc_request.c
@@ -2964,6 +2964,9 @@ static int mdc_precleanup(struct obd_device *obd)
 
 static int mdc_cleanup(struct obd_device *obd)
 {
+	struct client_obd *cli = &obd->u.cli;
+
+	LASSERT(cli->cl_mod_rpcs_in_flight == 0);
 	return osc_cleanup_common(obd);
 }
 
diff --git a/fs/lustre/obdclass/genops.c b/fs/lustre/obdclass/genops.c
index b6bde00..43772aa 100644
--- a/fs/lustre/obdclass/genops.c
+++ b/fs/lustre/obdclass/genops.c
@@ -1487,6 +1487,7 @@ int obd_mod_rpc_stats_seq_show(struct client_obd *cli, struct seq_file *seq)
 struct mod_waiter {
 	struct client_obd *cli;
 	bool close_req;
+	bool woken;
 	wait_queue_entry_t wqe;
 };
 static int claim_mod_rpc_function(wait_queue_entry_t *wq_entry,
@@ -1499,10 +1500,9 @@ static int claim_mod_rpc_function(wait_queue_entry_t *wq_entry,
 	int ret;
 
 	/* As woken_wake_function() doesn't remove us from the wait_queue,
-	 * we could get called twice for the same thread - take care.
+	 * we use own flag to ensure we're called just once.
 	 */
-	if (wq_entry->flags & WQ_FLAG_WOKEN)
-		/* Already woke this thread, don't try again */
+	if (w->woken)
 		return 0;
 
 	/* A slot is available if
@@ -1516,6 +1516,7 @@ static int claim_mod_rpc_function(wait_queue_entry_t *wq_entry,
 		if (w->close_req)
 			cli->cl_close_rpcs_in_flight++;
 		ret = woken_wake_function(wq_entry, mode, flags, key);
+		w->woken = true;
 	} else if (cli->cl_close_rpcs_in_flight)
 		/* No other waiter could be woken */
 		ret = -1;
@@ -1543,6 +1544,7 @@ u16 obd_get_mod_rpc_slot(struct client_obd *cli, u32 opc)
 	struct mod_waiter wait = {
 		.cli = cli,
 		.close_req = (opc == MDS_CLOSE),
+		.woken = false,
 	};
 	u16 i, max;
 
@@ -1556,7 +1558,8 @@ u16 obd_get_mod_rpc_slot(struct client_obd *cli, u32 opc)
 	 * and there will be no need to wait.
 	 */
 	wake_up_locked(&cli->cl_mod_rpcs_waitq);
-	if (!(wait.wqe.flags & WQ_FLAG_WOKEN)) {
+	/* XXX: handle spurious wakeups (from unknown yet source */
+	while (wait.woken == false) {
 		spin_unlock_irq(&cli->cl_mod_rpcs_waitq.lock);
 		wait_woken(&wait.wqe, TASK_UNINTERRUPTIBLE,
 			   MAX_SCHEDULE_TIMEOUT);
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

  parent reply	other threads:[~2023-04-17 13:57 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-17 13:46 [lustre-devel] [PATCH 00/27] lustre: sync to OpenSFS branch April 17, 2023 James Simmons
2023-04-17 13:46 ` [lustre-devel] [PATCH 01/27] lustre: llite: fix the wrong beyond read end calculation James Simmons
2023-04-17 13:46 ` [lustre-devel] [PATCH 02/27] lustre: lov: continue fsync on other OST objs even on -ENOENT James Simmons
2023-04-17 13:46 ` [lustre-devel] [PATCH 03/27] lustre: llite: protect cp_state with vmpage lock James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 04/27] lustre: llite: restart clio for AIO if necessary James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 05/27] lustre: protocol: add OBD_BRW_COMPRESSED James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 06/27] lustre: llite: call truncate_inode_pages() under inode lock James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 07/27] lustre: fid: reduce LUSTRE_DATA_SEQ_MAX_WIDTH James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 08/27] lnet: handle multi-rail setups James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 09/27] lustre: readahead: clip readahead with kms James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 10/27] lnet: use discovered ni status to set initial health James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 11/27] lnet: add 'lock_prim_nid" lnet module parameter James Simmons
2023-04-17 13:47 ` James Simmons [this message]
2023-04-17 13:47 ` [lustre-devel] [PATCH 13/27] lnet: libcfs: cleanup console messages James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 14/27] lustre: ldlm: clear lock converting flag on resource cleanup James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 15/27] lustre: statahead: statahead thread doesn't stop James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 16/27] lustre: uapi: fix unused function errors James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 17/27] lnet: Health logging improvements James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 18/27] lustre: update version to 2.15.54 James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 19/27] lustre: misc: remove unnecessary ioctl typecasts James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 20/27] lustre: llite: move common ioctl code to ll_iocontrol() James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 21/27] lnet: change LNetAddPeer() to take struct lnet_nid James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 22/27] lustre: obdclass: change class_add/check_uuid to large nid James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 23/27] lustre: obdclass: rename class_parse_nid to class_parse_nid4 James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 24/27] lustre: llite: only first sync to MDS matter James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 25/27] lustre: statahead: batched statahead processing James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 26/27] lustre: llite: fix LSOM blocks for ftruncate and close James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 27/27] lnet: fix clang build errors James Simmons

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1681739243-29375-13-git-send-email-jsimmons@infradead.org \
    --to=jsimmons@infradead.org \
    --cc=adilger@whamcloud.com \
    --cc=green@whamcloud.com \
    --cc=lustre-devel@lists.lustre.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).