From: James Simmons <jsimmons@infradead.org>
To: Andreas Dilger <adilger@whamcloud.com>,
Oleg Drokin <green@whamcloud.com>, NeilBrown <neilb@suse.de>
Cc: Lustre Development List <lustre-devel@lists.lustre.org>
Subject: [lustre-devel] [PATCH 12/27] lustre: obdclass: fix rpc slot leakage
Date: Mon, 17 Apr 2023 09:47:08 -0400 [thread overview]
Message-ID: <1681739243-29375-13-git-send-email-jsimmons@infradead.org> (raw)
In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org>
From: Alex Zhuravlev <bzzz@whamcloud.com>
obd_get_mod_rpc_slot() can race with obd_put_mod_rpc_slot():
finishing wait_woken() resets WQ_FLAG_WOKEN (which is set
when the corresponding thread gets a slot incrementing
cl_mod_rpcs_in_flight. then another thread execting
__wake_up_locked_key() may find that wq_entry again and call
claim_mod_rpc_function() one more time again incrementing
cl_mod_rpc_in_flight. thus it's incremented twice for a
single obd_get_mod_rpc_slot().
flags &= ~WQ_FLAG_WOKEN
list_add()
wait_woken()
schedule claim_mod_rpc_function()
cl_mod_rpcs_in_flight++
wake_up()
flags &= ~WQ_FLAG_WOKEN
#3: obd_put_mod_rpc_slot()
claim_mod_rpc_function()
cl_mod_rpcs_in_flight++
wake_up()
list_del()
the patch introduces a replacement for WQ_FLAG_WOKEN which is never
reset once set.
Fixes: 6d398c0843 ("lustre: obdclass: improve precision of wakeups for mod_rpcs")
WC-bug-id: https://jira.whamcloud.com/browse/LU-16633
Lustre-commit: 91a3726f313df33e09 ("LU-16633 obdclass: fix rpc slot leakage")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50261
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
fs/lustre/mdc/mdc_request.c | 3 +++
fs/lustre/obdclass/genops.c | 11 +++++++----
2 files changed, 10 insertions(+), 4 deletions(-)
diff --git a/fs/lustre/mdc/mdc_request.c b/fs/lustre/mdc/mdc_request.c
index 58ea982..15e58e8 100644
--- a/fs/lustre/mdc/mdc_request.c
+++ b/fs/lustre/mdc/mdc_request.c
@@ -2964,6 +2964,9 @@ static int mdc_precleanup(struct obd_device *obd)
static int mdc_cleanup(struct obd_device *obd)
{
+ struct client_obd *cli = &obd->u.cli;
+
+ LASSERT(cli->cl_mod_rpcs_in_flight == 0);
return osc_cleanup_common(obd);
}
diff --git a/fs/lustre/obdclass/genops.c b/fs/lustre/obdclass/genops.c
index b6bde00..43772aa 100644
--- a/fs/lustre/obdclass/genops.c
+++ b/fs/lustre/obdclass/genops.c
@@ -1487,6 +1487,7 @@ int obd_mod_rpc_stats_seq_show(struct client_obd *cli, struct seq_file *seq)
struct mod_waiter {
struct client_obd *cli;
bool close_req;
+ bool woken;
wait_queue_entry_t wqe;
};
static int claim_mod_rpc_function(wait_queue_entry_t *wq_entry,
@@ -1499,10 +1500,9 @@ static int claim_mod_rpc_function(wait_queue_entry_t *wq_entry,
int ret;
/* As woken_wake_function() doesn't remove us from the wait_queue,
- * we could get called twice for the same thread - take care.
+ * we use own flag to ensure we're called just once.
*/
- if (wq_entry->flags & WQ_FLAG_WOKEN)
- /* Already woke this thread, don't try again */
+ if (w->woken)
return 0;
/* A slot is available if
@@ -1516,6 +1516,7 @@ static int claim_mod_rpc_function(wait_queue_entry_t *wq_entry,
if (w->close_req)
cli->cl_close_rpcs_in_flight++;
ret = woken_wake_function(wq_entry, mode, flags, key);
+ w->woken = true;
} else if (cli->cl_close_rpcs_in_flight)
/* No other waiter could be woken */
ret = -1;
@@ -1543,6 +1544,7 @@ u16 obd_get_mod_rpc_slot(struct client_obd *cli, u32 opc)
struct mod_waiter wait = {
.cli = cli,
.close_req = (opc == MDS_CLOSE),
+ .woken = false,
};
u16 i, max;
@@ -1556,7 +1558,8 @@ u16 obd_get_mod_rpc_slot(struct client_obd *cli, u32 opc)
* and there will be no need to wait.
*/
wake_up_locked(&cli->cl_mod_rpcs_waitq);
- if (!(wait.wqe.flags & WQ_FLAG_WOKEN)) {
+ /* XXX: handle spurious wakeups (from unknown yet source */
+ while (wait.woken == false) {
spin_unlock_irq(&cli->cl_mod_rpcs_waitq.lock);
wait_woken(&wait.wqe, TASK_UNINTERRUPTIBLE,
MAX_SCHEDULE_TIMEOUT);
--
1.8.3.1
_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
next prev parent reply other threads:[~2023-04-17 13:57 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-17 13:46 [lustre-devel] [PATCH 00/27] lustre: sync to OpenSFS branch April 17, 2023 James Simmons
2023-04-17 13:46 ` [lustre-devel] [PATCH 01/27] lustre: llite: fix the wrong beyond read end calculation James Simmons
2023-04-17 13:46 ` [lustre-devel] [PATCH 02/27] lustre: lov: continue fsync on other OST objs even on -ENOENT James Simmons
2023-04-17 13:46 ` [lustre-devel] [PATCH 03/27] lustre: llite: protect cp_state with vmpage lock James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 04/27] lustre: llite: restart clio for AIO if necessary James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 05/27] lustre: protocol: add OBD_BRW_COMPRESSED James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 06/27] lustre: llite: call truncate_inode_pages() under inode lock James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 07/27] lustre: fid: reduce LUSTRE_DATA_SEQ_MAX_WIDTH James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 08/27] lnet: handle multi-rail setups James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 09/27] lustre: readahead: clip readahead with kms James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 10/27] lnet: use discovered ni status to set initial health James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 11/27] lnet: add 'lock_prim_nid" lnet module parameter James Simmons
2023-04-17 13:47 ` James Simmons [this message]
2023-04-17 13:47 ` [lustre-devel] [PATCH 13/27] lnet: libcfs: cleanup console messages James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 14/27] lustre: ldlm: clear lock converting flag on resource cleanup James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 15/27] lustre: statahead: statahead thread doesn't stop James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 16/27] lustre: uapi: fix unused function errors James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 17/27] lnet: Health logging improvements James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 18/27] lustre: update version to 2.15.54 James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 19/27] lustre: misc: remove unnecessary ioctl typecasts James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 20/27] lustre: llite: move common ioctl code to ll_iocontrol() James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 21/27] lnet: change LNetAddPeer() to take struct lnet_nid James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 22/27] lustre: obdclass: change class_add/check_uuid to large nid James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 23/27] lustre: obdclass: rename class_parse_nid to class_parse_nid4 James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 24/27] lustre: llite: only first sync to MDS matter James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 25/27] lustre: statahead: batched statahead processing James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 26/27] lustre: llite: fix LSOM blocks for ftruncate and close James Simmons
2023-04-17 13:47 ` [lustre-devel] [PATCH 27/27] lnet: fix clang build errors James Simmons
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1681739243-29375-13-git-send-email-jsimmons@infradead.org \
--to=jsimmons@infradead.org \
--cc=adilger@whamcloud.com \
--cc=green@whamcloud.com \
--cc=lustre-devel@lists.lustre.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).