From: James Simmons <jsimmons@infradead.org>
To: Andreas Dilger <adilger@whamcloud.com>,
Oleg Drokin <green@whamcloud.com>, NeilBrown <neilb@suse.de>
Cc: Lustre Development List <lustre-devel@lists.lustre.org>
Subject: [lustre-devel] [PATCH 15/15] lustre: mgc: configurable wait-to-reprocess time
Date: Wed, 7 Jul 2021 15:11:16 -0400 [thread overview]
Message-ID: <1625685076-1964-16-git-send-email-jsimmons@infradead.org> (raw)
In-Reply-To: <1625685076-1964-1-git-send-email-jsimmons@infradead.org>
From: Alex Zhuravlev <bzzz@whamcloud.com>
so we can set it shorter, for testing purposes at least. to change
minimal wait time MGC module option 'mgc_requeue_timeout_min'
should be used (in seconds). additionally a random value upto
mgc_requeue_timeout_min is added to avoid a flood of config re-read
requests from clients. if mgc_requeue_timeout_min is set to 0,
then random part will be upto 1 second.
ost-pools: before: 5840s, after:a 3474s
sanity-flr: before: 1575s, after: 1381s
sanity-quota: before: 10679s, after: 9703s
WC-bug-id: https://jira.whamcloud.com/browse/LU-14516
Lustre-commit: 04b2da6180d3c8eda ("LU-14516 mgc: configurable wait-to-reprocess time")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/42020
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
fs/lustre/mgc/mgc_internal.h | 8 ++++++++
fs/lustre/mgc/mgc_request.c | 44 +++++++++++++++++++++++++++++++++-----------
2 files changed, 41 insertions(+), 11 deletions(-)
diff --git a/fs/lustre/mgc/mgc_internal.h b/fs/lustre/mgc/mgc_internal.h
index a2a09d4..91f5fa1 100644
--- a/fs/lustre/mgc/mgc_internal.h
+++ b/fs/lustre/mgc/mgc_internal.h
@@ -43,6 +43,14 @@
int mgc_process_log(struct obd_device *mgc, struct config_llog_data *cld);
+/* this timeout represents how many seconds MGC should wait before
+ * requeue config and recover lock to the MGS. We need to randomize this
+ * in order to not flood the MGS.
+ */
+#define MGC_TIMEOUT_MIN_SECONDS 5
+
+extern unsigned int mgc_requeue_timeout_min;
+
static inline bool cld_is_sptlrpc(struct config_llog_data *cld)
{
return cld->cld_type == MGS_CFG_T_SPTLRPC;
diff --git a/fs/lustre/mgc/mgc_request.c b/fs/lustre/mgc/mgc_request.c
index 1dfc74b..50044aa2 100644
--- a/fs/lustre/mgc/mgc_request.c
+++ b/fs/lustre/mgc/mgc_request.c
@@ -530,13 +530,6 @@ static void do_requeue(struct config_llog_data *cld)
up_read(&cld->cld_mgcexp->exp_obd->u.cli.cl_sem);
}
-/* this timeout represents how many seconds MGC should wait before
- * requeue config and recover lock to the MGS. We need to randomize this
- * in order to not flood the MGS.
- */
-#define MGC_TIMEOUT_MIN_SECONDS 5
-#define MGC_TIMEOUT_RAND_CENTISEC 500
-
static int mgc_requeue_thread(void *data)
{
bool first = true;
@@ -548,7 +541,6 @@ static int mgc_requeue_thread(void *data)
rq_state |= RQ_RUNNING;
while (!(rq_state & RQ_STOP)) {
struct config_llog_data *cld, *cld_prev;
- int rand = prandom_u32_max(MGC_TIMEOUT_RAND_CENTISEC);
int to;
/* Any new or requeued lostlocks will change the state */
@@ -565,11 +557,11 @@ static int mgc_requeue_thread(void *data)
* random so everyone doesn't try to reconnect at once.
*/
/* rand is centi-seconds, "to" is in centi-HZ */
- to = MGC_TIMEOUT_MIN_SECONDS * HZ * 100;
- to += rand * HZ;
+ to = mgc_requeue_timeout_min == 0 ? 1 : mgc_requeue_timeout_min;
+ to = mgc_requeue_timeout_min * HZ + prandom_u32_max(to * HZ);
wait_event_idle_timeout(rq_waitq,
rq_state & (RQ_STOP | RQ_PRECLEANUP),
- to/100);
+ to);
/*
* iterate & processing through the list. for each cld, process
@@ -1835,6 +1827,36 @@ static int mgc_process_config(struct obd_device *obd, u32 len, void *buf)
.process_config = mgc_process_config,
};
+static int mgc_param_requeue_timeout_min_set(const char *val,
+ const struct kernel_param *kp)
+{
+ int rc;
+ unsigned int num;
+
+ rc = kstrtouint(val, 0, &num);
+ if (rc < 0)
+ return rc;
+ if (num > 120)
+ return -EINVAL;
+
+ mgc_requeue_timeout_min = num;
+
+ return 0;
+}
+
+static struct kernel_param_ops param_ops_requeue_timeout_min = {
+ .set = mgc_param_requeue_timeout_min_set,
+ .get = param_get_uint,
+};
+
+#define param_check_requeue_timeout_min(name, p) \
+ __param_check(name, p, unsigned int)
+
+unsigned int mgc_requeue_timeout_min = MGC_TIMEOUT_MIN_SECONDS;
+module_param_call(mgc_requeue_timeout_min, mgc_param_requeue_timeout_min_set,
+ param_get_uint, ¶m_ops_requeue_timeout_min, 0644);
+MODULE_PARM_DESC(mgc_requeue_timeout_min, "Minimal requeue time to refresh logs");
+
static int __init mgc_init(void)
{
int rc;
--
1.8.3.1
_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
prev parent reply other threads:[~2021-07-07 19:11 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-07 19:11 [lustre-devel] [PATCH 00/15] lustre: updates to OpenSFS tree as of July 7 2021 James Simmons
2021-07-07 19:11 ` [lustre-devel] [PATCH 01/15] lustre: osc: Notify server if cache discard takes a long time James Simmons
2021-07-07 19:11 ` [lustre-devel] [PATCH 02/15] lustre: osc: Move shrink update to per-write James Simmons
2021-07-07 19:11 ` [lustre-devel] [PATCH 03/15] lustre: client: don't panic for mgs evictions James Simmons
2021-07-07 19:11 ` [lustre-devel] [PATCH 04/15] lnet: Add health ping stats James Simmons
2021-07-07 19:11 ` [lustre-devel] [PATCH 05/15] lnet: Ensure ref taken when queueing for discovery James Simmons
2021-07-07 19:11 ` [lustre-devel] [PATCH 06/15] lnet: Correct distance calculation of local NIDs James Simmons
2021-07-07 19:11 ` [lustre-devel] [PATCH 07/15] lnet: socklnd: detect link state to set fatal error on ni James Simmons
2021-07-07 19:11 ` [lustre-devel] [PATCH 08/15] lustre: mdt: New connect flag for non-open-by-fid lock request James Simmons
2021-07-07 19:11 ` [lustre-devel] [PATCH 09/15] lustre: obdclass: Wake up entire queue of requests on close completion James Simmons
2021-07-07 19:11 ` [lustre-devel] [PATCH 10/15] lnet: add netlink infrastructure James Simmons
2021-07-07 19:11 ` [lustre-devel] [PATCH 11/15] lustre: llite: parallelize direct i/o issuance James Simmons
2021-07-07 19:11 ` [lustre-devel] [PATCH 12/15] lustre: osc: Don't get time for each page James Simmons
2021-07-07 19:11 ` [lustre-devel] [PATCH 13/15] lustre: clio: Implement real list splice James Simmons
2021-07-07 19:11 ` [lustre-devel] [PATCH 14/15] lustre: osc: Simplify clipping for transient pages James Simmons
2021-07-07 19:11 ` James Simmons [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1625685076-1964-16-git-send-email-jsimmons@infradead.org \
--to=jsimmons@infradead.org \
--cc=adilger@whamcloud.com \
--cc=green@whamcloud.com \
--cc=lustre-devel@lists.lustre.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).