From: James Simmons <jsimmons@infradead.org>
To: lustre-devel@lists.lustre.org
Subject: [lustre-devel] [PATCH 21/42] lnet: Loosen restrictions on LNet Health params
Date: Mon, 5 Oct 2020 20:06:00 -0400 [thread overview]
Message-ID: <1601942781-24950-22-git-send-email-jsimmons@infradead.org> (raw)
In-Reply-To: <1601942781-24950-1-git-send-email-jsimmons@infradead.org>
From: Chris Horn <chris.horn@hpe.com>
The functions that set various LNet Health related parameters require
that the parameters be set in a specific order depending on whether
health is enabled or disabled. This is not user-friendly.
- Don't overwrite lnet_transaction_timeout when health is being
enabled or disabled.
- Don't overwrite lnet_retry_count when health is being enabled
(still set it to zero when health is disabled).
- Allow lnet_retry_count to be set to 0 when health is disabled
- Correct off-by-one error in transaction_to_set() to ensure
lnet_transaction_timeout is greater than lnet_retry_count
HPE-bug-id: LUS-8995
WC-bug-id: https://jira.whamcloud.com/browse/LU-13735
Lustre-commit: b8ff97611717d ("LU-13735 lnet: Loosen restrictions on LNet Health params")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Reviewed-on: https://review.whamcloud.com/39228
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
net/lnet/lnet/api-ni.c | 35 ++++++++++-------------------------
1 file changed, 10 insertions(+), 25 deletions(-)
diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c
index af2089e..f678ae2 100644
--- a/net/lnet/lnet/api-ni.c
+++ b/net/lnet/lnet/api-ni.c
@@ -140,10 +140,8 @@ static int recovery_interval_set(const char *val,
MODULE_PARM_DESC(lnet_drop_asym_route,
"Set to 1 to drop asymmetrical route messages.");
-#define LNET_TRANSACTION_TIMEOUT_NO_HEALTH_DEFAULT 50
-#define LNET_TRANSACTION_TIMEOUT_HEALTH_DEFAULT 50
-
-unsigned int lnet_transaction_timeout = LNET_TRANSACTION_TIMEOUT_HEALTH_DEFAULT;
+#define LNET_TRANSACTION_TIMEOUT_DEFAULT 50
+unsigned int lnet_transaction_timeout = LNET_TRANSACTION_TIMEOUT_DEFAULT;
static int transaction_to_set(const char *val, const struct kernel_param *kp);
static struct kernel_param_ops param_ops_transaction_timeout = {
.set = transaction_to_set,
@@ -156,8 +154,8 @@ static int recovery_interval_set(const char *val,
MODULE_PARM_DESC(lnet_transaction_timeout,
"Maximum number of seconds to wait for a peer response.");
-#define LNET_RETRY_COUNT_HEALTH_DEFAULT 2
-unsigned int lnet_retry_count = LNET_RETRY_COUNT_HEALTH_DEFAULT;
+#define LNET_RETRY_COUNT_DEFAULT 2
+unsigned int lnet_retry_count = LNET_RETRY_COUNT_DEFAULT;
static int retry_count_set(const char *val, const struct kernel_param *kp);
static struct kernel_param_ops param_ops_retry_count = {
.set = retry_count_set,
@@ -184,8 +182,8 @@ static int response_tracking_set(const char *val,
MODULE_PARM_DESC(lnet_response_tracking,
"(0|1|2|3) LNet Internal Only|GET Reply only|PUT ACK only|Full Tracking (default)");
-#define LNET_LND_TIMEOUT_DEFAULT ((LNET_TRANSACTION_TIMEOUT_HEALTH_DEFAULT - 1) / \
- (LNET_RETRY_COUNT_HEALTH_DEFAULT + 1))
+#define LNET_LND_TIMEOUT_DEFAULT ((LNET_TRANSACTION_TIMEOUT_DEFAULT - 1) / \
+ (LNET_RETRY_COUNT_DEFAULT + 1))
unsigned int lnet_lnd_timeout = LNET_LND_TIMEOUT_DEFAULT;
static void lnet_set_lnd_timeout(void)
{
@@ -235,20 +233,7 @@ static int lnet_discover(struct lnet_process_id id, u32 force,
return -EINVAL;
}
- /* if we're turning on health then use the health timeout
- * defaults.
- */
- if (*sensitivity == 0 && value != 0) {
- lnet_transaction_timeout =
- LNET_TRANSACTION_TIMEOUT_HEALTH_DEFAULT;
- lnet_retry_count = LNET_RETRY_COUNT_HEALTH_DEFAULT;
- lnet_set_lnd_timeout();
- /* if we're turning off health then use the no health timeout
- * default.
- */
- } else if (*sensitivity != 0 && value == 0) {
- lnet_transaction_timeout =
- LNET_TRANSACTION_TIMEOUT_NO_HEALTH_DEFAULT;
+ if (*sensitivity != 0 && value == 0 && lnet_retry_count != 0) {
lnet_retry_count = 0;
lnet_set_lnd_timeout();
}
@@ -396,7 +381,7 @@ static int lnet_discover(struct lnet_process_id id, u32 force,
*/
mutex_lock(&the_lnet.ln_api_mutex);
- if (value < lnet_retry_count || value == 0) {
+ if (value <= lnet_retry_count || value == 0) {
mutex_unlock(&the_lnet.ln_api_mutex);
CERROR("Invalid value for lnet_transaction_timeout (%lu). Has to be greater than lnet_retry_count (%u)\n",
value, lnet_retry_count);
@@ -437,9 +422,9 @@ static int lnet_discover(struct lnet_process_id id, u32 force,
*/
mutex_lock(&the_lnet.ln_api_mutex);
- if (lnet_health_sensitivity == 0) {
+ if (lnet_health_sensitivity == 0 && value > 0) {
mutex_unlock(&the_lnet.ln_api_mutex);
- CERROR("Can not set retry_count when health feature is turned off\n");
+ CERROR("Can not set lnet_retry_count when health feature is turned off\n");
return -EINVAL;
}
--
1.8.3.1
next prev parent reply other threads:[~2020-10-06 0:06 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-06 0:05 [lustre-devel] [PATCH 00/42] lustre: OpenSFS backport for Oct 4 2020 James Simmons
2020-10-06 0:05 ` [lustre-devel] [PATCH 01/42] lustre: ptlrpc: don't require CONFIG_CRYPTO_CRC32 James Simmons
2020-10-06 0:05 ` [lustre-devel] [PATCH 02/42] lustre: dom: lock cancel to drop pages James Simmons
2020-10-06 0:05 ` [lustre-devel] [PATCH 03/42] lustre: sec: use memchr_inv() to check if page is zero James Simmons
2020-10-06 0:05 ` [lustre-devel] [PATCH 04/42] lustre: mdc: fix lovea for replay James Simmons
2020-10-06 0:05 ` [lustre-devel] [PATCH 05/42] lustre: llite: add test to check client deadlock selinux James Simmons
2020-10-06 0:05 ` [lustre-devel] [PATCH 06/42] lnet: use init_wait(), not init_waitqueue_entry() James Simmons
2020-10-06 0:05 ` [lustre-devel] [PATCH 07/42] lustre: lov: make various lov_object.c function static James Simmons
2020-10-06 0:05 ` [lustre-devel] [PATCH 08/42] lustre: llite: return -ENODATA if no default layout James Simmons
2020-10-06 0:05 ` [lustre-devel] [PATCH 09/42] lnet: libcfs: don't save journal_info in dumplog thread James Simmons
2020-10-06 0:05 ` [lustre-devel] [PATCH 10/42] lustre: ldlm: lru code cleanup James Simmons
2020-10-06 0:05 ` [lustre-devel] [PATCH 11/42] lustre: ldlm: cancel LRU improvement James Simmons
2020-10-06 0:05 ` [lustre-devel] [PATCH 12/42] lnet: Do not set preferred NI for MR peer James Simmons
2020-10-06 0:05 ` [lustre-devel] [PATCH 13/42] lustre: ptlrpc: prefer crc32_le() over CryptoAPI James Simmons
2020-10-06 0:05 ` [lustre-devel] [PATCH 14/42] lnet: call event handlers without res_lock James Simmons
2020-10-06 0:05 ` [lustre-devel] [PATCH 15/42] lnet: Conditionally attach rspt in LNetPut & LNetGet James Simmons
2020-10-06 0:05 ` [lustre-devel] [PATCH 16/42] lustre: llite: reuse same cl_dio_aio for one IO James Simmons
2020-10-06 0:05 ` [lustre-devel] [PATCH 17/42] lustre: llite: move iov iter forward by ourself James Simmons
2020-10-06 0:05 ` [lustre-devel] [PATCH 18/42] lustre: llite: report client stats sumsq James Simmons
2020-10-06 0:05 ` [lustre-devel] [PATCH 19/42] lnet: Support checking for MD leaks James Simmons
2020-10-06 0:05 ` [lustre-devel] [PATCH 20/42] lnet: don't read debugfs lnet stats when shutting down James Simmons
2020-10-06 0:06 ` James Simmons [this message]
2020-10-06 0:06 ` [lustre-devel] [PATCH 22/42] lnet: Fix reference leak in lnet_select_pathway James Simmons
2020-10-06 0:06 ` [lustre-devel] [PATCH 23/42] lustre: llite: prune invalid dentries James Simmons
2020-10-06 0:06 ` [lustre-devel] [PATCH 24/42] lnet: Do not overwrite destination when routing James Simmons
2020-10-06 0:06 ` [lustre-devel] [PATCH 25/42] lustre: lov: don't use inline for operations functions James Simmons
2020-10-06 0:06 ` [lustre-devel] [PATCH 26/42] lustre: osc: don't allow negative grants James Simmons
2020-10-06 0:06 ` [lustre-devel] [PATCH 27/42] lustre: mgc: Use IR for client->MDS/OST connections James Simmons
2020-10-06 0:06 ` [lustre-devel] [PATCH 28/42] lustre: ldlm: don't use a locks without l_ast_data James Simmons
2020-10-06 0:06 ` [lustre-devel] [PATCH 29/42] lustre: lov: discard unused lov_dump_lmm* functions James Simmons
2020-10-06 0:06 ` [lustre-devel] [PATCH 30/42] lustre: lov: guard against class_exp2obd() returning NULL James Simmons
2020-10-06 0:06 ` [lustre-devel] [PATCH 31/42] lustre: clio: don't call aio_complete() in lustre upon errors James Simmons
2020-10-06 0:06 ` [lustre-devel] [PATCH 32/42] lustre: llite: it_lock_bits should be bit-wise tested James Simmons
2020-10-06 0:06 ` [lustre-devel] [PATCH 33/42] lustre: ldlm: control lru_size for extent lock James Simmons
2020-10-06 0:06 ` [lustre-devel] [PATCH 34/42] lustre: ldlm: pool fixes James Simmons
2020-10-06 0:06 ` [lustre-devel] [PATCH 35/42] lustre: ldlm: pool recalc forceful call James Simmons
2020-10-06 0:06 ` [lustre-devel] [PATCH 36/42] lustre: don't take spinlock to read a 'long' James Simmons
2020-10-06 0:06 ` [lustre-devel] [PATCH 37/42] lustre: osc: Do ELC on locks with no OSC object James Simmons
2020-10-06 0:06 ` [lustre-devel] [PATCH 38/42] lnet: deadlock on LNet shutdown James Simmons
2020-10-06 0:06 ` [lustre-devel] [PATCH 39/42] lustre: update version to 2.13.56 James Simmons
2020-10-06 0:06 ` [lustre-devel] [PATCH 40/42] lustre: llite: increase readahead default values James Simmons
2020-10-06 0:06 ` [lustre-devel] [PATCH 41/42] lustre: obdclass: don't initialize obj for zero FID James Simmons
2020-10-06 0:06 ` [lustre-devel] [PATCH 42/42] lustre: obdclass: fixes and improvements for jobid James Simmons
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1601942781-24950-22-git-send-email-jsimmons@infradead.org \
--to=jsimmons@infradead.org \
--cc=lustre-devel@lists.lustre.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).