From: Tariq Toukan <tariqt@nvidia.com>
To: Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Jiri Pirko <jiri@resnulli.us>, Jiri Pirko <jiri@nvidia.com>
Cc: Saeed Mahameed <saeed@kernel.org>, Gal Pressman <gal@nvidia.com>,
"Leon Romanovsky" <leon@kernel.org>,
Shahar Shitrit <shshitrit@nvidia.com>,
"Donald Hunter" <donald.hunter@gmail.com>,
Jonathan Corbet <corbet@lwn.net>,
"Brett Creeley" <brett.creeley@amd.com>,
Michael Chan <michael.chan@broadcom.com>,
Pavan Chebbi <pavan.chebbi@broadcom.com>,
Cai Huoqing <cai.huoqing@linux.dev>,
Tony Nguyen <anthony.l.nguyen@intel.com>,
"Przemek Kitszel" <przemyslaw.kitszel@intel.com>,
Sunil Goutham <sgoutham@marvell.com>,
Linu Cherian <lcherian@marvell.com>,
Geetha sowjanya <gakula@marvell.com>,
Jerin Jacob <jerinj@marvell.com>, hariprasad <hkelam@marvell.com>,
"Subbaraya Sundeep" <sbhatta@marvell.com>,
Saeed Mahameed <saeedm@nvidia.com>,
"Tariq Toukan" <tariqt@nvidia.com>,
Mark Bloch <mbloch@nvidia.com>, Ido Schimmel <idosch@nvidia.com>,
Petr Machata <petrm@nvidia.com>,
Manish Chopra <manishc@marvell.com>, <netdev@vger.kernel.org>,
<linux-kernel@vger.kernel.org>, <linux-doc@vger.kernel.org>,
<intel-wired-lan@lists.osuosl.org>, <linux-rdma@vger.kernel.org>
Subject: [PATCH net-next 4/5] devlink: Make health reporter grace period delay configurable
Date: Thu, 17 Jul 2025 19:07:21 +0300 [thread overview]
Message-ID: <1752768442-264413-5-git-send-email-tariqt@nvidia.com> (raw)
In-Reply-To: <1752768442-264413-1-git-send-email-tariqt@nvidia.com>
From: Shahar Shitrit <shshitrit@nvidia.com>
Enable configuration of the grace period delay — a time window
starting from the first error recovery, during which the reporter
allows recovery attempts for each reported error.
This feature is helpful when a single underlying issue causes
multiple errors, as it delays the start of the grace period
to allow sufficient time for recovering all related errors.
For example, if multiple TX queues time out simultaneously,
a sufficient grace period delay could allow all affected TX
queues to be recovered within that window. Without this delay,
only the first TX queue that reports a timeout will undergo
recovery, while the remaining TX queues will be blocked once
the grace period begins.
Configuration example:
$ devlink health set pci/0000:00:09.0 reporter tx grace_period_delay 500
Configuration example with ynl:
./tools/net/ynl/pyynl/cli.py \
--spec Documentation/netlink/specs/devlink.yaml \
--do health-reporter-set --json '{
"bus-name": "auxiliary",
"dev-name": "mlx5_core.eth.0",
"port-index": 65535,
"health-reporter-name": "tx",
"health-reporter-graceful-period-delay": 500
}'
Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
Documentation/netlink/specs/devlink.yaml | 7 +++++
.../networking/devlink/devlink-health.rst | 2 +-
include/uapi/linux/devlink.h | 2 ++
net/devlink/health.c | 30 +++++++++++++++++--
net/devlink/netlink_gen.c | 5 ++--
5 files changed, 40 insertions(+), 6 deletions(-)
diff --git a/Documentation/netlink/specs/devlink.yaml b/Documentation/netlink/specs/devlink.yaml
index 1c4bb0cbe5f0..c6cc0ce18685 100644
--- a/Documentation/netlink/specs/devlink.yaml
+++ b/Documentation/netlink/specs/devlink.yaml
@@ -848,6 +848,10 @@ attribute-sets:
-
name: region-direct
type: flag
+ -
+ name: health-reporter-graceful-period-delay
+ type: u64
+
-
name: rate-tc-bws
type: nest
@@ -1228,6 +1232,8 @@ attribute-sets:
name: health-reporter-dump-ts-ns
-
name: health-reporter-auto-dump
+ -
+ name: health-reporter-graceful-period-delay
-
name: dl-attr-stats
@@ -1965,6 +1971,7 @@ operations:
- health-reporter-graceful-period
- health-reporter-auto-recover
- health-reporter-auto-dump
+ - health-reporter-graceful-period-delay
-
name: health-reporter-recover
diff --git a/Documentation/networking/devlink/devlink-health.rst b/Documentation/networking/devlink/devlink-health.rst
index e0b8cfed610a..07602f678282 100644
--- a/Documentation/networking/devlink/devlink-health.rst
+++ b/Documentation/networking/devlink/devlink-health.rst
@@ -50,7 +50,7 @@ Once an error is reported, devlink health will perform the following actions:
* Auto recovery attempt is being done. Depends on:
- Auto-recovery configuration
- - Grace period vs. time passed since last recover
+ - Grace period (and grace period delay) vs. time passed since last recover
Devlink formatted message
=========================
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index e72bcc239afd..42a11b7e4a70 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -634,6 +634,8 @@ enum devlink_attr {
DEVLINK_ATTR_REGION_DIRECT, /* flag */
+ DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD_DELAY, /* u64 */
+
DEVLINK_ATTR_RATE_TC_BWS, /* nested */
DEVLINK_ATTR_RATE_TC_INDEX, /* u8 */
DEVLINK_ATTR_RATE_TC_BW, /* u32 */
diff --git a/net/devlink/health.c b/net/devlink/health.c
index a0269975f592..5699779fce77 100644
--- a/net/devlink/health.c
+++ b/net/devlink/health.c
@@ -113,7 +113,9 @@ __devlink_health_reporter_create(struct devlink *devlink,
{
struct devlink_health_reporter *reporter;
- if (WARN_ON(ops->default_graceful_period && !ops->recover))
+ if (WARN_ON(ops->default_graceful_period_delay &&
+ !ops->default_graceful_period) ||
+ WARN_ON(ops->default_graceful_period && !ops->recover))
return ERR_PTR(-EINVAL);
reporter = kzalloc(sizeof(*reporter), GFP_KERNEL);
@@ -293,6 +295,11 @@ devlink_nl_health_reporter_fill(struct sk_buff *msg,
devlink_nl_put_u64(msg, DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD,
reporter->graceful_period))
goto reporter_nest_cancel;
+ if (reporter->ops->recover &&
+ devlink_nl_put_u64(msg,
+ DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD_DELAY,
+ reporter->graceful_period_delay))
+ goto reporter_nest_cancel;
if (reporter->ops->recover &&
nla_put_u8(msg, DEVLINK_ATTR_HEALTH_REPORTER_AUTO_RECOVER,
reporter->auto_recover))
@@ -458,16 +465,33 @@ int devlink_nl_health_reporter_set_doit(struct sk_buff *skb,
if (!reporter->ops->recover &&
(info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD] ||
- info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_AUTO_RECOVER]))
+ info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_AUTO_RECOVER] ||
+ info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD_DELAY]))
return -EOPNOTSUPP;
if (!reporter->ops->dump &&
info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_AUTO_DUMP])
return -EOPNOTSUPP;
- if (info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD])
+ if (info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD]) {
reporter->graceful_period =
nla_get_u64(info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD]);
+ if (!reporter->graceful_period)
+ reporter->graceful_period_delay = 0;
+ }
+
+ if (info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD_DELAY]) {
+ u64 configured_delay =
+ nla_get_u64(info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD_DELAY]);
+
+ if (!reporter->graceful_period && configured_delay) {
+ NL_SET_ERR_MSG_MOD(info->extack,
+ "Cannot set grace period delay without a grace period.");
+ return -EINVAL;
+ }
+
+ reporter->graceful_period_delay = configured_delay;
+ }
if (info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_AUTO_RECOVER])
reporter->auto_recover =
diff --git a/net/devlink/netlink_gen.c b/net/devlink/netlink_gen.c
index c50436433c18..b0f38253d163 100644
--- a/net/devlink/netlink_gen.c
+++ b/net/devlink/netlink_gen.c
@@ -389,7 +389,7 @@ static const struct nla_policy devlink_health_reporter_get_dump_nl_policy[DEVLIN
};
/* DEVLINK_CMD_HEALTH_REPORTER_SET - do */
-static const struct nla_policy devlink_health_reporter_set_nl_policy[DEVLINK_ATTR_HEALTH_REPORTER_AUTO_DUMP + 1] = {
+static const struct nla_policy devlink_health_reporter_set_nl_policy[DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD_DELAY + 1] = {
[DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, },
[DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, },
[DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, },
@@ -397,6 +397,7 @@ static const struct nla_policy devlink_health_reporter_set_nl_policy[DEVLINK_ATT
[DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD] = { .type = NLA_U64, },
[DEVLINK_ATTR_HEALTH_REPORTER_AUTO_RECOVER] = { .type = NLA_U8, },
[DEVLINK_ATTR_HEALTH_REPORTER_AUTO_DUMP] = { .type = NLA_U8, },
+ [DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD_DELAY] = { .type = NLA_U64, },
};
/* DEVLINK_CMD_HEALTH_REPORTER_RECOVER - do */
@@ -1032,7 +1033,7 @@ const struct genl_split_ops devlink_nl_ops[74] = {
.doit = devlink_nl_health_reporter_set_doit,
.post_doit = devlink_nl_post_doit,
.policy = devlink_health_reporter_set_nl_policy,
- .maxattr = DEVLINK_ATTR_HEALTH_REPORTER_AUTO_DUMP,
+ .maxattr = DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD_DELAY,
.flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
},
{
--
2.31.1
next prev parent reply other threads:[~2025-07-17 16:08 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-17 16:07 [PATCH net-next 0/5] Expose grace period delay for devlink health reporter Tariq Toukan
2025-07-17 16:07 ` [PATCH net-next 1/5] devlink: Move graceful period parameter to reporter ops Tariq Toukan
2025-07-17 16:07 ` [PATCH net-next 2/5] devlink: Move health reporter recovery abort logic to a separate function Tariq Toukan
2025-07-17 16:07 ` [PATCH net-next 3/5] devlink: Introduce grace period delay for health reporter Tariq Toukan
2025-07-17 16:07 ` Tariq Toukan [this message]
2025-07-19 0:48 ` [PATCH net-next 4/5] devlink: Make health reporter grace period delay configurable Jakub Kicinski
2025-07-20 10:11 ` Tariq Toukan
2025-07-19 0:51 ` Jakub Kicinski
2025-07-20 10:47 ` Tariq Toukan
2025-07-17 16:07 ` [PATCH net-next 5/5] net/mlx5e: Set default grace period delay for TX and RX reporters Tariq Toukan
2025-07-19 0:47 ` [PATCH net-next 0/5] Expose grace period delay for devlink health reporter Jakub Kicinski
2025-07-24 10:46 ` Tariq Toukan
2025-07-25 0:10 ` Jakub Kicinski
2025-07-27 11:00 ` Tariq Toukan
2025-07-28 15:17 ` Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1752768442-264413-5-git-send-email-tariqt@nvidia.com \
--to=tariqt@nvidia.com \
--cc=andrew+netdev@lunn.ch \
--cc=anthony.l.nguyen@intel.com \
--cc=brett.creeley@amd.com \
--cc=cai.huoqing@linux.dev \
--cc=corbet@lwn.net \
--cc=davem@davemloft.net \
--cc=donald.hunter@gmail.com \
--cc=edumazet@google.com \
--cc=gakula@marvell.com \
--cc=gal@nvidia.com \
--cc=hkelam@marvell.com \
--cc=idosch@nvidia.com \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=jerinj@marvell.com \
--cc=jiri@nvidia.com \
--cc=jiri@resnulli.us \
--cc=kuba@kernel.org \
--cc=lcherian@marvell.com \
--cc=leon@kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=manishc@marvell.com \
--cc=mbloch@nvidia.com \
--cc=michael.chan@broadcom.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=pavan.chebbi@broadcom.com \
--cc=petrm@nvidia.com \
--cc=przemyslaw.kitszel@intel.com \
--cc=saeed@kernel.org \
--cc=saeedm@nvidia.com \
--cc=sbhatta@marvell.com \
--cc=sgoutham@marvell.com \
--cc=shshitrit@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).