From: Saeed Mahameed <saeed@kernel.org>
To: "David S. Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Eric Dumazet <edumazet@google.com>
Cc: Saeed Mahameed <saeedm@nvidia.com>,
netdev@vger.kernel.org, Tariq Toukan <tariqt@nvidia.com>,
Gal Pressman <gal@nvidia.com>,
Leon Romanovsky <leonro@nvidia.com>,
mbloch@nvidia.com, Parav Pandit <parav@nvidia.com>,
Adithya Jayachandran <ajayachandra@nvidia.com>
Subject: [PATCH net-next 1/3] devlink: Introduce devlink eswitch state
Date: Wed, 15 Oct 2025 18:36:16 -0700 [thread overview]
Message-ID: <20251016013618.2030940-2-saeed@kernel.org> (raw)
In-Reply-To: <20251016013618.2030940-1-saeed@kernel.org>
From: Parav Pandit <parav@nvidia.com>
Introduce a new state to eswitch (active/inactive) and
enable user to set it dynamically.
A user can start the eswitch in switchdev mode in either active or
inactive state.
Active: Traffic is enabled on this eswitch FDB.
Inactive: Traffic is ignored/dropped on this eswitch FDB.
An example of starting the switch in active state is following.
1. devlink dev eswitch set pci/0000:08:00.1 mode switchdev
(default is active, backward compatible)
2. devlink dev eswitch set pci/0000:08:00.1 mode switchdev state
active
To bring up the esw in 'inactive' state:
devlink dev eswitch set pci/0000:08:00.1 mode switchdev state
inactive
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Adithya Jayachandran <ajayachandra@nvidia.com>
---
Documentation/netlink/specs/devlink.yaml | 13 ++++++++
.../devlink/devlink-eswitch-attr.rst | 15 ++++++++++
include/net/devlink.h | 5 ++++
include/uapi/linux/devlink.h | 7 +++++
net/devlink/dev.c | 30 +++++++++++++++++++
net/devlink/netlink_gen.c | 5 ++--
6 files changed, 73 insertions(+), 2 deletions(-)
diff --git a/Documentation/netlink/specs/devlink.yaml b/Documentation/netlink/specs/devlink.yaml
index 3db59c965869..4242a3431320 100644
--- a/Documentation/netlink/specs/devlink.yaml
+++ b/Documentation/netlink/specs/devlink.yaml
@@ -119,6 +119,14 @@ definitions:
name: none
-
name: basic
+ -
+ type: enum
+ name: eswitch-state
+ entries:
+ -
+ name: none
+ -
+ name: basic
-
type: enum
name: dpipe-header-id
@@ -857,6 +865,10 @@ attribute-sets:
name: health-reporter-burst-period
type: u64
doc: Time (in msec) for recoveries before starting the grace period.
+ -
+ name: eswitch-state
+ type: u8
+ enum: eswitch-state
-
name: dl-dev-stats
subset-of: devlink
@@ -1609,6 +1621,7 @@ operations:
- eswitch-mode
- eswitch-inline-mode
- eswitch-encap-mode
+ - eswitch-state
-
name: eswitch-set
diff --git a/Documentation/networking/devlink/devlink-eswitch-attr.rst b/Documentation/networking/devlink/devlink-eswitch-attr.rst
index 08bb39ab1528..13ad1ed300ee 100644
--- a/Documentation/networking/devlink/devlink-eswitch-attr.rst
+++ b/Documentation/networking/devlink/devlink-eswitch-attr.rst
@@ -57,6 +57,18 @@ The following is a list of E-Switch attributes.
* ``none`` Disable encapsulation support.
* ``basic`` Enable encapsulation support.
+ * - ``state``
+ - enum
+ - The state of the E-Switch.
+ In situations where the user want to bring up the e-switch, they want to
+ have the ability to block traffic towards the FDB until FDB is fully
+ programmed.
+ The state can be one of the following:
+
+ * ``active`` Traffic is enabled on this eswitch FDB - default mode
+ * ``inactive`` Traffic is disabled on this eswitch FDB - no traffic
+ will be forwarded to/from this eswitch FDB
+
Example Usage
=============
@@ -74,3 +86,6 @@ Example Usage
# enable encap-mode with legacy mode
$ devlink dev eswitch set pci/0000:08:00.0 mode legacy inline-mode none encap-mode basic
+
+ # enable switchdev mode in inactive state
+ $ devlink dev eswitch set pci/0000:08:00.0 mode switchdev state inactive
diff --git a/include/net/devlink.h b/include/net/devlink.h
index 8d4362f010e4..aca56a905ab8 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -1369,6 +1369,11 @@ struct devlink_ops {
int (*eswitch_encap_mode_set)(struct devlink *devlink,
enum devlink_eswitch_encap_mode encap_mode,
struct netlink_ext_ack *extack);
+ int (*eswitch_state_get)(struct devlink *devlink,
+ enum devlink_eswitch_state *state);
+ int (*eswitch_state_set)(struct devlink *devlink,
+ enum devlink_eswitch_state state,
+ struct netlink_ext_ack *extack);
int (*info_get)(struct devlink *devlink, struct devlink_info_req *req,
struct netlink_ext_ack *extack);
/**
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index bcad11a787a5..a01443810658 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -195,6 +195,11 @@ enum devlink_eswitch_encap_mode {
DEVLINK_ESWITCH_ENCAP_MODE_BASIC,
};
+enum devlink_eswitch_state {
+ DEVLINK_ESWITCH_STATE_INACTIVE,
+ DEVLINK_ESWITCH_STATE_ACTIVE,
+};
+
enum devlink_port_flavour {
DEVLINK_PORT_FLAVOUR_PHYSICAL, /* Any kind of a port physically
* facing the user.
@@ -638,6 +643,8 @@ enum devlink_attr {
DEVLINK_ATTR_HEALTH_REPORTER_BURST_PERIOD, /* u64 */
+ DEVLINK_ATTR_ESWITCH_STATE, /* u8 */
+
/* Add new attributes above here, update the spec in
* Documentation/netlink/specs/devlink.yaml and re-generate
* net/devlink/netlink_gen.c.
diff --git a/net/devlink/dev.c b/net/devlink/dev.c
index 02602704bdea..1eea3e2c1ade 100644
--- a/net/devlink/dev.c
+++ b/net/devlink/dev.c
@@ -672,6 +672,17 @@ static int devlink_nl_eswitch_fill(struct sk_buff *msg, struct devlink *devlink,
goto nla_put_failure;
}
+ if (ops->eswitch_state_get) {
+ enum devlink_eswitch_state state;
+
+ err = ops->eswitch_state_get(devlink, &state);
+ if (err)
+ return err;
+ err = nla_put_u8(msg, DEVLINK_ATTR_ESWITCH_STATE, state);
+ if (err)
+ return err;
+ }
+
genlmsg_end(msg, hdr);
return 0;
@@ -706,6 +717,7 @@ int devlink_nl_eswitch_set_doit(struct sk_buff *skb, struct genl_info *info)
struct devlink *devlink = info->user_ptr[0];
const struct devlink_ops *ops = devlink->ops;
enum devlink_eswitch_encap_mode encap_mode;
+ enum devlink_eswitch_state state;
u8 inline_mode;
int err = 0;
u16 mode;
@@ -722,6 +734,24 @@ int devlink_nl_eswitch_set_doit(struct sk_buff *skb, struct genl_info *info)
return err;
}
+ state = DEVLINK_ESWITCH_STATE_ACTIVE;
+ if (info->attrs[DEVLINK_ATTR_ESWITCH_STATE]) {
+ if (!ops->eswitch_state_set)
+ return -EOPNOTSUPP;
+ state = nla_get_u8(info->attrs[DEVLINK_ATTR_ESWITCH_STATE]);
+ }
+ /* If user did not supply the state attribute, the default is
+ * active state. If the state was not explicitly set, set the default
+ * state for drivers that support eswitch state.
+ * Keep this after mode-set as state handling can be dependent on
+ * the eswitch mode.
+ */
+ if (ops->eswitch_state_set) {
+ err = ops->eswitch_state_set(devlink, state, info->extack);
+ if (err)
+ return err;
+ }
+
if (info->attrs[DEVLINK_ATTR_ESWITCH_INLINE_MODE]) {
if (!ops->eswitch_inline_mode_set)
return -EOPNOTSUPP;
diff --git a/net/devlink/netlink_gen.c b/net/devlink/netlink_gen.c
index 9fd00977d59e..e0910fb2214d 100644
--- a/net/devlink/netlink_gen.c
+++ b/net/devlink/netlink_gen.c
@@ -226,12 +226,13 @@ static const struct nla_policy devlink_eswitch_get_nl_policy[DEVLINK_ATTR_DEV_NA
};
/* DEVLINK_CMD_ESWITCH_SET - do */
-static const struct nla_policy devlink_eswitch_set_nl_policy[DEVLINK_ATTR_ESWITCH_ENCAP_MODE + 1] = {
+static const struct nla_policy devlink_eswitch_set_nl_policy[DEVLINK_ATTR_ESWITCH_STATE + 1] = {
[DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, },
[DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, },
[DEVLINK_ATTR_ESWITCH_MODE] = NLA_POLICY_MAX(NLA_U16, 1),
[DEVLINK_ATTR_ESWITCH_INLINE_MODE] = NLA_POLICY_MAX(NLA_U8, 3),
[DEVLINK_ATTR_ESWITCH_ENCAP_MODE] = NLA_POLICY_MAX(NLA_U8, 1),
+ [DEVLINK_ATTR_ESWITCH_STATE] = NLA_POLICY_MAX(NLA_U8, 1),
};
/* DEVLINK_CMD_DPIPE_TABLE_GET - do */
@@ -822,7 +823,7 @@ const struct genl_split_ops devlink_nl_ops[74] = {
.doit = devlink_nl_eswitch_set_doit,
.post_doit = devlink_nl_post_doit,
.policy = devlink_eswitch_set_nl_policy,
- .maxattr = DEVLINK_ATTR_ESWITCH_ENCAP_MODE,
+ .maxattr = DEVLINK_ATTR_ESWITCH_STATE,
.flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
},
{
--
2.51.0
next prev parent reply other threads:[~2025-10-16 1:36 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-16 1:36 [PATCH net-next 0/3] devlink eswitch active/inactive state Saeed Mahameed
2025-10-16 1:36 ` Saeed Mahameed [this message]
2025-10-16 9:16 ` [PATCH net-next 1/3] devlink: Introduce devlink eswitch state Jiri Pirko
2025-10-16 17:34 ` Saeed Mahameed
2025-10-17 8:06 ` Jiri Pirko
2025-10-16 1:36 ` [PATCH net-next 2/3] net/mlx5: MPFS, add support for dynamic enable/disable Saeed Mahameed
2025-10-16 19:28 ` kernel test robot
2025-10-16 21:35 ` kernel test robot
2025-10-18 7:42 ` kernel test robot
2025-10-16 1:36 ` [PATCH net-next 3/3] net/mlx5: E-Switch, support eswitch state Saeed Mahameed
2025-10-16 14:54 ` Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251016013618.2030940-2-saeed@kernel.org \
--to=saeed@kernel.org \
--cc=ajayachandra@nvidia.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=gal@nvidia.com \
--cc=kuba@kernel.org \
--cc=leonro@nvidia.com \
--cc=mbloch@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=parav@nvidia.com \
--cc=saeedm@nvidia.com \
--cc=tariqt@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).