netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Saeed Mahameed <saeed@kernel.org>
To: "David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Eric Dumazet <edumazet@google.com>
Cc: Saeed Mahameed <saeedm@nvidia.com>,
	netdev@vger.kernel.org, Tariq Toukan <tariqt@nvidia.com>,
	Gal Pressman <gal@nvidia.com>,
	Leon Romanovsky <leonro@nvidia.com>,
	mbloch@nvidia.com, Parav Pandit <parav@nvidia.com>,
	Adithya Jayachandran <ajayachandra@nvidia.com>
Subject: [PATCH net-next 1/3] devlink: Introduce devlink eswitch state
Date: Wed, 15 Oct 2025 18:36:16 -0700	[thread overview]
Message-ID: <20251016013618.2030940-2-saeed@kernel.org> (raw)
In-Reply-To: <20251016013618.2030940-1-saeed@kernel.org>

From: Parav Pandit <parav@nvidia.com>

Introduce a new state to eswitch (active/inactive) and
enable user to set it dynamically.

A user can start the eswitch in switchdev mode in either active or
inactive state.

Active: Traffic is enabled on this eswitch FDB.
Inactive: Traffic is ignored/dropped on this eswitch FDB.

An example of starting the switch in active state is following.
  1. devlink dev eswitch set pci/0000:08:00.1 mode switchdev
     (default is active, backward compatible)

  2. devlink dev eswitch set pci/0000:08:00.1 mode switchdev state
     active

To bring up the esw in 'inactive' state:

   devlink dev eswitch set pci/0000:08:00.1 mode switchdev state
inactive

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Adithya Jayachandran <ajayachandra@nvidia.com>
---
 Documentation/netlink/specs/devlink.yaml      | 13 ++++++++
 .../devlink/devlink-eswitch-attr.rst          | 15 ++++++++++
 include/net/devlink.h                         |  5 ++++
 include/uapi/linux/devlink.h                  |  7 +++++
 net/devlink/dev.c                             | 30 +++++++++++++++++++
 net/devlink/netlink_gen.c                     |  5 ++--
 6 files changed, 73 insertions(+), 2 deletions(-)

diff --git a/Documentation/netlink/specs/devlink.yaml b/Documentation/netlink/specs/devlink.yaml
index 3db59c965869..4242a3431320 100644
--- a/Documentation/netlink/specs/devlink.yaml
+++ b/Documentation/netlink/specs/devlink.yaml
@@ -119,6 +119,14 @@ definitions:
         name: none
       -
         name: basic
+  -
+    type: enum
+    name: eswitch-state
+    entries:
+      -
+        name: none
+      -
+        name: basic
   -
     type: enum
     name: dpipe-header-id
@@ -857,6 +865,10 @@ attribute-sets:
         name: health-reporter-burst-period
         type: u64
         doc: Time (in msec) for recoveries before starting the grace period.
+      -
+        name: eswitch-state
+        type: u8
+        enum: eswitch-state
   -
     name: dl-dev-stats
     subset-of: devlink
@@ -1609,6 +1621,7 @@ operations:
             - eswitch-mode
             - eswitch-inline-mode
             - eswitch-encap-mode
+            - eswitch-state
 
     -
       name: eswitch-set
diff --git a/Documentation/networking/devlink/devlink-eswitch-attr.rst b/Documentation/networking/devlink/devlink-eswitch-attr.rst
index 08bb39ab1528..13ad1ed300ee 100644
--- a/Documentation/networking/devlink/devlink-eswitch-attr.rst
+++ b/Documentation/networking/devlink/devlink-eswitch-attr.rst
@@ -57,6 +57,18 @@ The following is a list of E-Switch attributes.
        * ``none`` Disable encapsulation support.
        * ``basic`` Enable encapsulation support.
 
+   * - ``state``
+     - enum
+     - The state of the E-Switch.
+       In situations where the user want to bring up the e-switch, they want to
+       have the ability to block traffic towards the FDB until FDB is fully
+       programmed.
+       The state can be one of the following:
+
+       * ``active`` Traffic is enabled on this eswitch FDB - default mode
+       * ``inactive`` Traffic is disabled on this eswitch FDB - no traffic
+         will be forwarded to/from this eswitch FDB
+
 Example Usage
 =============
 
@@ -74,3 +86,6 @@ Example Usage
 
     # enable encap-mode with legacy mode
     $ devlink dev eswitch set pci/0000:08:00.0 mode legacy inline-mode none encap-mode basic
+
+    # enable switchdev mode in inactive state
+    $ devlink dev eswitch set pci/0000:08:00.0 mode switchdev state inactive
diff --git a/include/net/devlink.h b/include/net/devlink.h
index 8d4362f010e4..aca56a905ab8 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -1369,6 +1369,11 @@ struct devlink_ops {
 	int (*eswitch_encap_mode_set)(struct devlink *devlink,
 				      enum devlink_eswitch_encap_mode encap_mode,
 				      struct netlink_ext_ack *extack);
+	int (*eswitch_state_get)(struct devlink *devlink,
+				 enum devlink_eswitch_state *state);
+	int (*eswitch_state_set)(struct devlink *devlink,
+				 enum devlink_eswitch_state state,
+				 struct netlink_ext_ack *extack);
 	int (*info_get)(struct devlink *devlink, struct devlink_info_req *req,
 			struct netlink_ext_ack *extack);
 	/**
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index bcad11a787a5..a01443810658 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -195,6 +195,11 @@ enum devlink_eswitch_encap_mode {
 	DEVLINK_ESWITCH_ENCAP_MODE_BASIC,
 };
 
+enum devlink_eswitch_state {
+	DEVLINK_ESWITCH_STATE_INACTIVE,
+	DEVLINK_ESWITCH_STATE_ACTIVE,
+};
+
 enum devlink_port_flavour {
 	DEVLINK_PORT_FLAVOUR_PHYSICAL, /* Any kind of a port physically
 					* facing the user.
@@ -638,6 +643,8 @@ enum devlink_attr {
 
 	DEVLINK_ATTR_HEALTH_REPORTER_BURST_PERIOD,	/* u64 */
 
+	DEVLINK_ATTR_ESWITCH_STATE,	/* u8 */
+
 	/* Add new attributes above here, update the spec in
 	 * Documentation/netlink/specs/devlink.yaml and re-generate
 	 * net/devlink/netlink_gen.c.
diff --git a/net/devlink/dev.c b/net/devlink/dev.c
index 02602704bdea..1eea3e2c1ade 100644
--- a/net/devlink/dev.c
+++ b/net/devlink/dev.c
@@ -672,6 +672,17 @@ static int devlink_nl_eswitch_fill(struct sk_buff *msg, struct devlink *devlink,
 			goto nla_put_failure;
 	}
 
+	if (ops->eswitch_state_get) {
+		enum devlink_eswitch_state state;
+
+		err = ops->eswitch_state_get(devlink, &state);
+		if (err)
+			return err;
+		err = nla_put_u8(msg, DEVLINK_ATTR_ESWITCH_STATE, state);
+		if (err)
+			return err;
+	}
+
 	genlmsg_end(msg, hdr);
 	return 0;
 
@@ -706,6 +717,7 @@ int devlink_nl_eswitch_set_doit(struct sk_buff *skb, struct genl_info *info)
 	struct devlink *devlink = info->user_ptr[0];
 	const struct devlink_ops *ops = devlink->ops;
 	enum devlink_eswitch_encap_mode encap_mode;
+	enum devlink_eswitch_state state;
 	u8 inline_mode;
 	int err = 0;
 	u16 mode;
@@ -722,6 +734,24 @@ int devlink_nl_eswitch_set_doit(struct sk_buff *skb, struct genl_info *info)
 			return err;
 	}
 
+	state = DEVLINK_ESWITCH_STATE_ACTIVE;
+	if (info->attrs[DEVLINK_ATTR_ESWITCH_STATE]) {
+		if (!ops->eswitch_state_set)
+			return -EOPNOTSUPP;
+		state = nla_get_u8(info->attrs[DEVLINK_ATTR_ESWITCH_STATE]);
+	}
+	/* If user did not supply the state attribute, the default is
+	 * active state. If the state was not explicitly set, set the default
+	 * state for drivers that support eswitch state.
+	 * Keep this after mode-set as state handling can be dependent on
+	 * the eswitch mode.
+	 */
+	if (ops->eswitch_state_set) {
+		err = ops->eswitch_state_set(devlink, state, info->extack);
+		if (err)
+			return err;
+	}
+
 	if (info->attrs[DEVLINK_ATTR_ESWITCH_INLINE_MODE]) {
 		if (!ops->eswitch_inline_mode_set)
 			return -EOPNOTSUPP;
diff --git a/net/devlink/netlink_gen.c b/net/devlink/netlink_gen.c
index 9fd00977d59e..e0910fb2214d 100644
--- a/net/devlink/netlink_gen.c
+++ b/net/devlink/netlink_gen.c
@@ -226,12 +226,13 @@ static const struct nla_policy devlink_eswitch_get_nl_policy[DEVLINK_ATTR_DEV_NA
 };
 
 /* DEVLINK_CMD_ESWITCH_SET - do */
-static const struct nla_policy devlink_eswitch_set_nl_policy[DEVLINK_ATTR_ESWITCH_ENCAP_MODE + 1] = {
+static const struct nla_policy devlink_eswitch_set_nl_policy[DEVLINK_ATTR_ESWITCH_STATE + 1] = {
 	[DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, },
 	[DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, },
 	[DEVLINK_ATTR_ESWITCH_MODE] = NLA_POLICY_MAX(NLA_U16, 1),
 	[DEVLINK_ATTR_ESWITCH_INLINE_MODE] = NLA_POLICY_MAX(NLA_U8, 3),
 	[DEVLINK_ATTR_ESWITCH_ENCAP_MODE] = NLA_POLICY_MAX(NLA_U8, 1),
+	[DEVLINK_ATTR_ESWITCH_STATE] = NLA_POLICY_MAX(NLA_U8, 1),
 };
 
 /* DEVLINK_CMD_DPIPE_TABLE_GET - do */
@@ -822,7 +823,7 @@ const struct genl_split_ops devlink_nl_ops[74] = {
 		.doit		= devlink_nl_eswitch_set_doit,
 		.post_doit	= devlink_nl_post_doit,
 		.policy		= devlink_eswitch_set_nl_policy,
-		.maxattr	= DEVLINK_ATTR_ESWITCH_ENCAP_MODE,
+		.maxattr	= DEVLINK_ATTR_ESWITCH_STATE,
 		.flags		= GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
 	},
 	{
-- 
2.51.0


  reply	other threads:[~2025-10-16  1:36 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-16  1:36 [PATCH net-next 0/3] devlink eswitch active/inactive state Saeed Mahameed
2025-10-16  1:36 ` Saeed Mahameed [this message]
2025-10-16  9:16   ` [PATCH net-next 1/3] devlink: Introduce devlink eswitch state Jiri Pirko
2025-10-16 17:34     ` Saeed Mahameed
2025-10-17  8:06       ` Jiri Pirko
2025-10-16  1:36 ` [PATCH net-next 2/3] net/mlx5: MPFS, add support for dynamic enable/disable Saeed Mahameed
2025-10-16 19:28   ` kernel test robot
2025-10-16 21:35   ` kernel test robot
2025-10-18  7:42   ` kernel test robot
2025-10-16  1:36 ` [PATCH net-next 3/3] net/mlx5: E-Switch, support eswitch state Saeed Mahameed
2025-10-16 14:54   ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251016013618.2030940-2-saeed@kernel.org \
    --to=saeed@kernel.org \
    --cc=ajayachandra@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=gal@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=leonro@nvidia.com \
    --cc=mbloch@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=parav@nvidia.com \
    --cc=saeedm@nvidia.com \
    --cc=tariqt@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).