All of lore.kernel.org
 help / color / mirror / Atom feed
From: Saeed Mahameed <saeed@kernel.org>
To: "David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Eric Dumazet <edumazet@google.com>
Cc: Saeed Mahameed <saeedm@nvidia.com>,
	netdev@vger.kernel.org, Tariq Toukan <tariqt@nvidia.com>,
	Gal Pressman <gal@nvidia.com>,
	Leon Romanovsky <leonro@nvidia.com>,
	mbloch@nvidia.com, Parav Pandit <parav@nvidia.com>,
	Adithya Jayachandran <ajayachandra@nvidia.com>
Subject: [PATCH net-next 1/3] devlink: Introduce devlink eswitch state
Date: Wed, 15 Oct 2025 18:36:16 -0700	[thread overview]
Message-ID: <20251016013618.2030940-2-saeed@kernel.org> (raw)
In-Reply-To: <20251016013618.2030940-1-saeed@kernel.org>

From: Parav Pandit <parav@nvidia.com>

Introduce a new state to eswitch (active/inactive) and
enable user to set it dynamically.

A user can start the eswitch in switchdev mode in either active or
inactive state.

Active: Traffic is enabled on this eswitch FDB.
Inactive: Traffic is ignored/dropped on this eswitch FDB.

An example of starting the switch in active state is following.
  1. devlink dev eswitch set pci/0000:08:00.1 mode switchdev
     (default is active, backward compatible)

  2. devlink dev eswitch set pci/0000:08:00.1 mode switchdev state
     active

To bring up the esw in 'inactive' state:

   devlink dev eswitch set pci/0000:08:00.1 mode switchdev state
inactive

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Adithya Jayachandran <ajayachandra@nvidia.com>
---
 Documentation/netlink/specs/devlink.yaml      | 13 ++++++++
 .../devlink/devlink-eswitch-attr.rst          | 15 ++++++++++
 include/net/devlink.h                         |  5 ++++
 include/uapi/linux/devlink.h                  |  7 +++++
 net/devlink/dev.c                             | 30 +++++++++++++++++++
 net/devlink/netlink_gen.c                     |  5 ++--
 6 files changed, 73 insertions(+), 2 deletions(-)

diff --git a/Documentation/netlink/specs/devlink.yaml b/Documentation/netlink/specs/devlink.yaml
index 3db59c965869..4242a3431320 100644
--- a/Documentation/netlink/specs/devlink.yaml
+++ b/Documentation/netlink/specs/devlink.yaml
@@ -119,6 +119,14 @@ definitions:
         name: none
       -
         name: basic
+  -
+    type: enum
+    name: eswitch-state
+    entries:
+      -
+        name: none
+      -
+        name: basic
   -
     type: enum
     name: dpipe-header-id
@@ -857,6 +865,10 @@ attribute-sets:
         name: health-reporter-burst-period
         type: u64
         doc: Time (in msec) for recoveries before starting the grace period.
+      -
+        name: eswitch-state
+        type: u8
+        enum: eswitch-state
   -
     name: dl-dev-stats
     subset-of: devlink
@@ -1609,6 +1621,7 @@ operations:
             - eswitch-mode
             - eswitch-inline-mode
             - eswitch-encap-mode
+            - eswitch-state
 
     -
       name: eswitch-set
diff --git a/Documentation/networking/devlink/devlink-eswitch-attr.rst b/Documentation/networking/devlink/devlink-eswitch-attr.rst
index 08bb39ab1528..13ad1ed300ee 100644
--- a/Documentation/networking/devlink/devlink-eswitch-attr.rst
+++ b/Documentation/networking/devlink/devlink-eswitch-attr.rst
@@ -57,6 +57,18 @@ The following is a list of E-Switch attributes.
        * ``none`` Disable encapsulation support.
        * ``basic`` Enable encapsulation support.
 
+   * - ``state``
+     - enum
+     - The state of the E-Switch.
+       In situations where the user want to bring up the e-switch, they want to
+       have the ability to block traffic towards the FDB until FDB is fully
+       programmed.
+       The state can be one of the following:
+
+       * ``active`` Traffic is enabled on this eswitch FDB - default mode
+       * ``inactive`` Traffic is disabled on this eswitch FDB - no traffic
+         will be forwarded to/from this eswitch FDB
+
 Example Usage
 =============
 
@@ -74,3 +86,6 @@ Example Usage
 
     # enable encap-mode with legacy mode
     $ devlink dev eswitch set pci/0000:08:00.0 mode legacy inline-mode none encap-mode basic
+
+    # enable switchdev mode in inactive state
+    $ devlink dev eswitch set pci/0000:08:00.0 mode switchdev state inactive
diff --git a/include/net/devlink.h b/include/net/devlink.h
index 8d4362f010e4..aca56a905ab8 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -1369,6 +1369,11 @@ struct devlink_ops {
 	int (*eswitch_encap_mode_set)(struct devlink *devlink,
 				      enum devlink_eswitch_encap_mode encap_mode,
 				      struct netlink_ext_ack *extack);
+	int (*eswitch_state_get)(struct devlink *devlink,
+				 enum devlink_eswitch_state *state);
+	int (*eswitch_state_set)(struct devlink *devlink,
+				 enum devlink_eswitch_state state,
+				 struct netlink_ext_ack *extack);
 	int (*info_get)(struct devlink *devlink, struct devlink_info_req *req,
 			struct netlink_ext_ack *extack);
 	/**
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index bcad11a787a5..a01443810658 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -195,6 +195,11 @@ enum devlink_eswitch_encap_mode {
 	DEVLINK_ESWITCH_ENCAP_MODE_BASIC,
 };
 
+enum devlink_eswitch_state {
+	DEVLINK_ESWITCH_STATE_INACTIVE,
+	DEVLINK_ESWITCH_STATE_ACTIVE,
+};
+
 enum devlink_port_flavour {
 	DEVLINK_PORT_FLAVOUR_PHYSICAL, /* Any kind of a port physically
 					* facing the user.
@@ -638,6 +643,8 @@ enum devlink_attr {
 
 	DEVLINK_ATTR_HEALTH_REPORTER_BURST_PERIOD,	/* u64 */
 
+	DEVLINK_ATTR_ESWITCH_STATE,	/* u8 */
+
 	/* Add new attributes above here, update the spec in
 	 * Documentation/netlink/specs/devlink.yaml and re-generate
 	 * net/devlink/netlink_gen.c.
diff --git a/net/devlink/dev.c b/net/devlink/dev.c
index 02602704bdea..1eea3e2c1ade 100644
--- a/net/devlink/dev.c
+++ b/net/devlink/dev.c
@@ -672,6 +672,17 @@ static int devlink_nl_eswitch_fill(struct sk_buff *msg, struct devlink *devlink,
 			goto nla_put_failure;
 	}
 
+	if (ops->eswitch_state_get) {
+		enum devlink_eswitch_state state;
+
+		err = ops->eswitch_state_get(devlink, &state);
+		if (err)
+			return err;
+		err = nla_put_u8(msg, DEVLINK_ATTR_ESWITCH_STATE, state);
+		if (err)
+			return err;
+	}
+
 	genlmsg_end(msg, hdr);
 	return 0;
 
@@ -706,6 +717,7 @@ int devlink_nl_eswitch_set_doit(struct sk_buff *skb, struct genl_info *info)
 	struct devlink *devlink = info->user_ptr[0];
 	const struct devlink_ops *ops = devlink->ops;
 	enum devlink_eswitch_encap_mode encap_mode;
+	enum devlink_eswitch_state state;
 	u8 inline_mode;
 	int err = 0;
 	u16 mode;
@@ -722,6 +734,24 @@ int devlink_nl_eswitch_set_doit(struct sk_buff *skb, struct genl_info *info)
 			return err;
 	}
 
+	state = DEVLINK_ESWITCH_STATE_ACTIVE;
+	if (info->attrs[DEVLINK_ATTR_ESWITCH_STATE]) {
+		if (!ops->eswitch_state_set)
+			return -EOPNOTSUPP;
+		state = nla_get_u8(info->attrs[DEVLINK_ATTR_ESWITCH_STATE]);
+	}
+	/* If user did not supply the state attribute, the default is
+	 * active state. If the state was not explicitly set, set the default
+	 * state for drivers that support eswitch state.
+	 * Keep this after mode-set as state handling can be dependent on
+	 * the eswitch mode.
+	 */
+	if (ops->eswitch_state_set) {
+		err = ops->eswitch_state_set(devlink, state, info->extack);
+		if (err)
+			return err;
+	}
+
 	if (info->attrs[DEVLINK_ATTR_ESWITCH_INLINE_MODE]) {
 		if (!ops->eswitch_inline_mode_set)
 			return -EOPNOTSUPP;
diff --git a/net/devlink/netlink_gen.c b/net/devlink/netlink_gen.c
index 9fd00977d59e..e0910fb2214d 100644
--- a/net/devlink/netlink_gen.c
+++ b/net/devlink/netlink_gen.c
@@ -226,12 +226,13 @@ static const struct nla_policy devlink_eswitch_get_nl_policy[DEVLINK_ATTR_DEV_NA
 };
 
 /* DEVLINK_CMD_ESWITCH_SET - do */
-static const struct nla_policy devlink_eswitch_set_nl_policy[DEVLINK_ATTR_ESWITCH_ENCAP_MODE + 1] = {
+static const struct nla_policy devlink_eswitch_set_nl_policy[DEVLINK_ATTR_ESWITCH_STATE + 1] = {
 	[DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, },
 	[DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, },
 	[DEVLINK_ATTR_ESWITCH_MODE] = NLA_POLICY_MAX(NLA_U16, 1),
 	[DEVLINK_ATTR_ESWITCH_INLINE_MODE] = NLA_POLICY_MAX(NLA_U8, 3),
 	[DEVLINK_ATTR_ESWITCH_ENCAP_MODE] = NLA_POLICY_MAX(NLA_U8, 1),
+	[DEVLINK_ATTR_ESWITCH_STATE] = NLA_POLICY_MAX(NLA_U8, 1),
 };
 
 /* DEVLINK_CMD_DPIPE_TABLE_GET - do */
@@ -822,7 +823,7 @@ const struct genl_split_ops devlink_nl_ops[74] = {
 		.doit		= devlink_nl_eswitch_set_doit,
 		.post_doit	= devlink_nl_post_doit,
 		.policy		= devlink_eswitch_set_nl_policy,
-		.maxattr	= DEVLINK_ATTR_ESWITCH_ENCAP_MODE,
+		.maxattr	= DEVLINK_ATTR_ESWITCH_STATE,
 		.flags		= GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
 	},
 	{
-- 
2.51.0


  reply	other threads:[~2025-10-16  1:36 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-16  1:36 [PATCH net-next 0/3] devlink eswitch active/inactive state Saeed Mahameed
2025-10-16  1:36 ` Saeed Mahameed [this message]
2025-10-16  9:16   ` [PATCH net-next 1/3] devlink: Introduce devlink eswitch state Jiri Pirko
2025-10-16 17:34     ` Saeed Mahameed
2025-10-17  8:06       ` Jiri Pirko
2025-10-16  1:36 ` [PATCH net-next 2/3] net/mlx5: MPFS, add support for dynamic enable/disable Saeed Mahameed
2025-10-16 19:28   ` kernel test robot
2025-10-16 21:35   ` kernel test robot
2025-10-18  7:42   ` kernel test robot
2025-10-16  1:36 ` [PATCH net-next 3/3] net/mlx5: E-Switch, support eswitch state Saeed Mahameed
2025-10-16 14:54   ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251016013618.2030940-2-saeed@kernel.org \
    --to=saeed@kernel.org \
    --cc=ajayachandra@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=gal@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=leonro@nvidia.com \
    --cc=mbloch@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=parav@nvidia.com \
    --cc=saeedm@nvidia.com \
    --cc=tariqt@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.