From: Mark Bloch <mbloch@nvidia.com>
To: Jiri Pirko <jiri@resnulli.us>, Eric Dumazet <edumazet@google.com>,
"Jakub Kicinski" <kuba@kernel.org>,
Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>
Cc: Saeed Mahameed <saeedm@nvidia.com>,
Leon Romanovsky <leon@kernel.org>,
Tariq Toukan <tariqt@nvidia.com>,
Andrew Lunn <andrew+netdev@lunn.ch>,
Jonathan Corbet <corbet@lwn.net>,
Shuah Khan <skhan@linuxfoundation.org>, <netdev@vger.kernel.org>,
<linux-rdma@vger.kernel.org>, <linux-doc@vger.kernel.org>,
Mark Bloch <mbloch@nvidia.com>
Subject: [PATCH net-next V4 0/6] evlink: Add boot-time eswitch mode defaults
Date: Mon, 29 Jun 2026 21:20:55 +0300 [thread overview]
Message-ID: <20260629182102.245150-1-mbloch@nvidia.com> (raw)
This series adds a devlink_eswitch_mode= kernel command line parameter
for setting a default devlink eswitch mode during boot.
Following the discussion with Jakub[1] and the feedback on the RFC
postings, this version keeps the scope limited to a boot-time devlink
eswitch mode default only.
The option selects either all devlink handles or an explicit
comma-separated handle list:
devlink_eswitch_mode=*=switchdev
devlink_eswitch_mode=pci/0000:08:00.0,pci/0000:09:00.1=switchdev_inactive
The supported modes are legacy, switchdev and switchdev_inactive. The
selected mode is applied through the existing eswitch_mode_set() devlink
operation, the same operation used by the devlink eswitch mode command.
Registration may happen before a driver is ready to change eswitch mode,
so devlink core queues an asynchronous apply request from devl_register().
The worker takes the devlink instance lock before calling into the driver.
After a successful reload that performed DRIVER_REINIT, devlink core
already holds the devlink instance lock and the driver completed
reload_up(), so the default is applied directly from the reload path.
Drivers that know exactly when the device is ready can call
devl_apply_default_esw_mode() directly. mlx5 uses this after initial
probe, when the device is initialized and the devlink lock is already
held.
Patch 1 clears the mlx5 FW reset-in-progress bit before reload.
Patch 2 factors the common eswitch mode set validation into a helper.
Patch 3 adds the devlink_eswitch_mode= parser and documentation.
Patch 4 applies parsed defaults from devlink core.
Patch 5 adds devl_apply_default_esw_mode() for drivers.
Patch 6 wires mlx5 to apply the default after initial probe.
Changelog:
v3 -> v4:
- Rework registration time apply to use per devlink delayed work instead
of calling eswitch_mode_set() directly from devl_register().
- Apply the default directly after successful DRIVER_REINIT devlink reload,
where the devlink lock is already held and reload_up() has completed.
- Add devl_apply_default_esw_mode() for drivers that know their exact ready
point.
- Drop the driver registration-ordering preparation patches that are no
longer needed with the async registration apply path.
v2 -> v3:
- Change the devlink_eswitch_mode= API syntax to use <selector>=<mode>
instead of [<selector>]:<mode>, following a comment from Randy Dunlap.
v1 -> v2:
- Move default eswitch mode application into devlink core. The default is
now applied during devlink registration and after a successful devlink
reload that performed DRIVER_REINIT.
- Remove the exported devl_apply_default_esw_mode() driver API and the mlx5
driver-side call to it.
- Skip devlink health recovery notifications while the devlink instance is
not registered, so drivers can move registration later without early
health work hitting registration assertions.
- Move mlx5 devlink registration after device initialization, including the
lightweight init path, so the core can apply the default through the
normal registration flow.
- Move the matching netdevsim and mlx5 unregister paths before object
teardown, so unregister notifications come from devl_unregister() and the
later object teardown paths run while the devlink instance is no longer
registered.
- Add registration-ordering preparation patches for netdevsim and octeontx2
AF/PF, so their eswitch state is ready before registration-time defaults
may call eswitch_mode_set().
[1] lore.kernel.org/r/20260502184153.4fd8d06f@kernel.org/
RFC v1: lore.kernel.org/r/20260506123739.1959770-1-mbloch@nvidia.com/
RFC v2: lore.kernel.org/r/20260510185424.2041415-1-mbloch@nvidia.com/
v1: lore.kernel.org/r/20260521072434.362624-1-tariqt@nvidia.com/
v2: lore.kernel.org/all/20260603193259.3412464-1-mbloch@nvidia.com/
v3: lore.kernel.org/all/20260605181030.3486619-1-mbloch@nvidia.com/
Mark Bloch (6):
net/mlx5: Clear FW reset-in-progress bit before reload
devlink: Factor out eswitch mode setting
devlink: Parse eswitch mode boot defaults
devlink: Apply eswitch mode boot defaults
devlink: Add API to apply eswitch mode boot default
net/mlx5: Apply devlink eswitch mode boot default on probe
.../admin-guide/kernel-parameters.txt | 25 ++
.../networking/devlink/devlink-defaults.rst | 78 ++++
Documentation/networking/devlink/index.rst | 1 +
.../ethernet/mellanox/mlx5/core/fw_reset.c | 28 +-
.../net/ethernet/mellanox/mlx5/core/main.c | 13 +
include/net/devlink.h | 1 +
net/devlink/core.c | 393 ++++++++++++++++++
net/devlink/dev.c | 33 +-
net/devlink/devl_internal.h | 8 +
9 files changed, 562 insertions(+), 18 deletions(-)
create mode 100644 Documentation/networking/devlink/devlink-defaults.rst
base-commit: 805185b7c7a1069e407b6f7b3bc98e44d415f484
--
2.43.0
next reply other threads:[~2026-06-29 18:21 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-29 18:20 Mark Bloch [this message]
2026-06-29 18:20 ` [PATCH net-next V4 1/6] net/mlx5: Clear FW reset-in-progress bit before reload Mark Bloch
2026-06-29 18:20 ` [PATCH net-next V4 2/6] devlink: Factor out eswitch mode setting Mark Bloch
2026-06-29 18:20 ` [PATCH net-next V4 3/6] devlink: Parse eswitch mode boot defaults Mark Bloch
2026-06-29 18:20 ` [PATCH net-next V4 4/6] devlink: Apply " Mark Bloch
2026-06-29 18:21 ` [PATCH net-next V4 5/6] devlink: Add API to apply eswitch mode boot default Mark Bloch
2026-06-29 18:21 ` [PATCH net-next V4 6/6] net/mlx5: Apply devlink eswitch mode boot default on probe Mark Bloch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260629182102.245150-1-mbloch@nvidia.com \
--to=mbloch@nvidia.com \
--cc=andrew+netdev@lunn.ch \
--cc=corbet@lwn.net \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=jiri@resnulli.us \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=saeedm@nvidia.com \
--cc=skhan@linuxfoundation.org \
--cc=tariqt@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox