* [RFC V2 net-next 1/2] devlink: Add eswitch mode boot defaults
2026-05-10 18:54 [RFC V2 net-next 0/2] devlink: Add boot-time defaults Mark Bloch
@ 2026-05-10 18:54 ` Mark Bloch
2026-05-10 18:54 ` [RFC V2 net-next 2/2] net/mlx5: Apply devlink boot defaults during init Mark Bloch
1 sibling, 0 replies; 3+ messages in thread
From: Mark Bloch @ 2026-05-10 18:54 UTC (permalink / raw)
To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller, Jiri Pirko
Cc: Jonathan Corbet, Shuah Khan, Simon Horman, Saeed Mahameed,
Leon Romanovsky, Tariq Toukan, Mark Bloch, Andrew Morton,
Borislav Petkov (AMD), Randy Dunlap, Petr Mladek,
Peter Zijlstra (Intel), Christian Brauner, Thomas Gleixner,
Dapeng Mi, Kees Cook, Marco Elver, Li RongQing, Eric Biggers,
Paul E. McKenney, linux-doc, linux-kernel, netdev, linux-rdma
Add devlink= kernel command line support for configuring devlink
eswitch mode during device initialization.
The supported syntax selects either all devlink handles or one explicit
comma-separated handle list:
devlink=[*]:esw:mode:<mode>
devlink=[<handle>[,<handle>...]]:esw:mode:<mode>
where <mode> is one of legacy, switchdev or switchdev_inactive. All
selected handles receive the same mode. Comma-separated default groups are
not supported.
The default is applied through the existing eswitch_mode_set() devlink
operation, matching the userspace devlink eswitch set command.
Document the devlink= syntax and duplicate handle handling.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
---
.../admin-guide/kernel-parameters.txt | 24 ++
.../networking/devlink/devlink-defaults.rst | 93 ++++++
Documentation/networking/devlink/index.rst | 1 +
include/net/devlink.h | 1 +
net/devlink/core.c | 269 ++++++++++++++++++
5 files changed, 388 insertions(+)
create mode 100644 Documentation/networking/devlink/devlink-defaults.rst
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 7834ee927310..46435bdfe039 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1278,6 +1278,30 @@ Kernel parameters
dell_smm_hwmon.fan_max=
[HW] Maximum configurable fan speed.
+ devlink= [NET]
+ Format:
+ [<selector>]:esw:mode:<mode>
+
+ <selector>:
+ * | <handle>[,<handle>...]
+
+ <handle>:
+ <bus-name>/<dev-name>
+
+ Configure default devlink settings for matching
+ devlink instances during device initialization.
+
+ Currently supported settings:
+ esw:mode:{ legacy | switchdev | switchdev_inactive }
+
+ Examples:
+ devlink=[*]:esw:mode:switchdev
+ devlink=[pci/0000:08:00.0]:esw:mode:switchdev
+ devlink=[pci/0000:08:00.0,pci/0000:09:00.1]:esw:mode:legacy
+
+ See Documentation/networking/devlink/devlink-defaults.rst
+ for the full syntax.
+
dfltcc= [HW,S390]
Format: { on | off | def_only | inf_only | always }
on: s390 zlib hardware support for compression on
diff --git a/Documentation/networking/devlink/devlink-defaults.rst b/Documentation/networking/devlink/devlink-defaults.rst
new file mode 100644
index 000000000000..4db3025c540f
--- /dev/null
+++ b/Documentation/networking/devlink/devlink-defaults.rst
@@ -0,0 +1,93 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+================
+Devlink Defaults
+================
+
+Devlink defaults allow selected devlink settings to be provided on the
+kernel command line and applied to matching devlink instances during device
+initialization.
+
+The devlink device is selected by its devlink handle. For PCI devices this is
+the same handle shown by ``devlink dev show``, for example
+``pci/0000:08:00.0``.
+
+Kernel command line syntax
+==========================
+
+Defaults are specified with the ``devlink=`` kernel command line parameter.
+
+The general syntax is::
+
+ devlink=[<selector>]:esw:mode:<mode>
+
+``<selector>`` is either ``*`` or one or more devlink handles::
+
+ * | <bus-name>/<dev-name>[,<bus-name>/<dev-name>...]
+
+``*`` applies the default to every devlink instance. All handles in the same
+``[]`` list receive the same eswitch mode setting.
+
+``<mode>`` is one of ``legacy``, ``switchdev`` or ``switchdev_inactive``.
+
+Syntax rules
+------------
+
+The following syntax rules apply:
+
+* Specify the default in one ``devlink=`` parameter. Repeated ``devlink=``
+ parameters are not accumulated.
+* The ``devlink=`` value is limited by the kernel command line size.
+* Whitespace is not allowed within the parameter value.
+* ``<selector>`` must be either ``*`` or a handle list. ``*`` cannot be
+ combined with explicit handles.
+* ``<bus-name>`` and ``<dev-name>`` must not be empty.
+* ``<bus-name>`` must not contain ``:``.
+* ``<dev-name>`` may contain ``:``. This allows PCI names such as
+ ``0000:08:00.0``.
+* Handles must not contain whitespace, ``[``, ``]``, ``*`` or more than one
+ ``/``.
+* A comma inside ``[]`` separates handles.
+* Comma-separated default groups are not supported.
+* Duplicate handles are rejected and the devlink default is ignored.
+
+Supported defaults
+==================
+
+The supported command is ``esw``:
+
+.. list-table::
+ :widths: 10 25 35
+ :header-rows: 1
+
+ * - Command
+ - Options
+ - Values
+ * - ``esw``
+ - ``mode:<mode>``
+ - ``legacy``, ``switchdev``, ``switchdev_inactive``
+
+The ``esw:mode`` default corresponds to the userspace command::
+
+ devlink dev eswitch set <handle> mode <value>
+
+
+Examples
+========
+
+Set all devlink instances to switchdev mode::
+
+ devlink=[*]:esw:mode:switchdev
+
+Set one PCI devlink instance to switchdev mode::
+
+ devlink=[pci/0000:08:00.0]:esw:mode:switchdev
+
+Set two PCI devlink instances to legacy mode::
+
+ devlink=[pci/0000:08:00.0,pci/0000:09:00.1]:esw:mode:legacy
+
+The following is invalid because comma-separated default groups are not
+supported::
+
+ devlink=[pci/0000:08:00.0]:esw:mode:switchdev,[pci/0000:09:00.0]:esw:mode:switchdev_inactive
diff --git a/Documentation/networking/devlink/index.rst b/Documentation/networking/devlink/index.rst
index f7ba7dcf477d..0d27a7008b14 100644
--- a/Documentation/networking/devlink/index.rst
+++ b/Documentation/networking/devlink/index.rst
@@ -56,6 +56,7 @@ general.
:maxdepth: 1
devlink-dpipe
+ devlink-defaults
devlink-eswitch-attr
devlink-flash
devlink-health
diff --git a/include/net/devlink.h b/include/net/devlink.h
index bcd31de1f890..058654d6800f 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -1622,6 +1622,7 @@ int devl_trylock(struct devlink *devlink);
void devl_unlock(struct devlink *devlink);
void devl_assert_locked(struct devlink *devlink);
bool devl_lock_is_held(struct devlink *devlink);
+int devl_apply_defaults(struct devlink *devlink);
DEFINE_GUARD(devl, struct devlink *, devl_lock(_T), devl_unlock(_T));
struct ib_device;
diff --git a/net/devlink/core.c b/net/devlink/core.c
index eeb6a71f5f56..2cfd50f9393b 100644
--- a/net/devlink/core.c
+++ b/net/devlink/core.c
@@ -4,6 +4,10 @@
* Copyright (c) 2016 Jiri Pirko <jiri@mellanox.com>
*/
+#include <linux/init.h>
+#include <linux/list.h>
+#include <linux/slab.h>
+#include <linux/string.h>
#include <net/genetlink.h>
#define CREATE_TRACE_POINTS
#include <trace/events/devlink.h>
@@ -16,6 +20,247 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(devlink_trap_report);
DEFINE_XARRAY_FLAGS(devlinks, XA_FLAGS_ALLOC);
+static char *devlink_default;
+static bool devlink_default_match_all;
+static enum devlink_eswitch_mode devlink_default_eswitch_mode;
+static LIST_HEAD(devlink_default_nodes);
+
+struct devlink_default_node {
+ struct list_head list;
+ char *bus_name;
+ char *dev_name;
+};
+
+static int __init
+devlink_default_esw_mode_to_value(const char *str,
+ enum devlink_eswitch_mode *mode)
+{
+ if (!strcmp(str, "legacy")) {
+ *mode = DEVLINK_ESWITCH_MODE_LEGACY;
+ return 0;
+ }
+ if (!strcmp(str, "switchdev")) {
+ *mode = DEVLINK_ESWITCH_MODE_SWITCHDEV;
+ return 0;
+ }
+ if (!strcmp(str, "switchdev_inactive")) {
+ *mode = DEVLINK_ESWITCH_MODE_SWITCHDEV_INACTIVE;
+ return 0;
+ }
+
+ return -EINVAL;
+}
+
+static int __init
+devlink_default_cmd_parse(char *str)
+{
+ char *cmd;
+ char *attr;
+ char *mode;
+
+ cmd = strsep(&str, ":");
+ attr = strsep(&str, ":");
+ mode = strsep(&str, ":");
+ if (!cmd || strcmp(cmd, "esw") || !attr || strcmp(attr, "mode") ||
+ !mode || !*mode || str)
+ return -EINVAL;
+
+ return devlink_default_esw_mode_to_value(mode,
+ &devlink_default_eswitch_mode);
+}
+
+static int devlink_default_eswitch_apply(struct devlink *devlink)
+{
+ const struct devlink_ops *ops = devlink->ops;
+
+ if (!ops->eswitch_mode_set)
+ return -EOPNOTSUPP;
+
+ return ops->eswitch_mode_set(devlink, devlink_default_eswitch_mode,
+ NULL);
+}
+
+static int __init
+devlink_default_handle_parse(char *handle, char **bus_name, char **dev_name)
+{
+ char *slash;
+ char *p;
+
+ if (!handle || !*handle)
+ return -EINVAL;
+
+ for (p = handle; *p; p++) {
+ if (*p == '[' || *p == ']' || *p == '*')
+ return -EINVAL;
+ }
+
+ slash = strchr(handle, '/');
+ if (!slash || slash == handle || !slash[1])
+ return -EINVAL;
+ if (strchr(slash + 1, '/'))
+ return -EINVAL;
+
+ *slash = '\0';
+ if (strchr(handle, ':'))
+ return -EINVAL;
+
+ *bus_name = handle;
+ *dev_name = slash + 1;
+ return 0;
+}
+
+static struct devlink_default_node *
+devlink_default_node_find(const char *bus_name, const char *dev_name)
+{
+ struct devlink_default_node *node;
+
+ list_for_each_entry(node, &devlink_default_nodes, list) {
+ if (!strcmp(node->bus_name, bus_name) &&
+ !strcmp(node->dev_name, dev_name))
+ return node;
+ }
+
+ return NULL;
+}
+
+static int __init
+devlink_default_node_add(const char *bus_name, const char *dev_name)
+{
+ struct devlink_default_node *node;
+
+ if (devlink_default_node_find(bus_name, dev_name))
+ return -EEXIST;
+
+ node = kzalloc_obj(*node);
+ if (!node)
+ return -ENOMEM;
+
+ INIT_LIST_HEAD(&node->list);
+ node->bus_name = kstrdup(bus_name, GFP_KERNEL);
+ node->dev_name = kstrdup(dev_name, GFP_KERNEL);
+ if (!node->bus_name || !node->dev_name) {
+ kfree(node->bus_name);
+ kfree(node->dev_name);
+ kfree(node);
+ return -ENOMEM;
+ }
+
+ list_add_tail(&node->list, &devlink_default_nodes);
+ return 0;
+}
+
+static int __init devlink_default_handles_parse(char *handles)
+{
+ char *handle;
+ int err;
+
+ if (!strcmp(handles, "*")) {
+ devlink_default_match_all = true;
+ return 0;
+ }
+
+ while ((handle = strsep(&handles, ",")) != NULL) {
+ char *bus_name;
+ char *dev_name;
+
+ err = devlink_default_handle_parse(handle, &bus_name,
+ &dev_name);
+ if (err)
+ return err;
+
+ err = devlink_default_node_add(bus_name, dev_name);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
+static void __init devlink_default_node_free(struct devlink_default_node *node)
+{
+ kfree(node->bus_name);
+ kfree(node->dev_name);
+ kfree(node);
+}
+
+static void __init devlink_default_nodes_clear(void)
+{
+ struct devlink_default_node *node;
+ struct devlink_default_node *node_tmp;
+
+ list_for_each_entry_safe(node, node_tmp, &devlink_default_nodes, list) {
+ list_del(&node->list);
+ devlink_default_node_free(node);
+ }
+
+ devlink_default_match_all = false;
+}
+
+static int __init devlink_default_parse(char *str)
+{
+ char *handles_end;
+ char *handles;
+ char *cmd;
+ int err;
+
+ if (!str || *str != '[')
+ return -EINVAL;
+
+ handles = str + 1;
+ handles_end = strchr(handles, ']');
+ if (!handles_end || handles_end[1] != ':' || !handles_end[2])
+ return -EINVAL;
+
+ *handles_end = '\0';
+ cmd = handles_end + 2;
+ if (!*handles)
+ return -EINVAL;
+
+ err = devlink_default_cmd_parse(cmd);
+ if (err)
+ return err;
+
+ err = devlink_default_handles_parse(handles);
+ if (err)
+ devlink_default_nodes_clear();
+
+ return err;
+}
+
+/**
+ * devl_apply_defaults - Apply defaults matching the devlink instance
+ * @devlink: devlink
+ *
+ * The caller must hold the devlink instance lock.
+ *
+ * Return: 0 on success, negative error code otherwise.
+ */
+int devl_apply_defaults(struct devlink *devlink)
+{
+ const char *bus_name = devlink_bus_name(devlink);
+ const char *dev_name = devlink_dev_name(devlink);
+ struct devlink_default_node *node;
+
+ devl_assert_locked(devlink);
+
+ if (devlink_default_match_all)
+ return devlink_default_eswitch_apply(devlink);
+
+ node = devlink_default_node_find(bus_name, dev_name);
+ if (node)
+ return devlink_default_eswitch_apply(devlink);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(devl_apply_defaults);
+
+static int __init devlink_default_setup(char *str)
+{
+ devlink_default = str;
+ return 1;
+}
+__setup("devlink=", devlink_default_setup);
+
static struct devlink *devlinks_xa_get(unsigned long index)
{
struct devlink *devlink;
@@ -578,6 +823,27 @@ static int __init devlink_init(void)
{
int err;
+ if (devlink_default) {
+ char *def;
+
+ def = kstrdup(devlink_default, GFP_KERNEL);
+ if (!def) {
+ err = -ENOMEM;
+ goto out;
+ }
+ err = devlink_default_parse(def);
+ kfree(def);
+ if (err == -EEXIST) {
+ devlink_default = NULL;
+ pr_warn("devlink: duplicate handles ignored\n");
+ } else if (err == -EINVAL) {
+ devlink_default = NULL;
+ pr_warn("devlink: invalid command line parameter ignored\n");
+ } else if (err) {
+ goto out;
+ }
+ }
+
err = register_pernet_subsys(&devlink_pernet_ops);
if (err)
goto out;
@@ -593,7 +859,10 @@ static int __init devlink_init(void)
out_unreg_pernet_subsys:
unregister_pernet_subsys(&devlink_pernet_ops);
out:
+ if (err)
+ devlink_default_nodes_clear();
WARN_ON(err);
+
return err;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 3+ messages in thread* [RFC V2 net-next 2/2] net/mlx5: Apply devlink boot defaults during init
2026-05-10 18:54 [RFC V2 net-next 0/2] devlink: Add boot-time defaults Mark Bloch
2026-05-10 18:54 ` [RFC V2 net-next 1/2] devlink: Add eswitch mode boot defaults Mark Bloch
@ 2026-05-10 18:54 ` Mark Bloch
1 sibling, 0 replies; 3+ messages in thread
From: Mark Bloch @ 2026-05-10 18:54 UTC (permalink / raw)
To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller, Jiri Pirko
Cc: Jonathan Corbet, Shuah Khan, Simon Horman, Saeed Mahameed,
Leon Romanovsky, Tariq Toukan, Mark Bloch, Andrew Morton,
Borislav Petkov (AMD), Randy Dunlap, Petr Mladek,
Peter Zijlstra (Intel), Christian Brauner, Thomas Gleixner,
Dapeng Mi, Kees Cook, Marco Elver, Li RongQing, Eric Biggers,
Paul E. McKenney, linux-doc, linux-kernel, netdev, linux-rdma
Apply devlink boot defaults for mlx5 devices after successful device
initialization while holding the devlink instance lock.
At this point the devlink instance is registered and the mlx5 devlink
operations are available, so eswitch mode defaults can be applied to
the matching PCI devlink handle.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/main.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 296c5223cf61..deea7150084f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1470,6 +1470,8 @@ int mlx5_init_one(struct mlx5_core_dev *dev)
err = mlx5_init_one_devl_locked(dev);
if (err)
devl_unregister(devlink);
+ else
+ devl_apply_defaults(devlink);
unlock:
devl_unlock(devlink);
return err;
--
2.34.1
^ permalink raw reply related [flat|nested] 3+ messages in thread