From: Mark Bloch <mbloch@nvidia.com>
To: Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Jiri Pirko <jiri@resnulli.us>
Cc: Jonathan Corbet <corbet@lwn.net>,
Shuah Khan <skhan@linuxfoundation.org>,
Simon Horman <horms@kernel.org>,
Saeed Mahameed <saeedm@nvidia.com>,
"Leon Romanovsky" <leon@kernel.org>,
Tariq Toukan <tariqt@nvidia.com>, Mark Bloch <mbloch@nvidia.com>,
Andrew Morton <akpm@linux-foundation.org>,
"Borislav Petkov (AMD)" <bp@alien8.de>,
Randy Dunlap <rdunlap@infradead.org>,
"Petr Mladek" <pmladek@suse.com>,
"Peter Zijlstra (Intel)" <peterz@infradead.org>,
Christian Brauner <brauner@kernel.org>,
Thomas Gleixner <tglx@kernel.org>,
Dapeng Mi <dapeng1.mi@linux.intel.com>,
Kees Cook <kees@kernel.org>, "Marco Elver" <elver@google.com>,
Li RongQing <lirongqing@baidu.com>,
Eric Biggers <ebiggers@kernel.org>,
"Paul E. McKenney" <paulmck@kernel.org>,
<linux-doc@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
<netdev@vger.kernel.org>, <linux-rdma@vger.kernel.org>
Subject: [RFC V2 net-next 1/2] devlink: Add eswitch mode boot defaults
Date: Sun, 10 May 2026 21:54:23 +0300 [thread overview]
Message-ID: <20260510185424.2041415-2-mbloch@nvidia.com> (raw)
In-Reply-To: <20260510185424.2041415-1-mbloch@nvidia.com>
Add devlink= kernel command line support for configuring devlink
eswitch mode during device initialization.
The supported syntax selects either all devlink handles or one explicit
comma-separated handle list:
devlink=[*]:esw:mode:<mode>
devlink=[<handle>[,<handle>...]]:esw:mode:<mode>
where <mode> is one of legacy, switchdev or switchdev_inactive. All
selected handles receive the same mode. Comma-separated default groups are
not supported.
The default is applied through the existing eswitch_mode_set() devlink
operation, matching the userspace devlink eswitch set command.
Document the devlink= syntax and duplicate handle handling.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
---
.../admin-guide/kernel-parameters.txt | 24 ++
.../networking/devlink/devlink-defaults.rst | 93 ++++++
Documentation/networking/devlink/index.rst | 1 +
include/net/devlink.h | 1 +
net/devlink/core.c | 269 ++++++++++++++++++
5 files changed, 388 insertions(+)
create mode 100644 Documentation/networking/devlink/devlink-defaults.rst
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 7834ee927310..46435bdfe039 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1278,6 +1278,30 @@ Kernel parameters
dell_smm_hwmon.fan_max=
[HW] Maximum configurable fan speed.
+ devlink= [NET]
+ Format:
+ [<selector>]:esw:mode:<mode>
+
+ <selector>:
+ * | <handle>[,<handle>...]
+
+ <handle>:
+ <bus-name>/<dev-name>
+
+ Configure default devlink settings for matching
+ devlink instances during device initialization.
+
+ Currently supported settings:
+ esw:mode:{ legacy | switchdev | switchdev_inactive }
+
+ Examples:
+ devlink=[*]:esw:mode:switchdev
+ devlink=[pci/0000:08:00.0]:esw:mode:switchdev
+ devlink=[pci/0000:08:00.0,pci/0000:09:00.1]:esw:mode:legacy
+
+ See Documentation/networking/devlink/devlink-defaults.rst
+ for the full syntax.
+
dfltcc= [HW,S390]
Format: { on | off | def_only | inf_only | always }
on: s390 zlib hardware support for compression on
diff --git a/Documentation/networking/devlink/devlink-defaults.rst b/Documentation/networking/devlink/devlink-defaults.rst
new file mode 100644
index 000000000000..4db3025c540f
--- /dev/null
+++ b/Documentation/networking/devlink/devlink-defaults.rst
@@ -0,0 +1,93 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+================
+Devlink Defaults
+================
+
+Devlink defaults allow selected devlink settings to be provided on the
+kernel command line and applied to matching devlink instances during device
+initialization.
+
+The devlink device is selected by its devlink handle. For PCI devices this is
+the same handle shown by ``devlink dev show``, for example
+``pci/0000:08:00.0``.
+
+Kernel command line syntax
+==========================
+
+Defaults are specified with the ``devlink=`` kernel command line parameter.
+
+The general syntax is::
+
+ devlink=[<selector>]:esw:mode:<mode>
+
+``<selector>`` is either ``*`` or one or more devlink handles::
+
+ * | <bus-name>/<dev-name>[,<bus-name>/<dev-name>...]
+
+``*`` applies the default to every devlink instance. All handles in the same
+``[]`` list receive the same eswitch mode setting.
+
+``<mode>`` is one of ``legacy``, ``switchdev`` or ``switchdev_inactive``.
+
+Syntax rules
+------------
+
+The following syntax rules apply:
+
+* Specify the default in one ``devlink=`` parameter. Repeated ``devlink=``
+ parameters are not accumulated.
+* The ``devlink=`` value is limited by the kernel command line size.
+* Whitespace is not allowed within the parameter value.
+* ``<selector>`` must be either ``*`` or a handle list. ``*`` cannot be
+ combined with explicit handles.
+* ``<bus-name>`` and ``<dev-name>`` must not be empty.
+* ``<bus-name>`` must not contain ``:``.
+* ``<dev-name>`` may contain ``:``. This allows PCI names such as
+ ``0000:08:00.0``.
+* Handles must not contain whitespace, ``[``, ``]``, ``*`` or more than one
+ ``/``.
+* A comma inside ``[]`` separates handles.
+* Comma-separated default groups are not supported.
+* Duplicate handles are rejected and the devlink default is ignored.
+
+Supported defaults
+==================
+
+The supported command is ``esw``:
+
+.. list-table::
+ :widths: 10 25 35
+ :header-rows: 1
+
+ * - Command
+ - Options
+ - Values
+ * - ``esw``
+ - ``mode:<mode>``
+ - ``legacy``, ``switchdev``, ``switchdev_inactive``
+
+The ``esw:mode`` default corresponds to the userspace command::
+
+ devlink dev eswitch set <handle> mode <value>
+
+
+Examples
+========
+
+Set all devlink instances to switchdev mode::
+
+ devlink=[*]:esw:mode:switchdev
+
+Set one PCI devlink instance to switchdev mode::
+
+ devlink=[pci/0000:08:00.0]:esw:mode:switchdev
+
+Set two PCI devlink instances to legacy mode::
+
+ devlink=[pci/0000:08:00.0,pci/0000:09:00.1]:esw:mode:legacy
+
+The following is invalid because comma-separated default groups are not
+supported::
+
+ devlink=[pci/0000:08:00.0]:esw:mode:switchdev,[pci/0000:09:00.0]:esw:mode:switchdev_inactive
diff --git a/Documentation/networking/devlink/index.rst b/Documentation/networking/devlink/index.rst
index f7ba7dcf477d..0d27a7008b14 100644
--- a/Documentation/networking/devlink/index.rst
+++ b/Documentation/networking/devlink/index.rst
@@ -56,6 +56,7 @@ general.
:maxdepth: 1
devlink-dpipe
+ devlink-defaults
devlink-eswitch-attr
devlink-flash
devlink-health
diff --git a/include/net/devlink.h b/include/net/devlink.h
index bcd31de1f890..058654d6800f 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -1622,6 +1622,7 @@ int devl_trylock(struct devlink *devlink);
void devl_unlock(struct devlink *devlink);
void devl_assert_locked(struct devlink *devlink);
bool devl_lock_is_held(struct devlink *devlink);
+int devl_apply_defaults(struct devlink *devlink);
DEFINE_GUARD(devl, struct devlink *, devl_lock(_T), devl_unlock(_T));
struct ib_device;
diff --git a/net/devlink/core.c b/net/devlink/core.c
index eeb6a71f5f56..2cfd50f9393b 100644
--- a/net/devlink/core.c
+++ b/net/devlink/core.c
@@ -4,6 +4,10 @@
* Copyright (c) 2016 Jiri Pirko <jiri@mellanox.com>
*/
+#include <linux/init.h>
+#include <linux/list.h>
+#include <linux/slab.h>
+#include <linux/string.h>
#include <net/genetlink.h>
#define CREATE_TRACE_POINTS
#include <trace/events/devlink.h>
@@ -16,6 +20,247 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(devlink_trap_report);
DEFINE_XARRAY_FLAGS(devlinks, XA_FLAGS_ALLOC);
+static char *devlink_default;
+static bool devlink_default_match_all;
+static enum devlink_eswitch_mode devlink_default_eswitch_mode;
+static LIST_HEAD(devlink_default_nodes);
+
+struct devlink_default_node {
+ struct list_head list;
+ char *bus_name;
+ char *dev_name;
+};
+
+static int __init
+devlink_default_esw_mode_to_value(const char *str,
+ enum devlink_eswitch_mode *mode)
+{
+ if (!strcmp(str, "legacy")) {
+ *mode = DEVLINK_ESWITCH_MODE_LEGACY;
+ return 0;
+ }
+ if (!strcmp(str, "switchdev")) {
+ *mode = DEVLINK_ESWITCH_MODE_SWITCHDEV;
+ return 0;
+ }
+ if (!strcmp(str, "switchdev_inactive")) {
+ *mode = DEVLINK_ESWITCH_MODE_SWITCHDEV_INACTIVE;
+ return 0;
+ }
+
+ return -EINVAL;
+}
+
+static int __init
+devlink_default_cmd_parse(char *str)
+{
+ char *cmd;
+ char *attr;
+ char *mode;
+
+ cmd = strsep(&str, ":");
+ attr = strsep(&str, ":");
+ mode = strsep(&str, ":");
+ if (!cmd || strcmp(cmd, "esw") || !attr || strcmp(attr, "mode") ||
+ !mode || !*mode || str)
+ return -EINVAL;
+
+ return devlink_default_esw_mode_to_value(mode,
+ &devlink_default_eswitch_mode);
+}
+
+static int devlink_default_eswitch_apply(struct devlink *devlink)
+{
+ const struct devlink_ops *ops = devlink->ops;
+
+ if (!ops->eswitch_mode_set)
+ return -EOPNOTSUPP;
+
+ return ops->eswitch_mode_set(devlink, devlink_default_eswitch_mode,
+ NULL);
+}
+
+static int __init
+devlink_default_handle_parse(char *handle, char **bus_name, char **dev_name)
+{
+ char *slash;
+ char *p;
+
+ if (!handle || !*handle)
+ return -EINVAL;
+
+ for (p = handle; *p; p++) {
+ if (*p == '[' || *p == ']' || *p == '*')
+ return -EINVAL;
+ }
+
+ slash = strchr(handle, '/');
+ if (!slash || slash == handle || !slash[1])
+ return -EINVAL;
+ if (strchr(slash + 1, '/'))
+ return -EINVAL;
+
+ *slash = '\0';
+ if (strchr(handle, ':'))
+ return -EINVAL;
+
+ *bus_name = handle;
+ *dev_name = slash + 1;
+ return 0;
+}
+
+static struct devlink_default_node *
+devlink_default_node_find(const char *bus_name, const char *dev_name)
+{
+ struct devlink_default_node *node;
+
+ list_for_each_entry(node, &devlink_default_nodes, list) {
+ if (!strcmp(node->bus_name, bus_name) &&
+ !strcmp(node->dev_name, dev_name))
+ return node;
+ }
+
+ return NULL;
+}
+
+static int __init
+devlink_default_node_add(const char *bus_name, const char *dev_name)
+{
+ struct devlink_default_node *node;
+
+ if (devlink_default_node_find(bus_name, dev_name))
+ return -EEXIST;
+
+ node = kzalloc_obj(*node);
+ if (!node)
+ return -ENOMEM;
+
+ INIT_LIST_HEAD(&node->list);
+ node->bus_name = kstrdup(bus_name, GFP_KERNEL);
+ node->dev_name = kstrdup(dev_name, GFP_KERNEL);
+ if (!node->bus_name || !node->dev_name) {
+ kfree(node->bus_name);
+ kfree(node->dev_name);
+ kfree(node);
+ return -ENOMEM;
+ }
+
+ list_add_tail(&node->list, &devlink_default_nodes);
+ return 0;
+}
+
+static int __init devlink_default_handles_parse(char *handles)
+{
+ char *handle;
+ int err;
+
+ if (!strcmp(handles, "*")) {
+ devlink_default_match_all = true;
+ return 0;
+ }
+
+ while ((handle = strsep(&handles, ",")) != NULL) {
+ char *bus_name;
+ char *dev_name;
+
+ err = devlink_default_handle_parse(handle, &bus_name,
+ &dev_name);
+ if (err)
+ return err;
+
+ err = devlink_default_node_add(bus_name, dev_name);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
+static void __init devlink_default_node_free(struct devlink_default_node *node)
+{
+ kfree(node->bus_name);
+ kfree(node->dev_name);
+ kfree(node);
+}
+
+static void __init devlink_default_nodes_clear(void)
+{
+ struct devlink_default_node *node;
+ struct devlink_default_node *node_tmp;
+
+ list_for_each_entry_safe(node, node_tmp, &devlink_default_nodes, list) {
+ list_del(&node->list);
+ devlink_default_node_free(node);
+ }
+
+ devlink_default_match_all = false;
+}
+
+static int __init devlink_default_parse(char *str)
+{
+ char *handles_end;
+ char *handles;
+ char *cmd;
+ int err;
+
+ if (!str || *str != '[')
+ return -EINVAL;
+
+ handles = str + 1;
+ handles_end = strchr(handles, ']');
+ if (!handles_end || handles_end[1] != ':' || !handles_end[2])
+ return -EINVAL;
+
+ *handles_end = '\0';
+ cmd = handles_end + 2;
+ if (!*handles)
+ return -EINVAL;
+
+ err = devlink_default_cmd_parse(cmd);
+ if (err)
+ return err;
+
+ err = devlink_default_handles_parse(handles);
+ if (err)
+ devlink_default_nodes_clear();
+
+ return err;
+}
+
+/**
+ * devl_apply_defaults - Apply defaults matching the devlink instance
+ * @devlink: devlink
+ *
+ * The caller must hold the devlink instance lock.
+ *
+ * Return: 0 on success, negative error code otherwise.
+ */
+int devl_apply_defaults(struct devlink *devlink)
+{
+ const char *bus_name = devlink_bus_name(devlink);
+ const char *dev_name = devlink_dev_name(devlink);
+ struct devlink_default_node *node;
+
+ devl_assert_locked(devlink);
+
+ if (devlink_default_match_all)
+ return devlink_default_eswitch_apply(devlink);
+
+ node = devlink_default_node_find(bus_name, dev_name);
+ if (node)
+ return devlink_default_eswitch_apply(devlink);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(devl_apply_defaults);
+
+static int __init devlink_default_setup(char *str)
+{
+ devlink_default = str;
+ return 1;
+}
+__setup("devlink=", devlink_default_setup);
+
static struct devlink *devlinks_xa_get(unsigned long index)
{
struct devlink *devlink;
@@ -578,6 +823,27 @@ static int __init devlink_init(void)
{
int err;
+ if (devlink_default) {
+ char *def;
+
+ def = kstrdup(devlink_default, GFP_KERNEL);
+ if (!def) {
+ err = -ENOMEM;
+ goto out;
+ }
+ err = devlink_default_parse(def);
+ kfree(def);
+ if (err == -EEXIST) {
+ devlink_default = NULL;
+ pr_warn("devlink: duplicate handles ignored\n");
+ } else if (err == -EINVAL) {
+ devlink_default = NULL;
+ pr_warn("devlink: invalid command line parameter ignored\n");
+ } else if (err) {
+ goto out;
+ }
+ }
+
err = register_pernet_subsys(&devlink_pernet_ops);
if (err)
goto out;
@@ -593,7 +859,10 @@ static int __init devlink_init(void)
out_unreg_pernet_subsys:
unregister_pernet_subsys(&devlink_pernet_ops);
out:
+ if (err)
+ devlink_default_nodes_clear();
WARN_ON(err);
+
return err;
}
--
2.34.1
next prev parent reply other threads:[~2026-05-10 18:54 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-10 18:54 [RFC V2 net-next 0/2] devlink: Add boot-time defaults Mark Bloch
2026-05-10 18:54 ` Mark Bloch [this message]
2026-05-10 18:54 ` [RFC V2 net-next 2/2] net/mlx5: Apply devlink boot defaults during init Mark Bloch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260510185424.2041415-2-mbloch@nvidia.com \
--to=mbloch@nvidia.com \
--cc=akpm@linux-foundation.org \
--cc=andrew+netdev@lunn.ch \
--cc=bp@alien8.de \
--cc=brauner@kernel.org \
--cc=corbet@lwn.net \
--cc=dapeng1.mi@linux.intel.com \
--cc=davem@davemloft.net \
--cc=ebiggers@kernel.org \
--cc=edumazet@google.com \
--cc=elver@google.com \
--cc=horms@kernel.org \
--cc=jiri@resnulli.us \
--cc=kees@kernel.org \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=lirongqing@baidu.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=pmladek@suse.com \
--cc=rdunlap@infradead.org \
--cc=saeedm@nvidia.com \
--cc=skhan@linuxfoundation.org \
--cc=tariqt@nvidia.com \
--cc=tglx@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox