From: Jiri Pirko <jiri@resnulli.us>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Mark Bloch <mbloch@nvidia.com>,
Eric Dumazet <edumazet@google.com>,
Paolo Abeni <pabeni@redhat.com>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Jonathan Corbet <corbet@lwn.net>,
Shuah Khan <skhan@linuxfoundation.org>,
Simon Horman <horms@kernel.org>,
Saeed Mahameed <saeedm@nvidia.com>,
Leon Romanovsky <leon@kernel.org>,
Tariq Toukan <tariqt@nvidia.com>,
Andrew Morton <akpm@linux-foundation.org>,
"Borislav Petkov (AMD)" <bp@alien8.de>,
Randy Dunlap <rdunlap@infradead.org>,
Dave Hansen <dave.hansen@linux.intel.com>,
Christian Brauner <brauner@kernel.org>,
Petr Mladek <pmladek@suse.com>,
"Peter Zijlstra (Intel)" <peterz@infradead.org>,
Thomas Gleixner <tglx@kernel.org>,
Pawan Gupta <pawan.kumar.gupta@linux.intel.com>,
Dapeng Mi <dapeng1.mi@linux.intel.com>,
Kees Cook <kees@kernel.org>, Marco Elver <elver@google.com>,
Eric Biggers <ebiggers@kernel.org>,
Li RongQing <lirongqing@baidu.com>,
"Paul E. McKenney" <paulmck@kernel.org>,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
netdev@vger.kernel.org, linux-rdma@vger.kernel.org
Subject: Re: [RFC net-next 0/4] devlink: Add boot-time defaults
Date: Sat, 9 May 2026 09:01:23 +0200 [thread overview]
Message-ID: <af7Y4AYv-XDCbK_8@FV6GYCPJ69> (raw)
In-Reply-To: <20260508175213.1952097f@kernel.org>
Sat, May 09, 2026 at 02:52:13AM +0200, kuba@kernel.org wrote:
>On Fri, 8 May 2026 20:07:44 +0200 Jiri Pirko wrote:
>> >I don't think switchdev by default should mean CX4+ in general. If we get
>> >there, I would expect it to be limited to the DPU/BlueField/ECPF case, where
>> >the host PF probe path can depend on the ECPF reaching switchdev. Changing the
>> >default for regular host NIC deployments feels like a much larger compatibility
>> >change.
>>
>> We can't travel throught time, but if from CX5 onwards the default would
>> be switchdev, nobody would feel broken in terms of compatibility. That
>> is my point. Having "legacy" as default is simply wrong for never NIC
>> generations. That is why it is called "legacy" and it should have been
>> rotten through and out since CX4 times.
>
>legacy vs switchdev only describes the eswitch configuration.
>As a non-SR-IOV user I really don't want to see the extra representors
>hanging around my systems, confusing all daemons. IIRC mlx5 had some
>limitations around the uplink representor. Maybe that's the disconnect.
>But for a real, fully featured switchdev eswitches having the
>PHY and PF representors on boot, always, will not make sense.
As "a non-SR-IOV user", what extra representors you talk about? When you
have pfs only, you don't have anything extra. Just 1 netdev per-pf, one
devlink port per-pf. What's extra about it? When you don't have VFs/SFs.
Everyhing is the same:
c-220-136-220-218:~$ sudo devlink dev eswitch show pci/0000:08:00.0
pci/0000:08:00.0: mode switchdev inline-mode none encap-mode basic
c-220-136-220-218:~$ sudo devlink dev eswitch show pci/0000:08:00.1
pci/0000:08:00.1: mode legacy inline-mode none encap-mode basic
c-220-136-220-218:~$ devlink dev
pci/0000:08:00.0: index 0
nested_devlink:
auxiliary/mlx5_core.eth.0
devlink_index/1: index 1
nested_devlink:
pci/0000:08:00.0
pci/0000:08:00.1
auxiliary/mlx5_core.eth.0: index 2
pci/0000:08:00.1: index 3
nested_devlink:
auxiliary/mlx5_core.eth.1
auxiliary/mlx5_core.eth.1: index 4
c-220-136-220-218:~$ devlink port
auxiliary/mlx5_core.eth.0/65535: type eth netdev eth2 flavour physical port 0 splittable false
auxiliary/mlx5_core.eth.1/131071: type eth netdev eth3 flavour physical port 1 splittable false
c-220-136-220-218:~$ ip link
...
4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether b8:e9:24:f2:b7:6c brd ff:ff:ff:ff:ff:ff
altname enp8s0f0np0
5: eth3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether b8:e9:24:f2:b7:6d brd ff:ff:ff:ff:ff:ff
altname enp8s0f1np1
>
>IOW it's not a question of the generation of the card but of
>the deployment type / use case.
I don't think so, not in the case of mlx5. The difference is only when
you work with sr-iov, you either use legacy way (ip vf) or the new one.
Same usecase.
>
>> >For the ASIC/NV bit: maybe technically possible, but it feels like the wrong
>> >layer. This is boot/deployment policy, not a persistent hardware property, and
>> >storing it in NV memory would make the state persist across kernels/hosts in a
>> >surprising way.
>>
>> Well, as any other nv config, it persists across kernels/hosts. Think
>> about it as "unbreak-my-not-legacy-device" bit.
>
>For most devices the switchdev mode does not change anything
>substantial about the device. It's purely a kernel / driver config.
>It changes what objects and default rules kernel / driver installs.
>So I don't get why it would make sense to flash into the device
>nvmem a Linux SW stack specific config.
I look at it from the perspective that from some CX generation,
switchdev mode should be default. So that is a device-based decision.
I believe as such it can optionally be permanenty configured (nv config)
on older device. Why not?
[...]
next prev parent reply other threads:[~2026-05-09 7:01 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-06 12:37 [RFC net-next 0/4] devlink: Add boot-time defaults Mark Bloch
2026-05-06 12:37 ` [RFC net-next 1/4] devlink: Add infrastructure for " Mark Bloch
2026-05-06 12:37 ` [RFC net-next 2/4] devlink: Add eswitch mode boot default Mark Bloch
2026-05-06 12:37 ` [RFC net-next 3/4] devlink: Add runtime parameter boot defaults Mark Bloch
2026-05-06 12:37 ` [RFC net-next 4/4] net/mlx5: Apply devlink boot defaults during init Mark Bloch
2026-05-06 15:22 ` [RFC net-next 0/4] devlink: Add boot-time defaults Jiri Pirko
2026-05-06 17:35 ` Mark Bloch
2026-05-07 11:03 ` Jiri Pirko
2026-05-08 17:59 ` Mark Bloch
2026-05-08 18:07 ` Jiri Pirko
2026-05-09 0:52 ` Jakub Kicinski
2026-05-09 7:01 ` Jiri Pirko [this message]
2026-05-10 12:31 ` Mark Bloch
2026-05-11 8:07 ` Jiri Pirko
2026-05-11 18:21 ` Parav Pandit
2026-05-10 16:37 ` Jakub Kicinski
2026-05-11 8:42 ` Jiri Pirko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=af7Y4AYv-XDCbK_8@FV6GYCPJ69 \
--to=jiri@resnulli.us \
--cc=akpm@linux-foundation.org \
--cc=andrew+netdev@lunn.ch \
--cc=bp@alien8.de \
--cc=brauner@kernel.org \
--cc=corbet@lwn.net \
--cc=dapeng1.mi@linux.intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=davem@davemloft.net \
--cc=ebiggers@kernel.org \
--cc=edumazet@google.com \
--cc=elver@google.com \
--cc=horms@kernel.org \
--cc=kees@kernel.org \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=lirongqing@baidu.com \
--cc=mbloch@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=paulmck@kernel.org \
--cc=pawan.kumar.gupta@linux.intel.com \
--cc=peterz@infradead.org \
--cc=pmladek@suse.com \
--cc=rdunlap@infradead.org \
--cc=saeedm@nvidia.com \
--cc=skhan@linuxfoundation.org \
--cc=tariqt@nvidia.com \
--cc=tglx@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox