public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Mark Bloch <mbloch@nvidia.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Tariq Toukan <tariqt@nvidia.com>,
	Eric Dumazet <edumazet@google.com>,
	Paolo Abeni <pabeni@redhat.com>,
	Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	Leon Romanovsky <leon@kernel.org>, Jason Gunthorpe <jgg@ziepe.ca>,
	Saeed Mahameed <saeedm@nvidia.com>, Shay Drory <shayd@nvidia.com>,
	Or Har-Toov <ohartoov@nvidia.com>,
	Edward Srouji <edwards@nvidia.com>,
	Maher Sanalla <msanalla@nvidia.com>,
	Simon Horman <horms@kernel.org>,
	Gerd Bayer <gbayer@linux.ibm.com>,
	Moshe Shemesh <moshe@nvidia.com>, Kees Cook <kees@kernel.org>,
	Patrisious Haddad <phaddad@nvidia.com>,
	Parav Pandit <parav@nvidia.com>,
	Carolina Jubran <cjubran@nvidia.com>,
	Cosmin Ratiu <cratiu@nvidia.com>,
	linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, Gal Pressman <gal@nvidia.com>,
	Dragos Tatulea <dtatulea@nvidia.com>
Subject: Re: [PATCH net-next V2 7/7] net/mlx5: Add profile to auto-enable switchdev mode at device init
Date: Tue, 5 May 2026 05:00:15 +0300	[thread overview]
Message-ID: <9f73036e-32a8-4060-a347-cae05269b85f@nvidia.com> (raw)
In-Reply-To: <20260504182122.08efb41e@kernel.org>



On 05/05/2026 4:21, Jakub Kicinski wrote:
> On Sun, 3 May 2026 10:51:06 +0300 Mark Bloch wrote:
>> On 03/05/2026 4:41, Jakub Kicinski wrote:
>>> On Sat, 2 May 2026 23:08:43 +0300 Mark Bloch wrote:  
>>>> Before I respin for the unrelated MR_CACHE cleanup, I’d like to confirm
>>>> whether the opt-in profile approach is acceptable at all. Regardless
>>>> of this last patch, the first 6 patches fix real representor/LAG locking
>>>> issues and are needed independently, so I’d like to keep those moving toward
>>>> acceptance as soon as possible.  
>>>
>>> For probe-time config module param is probably our only option.
>>> I'd obviously prefer to have a devlink-level knob for this, instead 
>>> of a mlx5 specific one. Can we come up with some format that'd apply
>>> more broadly? devlink=[$bfd:]flag1 ? so devlink=[$bdf:]switchdev-mode ?  
>>
>> I’m not convinced this is really a generic devlink knob problem.
> 
> I'm surprised you say that. Anyone using switchdev mode could benefit.
> Having the probe in one mode and switch adds to boot time. Whether it's
> a DPU or not is quite secondary.
> 
> Unless there's another deeper reason which makes the DPU incapable of
> running in the non-switchdev mode. But not sure that squares with the
> code you posted AFAICT.

No, there is no deeper DPU limitation. The device can probe in
non switchdev mode, this is only about the desired default for those
deployments, and avoiding the extra boot-time cost of probing in one mode
and then switching to another.

What I meant is that I am wary of putting too much policy into the kernel
command line. A generic devlink level switchdev probe mode knob sounds
reasonable to me if we keep the scope narrow. More complex policy, such as
changing multiple defaults still seems better handled by userspace.

Would adding only switchdev/switchdev_inactive for now be acceptable?
I will try to keep the code generic enough so it can be extended later if
we want.

Let's continue with v3 as posted and please give me a few days to put
together an RFC for the devlink part.

Mark


> 
>> A device should probe in its selected/default configuration. For DPU
>> deployments switchdev is the expected operating mode. mlx5 just made the
>> wrong default choice historically, and this profile is a way to move away
>> from that without forcing it on everyone at once. I expect/hope to move
>> quickly from this flag to simply making switchdev the driver default for
>> all DPU configs.
>>
>> A generic cmdline format also gets complicated quickly: vendor-specific
>> flags, ordering/dependencies between flags, hotplug timing, and whether a
>> BDF rule should apply when a device is passed into a VM after boot.
>> Userspace scripts are probably better for that kind of policy because
>> they can carry real site specific logic.
>>
>> I’ll drop this last patch from the series for now so the representor/LAG
>> locking fixes can move independently and we can continue the default
>> switchdev discussion separately. I can always submit that as a standalone
>> patch later in the cycle if needed.
> 
> SG


  reply	other threads:[~2026-05-05  2:00 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-01  4:16 [PATCH net-next V2 0/7] net/mlx5: Improve representor lifecycle and allow switchdev by default Tariq Toukan
2026-05-01  4:16 ` [PATCH net-next V2 1/7] net/mlx5: Lag: refactor representor reload handling Tariq Toukan
2026-05-01  4:16 ` [PATCH net-next V2 2/7] net/mlx5: E-Switch, add representor lifecycle lock Tariq Toukan
2026-05-01  4:16 ` [PATCH net-next V2 3/7] net/mlx5: Lag, avoid LAG and representor lock cycles Tariq Toukan
2026-05-02 20:04   ` Mark Bloch
2026-05-01  4:16 ` [PATCH net-next V2 4/7] net/mlx5: E-Switch, serialize representor lifecycle Tariq Toukan
2026-05-02 20:05   ` Mark Bloch
2026-05-03  1:42   ` Jakub Kicinski
2026-05-03  8:18     ` Mark Bloch
2026-05-01  4:16 ` [PATCH net-next V2 5/7] net/mlx5: E-Switch, unwind only newly loaded representor types Tariq Toukan
2026-05-02 20:06   ` Mark Bloch
2026-05-01  4:16 ` [PATCH net-next V2 6/7] net/mlx5: E-switch, load reps via work queue after registration Tariq Toukan
2026-05-02 20:07   ` Mark Bloch
2026-05-03  1:42   ` Jakub Kicinski
2026-05-03  8:01     ` Mark Bloch
2026-05-01  4:16 ` [PATCH net-next V2 7/7] net/mlx5: Add profile to auto-enable switchdev mode at device init Tariq Toukan
2026-05-02 20:08   ` Mark Bloch
2026-05-03  1:41     ` Jakub Kicinski
2026-05-03  7:51       ` Mark Bloch
2026-05-05  1:21         ` Jakub Kicinski
2026-05-05  2:00           ` Mark Bloch [this message]
2026-05-05  2:19             ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9f73036e-32a8-4060-a347-cae05269b85f@nvidia.com \
    --to=mbloch@nvidia.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=cjubran@nvidia.com \
    --cc=cratiu@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=dtatulea@nvidia.com \
    --cc=edumazet@google.com \
    --cc=edwards@nvidia.com \
    --cc=gal@nvidia.com \
    --cc=gbayer@linux.ibm.com \
    --cc=horms@kernel.org \
    --cc=jgg@ziepe.ca \
    --cc=kees@kernel.org \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=moshe@nvidia.com \
    --cc=msanalla@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=ohartoov@nvidia.com \
    --cc=pabeni@redhat.com \
    --cc=parav@nvidia.com \
    --cc=phaddad@nvidia.com \
    --cc=saeedm@nvidia.com \
    --cc=shayd@nvidia.com \
    --cc=tariqt@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox