From: Jiri Pirko <jiri@resnulli.us>
To: Mark Bloch <mbloch@nvidia.com>
Cc: Tariq Toukan <tariqt@nvidia.com>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>,
Paolo Abeni <pabeni@redhat.com>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Jonathan Corbet <corbet@lwn.net>,
Shuah Khan <skhan@linuxfoundation.org>,
Simon Horman <horms@kernel.org>,
Saeed Mahameed <saeedm@nvidia.com>,
Leon Romanovsky <leon@kernel.org>,
"Borislav Petkov (AMD)" <bp@alien8.de>,
Andrew Morton <akpm@linux-foundation.org>,
Randy Dunlap <rdunlap@infradead.org>,
Thomas Gleixner <tglx@kernel.org>,
Petr Mladek <pmladek@suse.com>,
"Peter Zijlstra (Intel)" <peterz@infradead.org>,
Tejun Heo <tj@kernel.org>, Vlastimil Babka <vbabka@kernel.org>,
Feng Tang <feng.tang@linux.alibaba.com>,
Christian Brauner <brauner@kernel.org>,
Dave Hansen <dave.hansen@linux.intel.com>,
Dapeng Mi <dapeng1.mi@linux.intel.com>,
Kees Cook <kees@kernel.org>, Marco Elver <elver@google.com>,
Li RongQing <lirongqing@baidu.com>,
Eric Biggers <ebiggers@kernel.org>,
"Paul E. McKenney" <paulmck@kernel.org>,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
Gal Pressman <gal@nvidia.com>,
Dragos Tatulea <dtatulea@nvidia.com>,
Jiri Pirko <jiri@nvidia.com>, Shay Drori <shayd@nvidia.com>,
Moshe Shemesh <moshe@nvidia.com>
Subject: Re: [PATCH net-next 3/3] net/mlx5: Apply devlink default eswitch mode during init
Date: Tue, 26 May 2026 16:07:11 +0200 [thread overview]
Message-ID: <ahWm4NXph9gdazV_@FV6GYCPJ69> (raw)
In-Reply-To: <8c8df8da-62a9-49e8-84eb-572d54cfeb1f@nvidia.com>
Tue, May 26, 2026 at 11:44:46AM +0200, mbloch@nvidia.com wrote:
>
>
>On 26/05/2026 10:44, Jiri Pirko wrote:
>> Thu, May 21, 2026 at 09:24:34AM +0200, tariqt@nvidia.com wrote:
>>> From: Mark Bloch <mbloch@nvidia.com>
>>>
>>> Apply devlink default eswitch mode for mlx5 devices after successful
>>> device initialization while holding the devlink instance lock.
>>>
>>> At this point the devlink instance is registered and the mlx5 devlink
>>> operations are available, so the default eswitch mode can be applied to
>>> the matching PCI devlink handle.
>>>
>>> Signed-off-by: Mark Bloch <mbloch@nvidia.com>
>>> Reviewed-by: Shay Drori <shayd@nvidia.com>
>>> Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
>>> Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
>>> ---
>>> drivers/net/ethernet/mellanox/mlx5/core/main.c | 17 +++++++++++++++++
>>> 1 file changed, 17 insertions(+)
>>>
>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
>>> index 0c6e4efe38c8..4528097f3d84 100644
>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
>>> @@ -1391,6 +1391,21 @@ static void mlx5_unload(struct mlx5_core_dev *dev)
>>> mlx5_free_bfreg(dev, &dev->priv.bfreg);
>>> }
>>>
>>> +static void mlx5_devl_apply_default_esw_mode(struct mlx5_core_dev *dev)
>>> +{
>>> + struct devlink *devlink = priv_to_devlink(dev);
>>> + int err;
>>> +
>>> + if (!MLX5_ESWITCH_MANAGER(dev))
>>> + return;
>>> +
>>> + devl_assert_locked(devlink);
>>> + err = devl_apply_default_esw_mode(devlink);
>>> + if (err)
>>> + mlx5_core_warn(dev, "Couldn't apply default eswitch mode, err %d\n",
>>> + err);
>>> +}
>>> +
>>> int mlx5_init_one_devl_locked(struct mlx5_core_dev *dev)
>>> {
>>> bool light_probe = mlx5_dev_is_lightweight(dev);
>>> @@ -1437,6 +1452,7 @@ int mlx5_init_one_devl_locked(struct mlx5_core_dev *dev)
>>> mlx5_core_err(dev, "mlx5_hwmon_dev_register failed with error code %d\n", err);
>>>
>>> mutex_unlock(&dev->intf_state_mutex);
>>> + mlx5_devl_apply_default_esw_mode(dev);
>>
>> I wonder how we can make this work for all. I mean, other driver would
>> silently ignore this command like arg, right? Any idea how to make all
>> drivers follow the arg from very beginning?
>>
>
>I have a follow-up series that adds the call to all drivers which support
>setting eswitch mode. When going over the other drivers, what I found is
>that the right point to apply the default is driver specific, drivers
>I have patch for:
>
>46e16c6d9836 net: Apply devlink esw mode defaults
>ab4f54102ba9 bnxt_en: Apply devlink default eswitch mode during init
>b48cce1607bb liquidio: Apply devlink default eswitch mode during init
>4ea54b0fe04a ice: Apply devlink default eswitch mode during init
>b7faddaa1c90 octeontx2-af: Apply devlink default eswitch mode during init
>74b0c22c47b9 octeontx2-pf: Apply devlink default eswitch mode during init
>5000e4c3d768 nfp: Apply devlink default eswitch mode during init
>97a218e95e41 netdevsim: Apply devlink default eswitch mode during init
>
>I don't think doing this generically from devlink is realistic. devlink
>doesn't really know when a given driver is ready to change eswitch mode.
>Some drivers need SR-IOV state, representor setup, or other init pieces to
>be ready first, and the locking is not identical across drivers either.
Low hanging fruit would be just to call ops->eswitch_mode_set at the end
of register. Multiple reasons:
1) end of devl_register is exactly the point userspace is free to issue
the eswitch mode set. Driver should be ready to handle it.
2) all drivers would transparently get this functionality, without
actually knowing this kernel command line arg ever existed, without
odd wiring call of related exported function. I prefer that stongly.
3) you should add a there warning for the case this arg is passed yet
the driver does not implement eswitch_mode_set. User should
get a feedback like this, not silent ignore.
The only loose end is see it the void return of devl_register().
Multiple ways to handle the possibly failed eswitch_mode_set(). I would
probably just go for pr_warn, seems to be the most correct.
Make sense?
>
>Also, since this knob is only about eswitch mode, I don't think we need to
>touch every devlink driver. Drivers that don't implement eswitch_mode_set()
>would just ignore it anyway. The follow-up only wires the default into
>drivers that actually support changing eswitch mode.
>
>Mark
>
>>
>>> return 0;
>>>
>>> err_register:
>>> @@ -1538,6 +1554,7 @@ int mlx5_load_one_devl_locked(struct mlx5_core_dev *dev, bool recovery)
>>> goto err_attach;
>>>
>>> mutex_unlock(&dev->intf_state_mutex);
>>> + mlx5_devl_apply_default_esw_mode(dev);
>>> return 0;
>>>
>>> err_attach:
>>> --
>>> 2.44.0
>>>
>
next prev parent reply other threads:[~2026-05-26 14:07 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-21 7:24 [PATCH net-next 0/3] devlink: Add boot-time eswitch mode defaults Tariq Toukan
2026-05-21 7:24 ` [PATCH net-next 1/3] net/mlx5: Clear FW reset-in-progress bit before reload Tariq Toukan
2026-05-21 7:24 ` [PATCH net-next 2/3] devlink: Add eswitch mode boot defaults Tariq Toukan
2026-05-21 7:24 ` [PATCH net-next 3/3] net/mlx5: Apply devlink default eswitch mode during init Tariq Toukan
2026-05-21 13:16 ` Mark Bloch
2026-05-21 13:41 ` Thomas Weißschuh
2026-05-21 21:02 ` Mark Bloch
2026-05-26 7:44 ` Jiri Pirko
2026-05-26 9:44 ` Mark Bloch
2026-05-26 14:07 ` Jiri Pirko [this message]
2026-05-26 15:03 ` Mark Bloch
2026-05-26 16:23 ` Jiri Pirko
2026-05-26 17:13 ` Mark Bloch
2026-05-25 19:42 ` [PATCH net-next 0/3] devlink: Add boot-time eswitch mode defaults Jakub Kicinski
2026-05-26 7:41 ` Jiri Pirko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ahWm4NXph9gdazV_@FV6GYCPJ69 \
--to=jiri@resnulli.us \
--cc=akpm@linux-foundation.org \
--cc=andrew+netdev@lunn.ch \
--cc=bp@alien8.de \
--cc=brauner@kernel.org \
--cc=corbet@lwn.net \
--cc=dapeng1.mi@linux.intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=davem@davemloft.net \
--cc=dtatulea@nvidia.com \
--cc=ebiggers@kernel.org \
--cc=edumazet@google.com \
--cc=elver@google.com \
--cc=feng.tang@linux.alibaba.com \
--cc=gal@nvidia.com \
--cc=horms@kernel.org \
--cc=jiri@nvidia.com \
--cc=kees@kernel.org \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=lirongqing@baidu.com \
--cc=mbloch@nvidia.com \
--cc=moshe@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=pmladek@suse.com \
--cc=rdunlap@infradead.org \
--cc=saeedm@nvidia.com \
--cc=shayd@nvidia.com \
--cc=skhan@linuxfoundation.org \
--cc=tariqt@nvidia.com \
--cc=tglx@kernel.org \
--cc=tj@kernel.org \
--cc=vbabka@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox