netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [?bug] Can't get switchdev mode work on ConnectX-4 Card
@ 2025-04-21  9:07 Qiyu Yan
  2025-04-22  6:42 ` Mark Bloch
  0 siblings, 1 reply; 6+ messages in thread
From: Qiyu Yan @ 2025-04-21  9:07 UTC (permalink / raw)
  To: Saeed Mahameed, Tariq Toukan; +Cc: netdev, Ivan Vecera, Jiri Pirko

Hi,

I have a ConnectX-4 Lx EN MCX4121A-acat card:

$ lspci -s c1:00.0
c1:00.0 Ethernet controller: Mellanox Technologies MT27710 Family 
[ConnectX-4 Lx]
$ devlink dev info pci/0000:c1:00.0
pci/0000:c1:00.0:
   driver mlx5_core
   versions:
       fixed:
         fw.psid MT_2420110034
       running:
         fw.version 14.32.1900
         fw 14.32.1900
       stored:
         fw.version 14.32.1900
         fw 14.32.1900

I wanted to put the card to switchdev mode, so I started trying to to 
the following:

# enable switchdev mode
$ sudo devlink dev eswitch set pci/0000:c1:00.0 mode switchdev
$ sudo devlink dev eswitch show pci/0000:c1:00.0
pci/0000:c1:00.0: mode switchdev inline-mode link encap-mode basic

# create 2 VFs
$ echo 2 | sudo tee /sys/class/net/mlx-p0/device/sriov_numvfs

# Try add interface to bridges
$ sudo ip link add vmbr type bridge
$ sudo ip link set mlx-p0 master vmbr
Error: mlx5_core: Error checking for existing bridge with same ifindex.
$ sudo ip link set enp193s0f0r0 master vmbr
Error: mlx5_core: Error checking for existing bridge with same ifindex.

when the failure happens, there are messages like this in kmsg:

mlx5_core 0000:c1:00.0 mlx-p0: entered allmulticast mode
mlx5_core 0000:c1:00.0 mlx-p0: left allmulticast mode
mlx5_core 0000:c1:00.0 mlx-p0: failed (err=-22) to set attribute (id=6)

I am wondering if this is a bug in the current driver or anything above 
is wrong?

Some additional information:

(Fedora stock kernel 6.14.2-300.fc42.x86_64)
$ uname -a
Linux epyc-server 6.14.2-300.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Apr 
10 21:50:55 UTC 2025 x86_64 GNU/Linux
$ rpm -q iproute
iproute-6.12.0-3.fc42.x86_64

Best,
Qiyu


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [?bug] Can't get switchdev mode work on ConnectX-4 Card
  2025-04-21  9:07 [?bug] Can't get switchdev mode work on ConnectX-4 Card Qiyu Yan
@ 2025-04-22  6:42 ` Mark Bloch
  2025-04-22  7:15   ` Qiyu Yan
  0 siblings, 1 reply; 6+ messages in thread
From: Mark Bloch @ 2025-04-22  6:42 UTC (permalink / raw)
  To: Qiyu Yan, Saeed Mahameed, Tariq Toukan; +Cc: netdev, Ivan Vecera, Jiri Pirko



On 21/04/2025 12:07, Qiyu Yan wrote:
> Hi,
> 
> I have a ConnectX-4 Lx EN MCX4121A-acat card:
> 
> $ lspci -s c1:00.0
> c1:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
> $ devlink dev info pci/0000:c1:00.0
> pci/0000:c1:00.0:
>   driver mlx5_core
>   versions:
>       fixed:
>         fw.psid MT_2420110034
>       running:
>         fw.version 14.32.1900
>         fw 14.32.1900
>       stored:
>         fw.version 14.32.1900
>         fw 14.32.1900
> 
> I wanted to put the card to switchdev mode, so I started trying to to the following:
> 
> # enable switchdev mode
> $ sudo devlink dev eswitch set pci/0000:c1:00.0 mode switchdev
> $ sudo devlink dev eswitch show pci/0000:c1:00.0
> pci/0000:c1:00.0: mode switchdev inline-mode link encap-mode basic
> 
> # create 2 VFs
> $ echo 2 | sudo tee /sys/class/net/mlx-p0/device/sriov_numvfs
> 
> # Try add interface to bridges
> $ sudo ip link add vmbr type bridge
> $ sudo ip link set mlx-p0 master vmbr
> Error: mlx5_core: Error checking for existing bridge with same ifindex.
> $ sudo ip link set enp193s0f0r0 master vmbr
> Error: mlx5_core: Error checking for existing bridge with same ifindex.

It’s likely that the issue stems from cx4-lx not supporting metadata
matching, which in turn prevents the driver from enabling bridge
offloads.

Could you please confirm this by checking the output of the following
command?
# devlink dev param show pci/0000:c1:00.0 name esw_port_metadata

A better approach might be to check for metadata matching support
ahead of time and avoid registering for bridge offloads if it's not
supported. This way the driver won't offload the bridge but
it will also won't prevent users from adding the reps to bridge.
Could you try the diff below and let me know if it resolves
the issue for you?

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
index 0f5d7ea8956f..25a5845e5618 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
@@ -523,6 +523,11 @@ void mlx5e_rep_bridge_init(struct mlx5e_priv *priv)
                mdev->priv.eswitch;
        int err;

+       if (!mlx5_esw_bridge_supported(esw)) {
+               esw_debug(mdev, "Bridge offlaods isn't supported\n");
+               return;
+       }
+
        rtnl_lock();
        br_offloads = mlx5_esw_bridge_init(esw);
        rtnl_unlock();
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
index 76e35c827da0..37781f9ca884 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
@@ -368,9 +368,6 @@ mlx5_esw_bridge_ingress_table_init(struct mlx5_esw_bridge_offloads *br_offloads)
        struct mlx5_eswitch *esw = br_offloads->esw;
        int err;

-       if (!mlx5_eswitch_vport_match_metadata_enabled(esw))
-               return -EOPNOTSUPP;
-
        ingress_ft = mlx5_esw_bridge_table_create(MLX5_ESW_BRIDGE_INGRESS_TABLE_SIZE,
                                                  MLX5_ESW_BRIDGE_LEVEL_INGRESS_TABLE,
                                                  esw);
@@ -1917,6 +1914,14 @@ static void mlx5_esw_bridge_flush(struct mlx5_esw_bridge_offloads *br_offloads)
                  "Cleaning up bridge offloads while still having bridges attached\n");
 }

+bool mlx5_esw_bridge_supported(struct mlx5_eswitch *esw)
+{
+       if (!mlx5_eswitch_vport_match_metadata_enabled(esw))
+               return false;
+
+       return true;
+}
+
 struct mlx5_esw_bridge_offloads *mlx5_esw_bridge_init(struct mlx5_eswitch *esw)
 {
        struct mlx5_esw_bridge_offloads *br_offloads;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
index d6f539161993..f920c1c47f47 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
@@ -93,5 +93,6 @@ int mlx5_esw_bridge_port_mdb_add(struct net_device *dev, u16 vport_num, u16 esw_
 void mlx5_esw_bridge_port_mdb_del(struct net_device *dev, u16 vport_num, u16 esw_owner_vhca_id,
                                  const unsigned char *addr, u16 vid,
                                  struct mlx5_esw_bridge_offloads *br_offloads);
+bool mlx5_esw_bridge_supported(struct mlx5_eswitch *esw);

 #endif /* __MLX5_ESW_BRIDGE_H__ */

> 
> when the failure happens, there are messages like this in kmsg:
> 
> mlx5_core 0000:c1:00.0 mlx-p0: entered allmulticast mode
> mlx5_core 0000:c1:00.0 mlx-p0: left allmulticast mode
> mlx5_core 0000:c1:00.0 mlx-p0: failed (err=-22) to set attribute (id=6)
> 
> I am wondering if this is a bug in the current driver or anything above is wrong?
> 
> Some additional information:
> 
> (Fedora stock kernel 6.14.2-300.fc42.x86_64)
> $ uname -a
> Linux epyc-server 6.14.2-300.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Apr 10 21:50:55 UTC 2025 x86_64 GNU/Linux
> $ rpm -q iproute
> iproute-6.12.0-3.fc42.x86_64
> 
> Best,
> Qiyu
> 
> 


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [?bug] Can't get switchdev mode work on ConnectX-4 Card
  2025-04-22  6:42 ` Mark Bloch
@ 2025-04-22  7:15   ` Qiyu Yan
  2025-04-22  8:14     ` Mark Bloch
  0 siblings, 1 reply; 6+ messages in thread
From: Qiyu Yan @ 2025-04-22  7:15 UTC (permalink / raw)
  To: Mark Bloch, Saeed Mahameed, Tariq Toukan; +Cc: netdev, Ivan Vecera, Jiri Pirko

Thank you for your reply!

在 2025/4/22 14:42, Mark Bloch 写道:
> It’s likely that the issue stems from cx4-lx not supporting metadata
> matching, which in turn prevents the driver from enabling bridge
> offloads.
>
> Could you please confirm this by checking the output of the following
> command?
> # devlink dev param show pci/0000:c1:00.0 name esw_port_metadata

$ sudo devlink dev param show pci/0000:c1:00.0 name esw_port_metadata
pci/0000:c1:00.0:
   name esw_port_metadata type driver-specific
     values:
       cmode runtime value false

I guess the "value false"  here means not supported.
> A better approach might be to check for metadata matching support
> ahead of time and avoid registering for bridge offloads if it's not
> supported.
Just wondering what is the penalty of not having such offload?

The reason I am trying to enable switchdev is that I wanted to tag 
multiple vlans for a single VF. I see there is something called VGT+ in 
the document of OFED driver but the same function don't seem to exist in 
the mainline driver, so I considered to use the switchdev. But if the 
performance penalty of switchdev can be high I might want to switch to 
OFED driver instead.

> This way the driver won't offload the bridge but
> it will also won't prevent users from adding the reps to bridge.
> Could you try the diff below and let me know if it resolves
> the issue for you?
Will try but this will take some time for me to do so.
Best,
Qiyu


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [?bug] Can't get switchdev mode work on ConnectX-4 Card
  2025-04-22  7:15   ` Qiyu Yan
@ 2025-04-22  8:14     ` Mark Bloch
  2025-04-22  8:45       ` Qiyu Yan
  0 siblings, 1 reply; 6+ messages in thread
From: Mark Bloch @ 2025-04-22  8:14 UTC (permalink / raw)
  To: Qiyu Yan, Saeed Mahameed, Tariq Toukan; +Cc: netdev, Ivan Vecera, Jiri Pirko



On 22/04/2025 10:15, Qiyu Yan wrote:
> Thank you for your reply!
> 
> 在 2025/4/22 14:42, Mark Bloch 写道:
>> It’s likely that the issue stems from cx4-lx not supporting metadata
>> matching, which in turn prevents the driver from enabling bridge
>> offloads.
>>
>> Could you please confirm this by checking the output of the following
>> command?
>> # devlink dev param show pci/0000:c1:00.0 name esw_port_metadata
> 
> $ sudo devlink dev param show pci/0000:c1:00.0 name esw_port_metadata
> pci/0000:c1:00.0:
>   name esw_port_metadata type driver-specific
>     values:
>       cmode runtime value false
> 
> I guess the "value false"  here means not supported.

Yes

>> A better approach might be to check for metadata matching support
>> ahead of time and avoid registering for bridge offloads if it's not
>> supported.
> Just wondering what is the penalty of not having such offload?
> 
> The reason I am trying to enable switchdev is that I wanted to tag multiple vlans for a single VF. I see there is something called VGT+ in the document of OFED driver but the same function don't seem to exist in the mainline driver, so I considered to use the switchdev. But if the performance penalty of switchdev can be high I might want to switch to OFED driver instead.
> 

OFED is unrelated to upstream.

so you want to do QinQ (not sure cx4-lx supports qinq)
or just different vlans based on the traffic?
I don't think cx4-lx supports vlan push offloads.

You can still use software push vlan action using regular tc rules,
something like this:

tc filter add dev pf0vf0_rep protocol ip parent ffff: flower skip_hw dst_mac 50:6b:4b:b4:ac:0a src_mac 8a:ee:f9:37:bb:ef action vlan push id 100 action mirred egress redirect dev uplink0_rep

Mark

>> This way the driver won't offload the bridge but
>> it will also won't prevent users from adding the reps to bridge.
>> Could you try the diff below and let me know if it resolves
>> the issue for you?
> Will try but this will take some time for me to do so.
> Best,
> Qiyu
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [?bug] Can't get switchdev mode work on ConnectX-4 Card
  2025-04-22  8:14     ` Mark Bloch
@ 2025-04-22  8:45       ` Qiyu Yan
  2025-04-22  8:50         ` Mark Bloch
  0 siblings, 1 reply; 6+ messages in thread
From: Qiyu Yan @ 2025-04-22  8:45 UTC (permalink / raw)
  To: Mark Bloch, Saeed Mahameed, Tariq Toukan; +Cc: netdev, Ivan Vecera, Jiri Pirko

在 2025/4/22 16:14, Mark Bloch 写道:
> so you want to do QinQ (not sure cx4-lx supports qinq)
> or just different vlans based on the traffic?
> I don't think cx4-lx supports vlan push offloads.
Just want to grant access to different vlans through a single VF, the 
command with ip
  $ ip link set <interface> vf X vlan Y
filters and tags a single vlan. I am wondering if there is a suggested 
way to "pass-though" multiple vlans to a VM that I can create vlan 
interface in.

Maybe this patch is for this: 
https://patchwork.ozlabs.org/project/netdev/cover/20170827110618.20599-1-saeedm@mellanox.com/ 
but it is not merged yet...

Or I have to put the VF in promiscuous mode/pass-though multiple VFs?

Qiyu


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [?bug] Can't get switchdev mode work on ConnectX-4 Card
  2025-04-22  8:45       ` Qiyu Yan
@ 2025-04-22  8:50         ` Mark Bloch
  0 siblings, 0 replies; 6+ messages in thread
From: Mark Bloch @ 2025-04-22  8:50 UTC (permalink / raw)
  To: Qiyu Yan, Saeed Mahameed, Tariq Toukan; +Cc: netdev, Ivan Vecera, Jiri Pirko



On 22/04/2025 11:45, Qiyu Yan wrote:
> 在 2025/4/22 16:14, Mark Bloch 写道:
>> so you want to do QinQ (not sure cx4-lx supports qinq)
>> or just different vlans based on the traffic?
>> I don't think cx4-lx supports vlan push offloads.
> Just want to grant access to different vlans through a single VF, the command with ip
>  $ ip link set <interface> vf X vlan Y
> filters and tags a single vlan. I am wondering if there is a suggested way to "pass-though" multiple vlans to a VM that I can create vlan interface in.

Usually you don't want the VF to be aware what vlans are used, on egress in the FDB you push vlan
and on ingress in the FDB you pop vlan. The VF gets the traffic without any vlans.

With the TC command I wrote you can match on mac (or ip, tcp, udp etc) and push/pop vlan as you want
and with as many different vlan as you want.

> 
> Maybe this patch is for this: https://patchwork.ozlabs.org/project/netdev/cover/20170827110618.20599-1-saeedm@mellanox.com/ but it is not merged yet...

This patch is from 2017, it's not relevant.

mark

> 
> Or I have to put the VF in promiscuous mode/pass-though multiple VFs?
> 
> Qiyu
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-04-22  8:51 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-21  9:07 [?bug] Can't get switchdev mode work on ConnectX-4 Card Qiyu Yan
2025-04-22  6:42 ` Mark Bloch
2025-04-22  7:15   ` Qiyu Yan
2025-04-22  8:14     ` Mark Bloch
2025-04-22  8:45       ` Qiyu Yan
2025-04-22  8:50         ` Mark Bloch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).