* [PATCH net 0/7] mlx5 misc fixes 2024-11-07
@ 2024-11-07 18:35 Tariq Toukan
2024-11-07 18:35 ` [PATCH net 1/7] net/mlx5: E-switch, unload IB representors when unloading ETH representors Tariq Toukan
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: Tariq Toukan @ 2024-11-07 18:35 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Andrew Lunn
Cc: netdev, Saeed Mahameed, Gal Pressman, Leon Romanovsky,
Tariq Toukan
Hi,
This patchset provides misc bug fixes from the team to the mlx5 core and
Eth drivers.
Series generated against:
commit 5f897f30f596 ("Merge branch 'fix-the-arc-emac-driver'")
Thanks,
Tariq.
Carolina Jubran (1):
net/mlx5e: Disable loopback self-test on multi-PF netdev
Chiara Meiohas (1):
net/mlx5: E-switch, unload IB representors when unloading ETH
representors
Dragos Tatulea (1):
net/mlx5e: kTLS, Fix incorrect page refcounting
Mark Bloch (1):
net/mlx5: fs, lock FTE when checking if active
Moshe Shemesh (1):
net/mlx5e: CT: Fix null-ptr-deref in add rule err flow
Parav Pandit (1):
net/mlx5: Fix msix vectors to respect platform limit
William Tu (1):
net/mlx5e: clear xdp features on non-uplink representors
.../ethernet/mellanox/mlx5/core/en/tc_ct.c | 2 +-
.../mellanox/mlx5/core/en_accel/ktls_tx.c | 8 ++---
.../net/ethernet/mellanox/mlx5/core/en_main.c | 3 +-
.../ethernet/mellanox/mlx5/core/en_selftest.c | 4 +++
.../mellanox/mlx5/core/eswitch_offloads.c | 5 ++-
.../net/ethernet/mellanox/mlx5/core/fs_core.c | 15 +++++++--
.../net/ethernet/mellanox/mlx5/core/pci_irq.c | 32 ++++++++++++++++---
7 files changed, 54 insertions(+), 15 deletions(-)
--
2.44.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH net 1/7] net/mlx5: E-switch, unload IB representors when unloading ETH representors
2024-11-07 18:35 [PATCH net 0/7] mlx5 misc fixes 2024-11-07 Tariq Toukan
@ 2024-11-07 18:35 ` Tariq Toukan
2024-11-07 18:35 ` [PATCH net 2/7] net/mlx5: Fix msix vectors to respect platform limit Tariq Toukan
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Tariq Toukan @ 2024-11-07 18:35 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Andrew Lunn
Cc: netdev, Saeed Mahameed, Gal Pressman, Leon Romanovsky,
Chiara Meiohas, Mark Bloch, Tariq Toukan
From: Chiara Meiohas <cmeiohas@nvidia.com>
IB representors depend on ETH representors, so the IB representors
should not exist without the ETH ones. When unloading the ETH
representors, the corresponding IB representors should be also
unloaded.
The commit 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
introduced the use of the ib_device_set_netdev API in IB
repsresentors. ib_device_set_netdev() increments the refcount of
the representor's netdev when loading an IB representor and
decrements it when unloading.
Without the unloading of the IB representor, the refcount of the
representor's netdev remains greater than 0, preventing it from
being unregistered.
The patch uncovered an underlying bug where the eth representor is
unloaded, without unloading the IB representor.
This issue happened when using multiport E-switch and rebooting,
causing the shutdown to hang when unloading the ETH representor
because the refcount of the representor's netdevice was greater than 0.
Call trace:
unregister_netdevice: waiting for eth3 to become free. Usage count = 2
ref_tracker: eth%d@00000000661d60f7 has 1/1 users at
ib_device_set_netdev+0x160/0x2d0 [ib_core]
mlx5_ib_vport_rep_load+0x104/0x3f0 [mlx5_ib]
mlx5_eswitch_reload_ib_reps+0xfc/0x110 [mlx5_core]
mlx5_mpesw_work+0x236/0x330 [mlx5_core]
process_one_work+0x169/0x320
worker_thread+0x288/0x3a0
kthread+0xb8/0xe0
ret_from_fork+0x2d/0x50
ret_from_fork_asm+0x11/0x20
Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index f24f91d213f2..8cf61ae8b89d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -2527,8 +2527,11 @@ static void __esw_offloads_unload_rep(struct mlx5_eswitch *esw,
struct mlx5_eswitch_rep *rep, u8 rep_type)
{
if (atomic_cmpxchg(&rep->rep_data[rep_type].state,
- REP_LOADED, REP_REGISTERED) == REP_LOADED)
+ REP_LOADED, REP_REGISTERED) == REP_LOADED) {
+ if (rep_type == REP_ETH)
+ __esw_offloads_unload_rep(esw, rep, REP_IB);
esw->offloads.rep_ops[rep_type]->unload(rep);
+ }
}
static void __unload_reps_all_vport(struct mlx5_eswitch *esw, u8 rep_type)
--
2.44.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net 2/7] net/mlx5: Fix msix vectors to respect platform limit
2024-11-07 18:35 [PATCH net 0/7] mlx5 misc fixes 2024-11-07 Tariq Toukan
2024-11-07 18:35 ` [PATCH net 1/7] net/mlx5: E-switch, unload IB representors when unloading ETH representors Tariq Toukan
@ 2024-11-07 18:35 ` Tariq Toukan
2024-11-07 18:35 ` [PATCH net 3/7] net/mlx5: fs, lock FTE when checking if active Tariq Toukan
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Tariq Toukan @ 2024-11-07 18:35 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Andrew Lunn
Cc: netdev, Saeed Mahameed, Gal Pressman, Leon Romanovsky,
Parav Pandit, Amir Tzin, Tariq Toukan
From: Parav Pandit <parav@nvidia.com>
The number of PCI vectors allocated by the platform (which may be fewer
than requested) is currently not honored when creating the SF pool;
only the PCI MSI-X capability is considered.
As a result, when a platform allocates fewer vectors
(in non-dynamic mode) than requested, the PF and SF pools end up
with an invalid vector range.
This causes incorrect SF vector accounting, which leads to the
following call trace when an invalid IRQ vector is allocated.
This issue is resolved by ensuring that the platform's vector
limit is respected for both the SF and PF pools.
Workqueue: mlx5_vhca_event0 mlx5_sf_dev_add_active_work [mlx5_core]
RIP: 0010:pci_irq_vector+0x23/0x80
RSP: 0018:ffffabd5cebd7248 EFLAGS: 00010246
RAX: ffff980880e7f308 RBX: ffff9808932fb880 RCX: 0000000000000001
RDX: 00000000000001ff RSI: 0000000000000200 RDI: ffff980880e7f308
RBP: 0000000000000200 R08: 0000000000000010 R09: ffff97a9116f0860
R10: 0000000000000002 R11: 0000000000000228 R12: ffff980897cd0160
R13: 0000000000000000 R14: ffff97a920fec0c0 R15: ffffabd5cebd72d0
FS: 0000000000000000(0000) GS:ffff97c7ff9c0000(0000) knlGS:0000000000000000
? rescuer_thread+0x350/0x350
kthread+0x11b/0x140
? __kthread_bind_mask+0x60/0x60
ret_from_fork+0x22/0x30
mlx5_core 0000:a1:00.0: mlx5_irq_alloc:321:(pid 6781): Failed to request irq. err = -22
mlx5_core 0000:a1:00.0: mlx5_irq_alloc:321:(pid 6781): Failed to request irq. err = -22
mlx5_core.sf mlx5_core.sf.6: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0 enhanced)
mlx5_core.sf mlx5_core.sf.7: firmware version: 32.43.356
mlx5_core.sf mlx5_core.sf.6 enpa1s0f0s4: renamed from eth0
mlx5_core.sf mlx5_core.sf.7: Rate limit: 127 rates are supported, range: 0Mbps to 195312Mbps
mlx5_core 0000:a1:00.0: mlx5_irq_alloc:321:(pid 6781): Failed to request irq. err = -22
mlx5_core 0000:a1:00.0: mlx5_irq_alloc:321:(pid 6781): Failed to request irq. err = -22
mlx5_core 0000:a1:00.0: mlx5_irq_alloc:321:(pid 6781): Failed to request irq. err = -22
Fixes: 3354822cde5a ("net/mlx5: Use dynamic msix vectors allocation")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Amir Tzin <amirtz@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
.../net/ethernet/mellanox/mlx5/core/pci_irq.c | 32 ++++++++++++++++---
1 file changed, 27 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
index 81a9232a03e1..7db9cab9bedf 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
@@ -593,9 +593,11 @@ static void irq_pool_free(struct mlx5_irq_pool *pool)
kvfree(pool);
}
-static int irq_pools_init(struct mlx5_core_dev *dev, int sf_vec, int pcif_vec)
+static int irq_pools_init(struct mlx5_core_dev *dev, int sf_vec, int pcif_vec,
+ bool dynamic_vec)
{
struct mlx5_irq_table *table = dev->priv.irq_table;
+ int sf_vec_available = sf_vec;
int num_sf_ctrl;
int err;
@@ -616,6 +618,13 @@ static int irq_pools_init(struct mlx5_core_dev *dev, int sf_vec, int pcif_vec)
num_sf_ctrl = DIV_ROUND_UP(mlx5_sf_max_functions(dev),
MLX5_SFS_PER_CTRL_IRQ);
num_sf_ctrl = min_t(int, MLX5_IRQ_CTRL_SF_MAX, num_sf_ctrl);
+ if (!dynamic_vec && (num_sf_ctrl + 1) > sf_vec_available) {
+ mlx5_core_dbg(dev,
+ "Not enough IRQs for SFs control and completion pool, required=%d avail=%d\n",
+ num_sf_ctrl + 1, sf_vec_available);
+ return 0;
+ }
+
table->sf_ctrl_pool = irq_pool_alloc(dev, pcif_vec, num_sf_ctrl,
"mlx5_sf_ctrl",
MLX5_EQ_SHARE_IRQ_MIN_CTRL,
@@ -624,9 +633,11 @@ static int irq_pools_init(struct mlx5_core_dev *dev, int sf_vec, int pcif_vec)
err = PTR_ERR(table->sf_ctrl_pool);
goto err_pf;
}
- /* init sf_comp_pool */
+ sf_vec_available -= num_sf_ctrl;
+
+ /* init sf_comp_pool, remaining vectors are for the SF completions */
table->sf_comp_pool = irq_pool_alloc(dev, pcif_vec + num_sf_ctrl,
- sf_vec - num_sf_ctrl, "mlx5_sf_comp",
+ sf_vec_available, "mlx5_sf_comp",
MLX5_EQ_SHARE_IRQ_MIN_COMP,
MLX5_EQ_SHARE_IRQ_MAX_COMP);
if (IS_ERR(table->sf_comp_pool)) {
@@ -715,6 +726,7 @@ int mlx5_irq_table_get_num_comp(struct mlx5_irq_table *table)
int mlx5_irq_table_create(struct mlx5_core_dev *dev)
{
int num_eqs = mlx5_max_eq_cap_get(dev);
+ bool dynamic_vec;
int total_vec;
int pcif_vec;
int req_vec;
@@ -724,21 +736,31 @@ int mlx5_irq_table_create(struct mlx5_core_dev *dev)
if (mlx5_core_is_sf(dev))
return 0;
+ /* PCI PF vectors usage is limited by online cpus, device EQs and
+ * PCI MSI-X capability.
+ */
pcif_vec = MLX5_CAP_GEN(dev, num_ports) * num_online_cpus() + 1;
pcif_vec = min_t(int, pcif_vec, num_eqs);
+ pcif_vec = min_t(int, pcif_vec, pci_msix_vec_count(dev->pdev));
total_vec = pcif_vec;
if (mlx5_sf_max_functions(dev))
total_vec += MLX5_MAX_MSIX_PER_SF * mlx5_sf_max_functions(dev);
total_vec = min_t(int, total_vec, pci_msix_vec_count(dev->pdev));
- pcif_vec = min_t(int, pcif_vec, pci_msix_vec_count(dev->pdev));
req_vec = pci_msix_can_alloc_dyn(dev->pdev) ? 1 : total_vec;
n = pci_alloc_irq_vectors(dev->pdev, 1, req_vec, PCI_IRQ_MSIX);
if (n < 0)
return n;
- err = irq_pools_init(dev, total_vec - pcif_vec, pcif_vec);
+ /* Further limit vectors of the pools based on platform for non dynamic case */
+ dynamic_vec = pci_msix_can_alloc_dyn(dev->pdev);
+ if (!dynamic_vec) {
+ pcif_vec = min_t(int, n, pcif_vec);
+ total_vec = min_t(int, n, total_vec);
+ }
+
+ err = irq_pools_init(dev, total_vec - pcif_vec, pcif_vec, dynamic_vec);
if (err)
pci_free_irq_vectors(dev->pdev);
--
2.44.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net 3/7] net/mlx5: fs, lock FTE when checking if active
2024-11-07 18:35 [PATCH net 0/7] mlx5 misc fixes 2024-11-07 Tariq Toukan
2024-11-07 18:35 ` [PATCH net 1/7] net/mlx5: E-switch, unload IB representors when unloading ETH representors Tariq Toukan
2024-11-07 18:35 ` [PATCH net 2/7] net/mlx5: Fix msix vectors to respect platform limit Tariq Toukan
@ 2024-11-07 18:35 ` Tariq Toukan
2024-11-07 18:35 ` [PATCH net 4/7] net/mlx5e: kTLS, Fix incorrect page refcounting Tariq Toukan
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Tariq Toukan @ 2024-11-07 18:35 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Andrew Lunn
Cc: netdev, Saeed Mahameed, Gal Pressman, Leon Romanovsky, Mark Bloch,
Maor Gottlieb, Tariq Toukan
From: Mark Bloch <mbloch@nvidia.com>
The referenced commits introduced a two-step process for deleting FTEs:
- Lock the FTE, delete it from hardware, set the hardware deletion function
to NULL and unlock the FTE.
- Lock the parent flow group, delete the software copy of the FTE, and
remove it from the xarray.
However, this approach encounters a race condition if a rule with the same
match value is added simultaneously. In this scenario, fs_core may set the
hardware deletion function to NULL prematurely, causing a panic during
subsequent rule deletions.
To prevent this, ensure the active flag of the FTE is checked under a lock,
which will prevent the fs_core layer from attaching a new steering rule to
an FTE that is in the process of deletion.
[ 438.967589] MOSHE: 2496 mlx5_del_flow_rules del_hw_func
[ 438.968205] ------------[ cut here ]------------
[ 438.968654] refcount_t: decrement hit 0; leaking memory.
[ 438.969249] WARNING: CPU: 0 PID: 8957 at lib/refcount.c:31 refcount_warn_saturate+0xfb/0x110
[ 438.970054] Modules linked in: act_mirred cls_flower act_gact sch_ingress openvswitch nsh mlx5_vdpa vringh vhost_iotlb vdpa mlx5_ib mlx5_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm ib_uverbs ib_core zram zsmalloc fuse [last unloaded: cls_flower]
[ 438.973288] CPU: 0 UID: 0 PID: 8957 Comm: tc Not tainted 6.12.0-rc1+ #8
[ 438.973888] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[ 438.974874] RIP: 0010:refcount_warn_saturate+0xfb/0x110
[ 438.975363] Code: 40 66 3b 82 c6 05 16 e9 4d 01 01 e8 1f 7c a0 ff 0f 0b c3 cc cc cc cc 48 c7 c7 10 66 3b 82 c6 05 fd e8 4d 01 01 e8 05 7c a0 ff <0f> 0b c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 90
[ 438.976947] RSP: 0018:ffff888124a53610 EFLAGS: 00010286
[ 438.977446] RAX: 0000000000000000 RBX: ffff888119d56de0 RCX: 0000000000000000
[ 438.978090] RDX: ffff88852c828700 RSI: ffff88852c81b3c0 RDI: ffff88852c81b3c0
[ 438.978721] RBP: ffff888120fa0e88 R08: 0000000000000000 R09: ffff888124a534b0
[ 438.979353] R10: 0000000000000001 R11: 0000000000000001 R12: ffff888119d56de0
[ 438.979979] R13: ffff888120fa0ec0 R14: ffff888120fa0ee8 R15: ffff888119d56de0
[ 438.980607] FS: 00007fe6dcc0f800(0000) GS:ffff88852c800000(0000) knlGS:0000000000000000
[ 438.983984] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 438.984544] CR2: 00000000004275e0 CR3: 0000000186982001 CR4: 0000000000372eb0
[ 438.985205] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 438.985842] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 438.986507] Call Trace:
[ 438.986799] <TASK>
[ 438.987070] ? __warn+0x7d/0x110
[ 438.987426] ? refcount_warn_saturate+0xfb/0x110
[ 438.987877] ? report_bug+0x17d/0x190
[ 438.988261] ? prb_read_valid+0x17/0x20
[ 438.988659] ? handle_bug+0x53/0x90
[ 438.989054] ? exc_invalid_op+0x14/0x70
[ 438.989458] ? asm_exc_invalid_op+0x16/0x20
[ 438.989883] ? refcount_warn_saturate+0xfb/0x110
[ 438.990348] mlx5_del_flow_rules+0x2f7/0x340 [mlx5_core]
[ 438.990932] __mlx5_eswitch_del_rule+0x49/0x170 [mlx5_core]
[ 438.991519] ? mlx5_lag_is_sriov+0x3c/0x50 [mlx5_core]
[ 438.992054] ? xas_load+0x9/0xb0
[ 438.992407] mlx5e_tc_rule_unoffload+0x45/0xe0 [mlx5_core]
[ 438.993037] mlx5e_tc_del_fdb_flow+0x2a6/0x2e0 [mlx5_core]
[ 438.993623] mlx5e_flow_put+0x29/0x60 [mlx5_core]
[ 438.994161] mlx5e_delete_flower+0x261/0x390 [mlx5_core]
[ 438.994728] tc_setup_cb_destroy+0xb9/0x190
[ 438.995150] fl_hw_destroy_filter+0x94/0xc0 [cls_flower]
[ 438.995650] fl_change+0x11a4/0x13c0 [cls_flower]
[ 438.996105] tc_new_tfilter+0x347/0xbc0
[ 438.996503] ? ___slab_alloc+0x70/0x8c0
[ 438.996929] rtnetlink_rcv_msg+0xf9/0x3e0
[ 438.997339] ? __netlink_sendskb+0x4c/0x70
[ 438.997751] ? netlink_unicast+0x286/0x2d0
[ 438.998171] ? __pfx_rtnetlink_rcv_msg+0x10/0x10
[ 438.998625] netlink_rcv_skb+0x54/0x100
[ 438.999020] netlink_unicast+0x203/0x2d0
[ 438.999421] netlink_sendmsg+0x1e4/0x420
[ 438.999820] __sock_sendmsg+0xa1/0xb0
[ 439.000203] ____sys_sendmsg+0x207/0x2a0
[ 439.000600] ? copy_msghdr_from_user+0x6d/0xa0
[ 439.001072] ___sys_sendmsg+0x80/0xc0
[ 439.001459] ? ___sys_recvmsg+0x8b/0xc0
[ 439.001848] ? generic_update_time+0x4d/0x60
[ 439.002282] __sys_sendmsg+0x51/0x90
[ 439.002658] do_syscall_64+0x50/0x110
[ 439.003040] entry_SYSCALL_64_after_hwframe+0x76/0x7e
Fixes: 718ce4d601db ("net/mlx5: Consolidate update FTE for all removal changes")
Fixes: cefc23554fc2 ("net/mlx5: Fix FTE cleanup")
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index 8505d5e241e1..6e4f8aaf8d2f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -2105,13 +2105,22 @@ lookup_fte_locked(struct mlx5_flow_group *g,
fte_tmp = NULL;
goto out;
}
+
+ nested_down_write_ref_node(&fte_tmp->node, FS_LOCK_CHILD);
+
if (!fte_tmp->node.active) {
+ up_write_ref_node(&fte_tmp->node, false);
+
+ if (take_write)
+ up_write_ref_node(&g->node, false);
+ else
+ up_read_ref_node(&g->node);
+
tree_put_node(&fte_tmp->node, false);
- fte_tmp = NULL;
- goto out;
+
+ return NULL;
}
- nested_down_write_ref_node(&fte_tmp->node, FS_LOCK_CHILD);
out:
if (take_write)
up_write_ref_node(&g->node, false);
--
2.44.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net 4/7] net/mlx5e: kTLS, Fix incorrect page refcounting
2024-11-07 18:35 [PATCH net 0/7] mlx5 misc fixes 2024-11-07 Tariq Toukan
` (2 preceding siblings ...)
2024-11-07 18:35 ` [PATCH net 3/7] net/mlx5: fs, lock FTE when checking if active Tariq Toukan
@ 2024-11-07 18:35 ` Tariq Toukan
2024-11-07 18:35 ` [PATCH net 5/7] net/mlx5e: clear xdp features on non-uplink representors Tariq Toukan
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Tariq Toukan @ 2024-11-07 18:35 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Andrew Lunn
Cc: netdev, Saeed Mahameed, Gal Pressman, Leon Romanovsky,
Dragos Tatulea, Tariq Toukan
From: Dragos Tatulea <dtatulea@nvidia.com>
The kTLS tx handling code is using a mix of get_page() and
page_ref_inc() APIs to increment the page reference. But on the release
path (mlx5e_ktls_tx_handle_resync_dump_comp()), only put_page() is used.
This is an issue when using pages from large folios: the get_page()
references are stored on the folio page while the page_ref_inc()
references are stored directly in the given page. On release the folio
page will be dereferenced too many times.
This was found while doing kTLS testing with sendfile() + ZC when the
served file was read from NFS on a kernel with NFS large folios support
(commit 49b29a573da8 ("nfs: add support for large folios")).
Fixes: 84d1bb2b139e ("net/mlx5e: kTLS, Limit DUMP wqe size")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
.../net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
index d61be26a4df1..3db31cc10719 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
@@ -660,7 +660,7 @@ tx_sync_info_get(struct mlx5e_ktls_offload_context_tx *priv_tx,
while (remaining > 0) {
skb_frag_t *frag = &record->frags[i];
- get_page(skb_frag_page(frag));
+ page_ref_inc(skb_frag_page(frag));
remaining -= skb_frag_size(frag);
info->frags[i++] = *frag;
}
@@ -763,7 +763,7 @@ void mlx5e_ktls_tx_handle_resync_dump_comp(struct mlx5e_txqsq *sq,
stats = sq->stats;
mlx5e_tx_dma_unmap(sq->pdev, dma);
- put_page(wi->resync_dump_frag_page);
+ page_ref_dec(wi->resync_dump_frag_page);
stats->tls_dump_packets++;
stats->tls_dump_bytes += wi->num_bytes;
}
@@ -816,12 +816,12 @@ mlx5e_ktls_tx_handle_ooo(struct mlx5e_ktls_offload_context_tx *priv_tx,
err_out:
for (; i < info.nr_frags; i++)
- /* The put_page() here undoes the page ref obtained in tx_sync_info_get().
+ /* The page_ref_dec() here undoes the page ref obtained in tx_sync_info_get().
* Page refs obtained for the DUMP WQEs above (by page_ref_add) will be
* released only upon their completions (or in mlx5e_free_txqsq_descs,
* if channel closes).
*/
- put_page(skb_frag_page(&info.frags[i]));
+ page_ref_dec(skb_frag_page(&info.frags[i]));
return MLX5E_KTLS_SYNC_FAIL;
}
--
2.44.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net 5/7] net/mlx5e: clear xdp features on non-uplink representors
2024-11-07 18:35 [PATCH net 0/7] mlx5 misc fixes 2024-11-07 Tariq Toukan
` (3 preceding siblings ...)
2024-11-07 18:35 ` [PATCH net 4/7] net/mlx5e: kTLS, Fix incorrect page refcounting Tariq Toukan
@ 2024-11-07 18:35 ` Tariq Toukan
2024-11-07 18:35 ` [PATCH net 6/7] net/mlx5e: CT: Fix null-ptr-deref in add rule err flow Tariq Toukan
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Tariq Toukan @ 2024-11-07 18:35 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Andrew Lunn
Cc: netdev, Saeed Mahameed, Gal Pressman, Leon Romanovsky, William Tu,
Tariq Toukan
From: William Tu <witu@nvidia.com>
Non-uplink representor port does not support XDP. The patch clears
the xdp feature by checking the net_device_ops.ndo_bpf is set or not.
Verify using the netlink tool:
$ tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml --dump dev-get
Representor netdev before the patch:
{'ifindex': 8,
'xdp-features': {'basic',
'ndo-xmit',
'ndo-xmit-sg',
'redirect',
'rx-sg',
'xsk-zerocopy'},
'xdp-rx-metadata-features': set(),
'xdp-zc-max-segs': 1,
'xsk-features': set()},
With the patch:
{'ifindex': 8,
'xdp-features': set(),
'xdp-rx-metadata-features': set(),
'xsk-features': set()},
Fixes: 4d5ab0ad964d ("net/mlx5e: take into account device reconfiguration for xdp_features flag")
Signed-off-by: William Tu <witu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index e601324a690a..13a3fa8dc0cb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -4267,7 +4267,8 @@ void mlx5e_set_xdp_feature(struct net_device *netdev)
struct mlx5e_params *params = &priv->channels.params;
xdp_features_t val;
- if (params->packet_merge.type != MLX5E_PACKET_MERGE_NONE) {
+ if (!netdev->netdev_ops->ndo_bpf ||
+ params->packet_merge.type != MLX5E_PACKET_MERGE_NONE) {
xdp_clear_features_flag(netdev);
return;
}
--
2.44.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net 6/7] net/mlx5e: CT: Fix null-ptr-deref in add rule err flow
2024-11-07 18:35 [PATCH net 0/7] mlx5 misc fixes 2024-11-07 Tariq Toukan
` (4 preceding siblings ...)
2024-11-07 18:35 ` [PATCH net 5/7] net/mlx5e: clear xdp features on non-uplink representors Tariq Toukan
@ 2024-11-07 18:35 ` Tariq Toukan
2024-11-07 18:35 ` [PATCH net 7/7] net/mlx5e: Disable loopback self-test on multi-PF netdev Tariq Toukan
2024-11-12 3:30 ` [PATCH net 0/7] mlx5 misc fixes 2024-11-07 patchwork-bot+netdevbpf
7 siblings, 0 replies; 9+ messages in thread
From: Tariq Toukan @ 2024-11-07 18:35 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Andrew Lunn
Cc: netdev, Saeed Mahameed, Gal Pressman, Leon Romanovsky,
Moshe Shemesh, Cosmin Ratiu, Yevgeny Kliteynik, Tariq Toukan
From: Moshe Shemesh <moshe@nvidia.com>
In error flow of mlx5_tc_ct_entry_add_rule(), in case ct_rule_add()
callback returns error, zone_rule->attr is used uninitiated. Fix it to
use attr which has the needed pointer value.
Kernel log:
BUG: kernel NULL pointer dereference, address: 0000000000000110
RIP: 0010:mlx5_tc_ct_entry_add_rule+0x2b1/0x2f0 [mlx5_core]
…
Call Trace:
<TASK>
? __die+0x20/0x70
? page_fault_oops+0x150/0x3e0
? exc_page_fault+0x74/0x140
? asm_exc_page_fault+0x22/0x30
? mlx5_tc_ct_entry_add_rule+0x2b1/0x2f0 [mlx5_core]
? mlx5_tc_ct_entry_add_rule+0x1d5/0x2f0 [mlx5_core]
mlx5_tc_ct_block_flow_offload+0xc6a/0xf90 [mlx5_core]
? nf_flow_offload_tuple+0xd8/0x190 [nf_flow_table]
nf_flow_offload_tuple+0xd8/0x190 [nf_flow_table]
flow_offload_work_handler+0x142/0x320 [nf_flow_table]
? finish_task_switch.isra.0+0x15b/0x2b0
process_one_work+0x16c/0x320
worker_thread+0x28c/0x3a0
? __pfx_worker_thread+0x10/0x10
kthread+0xb8/0xf0
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2d/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
Fixes: 7fac5c2eced3 ("net/mlx5: CT: Avoid reusing modify header context for natted entries")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
index dcfccaaa8d91..92d5cfec3dc0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
@@ -866,7 +866,7 @@ mlx5_tc_ct_entry_add_rule(struct mlx5_tc_ct_priv *ct_priv,
return 0;
err_rule:
- mlx5_tc_ct_entry_destroy_mod_hdr(ct_priv, zone_rule->attr, zone_rule->mh);
+ mlx5_tc_ct_entry_destroy_mod_hdr(ct_priv, attr, zone_rule->mh);
mlx5_put_label_mapping(ct_priv, attr->ct_attr.ct_labels_id);
err_mod_hdr:
kfree(attr);
--
2.44.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net 7/7] net/mlx5e: Disable loopback self-test on multi-PF netdev
2024-11-07 18:35 [PATCH net 0/7] mlx5 misc fixes 2024-11-07 Tariq Toukan
` (5 preceding siblings ...)
2024-11-07 18:35 ` [PATCH net 6/7] net/mlx5e: CT: Fix null-ptr-deref in add rule err flow Tariq Toukan
@ 2024-11-07 18:35 ` Tariq Toukan
2024-11-12 3:30 ` [PATCH net 0/7] mlx5 misc fixes 2024-11-07 patchwork-bot+netdevbpf
7 siblings, 0 replies; 9+ messages in thread
From: Tariq Toukan @ 2024-11-07 18:35 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Andrew Lunn
Cc: netdev, Saeed Mahameed, Gal Pressman, Leon Romanovsky,
Carolina Jubran, Tariq Toukan
From: Carolina Jubran <cjubran@nvidia.com>
In Multi-PF (Socket Direct) configurations, when a loopback packet is
sent through one of the secondary devices, it will always be received
on the primary device. This causes the loopback layer to fail in
identifying the loopback packet as the devices are different.
To avoid false test failures, disable the loopback self-test in
Multi-PF configurations.
Fixes: ed29705e4ed1 ("net/mlx5: Enable SD feature")
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c b/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
index 5bf8318cc48b..1d60465cc2ca 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
@@ -36,6 +36,7 @@
#include "en.h"
#include "en/port.h"
#include "eswitch.h"
+#include "lib/mlx5.h"
static int mlx5e_test_health_info(struct mlx5e_priv *priv)
{
@@ -247,6 +248,9 @@ static int mlx5e_cond_loopback(struct mlx5e_priv *priv)
if (is_mdev_switchdev_mode(priv->mdev))
return -EOPNOTSUPP;
+ if (mlx5_get_sd(priv->mdev))
+ return -EOPNOTSUPP;
+
return 0;
}
--
2.44.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH net 0/7] mlx5 misc fixes 2024-11-07
2024-11-07 18:35 [PATCH net 0/7] mlx5 misc fixes 2024-11-07 Tariq Toukan
` (6 preceding siblings ...)
2024-11-07 18:35 ` [PATCH net 7/7] net/mlx5e: Disable loopback self-test on multi-PF netdev Tariq Toukan
@ 2024-11-12 3:30 ` patchwork-bot+netdevbpf
7 siblings, 0 replies; 9+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-11-12 3:30 UTC (permalink / raw)
To: Tariq Toukan
Cc: davem, kuba, pabeni, edumazet, andrew+netdev, netdev, saeedm, gal,
leonro
Hello:
This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Thu, 7 Nov 2024 20:35:20 +0200 you wrote:
> Hi,
>
> This patchset provides misc bug fixes from the team to the mlx5 core and
> Eth drivers.
>
> Series generated against:
> commit 5f897f30f596 ("Merge branch 'fix-the-arc-emac-driver'")
>
> [...]
Here is the summary with links:
- [net,1/7] net/mlx5: E-switch, unload IB representors when unloading ETH representors
https://git.kernel.org/netdev/net/c/1220965d6191
- [net,2/7] net/mlx5: Fix msix vectors to respect platform limit
https://git.kernel.org/netdev/net/c/d0989c9d2b3a
- [net,3/7] net/mlx5: fs, lock FTE when checking if active
https://git.kernel.org/netdev/net/c/9ca314419930
- [net,4/7] net/mlx5e: kTLS, Fix incorrect page refcounting
https://git.kernel.org/netdev/net/c/dd6e972cc589
- [net,5/7] net/mlx5e: clear xdp features on non-uplink representors
https://git.kernel.org/netdev/net/c/c079389878de
- [net,6/7] net/mlx5e: CT: Fix null-ptr-deref in add rule err flow
https://git.kernel.org/netdev/net/c/e99c6873229f
- [net,7/7] net/mlx5e: Disable loopback self-test on multi-PF netdev
https://git.kernel.org/netdev/net/c/d1ac33934a66
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-11-12 3:30 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-07 18:35 [PATCH net 0/7] mlx5 misc fixes 2024-11-07 Tariq Toukan
2024-11-07 18:35 ` [PATCH net 1/7] net/mlx5: E-switch, unload IB representors when unloading ETH representors Tariq Toukan
2024-11-07 18:35 ` [PATCH net 2/7] net/mlx5: Fix msix vectors to respect platform limit Tariq Toukan
2024-11-07 18:35 ` [PATCH net 3/7] net/mlx5: fs, lock FTE when checking if active Tariq Toukan
2024-11-07 18:35 ` [PATCH net 4/7] net/mlx5e: kTLS, Fix incorrect page refcounting Tariq Toukan
2024-11-07 18:35 ` [PATCH net 5/7] net/mlx5e: clear xdp features on non-uplink representors Tariq Toukan
2024-11-07 18:35 ` [PATCH net 6/7] net/mlx5e: CT: Fix null-ptr-deref in add rule err flow Tariq Toukan
2024-11-07 18:35 ` [PATCH net 7/7] net/mlx5e: Disable loopback self-test on multi-PF netdev Tariq Toukan
2024-11-12 3:30 ` [PATCH net 0/7] mlx5 misc fixes 2024-11-07 patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).