From: Saeed Mahameed <saeed@kernel.org>
To: "David S. Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Eric Dumazet <edumazet@google.com>
Cc: Saeed Mahameed <saeedm@nvidia.com>,
netdev@vger.kernel.org, Tariq Toukan <tariqt@nvidia.com>,
Shay Drory <shayd@nvidia.com>,
Frederick Lawler <fred@cloudflare.com>
Subject: [net 13/15] net/mlx5: Free irqs only on shutdown callback
Date: Mon, 22 May 2023 22:42:40 -0700 [thread overview]
Message-ID: <20230523054242.21596-14-saeed@kernel.org> (raw)
In-Reply-To: <20230523054242.21596-1-saeed@kernel.org>
From: Shay Drory <shayd@nvidia.com>
Whenever a shutdown is invoked, free irqs only and keep mlx5_irq
synthetic wrapper intact in order to avoid use-after-free on
system shutdown.
for example:
==================================================================
BUG: KASAN: use-after-free in _find_first_bit+0x66/0x80
Read of size 8 at addr ffff88823fc0d318 by task kworker/u192:0/13608
CPU: 25 PID: 13608 Comm: kworker/u192:0 Tainted: G B W O 6.1.21-cloudflare-kasan-2023.3.21 #1
Hardware name: GIGABYTE R162-R2-GEN0/MZ12-HD2-CD, BIOS R14 05/03/2021
Workqueue: mlx5e mlx5e_tx_timeout_work [mlx5_core]
Call Trace:
<TASK>
dump_stack_lvl+0x34/0x48
print_report+0x170/0x473
? _find_first_bit+0x66/0x80
kasan_report+0xad/0x130
? _find_first_bit+0x66/0x80
_find_first_bit+0x66/0x80
mlx5e_open_channels+0x3c5/0x3a10 [mlx5_core]
? console_unlock+0x2fa/0x430
? _raw_spin_lock_irqsave+0x8d/0xf0
? _raw_spin_unlock_irqrestore+0x42/0x80
? preempt_count_add+0x7d/0x150
? __wake_up_klogd.part.0+0x7d/0xc0
? vprintk_emit+0xfe/0x2c0
? mlx5e_trigger_napi_sched+0x40/0x40 [mlx5_core]
? dev_attr_show.cold+0x35/0x35
? devlink_health_do_dump.part.0+0x174/0x340
? devlink_health_report+0x504/0x810
? mlx5e_reporter_tx_timeout+0x29d/0x3a0 [mlx5_core]
? mlx5e_tx_timeout_work+0x17c/0x230 [mlx5_core]
? process_one_work+0x680/0x1050
mlx5e_safe_switch_params+0x156/0x220 [mlx5_core]
? mlx5e_switch_priv_channels+0x310/0x310 [mlx5_core]
? mlx5_eq_poll_irq_disabled+0xb6/0x100 [mlx5_core]
mlx5e_tx_reporter_timeout_recover+0x123/0x240 [mlx5_core]
? __mutex_unlock_slowpath.constprop.0+0x2b0/0x2b0
devlink_health_reporter_recover+0xa6/0x1f0
devlink_health_report+0x2f7/0x810
? vsnprintf+0x854/0x15e0
mlx5e_reporter_tx_timeout+0x29d/0x3a0 [mlx5_core]
? mlx5e_reporter_tx_err_cqe+0x1a0/0x1a0 [mlx5_core]
? mlx5e_tx_reporter_timeout_dump+0x50/0x50 [mlx5_core]
? mlx5e_tx_reporter_dump_sq+0x260/0x260 [mlx5_core]
? newidle_balance+0x9b7/0xe30
? psi_group_change+0x6a7/0xb80
? mutex_lock+0x96/0xf0
? __mutex_lock_slowpath+0x10/0x10
mlx5e_tx_timeout_work+0x17c/0x230 [mlx5_core]
process_one_work+0x680/0x1050
worker_thread+0x5a0/0xeb0
? process_one_work+0x1050/0x1050
kthread+0x2a2/0x340
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x22/0x30
</TASK>
Freed by task 1:
kasan_save_stack+0x23/0x50
kasan_set_track+0x21/0x30
kasan_save_free_info+0x2a/0x40
____kasan_slab_free+0x169/0x1d0
slab_free_freelist_hook+0xd2/0x190
__kmem_cache_free+0x1a1/0x2f0
irq_pool_free+0x138/0x200 [mlx5_core]
mlx5_irq_table_destroy+0xf6/0x170 [mlx5_core]
mlx5_core_eq_free_irqs+0x74/0xf0 [mlx5_core]
shutdown+0x194/0x1aa [mlx5_core]
pci_device_shutdown+0x75/0x120
device_shutdown+0x35c/0x620
kernel_restart+0x60/0xa0
__do_sys_reboot+0x1cb/0x2c0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x4b/0xb5
The buggy address belongs to the object at ffff88823fc0d300
which belongs to the cache kmalloc-192 of size 192
The buggy address is located 24 bytes inside of
192-byte region [ffff88823fc0d300, ffff88823fc0d3c0)
The buggy address belongs to the physical page:
page:0000000010139587 refcount:1 mapcount:0 mapping:0000000000000000
index:0x0 pfn:0x23fc0c
head:0000000010139587 order:1 compound_mapcount:0 compound_pincount:0
flags: 0x2ffff800010200(slab|head|node=0|zone=2|lastcpupid=0x1ffff)
raw: 002ffff800010200 0000000000000000 dead000000000122 ffff88810004ca00
raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff88823fc0d200: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88823fc0d280: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
>ffff88823fc0d300: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88823fc0d380: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
ffff88823fc0d400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================
general protection fault, probably for non-canonical address
0xdffffc005c40d7ac: 0000 [#1] PREEMPT SMP KASAN NOPTI
KASAN: probably user-memory-access in range [0x00000002e206bd60-0x00000002e206bd67]
CPU: 25 PID: 13608 Comm: kworker/u192:0 Tainted: G B W O 6.1.21-cloudflare-kasan-2023.3.21 #1
Hardware name: GIGABYTE R162-R2-GEN0/MZ12-HD2-CD, BIOS R14 05/03/2021
Workqueue: mlx5e mlx5e_tx_timeout_work [mlx5_core]
RIP: 0010:__alloc_pages+0x141/0x5c0
Call Trace:
<TASK>
? sysvec_apic_timer_interrupt+0xa0/0xc0
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? __alloc_pages_slowpath.constprop.0+0x1ec0/0x1ec0
? _raw_spin_unlock_irqrestore+0x3d/0x80
__kmalloc_large_node+0x80/0x120
? kvmalloc_node+0x4e/0x170
__kmalloc_node+0xd4/0x150
kvmalloc_node+0x4e/0x170
mlx5e_open_channels+0x631/0x3a10 [mlx5_core]
? console_unlock+0x2fa/0x430
? _raw_spin_lock_irqsave+0x8d/0xf0
? _raw_spin_unlock_irqrestore+0x42/0x80
? preempt_count_add+0x7d/0x150
? __wake_up_klogd.part.0+0x7d/0xc0
? vprintk_emit+0xfe/0x2c0
? mlx5e_trigger_napi_sched+0x40/0x40 [mlx5_core]
? dev_attr_show.cold+0x35/0x35
? devlink_health_do_dump.part.0+0x174/0x340
? devlink_health_report+0x504/0x810
? mlx5e_reporter_tx_timeout+0x29d/0x3a0 [mlx5_core]
? mlx5e_tx_timeout_work+0x17c/0x230 [mlx5_core]
? process_one_work+0x680/0x1050
mlx5e_safe_switch_params+0x156/0x220 [mlx5_core]
? mlx5e_switch_priv_channels+0x310/0x310 [mlx5_core]
? mlx5_eq_poll_irq_disabled+0xb6/0x100 [mlx5_core]
mlx5e_tx_reporter_timeout_recover+0x123/0x240 [mlx5_core]
? __mutex_unlock_slowpath.constprop.0+0x2b0/0x2b0
devlink_health_reporter_recover+0xa6/0x1f0
devlink_health_report+0x2f7/0x810
? vsnprintf+0x854/0x15e0
mlx5e_reporter_tx_timeout+0x29d/0x3a0 [mlx5_core]
? mlx5e_reporter_tx_err_cqe+0x1a0/0x1a0 [mlx5_core]
? mlx5e_tx_reporter_timeout_dump+0x50/0x50 [mlx5_core]
? mlx5e_tx_reporter_dump_sq+0x260/0x260 [mlx5_core]
? newidle_balance+0x9b7/0xe30
? psi_group_change+0x6a7/0xb80
? mutex_lock+0x96/0xf0
? __mutex_lock_slowpath+0x10/0x10
mlx5e_tx_timeout_work+0x17c/0x230 [mlx5_core]
process_one_work+0x680/0x1050
worker_thread+0x5a0/0xeb0
? process_one_work+0x1050/0x1050
kthread+0x2a2/0x340
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x22/0x30
</TASK>
---[ end trace 0000000000000000 ]---
RIP: 0010:__alloc_pages+0x141/0x5c0
Code: e0 39 a3 96 89 e9 b8 22 01 32 01 83 e1 0f 48 89 fa 01 c9 48 c1 ea
03 d3 f8 83 e0 03 89 44 24 6c 48 b8 00 00 00 00 00 fc ff df <80> 3c 02
00 0f 85 fc 03 00 00 89 e8 4a 8b 14 f5 e0 39 a3 96 4c 89
RSP: 0018:ffff888251f0f438 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: 1ffff1104a3e1e8b RCX: 0000000000000000
RDX: 000000005c40d7ac RSI: 0000000000000003 RDI: 00000002e206bd60
RBP: 0000000000052dc0 R08: ffff8882b0044218 R09: ffff8882b0045e8a
R10: fffffbfff300fefc R11: ffff888167af4000 R12: 0000000000000003
R13: 0000000000000000 R14: 00000000696c7070 R15: ffff8882373f4380
FS: 0000000000000000(0000) GS:ffff88bf2be80000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005641d031eee8 CR3: 0000002e7ca14000 CR4: 0000000000350ee0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x11000000 from 0xffffffff81000000 (relocation range:
0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---]
Reported-by: Frederick Lawler <fred@cloudflare.com>
Link: https://lore.kernel.org/netdev/be5b9271-7507-19c5-ded1-fa78f1980e69@cloudflare.com
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/eq.c | 2 +-
.../ethernet/mellanox/mlx5/core/mlx5_irq.h | 1 +
.../net/ethernet/mellanox/mlx5/core/pci_irq.c | 29 +++++++++++++++++++
3 files changed, 31 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index 1c35d721a31d..fe698c79616c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -1104,7 +1104,7 @@ void mlx5_core_eq_free_irqs(struct mlx5_core_dev *dev)
struct mlx5_eq_table *table = dev->priv.eq_table;
mutex_lock(&table->lock); /* sync with create/destroy_async_eq */
- mlx5_irq_table_destroy(dev);
+ mlx5_irq_table_free_irqs(dev);
mutex_unlock(&table->lock);
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_irq.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_irq.h
index efd0c299c5c7..aa403a5ea34e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_irq.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_irq.h
@@ -15,6 +15,7 @@ int mlx5_irq_table_init(struct mlx5_core_dev *dev);
void mlx5_irq_table_cleanup(struct mlx5_core_dev *dev);
int mlx5_irq_table_create(struct mlx5_core_dev *dev);
void mlx5_irq_table_destroy(struct mlx5_core_dev *dev);
+void mlx5_irq_table_free_irqs(struct mlx5_core_dev *dev);
int mlx5_irq_table_get_num_comp(struct mlx5_irq_table *table);
int mlx5_irq_table_get_sfs_vec(struct mlx5_irq_table *table);
struct mlx5_irq_table *mlx5_irq_table_get(struct mlx5_core_dev *dev);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
index 2245d3b2f393..ac1304c2d205 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
@@ -691,6 +691,24 @@ static void irq_pools_destroy(struct mlx5_irq_table *table)
irq_pool_free(table->pcif_pool);
}
+static void mlx5_irq_pool_free_irqs(struct mlx5_irq_pool *pool)
+{
+ struct mlx5_irq *irq;
+ unsigned long index;
+
+ xa_for_each(&pool->irqs, index, irq)
+ free_irq(irq->map.virq, &irq->nh);
+}
+
+static void mlx5_irq_pools_free_irqs(struct mlx5_irq_table *table)
+{
+ if (table->sf_ctrl_pool) {
+ mlx5_irq_pool_free_irqs(table->sf_comp_pool);
+ mlx5_irq_pool_free_irqs(table->sf_ctrl_pool);
+ }
+ mlx5_irq_pool_free_irqs(table->pcif_pool);
+}
+
/* irq_table API */
int mlx5_irq_table_init(struct mlx5_core_dev *dev)
@@ -774,6 +792,17 @@ void mlx5_irq_table_destroy(struct mlx5_core_dev *dev)
pci_free_irq_vectors(dev->pdev);
}
+void mlx5_irq_table_free_irqs(struct mlx5_core_dev *dev)
+{
+ struct mlx5_irq_table *table = dev->priv.irq_table;
+
+ if (mlx5_core_is_sf(dev))
+ return;
+
+ mlx5_irq_pools_free_irqs(table);
+ pci_free_irq_vectors(dev->pdev);
+}
+
int mlx5_irq_table_get_sfs_vec(struct mlx5_irq_table *table)
{
if (table->sf_comp_pool)
--
2.40.1
next prev parent reply other threads:[~2023-05-23 5:43 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-23 5:42 [pull request][net 00/15] mlx5 fixes 2023-05-22 Saeed Mahameed
2023-05-23 5:42 ` [net 01/15] net/mlx5: Collect command failures data only for known commands Saeed Mahameed
2023-05-24 7:50 ` patchwork-bot+netdevbpf
2023-05-23 5:42 ` [net 02/15] net/mlx5: Handle pairing of E-switch via uplink un/load APIs Saeed Mahameed
2023-05-23 5:42 ` [net 03/15] net/mlx5: DR, Fix crc32 calculation to work on big-endian (BE) CPUs Saeed Mahameed
2023-05-23 5:42 ` [net 04/15] net/mlx5: DR, Check force-loopback RC QP capability independently from RoCE Saeed Mahameed
2023-05-23 5:42 ` [net 05/15] net/mlx5e: Use correct encap attribute during invalidation Saeed Mahameed
2023-05-23 5:42 ` [net 06/15] net/mlx5: Fix error message when failing to allocate device memory Saeed Mahameed
2023-05-23 5:42 ` [net 07/15] net/mlx5e: Fix deadlock in tc route query code Saeed Mahameed
2023-05-23 5:42 ` [net 08/15] net/mlx5e: Fix SQ wake logic in ptp napi_poll context Saeed Mahameed
2023-05-23 5:42 ` [net 09/15] net/mlx5e: TC, Fix using eswitch mapping in nic mode Saeed Mahameed
2023-05-23 5:42 ` [net 10/15] net/mlx5: E-switch, Devcom, sync devcom events and devcom comp register Saeed Mahameed
2023-05-23 5:42 ` [net 11/15] net/mlx5: Devcom, fix error flow in mlx5_devcom_register_device Saeed Mahameed
2023-05-23 5:42 ` [net 12/15] net/mlx5: Devcom, serialize devcom registration Saeed Mahameed
2023-05-23 5:42 ` Saeed Mahameed [this message]
2023-05-23 5:42 ` [net 14/15] net/mlx5: Fix irq affinity management Saeed Mahameed
2023-05-23 5:42 ` [net 15/15] net/mlx5: Fix indexing of mlx5_irq Saeed Mahameed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230523054242.21596-14-saeed@kernel.org \
--to=saeed@kernel.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=fred@cloudflare.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=saeedm@nvidia.com \
--cc=shayd@nvidia.com \
--cc=tariqt@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.