public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] net/mlx5: Ensure af_desc.mask is properly initialized
       [not found] <168556238265.1445.7577814343475230160.stgit@manet.1015granger.net>
@ 2023-05-31 22:35 ` Saeed Mahameed
  0 siblings, 0 replies; only message in thread
From: Saeed Mahameed @ 2023-05-31 22:35 UTC (permalink / raw)
  To: Chuck Lever; +Cc: elic, Thomas Gleixner, Chuck Lever, linux-rdma

On 31 May 15:48, Chuck Lever wrote:
>From: Chuck Lever <chuck.lever@oracle.com>
>
>[    9.837087] mlx5_core 0000:02:00.0: firmware version: 16.35.2000
>[    9.843126] mlx5_core 0000:02:00.0: 126.016 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x16 link)
>[   10.311515] mlx5_core 0000:02:00.0: Rate limit: 127 rates are supported, range: 0Mbps to 97656Mbps
>[   10.321948] mlx5_core 0000:02:00.0: E-Switch: Total vports 2, per vport: max uc(128) max mc(2048)
>[   10.344324] mlx5_core 0000:02:00.0: mlx5_pcie_event:301:(pid 88): PCIe slot advertised sufficient power (27W).
>[   10.354339] BUG: unable to handle page fault for address: ffffffff8ff0ade0
>[   10.361206] #PF: supervisor read access in kernel mode
>[   10.366335] #PF: error_code(0x0000) - not-present page
>[   10.371467] PGD 81ec39067 P4D 81ec39067 PUD 81ec3a063 PMD 114b07063 PTE 800ffff7e10f5062
>[   10.379544] Oops: 0000 [#1] PREEMPT SMP PTI
>[   10.383721] CPU: 0 PID: 117 Comm: kworker/0:6 Not tainted 6.3.0-13028-g7222f123c983 #1
>[   10.391625] Hardware name: Supermicro X10SRA-F/X10SRA-F, BIOS 2.0b 06/12/2017
>[   10.398750] Workqueue: events work_for_cpu_fn
>[   10.403108] RIP: 0010:__bitmap_or+0x10/0x26
>[   10.407286] Code: 85 c0 0f 95 c0 c3 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 89 c9 31 c0 48 83 c1 3f 48 c1 e9 06 39 c>
>[   10.426024] RSP: 0000:ffffb45a0078f7b0 EFLAGS: 00010097
>[   10.431240] RAX: 0000000000000000 RBX: ffffffff8ff0adc0 RCX: 0000000000000004
>[   10.438365] RDX: ffff9156801967d0 RSI: ffffffff8ff0ade0 RDI: ffff9156801967b0
>[   10.445489] RBP: ffffb45a0078f7e8 R08: 0000000000000030 R09: 0000000000000000
>[   10.452613] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000000ec
>[   10.459737] R13: ffffffff8ff0ade0 R14: 0000000000000001 R15: 0000000000000020
>[   10.466862] FS:  0000000000000000(0000) GS:ffff9165bfc00000(0000) knlGS:0000000000000000
>[   10.474936] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>[   10.480674] CR2: ffffffff8ff0ade0 CR3: 00000001011ae003 CR4: 00000000003706f0
>[   10.487800] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>[   10.494922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>[   10.502046] Call Trace:
>[   10.504493]  <TASK>
>[   10.506589]  ? matrix_alloc_area.constprop.0+0x43/0x9a
>[   10.511729]  ? prepare_namespace+0x84/0x174
>[   10.515914]  irq_matrix_reserve_managed+0x56/0x10c
>[   10.520699]  x86_vector_alloc_irqs+0x1d2/0x31e
>[   10.525146]  irq_domain_alloc_irqs_hierarchy+0x39/0x3f
>[   10.530284]  irq_domain_alloc_irqs_parent+0x1a/0x2a
>[   10.535155]  intel_irq_remapping_alloc+0x59/0x5e9
>[   10.539859]  ? kmem_cache_debug_flags+0x11/0x26
>[   10.544383]  ? __radix_tree_lookup+0x39/0xb9
>[   10.548649]  irq_domain_alloc_irqs_hierarchy+0x39/0x3f
>[   10.553779]  irq_domain_alloc_irqs_parent+0x1a/0x2a
>[   10.558650]  msi_domain_alloc+0x8c/0x120
>[   10.567697]  irq_domain_alloc_irqs_locked+0x11d/0x286
>[   10.572741]  __irq_domain_alloc_irqs+0x72/0x93
>[   10.577179]  __msi_domain_alloc_irqs+0x193/0x3f1
>[   10.581789]  ? __xa_alloc+0xcf/0xe2
>[   10.585273]  msi_domain_alloc_irq_at+0xa8/0xfe
>[   10.589711]  pci_msix_alloc_irq_at+0x47/0x5c
>
>The crash is due to matrix_alloc_area() attempting to access per-CPU
>memory for CPUs that are not present on the system. The CPU mask
>passed into reserve_managed_vector() via it's @irqd parameter is
>corrupted because it contains uninitialized stack data.
>
>Fixes: bbac70c74183 ("net/mlx5: Use newer affinity descriptor")
>Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
>Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

Applied to net-mlx5, Chuck, for Faster review please CC netdev next time
for mlx5 patches.

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2023-05-31 22:35 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <168556238265.1445.7577814343475230160.stgit@manet.1015granger.net>
2023-05-31 22:35 ` [PATCH] net/mlx5: Ensure af_desc.mask is properly initialized Saeed Mahameed

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox