* [PATCH v2 net] qed: Don't collect too many protection override GRC elements
@ 2025-09-10 6:29 Jamie Bainbridge
2025-09-14 21:30 ` patchwork-bot+netdevbpf
0 siblings, 1 reply; 2+ messages in thread
From: Jamie Bainbridge @ 2025-09-10 6:29 UTC (permalink / raw)
To: Nimal Prabudoss, Jakub Kicinski, Manish Chopra, Andrew Lunn,
David S. Miller, Eric Dumazet, Paolo Abeni, Manish Rangankar,
Michal Kalderon, Ariel Elior
Cc: Jamie Bainbridge, netdev, linux-kernel
In the protection override dump path, the firmware can return far too
many GRC elements, resulting in attempting to write past the end of the
previously-kmalloc'ed dump buffer.
This will result in a kernel panic with reason:
BUG: unable to handle kernel paging request at ADDRESS
where "ADDRESS" is just past the end of the protection override dump
buffer. The start address of the buffer is:
p_hwfn->cdev->dbg_features[DBG_FEATURE_PROTECTION_OVERRIDE].dump_buf
and the size of the buffer is buf_size in the same data structure.
The panic can be arrived at from either the qede Ethernet driver path:
[exception RIP: qed_grc_dump_addr_range+0x108]
qed_protection_override_dump at ffffffffc02662ed [qed]
qed_dbg_protection_override_dump at ffffffffc0267792 [qed]
qed_dbg_feature at ffffffffc026aa8f [qed]
qed_dbg_all_data at ffffffffc026b211 [qed]
qed_fw_fatal_reporter_dump at ffffffffc027298a [qed]
devlink_health_do_dump at ffffffff82497f61
devlink_health_report at ffffffff8249cf29
qed_report_fatal_error at ffffffffc0272baf [qed]
qede_sp_task at ffffffffc045ed32 [qede]
process_one_work at ffffffff81d19783
or the qedf storage driver path:
[exception RIP: qed_grc_dump_addr_range+0x108]
qed_protection_override_dump at ffffffffc068b2ed [qed]
qed_dbg_protection_override_dump at ffffffffc068c792 [qed]
qed_dbg_feature at ffffffffc068fa8f [qed]
qed_dbg_all_data at ffffffffc0690211 [qed]
qed_fw_fatal_reporter_dump at ffffffffc069798a [qed]
devlink_health_do_dump at ffffffff8aa95e51
devlink_health_report at ffffffff8aa9ae19
qed_report_fatal_error at ffffffffc0697baf [qed]
qed_hw_err_notify at ffffffffc06d32d7 [qed]
qed_spq_post at ffffffffc06b1011 [qed]
qed_fcoe_destroy_conn at ffffffffc06b2e91 [qed]
qedf_cleanup_fcport at ffffffffc05e7597 [qedf]
qedf_rport_event_handler at ffffffffc05e7bf7 [qedf]
fc_rport_work at ffffffffc02da715 [libfc]
process_one_work at ffffffff8a319663
Resolve this by clamping the firmware's return value to the maximum
number of legal elements the firmware should return.
Fixes: d52c89f120de8 ("qed*: Utilize FW 8.37.2.0")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
---
v2: Move the check to the correct place after discussion with Jakub
Kicinski and more debugging.
Add more detail about error symptoms to the commit description
also suggested by Jakub.
---
drivers/net/ethernet/qlogic/qed/qed_debug.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_debug.c b/drivers/net/ethernet/qlogic/qed/qed_debug.c
index 9c3d3dd2f84753100d3c639505677bd53e3ca543..1f0cea3cae92f542e0213af87e885406599980a9 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_debug.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_debug.c
@@ -4462,10 +4462,11 @@ static enum dbg_status qed_protection_override_dump(struct qed_hwfn *p_hwfn,
goto out;
}
- /* Add override window info to buffer */
+ /* Add override window info to buffer, preventing buffer overflow */
override_window_dwords =
- qed_rd(p_hwfn, p_ptt, GRC_REG_NUMBER_VALID_OVERRIDE_WINDOW) *
- PROTECTION_OVERRIDE_ELEMENT_DWORDS;
+ min(qed_rd(p_hwfn, p_ptt, GRC_REG_NUMBER_VALID_OVERRIDE_WINDOW) *
+ PROTECTION_OVERRIDE_ELEMENT_DWORDS,
+ PROTECTION_OVERRIDE_DEPTH_DWORDS);
if (override_window_dwords) {
addr = BYTES_TO_DWORDS(GRC_REG_PROTECTION_OVERRIDE_WINDOW);
offset += qed_grc_dump_addr_range(p_hwfn,
--
2.47.3
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH v2 net] qed: Don't collect too many protection override GRC elements
2025-09-10 6:29 [PATCH v2 net] qed: Don't collect too many protection override GRC elements Jamie Bainbridge
@ 2025-09-14 21:30 ` patchwork-bot+netdevbpf
0 siblings, 0 replies; 2+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-09-14 21:30 UTC (permalink / raw)
To: Jamie Bainbridge
Cc: nprabudoss, kuba, manishc, andrew+netdev, davem, edumazet, pabeni,
manish.rangankar, Michal.Kalderon, Ariel.Elior, netdev,
linux-kernel
Hello:
This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Wed, 10 Sep 2025 16:29:16 +1000 you wrote:
> In the protection override dump path, the firmware can return far too
> many GRC elements, resulting in attempting to write past the end of the
> previously-kmalloc'ed dump buffer.
>
> This will result in a kernel panic with reason:
>
> BUG: unable to handle kernel paging request at ADDRESS
>
> [...]
Here is the summary with links:
- [v2,net] qed: Don't collect too many protection override GRC elements
https://git.kernel.org/netdev/net/c/56c0a2a9ddc2
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2025-09-14 21:30 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-10 6:29 [PATCH v2 net] qed: Don't collect too many protection override GRC elements Jamie Bainbridge
2025-09-14 21:30 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox