* [PATCH v7 0/2] bnx2x: Fix error recovering in switch configuration
@ 2024-02-01 17:48 Thinh Tran
2024-02-01 17:48 ` [PATCH v7 1/2] net/bnx2x: Prevent access to a freed page in page_pool Thinh Tran
` (2 more replies)
0 siblings, 3 replies; 14+ messages in thread
From: Thinh Tran @ 2024-02-01 17:48 UTC (permalink / raw)
To: kuba
Cc: netdev, aelior, davem, manishc, pabeni, skalluru, simon.horman,
edumazet, VENKATA.SAI.DUGGI, drc, abdhalee, thinhtr
Please refer to the initial cover letter
https://lore.kernel.org/all/20230818161443.708785-1-thinhtr@linux.vnet.ibm.com
In series Version 6, the patch
[v6,1/4] bnx2x: new flag for tracking HW resource
was successfully made it to the mainline kernel
https://github.com/torvalds/linux/commit/bf23ffc8a9a777dfdeb04232e0946b803adbb6a9
but the rest of the patches did not.
The following patch has been excluded from this series:
net/bnx2x: prevent excessive debug information during a TX timeout
based on concerns raised by some developers that it might omit valuable
debugging details as in some other scenarios may cause the TX timeout.
Hereby resubmitting the two patches below:
Thinh Tran (2):
net/bnx2x: Prevent access to a freed page in page_pool
net/bnx2x: refactor common code to bnx2x_stop_nic()
.../net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 28 +++++++++++--------
.../net/ethernet/broadcom/bnx2x/bnx2x_cmn.h | 7 +++--
.../net/ethernet/broadcom/bnx2x/bnx2x_main.c | 25 +++--------------
.../net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c | 12 ++------
4 files changed, 27 insertions(+), 45 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 14+ messages in thread* [PATCH v7 1/2] net/bnx2x: Prevent access to a freed page in page_pool 2024-02-01 17:48 [PATCH v7 0/2] bnx2x: Fix error recovering in switch configuration Thinh Tran @ 2024-02-01 17:48 ` Thinh Tran 2024-02-08 2:37 ` Jakub Kicinski 2024-02-01 17:48 ` [PATCH v7 2/2] net/bnx2x: refactor common code to bnx2x_stop_nic() Thinh Tran 2024-02-08 18:55 ` [PATCH v8 0/2] bnx2x: Fix error recovering in switch configuration Thinh Tran 2 siblings, 1 reply; 14+ messages in thread From: Thinh Tran @ 2024-02-01 17:48 UTC (permalink / raw) To: kuba Cc: netdev, aelior, davem, manishc, pabeni, skalluru, simon.horman, edumazet, VENKATA.SAI.DUGGI, drc, abdhalee, thinhtr Verify page pool allocations before freeing. Signed-off-by: Thinh Tran <thinhtr@linux.vnet.ibm.com> --- drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h index d8b1824c334d..0bc1367fd649 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h @@ -1002,9 +1002,6 @@ static inline void bnx2x_set_fw_mac_addr(__le16 *fw_hi, __le16 *fw_mid, static inline void bnx2x_free_rx_mem_pool(struct bnx2x *bp, struct bnx2x_alloc_pool *pool) { - if (!pool->page) - return; - put_page(pool->page); pool->page = NULL; @@ -1015,6 +1012,9 @@ static inline void bnx2x_free_rx_sge_range(struct bnx2x *bp, { int i; + if (!fp->page_pool.page) + return; + if (fp->mode == TPA_MODE_DISABLED) return; -- 2.25.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v7 1/2] net/bnx2x: Prevent access to a freed page in page_pool 2024-02-01 17:48 ` [PATCH v7 1/2] net/bnx2x: Prevent access to a freed page in page_pool Thinh Tran @ 2024-02-08 2:37 ` Jakub Kicinski 0 siblings, 0 replies; 14+ messages in thread From: Jakub Kicinski @ 2024-02-08 2:37 UTC (permalink / raw) To: Thinh Tran Cc: netdev, aelior, davem, manishc, pabeni, skalluru, simon.horman, edumazet, VENKATA.SAI.DUGGI, drc, abdhalee On Thu, 1 Feb 2024 11:48:21 -0600 Thinh Tran wrote: > Verify page pool allocations before freeing. Could you add an example stack trace or explain how the crash can occur? Could you add a Fixes tag? ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v7 2/2] net/bnx2x: refactor common code to bnx2x_stop_nic() 2024-02-01 17:48 [PATCH v7 0/2] bnx2x: Fix error recovering in switch configuration Thinh Tran 2024-02-01 17:48 ` [PATCH v7 1/2] net/bnx2x: Prevent access to a freed page in page_pool Thinh Tran @ 2024-02-01 17:48 ` Thinh Tran 2024-02-08 18:55 ` [PATCH v8 0/2] bnx2x: Fix error recovering in switch configuration Thinh Tran 2 siblings, 0 replies; 14+ messages in thread From: Thinh Tran @ 2024-02-01 17:48 UTC (permalink / raw) To: kuba Cc: netdev, aelior, davem, manishc, pabeni, skalluru, simon.horman, edumazet, VENKATA.SAI.DUGGI, drc, abdhalee, thinhtr Refactor common code which disables and releases HW interrupts, deletes NAPI objects, into a new bnx2x_stop_nic() function. Signed-off-by: Thinh Tran <thinhtr@linux.vnet.ibm.com> --- .../net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 28 +++++++++++-------- .../net/ethernet/broadcom/bnx2x/bnx2x_cmn.h | 1 + .../net/ethernet/broadcom/bnx2x/bnx2x_main.c | 25 +++-------------- .../net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c | 12 ++------ 4 files changed, 24 insertions(+), 42 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c index e9c1e1bb5580..e34aff5fb782 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c @@ -3097,17 +3097,8 @@ int bnx2x_nic_unload(struct bnx2x *bp, int unload_mode, bool keep_link) if (!CHIP_IS_E1x(bp)) bnx2x_pf_disable(bp); - if (!bp->nic_stopped) { - /* Disable HW interrupts, NAPI */ - bnx2x_netif_stop(bp, 1); - /* Delete all NAPI objects */ - bnx2x_del_all_napi(bp); - if (CNIC_LOADED(bp)) - bnx2x_del_all_napi_cnic(bp); - /* Release IRQs */ - bnx2x_free_irq(bp); - bp->nic_stopped = true; - } + /* Disable HW interrupts, delete NAPIs, Release IRQs */ + bnx2x_stop_nic(bp, 1); /* Report UNLOAD_DONE to MCP */ bnx2x_send_unload_done(bp, false); @@ -5139,3 +5130,18 @@ void bnx2x_schedule_sp_rtnl(struct bnx2x *bp, enum sp_rtnl_flag flag, flag); schedule_delayed_work(&bp->sp_rtnl_task, 0); } + +void bnx2x_stop_nic(struct bnx2x *bp, int disable_hw) +{ + if (!bp->nic_stopped) { + /* Disable HW interrupts, NAPI */ + bnx2x_netif_stop(bp, disable_hw); + /* Delete all NAPI objects */ + bnx2x_del_all_napi(bp); + if (CNIC_LOADED(bp)) + bnx2x_del_all_napi_cnic(bp); + /* Release IRQs */ + bnx2x_free_irq(bp); + bp->nic_stopped = true; + } +} diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h index 0bc1367fd649..a35a02299b33 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h @@ -553,6 +553,7 @@ void bnx2x_free_skbs(struct bnx2x *bp); void bnx2x_netif_stop(struct bnx2x *bp, int disable_hw); void bnx2x_netif_start(struct bnx2x *bp); int bnx2x_load_cnic(struct bnx2x *bp); +void bnx2x_stop_nic(struct bnx2x *bp, int disable_hw); /** * bnx2x_enable_msix - set msix configuration. diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c index 0d8e61c63c7c..ff75c883cffe 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c @@ -9474,18 +9474,8 @@ void bnx2x_chip_cleanup(struct bnx2x *bp, int unload_mode, bool keep_link) } } - if (!bp->nic_stopped) { - /* Disable HW interrupts, NAPI */ - bnx2x_netif_stop(bp, 1); - /* Delete all NAPI objects */ - bnx2x_del_all_napi(bp); - if (CNIC_LOADED(bp)) - bnx2x_del_all_napi_cnic(bp); - - /* Release IRQs */ - bnx2x_free_irq(bp); - bp->nic_stopped = true; - } + /* Disable HW interrupts, delete NAPIs, Release IRQs */ + bnx2x_stop_nic(bp, 1); /* Reset the chip, unless PCI function is offline. If we reach this * point following a PCI error handling, it means device is really @@ -14241,16 +14231,9 @@ static pci_ers_result_t bnx2x_io_slot_reset(struct pci_dev *pdev) } bnx2x_drain_tx_queues(bp); bnx2x_send_unload_req(bp, UNLOAD_RECOVERY); - if (!bp->nic_stopped) { - bnx2x_netif_stop(bp, 1); - bnx2x_del_all_napi(bp); - if (CNIC_LOADED(bp)) - bnx2x_del_all_napi_cnic(bp); - - bnx2x_free_irq(bp); - bp->nic_stopped = true; - } + /* Disable HW interrupts, delete NAPIs, Release IRQs */ + bnx2x_stop_nic(bp, 1); /* Report UNLOAD_DONE to MCP */ bnx2x_send_unload_done(bp, true); diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c index 8946a931e87e..e92e82423096 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c @@ -529,16 +529,8 @@ void bnx2x_vfpf_close_vf(struct bnx2x *bp) bnx2x_vfpf_finalize(bp, &req->first_tlv); free_irq: - if (!bp->nic_stopped) { - /* Disable HW interrupts, NAPI */ - bnx2x_netif_stop(bp, 0); - /* Delete all NAPI objects */ - bnx2x_del_all_napi(bp); - - /* Release IRQs */ - bnx2x_free_irq(bp); - bp->nic_stopped = true; - } + /* Disable HW interrupts, delete NAPIs, Release IRQs */ + bnx2x_stop_nic(bp, 0); } static void bnx2x_leading_vfq_init(struct bnx2x *bp, struct bnx2x_virtf *vf, -- 2.25.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v8 0/2] bnx2x: Fix error recovering in switch configuration 2024-02-01 17:48 [PATCH v7 0/2] bnx2x: Fix error recovering in switch configuration Thinh Tran 2024-02-01 17:48 ` [PATCH v7 1/2] net/bnx2x: Prevent access to a freed page in page_pool Thinh Tran 2024-02-01 17:48 ` [PATCH v7 2/2] net/bnx2x: refactor common code to bnx2x_stop_nic() Thinh Tran @ 2024-02-08 18:55 ` Thinh Tran 2024-02-08 18:55 ` [PATCH v8 1/2] net/bnx2x: Prevent access to a freed page in page_pool Thinh Tran ` (2 more replies) 2 siblings, 3 replies; 14+ messages in thread From: Thinh Tran @ 2024-02-08 18:55 UTC (permalink / raw) To: kuba Cc: netdev, aelior, davem, manishc, pabeni, skalluru, simon.horman, edumazet, VENKATA.SAI.DUGGI, drc, abdhalee, Thinh Tran Please refer to the initial cover letter https://lore.kernel.org/all/20230818161443.708785-1-thinhtr@linux.vnet.ibm.com In series Version 6, the patch [v6,1/4] bnx2x: new flag for tracking HW resource was successfully made it to the mainline kernel https://github.com/torvalds/linux/commit/bf23ffc8a9a777dfdeb04232e0946b803adbb6a9 but the rest of the patches did not. The following patch has been excluded from this series: net/bnx2x: prevent excessive debug information during a TX timeout based on concerns raised by some developers that it might omit valuable debugging details as in some other scenarios may cause the TX timeout. v8: adding stack trace to commit messages for patch net/bnx2x: Prevent access to a freed page in page_pool v7: resubmitting patches. Hereby resubmitting the two patches below: Thinh Tran (2): net/bnx2x: Prevent access to a freed page in page_pool net/bnx2x: refactor common code to bnx2x_stop_nic() .../net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 28 +++++++++++-------- .../net/ethernet/broadcom/bnx2x/bnx2x_cmn.h | 7 +++-- .../net/ethernet/broadcom/bnx2x/bnx2x_main.c | 25 +++-------------- .../net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c | 12 ++------ 4 files changed, 27 insertions(+), 45 deletions(-) -- 2.25.1 ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v8 1/2] net/bnx2x: Prevent access to a freed page in page_pool 2024-02-08 18:55 ` [PATCH v8 0/2] bnx2x: Fix error recovering in switch configuration Thinh Tran @ 2024-02-08 18:55 ` Thinh Tran 2024-02-08 19:14 ` Thinh Tran ` (2 more replies) 2024-02-08 18:55 ` [PATCH v8 2/2] net/bnx2x: refactor common code to bnx2x_stop_nic() Thinh Tran 2024-02-12 20:47 ` [PATCH v9 0/2] bnx2x: Fix error recovering in switch configuration Thinh Tran 2 siblings, 3 replies; 14+ messages in thread From: Thinh Tran @ 2024-02-08 18:55 UTC (permalink / raw) To: kuba Cc: netdev, aelior, davem, manishc, pabeni, skalluru, simon.horman, edumazet, VENKATA.SAI.DUGGI, drc, abdhalee, Thinh Tran Fix race condition leading to system crash during EEH error handling During EEH error recovery, the bnx2x driver's transmit timeout logic could cause a race condition when handling reset tasks. The bnx2x_tx_timeout() schedules reset tasks via bnx2x_sp_rtnl_task(), which ultimately leads to bnx2x_nic_unload(). In bnx2x_nic_unload() SGEs are freed using bnx2x_free_rx_sge_range(). However, this could overlap with the EEH driver's attempt to reset the device using bnx2x_io_slot_reset(), which also frees SGEs. This race condition can result in system crashes due to accessing freed memory locations. [ 793.003930] EEH: Beginning: 'slot_reset' [ 793.003937] PCI 0011:01:00.0#10000: EEH: Invoking bnx2x->slot_reset() [ 793.003939] bnx2x: [bnx2x_io_slot_reset:14228(eth1)]IO slot reset initializing... [ 793.004037] bnx2x 0011:01:00.0: enabling device (0140 -> 0142) [ 793.008839] bnx2x: [bnx2x_io_slot_reset:14244(eth1)]IO slot reset --> driver unload [ 793.122134] Kernel attempted to read user page (0) - exploit attempt? (uid: 0) [ 793.122143] BUG: Kernel NULL pointer dereference on read at 0x00000000 [ 793.122147] Faulting instruction address: 0xc0080000025065fc [ 793.122152] Oops: Kernel access of bad area, sig: 11 [#1] ..... [ 793.122315] Call Trace: [ 793.122318] [c000000003c67a20] [c00800000250658c] bnx2x_io_slot_reset+0x204/0x610 [bnx2x] (unreliable) [ 793.122331] [c000000003c67af0] [c0000000000518a8] eeh_report_reset+0xb8/0xf0 [ 793.122338] [c000000003c67b60] [c000000000052130] eeh_pe_report+0x180/0x550 [ 793.122342] [c000000003c67c70] [c00000000005318c] eeh_handle_normal_event+0x84c/0xa60 [ 793.122347] [c000000003c67d50] [c000000000053a84] eeh_event_handler+0xf4/0x170 [ 793.122352] [c000000003c67da0] [c000000000194c58] kthread+0x1c8/0x1d0 [ 793.122356] [c000000003c67e10] [c00000000000cf64] ret_from_kernel_thread+0x5c/0x64 To solve this issue, we need to verify page pool allocations before freeing. Signed-off-by: Thinh Tran <thinhtr@linux.vnet.ibm.com> --- drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h index d8b1824c334d..0bc1367fd649 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h @@ -1002,9 +1002,6 @@ static inline void bnx2x_set_fw_mac_addr(__le16 *fw_hi, __le16 *fw_mid, static inline void bnx2x_free_rx_mem_pool(struct bnx2x *bp, struct bnx2x_alloc_pool *pool) { - if (!pool->page) - return; - put_page(pool->page); pool->page = NULL; @@ -1015,6 +1012,9 @@ static inline void bnx2x_free_rx_sge_range(struct bnx2x *bp, { int i; + if (!fp->page_pool.page) + return; + if (fp->mode == TPA_MODE_DISABLED) return; -- 2.25.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v8 1/2] net/bnx2x: Prevent access to a freed page in page_pool 2024-02-08 18:55 ` [PATCH v8 1/2] net/bnx2x: Prevent access to a freed page in page_pool Thinh Tran @ 2024-02-08 19:14 ` Thinh Tran 2024-02-08 19:18 ` Thinh Tran 2024-02-12 20:47 ` [PATCH v9 " Thinh Tran 2 siblings, 0 replies; 14+ messages in thread From: Thinh Tran @ 2024-02-08 19:14 UTC (permalink / raw) To: kuba Cc: netdev, aelior, davem, manishc, pabeni, skalluru, simon.horman, edumazet, VENKATA.SAI.DUGGI, drc, abdhalee Fixes: 4cace675d687 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v8 1/2] net/bnx2x: Prevent access to a freed page in page_pool 2024-02-08 18:55 ` [PATCH v8 1/2] net/bnx2x: Prevent access to a freed page in page_pool Thinh Tran 2024-02-08 19:14 ` Thinh Tran @ 2024-02-08 19:18 ` Thinh Tran 2024-02-09 1:29 ` Jakub Kicinski 2024-02-12 20:47 ` [PATCH v9 " Thinh Tran 2 siblings, 1 reply; 14+ messages in thread From: Thinh Tran @ 2024-02-08 19:18 UTC (permalink / raw) To: kuba Cc: netdev, aelior, davem, manishc, pabeni, skalluru, simon.horman, edumazet, VENKATA.SAI.DUGGI, drc, abdhalee Fixes: 4cace675d687 ("bnx2x: Alloc 4k fragment for each rx ring buffer element") ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v8 1/2] net/bnx2x: Prevent access to a freed page in page_pool 2024-02-08 19:18 ` Thinh Tran @ 2024-02-09 1:29 ` Jakub Kicinski 2024-02-12 20:47 ` Thinh Tran 0 siblings, 1 reply; 14+ messages in thread From: Jakub Kicinski @ 2024-02-09 1:29 UTC (permalink / raw) To: Thinh Tran Cc: netdev, aelior, davem, manishc, pabeni, skalluru, simon.horman, edumazet, VENKATA.SAI.DUGGI, drc, abdhalee On Thu, 8 Feb 2024 13:18:14 -0600 Thinh Tran wrote: > Fixes: 4cace675d687 ("bnx2x: Alloc 4k fragment for each rx ring buffer > element") The Fixes tag should be on one line, without wrapping. Please post a v9 with the tag included, as a new thread. Don't use --in-reply-to on netdev (sorry for so many rules..) ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v8 1/2] net/bnx2x: Prevent access to a freed page in page_pool 2024-02-09 1:29 ` Jakub Kicinski @ 2024-02-12 20:47 ` Thinh Tran 0 siblings, 0 replies; 14+ messages in thread From: Thinh Tran @ 2024-02-12 20:47 UTC (permalink / raw) To: Jakub Kicinski Cc: netdev, aelior, davem, manishc, pabeni, skalluru, simon.horman, edumazet, VENKATA.SAI.DUGGI, drc, abdhalee On 2/8/2024 7:29 PM, Jakub Kicinski wrote: > On Thu, 8 Feb 2024 13:18:14 -0600 Thinh Tran wrote: >> Fixes: 4cace675d687 ("bnx2x: Alloc 4k fragment for each rx ring buffer >> element") > > The Fixes tag should be on one line, without wrapping. > Please post a v9 with the tag included, as a new thread. > Don't use --in-reply-to on netdev (sorry for so many rules..) Will do. Thank you. ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v9 1/2] net/bnx2x: Prevent access to a freed page in page_pool 2024-02-08 18:55 ` [PATCH v8 1/2] net/bnx2x: Prevent access to a freed page in page_pool Thinh Tran 2024-02-08 19:14 ` Thinh Tran 2024-02-08 19:18 ` Thinh Tran @ 2024-02-12 20:47 ` Thinh Tran 2 siblings, 0 replies; 14+ messages in thread From: Thinh Tran @ 2024-02-12 20:47 UTC (permalink / raw) To: kuba Cc: netdev, davem, manishc, pabeni, skalluru, simon.horman, edumazet, VENKATA.SAI.DUGGI, drc, abdhalee, thinhtr Fix race condition leading to system crash during EEH error handling During EEH error recovery, the bnx2x driver's transmit timeout logic could cause a race condition when handling reset tasks. The bnx2x_tx_timeout() schedules reset tasks via bnx2x_sp_rtnl_task(), which ultimately leads to bnx2x_nic_unload(). In bnx2x_nic_unload() SGEs are freed using bnx2x_free_rx_sge_range(). However, this could overlap with the EEH driver's attempt to reset the device using bnx2x_io_slot_reset(), which also frees SGEs. This race condition can result in system crashes due to accessing freed memory locations. [ 793.003930] EEH: Beginning: 'slot_reset' [ 793.003937] PCI 0011:01:00.0#10000: EEH: Invoking bnx2x->slot_reset() [ 793.003939] bnx2x: [bnx2x_io_slot_reset:14228(eth1)]IO slot reset initializing... [ 793.004037] bnx2x 0011:01:00.0: enabling device (0140 -> 0142) [ 793.008839] bnx2x: [bnx2x_io_slot_reset:14244(eth1)]IO slot reset --> driver unload [ 793.122134] Kernel attempted to read user page (0) - exploit attempt? (uid: 0) [ 793.122143] BUG: Kernel NULL pointer dereference on read at 0x00000000 [ 793.122147] Faulting instruction address: 0xc0080000025065fc [ 793.122152] Oops: Kernel access of bad area, sig: 11 [#1] ..... [ 793.122315] Call Trace: [ 793.122318] [c000000003c67a20] [c00800000250658c] bnx2x_io_slot_reset+0x204/0x610 [bnx2x] (unreliable) [ 793.122331] [c000000003c67af0] [c0000000000518a8] eeh_report_reset+0xb8/0xf0 [ 793.122338] [c000000003c67b60] [c000000000052130] eeh_pe_report+0x180/0x550 [ 793.122342] [c000000003c67c70] [c00000000005318c] eeh_handle_normal_event+0x84c/0xa60 [ 793.122347] [c000000003c67d50] [c000000000053a84] eeh_event_handler+0xf4/0x170 [ 793.122352] [c000000003c67da0] [c000000000194c58] kthread+0x1c8/0x1d0 [ 793.122356] [c000000003c67e10] [c00000000000cf64] ret_from_kernel_thread+0x5c/0x64 To solve this issue, we need to verify page pool allocations before freeing. Fixes: 4cace675d687 ("bnx2x: Alloc 4k fragment for each rx ring buffer element") Signed-off-by: Thinh Tran <thinhtr@linux.vnet.ibm.com> --- drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h index d8b1824c334d..0bc1367fd649 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h @@ -1002,9 +1002,6 @@ static inline void bnx2x_set_fw_mac_addr(__le16 *fw_hi, __le16 *fw_mid, static inline void bnx2x_free_rx_mem_pool(struct bnx2x *bp, struct bnx2x_alloc_pool *pool) { - if (!pool->page) - return; - put_page(pool->page); pool->page = NULL; @@ -1015,6 +1012,9 @@ static inline void bnx2x_free_rx_sge_range(struct bnx2x *bp, { int i; + if (!fp->page_pool.page) + return; + if (fp->mode == TPA_MODE_DISABLED) return; -- 2.25.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v8 2/2] net/bnx2x: refactor common code to bnx2x_stop_nic() 2024-02-08 18:55 ` [PATCH v8 0/2] bnx2x: Fix error recovering in switch configuration Thinh Tran 2024-02-08 18:55 ` [PATCH v8 1/2] net/bnx2x: Prevent access to a freed page in page_pool Thinh Tran @ 2024-02-08 18:55 ` Thinh Tran 2024-02-12 20:47 ` [PATCH v9 " Thinh Tran 2024-02-12 20:47 ` [PATCH v9 0/2] bnx2x: Fix error recovering in switch configuration Thinh Tran 2 siblings, 1 reply; 14+ messages in thread From: Thinh Tran @ 2024-02-08 18:55 UTC (permalink / raw) To: kuba Cc: netdev, aelior, davem, manishc, pabeni, skalluru, simon.horman, edumazet, VENKATA.SAI.DUGGI, drc, abdhalee, Thinh Tran Refactor common code which disables and releases HW interrupts, deletes NAPI objects, into a new bnx2x_stop_nic() function. Signed-off-by: Thinh Tran <thinhtr@linux.vnet.ibm.com> --- .../net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 28 +++++++++++-------- .../net/ethernet/broadcom/bnx2x/bnx2x_cmn.h | 1 + .../net/ethernet/broadcom/bnx2x/bnx2x_main.c | 25 +++-------------- .../net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c | 12 ++------ 4 files changed, 24 insertions(+), 42 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c index e9c1e1bb5580..e34aff5fb782 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c @@ -3097,17 +3097,8 @@ int bnx2x_nic_unload(struct bnx2x *bp, int unload_mode, bool keep_link) if (!CHIP_IS_E1x(bp)) bnx2x_pf_disable(bp); - if (!bp->nic_stopped) { - /* Disable HW interrupts, NAPI */ - bnx2x_netif_stop(bp, 1); - /* Delete all NAPI objects */ - bnx2x_del_all_napi(bp); - if (CNIC_LOADED(bp)) - bnx2x_del_all_napi_cnic(bp); - /* Release IRQs */ - bnx2x_free_irq(bp); - bp->nic_stopped = true; - } + /* Disable HW interrupts, delete NAPIs, Release IRQs */ + bnx2x_stop_nic(bp, 1); /* Report UNLOAD_DONE to MCP */ bnx2x_send_unload_done(bp, false); @@ -5139,3 +5130,18 @@ void bnx2x_schedule_sp_rtnl(struct bnx2x *bp, enum sp_rtnl_flag flag, flag); schedule_delayed_work(&bp->sp_rtnl_task, 0); } + +void bnx2x_stop_nic(struct bnx2x *bp, int disable_hw) +{ + if (!bp->nic_stopped) { + /* Disable HW interrupts, NAPI */ + bnx2x_netif_stop(bp, disable_hw); + /* Delete all NAPI objects */ + bnx2x_del_all_napi(bp); + if (CNIC_LOADED(bp)) + bnx2x_del_all_napi_cnic(bp); + /* Release IRQs */ + bnx2x_free_irq(bp); + bp->nic_stopped = true; + } +} diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h index 0bc1367fd649..a35a02299b33 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h @@ -553,6 +553,7 @@ void bnx2x_free_skbs(struct bnx2x *bp); void bnx2x_netif_stop(struct bnx2x *bp, int disable_hw); void bnx2x_netif_start(struct bnx2x *bp); int bnx2x_load_cnic(struct bnx2x *bp); +void bnx2x_stop_nic(struct bnx2x *bp, int disable_hw); /** * bnx2x_enable_msix - set msix configuration. diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c index 0d8e61c63c7c..ff75c883cffe 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c @@ -9474,18 +9474,8 @@ void bnx2x_chip_cleanup(struct bnx2x *bp, int unload_mode, bool keep_link) } } - if (!bp->nic_stopped) { - /* Disable HW interrupts, NAPI */ - bnx2x_netif_stop(bp, 1); - /* Delete all NAPI objects */ - bnx2x_del_all_napi(bp); - if (CNIC_LOADED(bp)) - bnx2x_del_all_napi_cnic(bp); - - /* Release IRQs */ - bnx2x_free_irq(bp); - bp->nic_stopped = true; - } + /* Disable HW interrupts, delete NAPIs, Release IRQs */ + bnx2x_stop_nic(bp, 1); /* Reset the chip, unless PCI function is offline. If we reach this * point following a PCI error handling, it means device is really @@ -14241,16 +14231,9 @@ static pci_ers_result_t bnx2x_io_slot_reset(struct pci_dev *pdev) } bnx2x_drain_tx_queues(bp); bnx2x_send_unload_req(bp, UNLOAD_RECOVERY); - if (!bp->nic_stopped) { - bnx2x_netif_stop(bp, 1); - bnx2x_del_all_napi(bp); - if (CNIC_LOADED(bp)) - bnx2x_del_all_napi_cnic(bp); - - bnx2x_free_irq(bp); - bp->nic_stopped = true; - } + /* Disable HW interrupts, delete NAPIs, Release IRQs */ + bnx2x_stop_nic(bp, 1); /* Report UNLOAD_DONE to MCP */ bnx2x_send_unload_done(bp, true); diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c index 8946a931e87e..e92e82423096 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c @@ -529,16 +529,8 @@ void bnx2x_vfpf_close_vf(struct bnx2x *bp) bnx2x_vfpf_finalize(bp, &req->first_tlv); free_irq: - if (!bp->nic_stopped) { - /* Disable HW interrupts, NAPI */ - bnx2x_netif_stop(bp, 0); - /* Delete all NAPI objects */ - bnx2x_del_all_napi(bp); - - /* Release IRQs */ - bnx2x_free_irq(bp); - bp->nic_stopped = true; - } + /* Disable HW interrupts, delete NAPIs, Release IRQs */ + bnx2x_stop_nic(bp, 0); } static void bnx2x_leading_vfq_init(struct bnx2x *bp, struct bnx2x_virtf *vf, -- 2.25.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v9 2/2] net/bnx2x: refactor common code to bnx2x_stop_nic() 2024-02-08 18:55 ` [PATCH v8 2/2] net/bnx2x: refactor common code to bnx2x_stop_nic() Thinh Tran @ 2024-02-12 20:47 ` Thinh Tran 0 siblings, 0 replies; 14+ messages in thread From: Thinh Tran @ 2024-02-12 20:47 UTC (permalink / raw) To: kuba Cc: netdev, davem, manishc, pabeni, skalluru, simon.horman, edumazet, VENKATA.SAI.DUGGI, drc, abdhalee, thinhtr Refactor common code which disables and releases HW interrupts, deletes NAPI objects, into a new bnx2x_stop_nic() function. Signed-off-by: Thinh Tran <thinhtr@linux.vnet.ibm.com> --- .../net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 28 +++++++++++-------- .../net/ethernet/broadcom/bnx2x/bnx2x_cmn.h | 1 + .../net/ethernet/broadcom/bnx2x/bnx2x_main.c | 25 +++-------------- .../net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c | 12 ++------ 4 files changed, 24 insertions(+), 42 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c index e9c1e1bb5580..e34aff5fb782 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c @@ -3097,17 +3097,8 @@ int bnx2x_nic_unload(struct bnx2x *bp, int unload_mode, bool keep_link) if (!CHIP_IS_E1x(bp)) bnx2x_pf_disable(bp); - if (!bp->nic_stopped) { - /* Disable HW interrupts, NAPI */ - bnx2x_netif_stop(bp, 1); - /* Delete all NAPI objects */ - bnx2x_del_all_napi(bp); - if (CNIC_LOADED(bp)) - bnx2x_del_all_napi_cnic(bp); - /* Release IRQs */ - bnx2x_free_irq(bp); - bp->nic_stopped = true; - } + /* Disable HW interrupts, delete NAPIs, Release IRQs */ + bnx2x_stop_nic(bp, 1); /* Report UNLOAD_DONE to MCP */ bnx2x_send_unload_done(bp, false); @@ -5139,3 +5130,18 @@ void bnx2x_schedule_sp_rtnl(struct bnx2x *bp, enum sp_rtnl_flag flag, flag); schedule_delayed_work(&bp->sp_rtnl_task, 0); } + +void bnx2x_stop_nic(struct bnx2x *bp, int disable_hw) +{ + if (!bp->nic_stopped) { + /* Disable HW interrupts, NAPI */ + bnx2x_netif_stop(bp, disable_hw); + /* Delete all NAPI objects */ + bnx2x_del_all_napi(bp); + if (CNIC_LOADED(bp)) + bnx2x_del_all_napi_cnic(bp); + /* Release IRQs */ + bnx2x_free_irq(bp); + bp->nic_stopped = true; + } +} diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h index 0bc1367fd649..a35a02299b33 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h @@ -553,6 +553,7 @@ void bnx2x_free_skbs(struct bnx2x *bp); void bnx2x_netif_stop(struct bnx2x *bp, int disable_hw); void bnx2x_netif_start(struct bnx2x *bp); int bnx2x_load_cnic(struct bnx2x *bp); +void bnx2x_stop_nic(struct bnx2x *bp, int disable_hw); /** * bnx2x_enable_msix - set msix configuration. diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c index 0d8e61c63c7c..ff75c883cffe 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c @@ -9474,18 +9474,8 @@ void bnx2x_chip_cleanup(struct bnx2x *bp, int unload_mode, bool keep_link) } } - if (!bp->nic_stopped) { - /* Disable HW interrupts, NAPI */ - bnx2x_netif_stop(bp, 1); - /* Delete all NAPI objects */ - bnx2x_del_all_napi(bp); - if (CNIC_LOADED(bp)) - bnx2x_del_all_napi_cnic(bp); - - /* Release IRQs */ - bnx2x_free_irq(bp); - bp->nic_stopped = true; - } + /* Disable HW interrupts, delete NAPIs, Release IRQs */ + bnx2x_stop_nic(bp, 1); /* Reset the chip, unless PCI function is offline. If we reach this * point following a PCI error handling, it means device is really @@ -14241,16 +14231,9 @@ static pci_ers_result_t bnx2x_io_slot_reset(struct pci_dev *pdev) } bnx2x_drain_tx_queues(bp); bnx2x_send_unload_req(bp, UNLOAD_RECOVERY); - if (!bp->nic_stopped) { - bnx2x_netif_stop(bp, 1); - bnx2x_del_all_napi(bp); - if (CNIC_LOADED(bp)) - bnx2x_del_all_napi_cnic(bp); - - bnx2x_free_irq(bp); - bp->nic_stopped = true; - } + /* Disable HW interrupts, delete NAPIs, Release IRQs */ + bnx2x_stop_nic(bp, 1); /* Report UNLOAD_DONE to MCP */ bnx2x_send_unload_done(bp, true); diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c index 8946a931e87e..e92e82423096 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c @@ -529,16 +529,8 @@ void bnx2x_vfpf_close_vf(struct bnx2x *bp) bnx2x_vfpf_finalize(bp, &req->first_tlv); free_irq: - if (!bp->nic_stopped) { - /* Disable HW interrupts, NAPI */ - bnx2x_netif_stop(bp, 0); - /* Delete all NAPI objects */ - bnx2x_del_all_napi(bp); - - /* Release IRQs */ - bnx2x_free_irq(bp); - bp->nic_stopped = true; - } + /* Disable HW interrupts, delete NAPIs, Release IRQs */ + bnx2x_stop_nic(bp, 0); } static void bnx2x_leading_vfq_init(struct bnx2x *bp, struct bnx2x_virtf *vf, -- 2.25.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v9 0/2] bnx2x: Fix error recovering in switch configuration 2024-02-08 18:55 ` [PATCH v8 0/2] bnx2x: Fix error recovering in switch configuration Thinh Tran 2024-02-08 18:55 ` [PATCH v8 1/2] net/bnx2x: Prevent access to a freed page in page_pool Thinh Tran 2024-02-08 18:55 ` [PATCH v8 2/2] net/bnx2x: refactor common code to bnx2x_stop_nic() Thinh Tran @ 2024-02-12 20:47 ` Thinh Tran 2 siblings, 0 replies; 14+ messages in thread From: Thinh Tran @ 2024-02-12 20:47 UTC (permalink / raw) To: kuba Cc: netdev, davem, manishc, pabeni, skalluru, simon.horman, edumazet, VENKATA.SAI.DUGGI, drc, abdhalee, thinhtr Please refer to the initial cover letter https://lore.kernel.org/all/20230818161443.708785-1-thinhtr@linux.vnet.ibm.com In series Version 6, the patch [v6,1/4] bnx2x: new flag for tracking HW resource was successfully made it to the mainline kernel https://github.com/torvalds/linux/commit/bf23ffc8a9a777dfdeb04232e0946b803adbb6a9 but the rest of the patches did not. The following patch has been excluded from this series: net/bnx2x: prevent excessive debug information during a TX timeout based on concerns raised by some developers that it might omit valuable debugging details as in some other scenarios may cause the TX timeout. v9: adding "Fixes:" tag to commit messages for patch net/bnx2x: Prevent access to a freed page in page_pool v8: adding stack trace to commit messages for patch net/bnx2x: Prevent access to a freed page in page_pool v7: resubmitting patches. Hereby resubmitting the two patches below: Thinh Tran (2): net/bnx2x: Prevent access to a freed page in page_pool net/bnx2x: refactor common code to bnx2x_stop_nic() .../net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 28 +++++++++++-------- .../net/ethernet/broadcom/bnx2x/bnx2x_cmn.h | 7 +++-- .../net/ethernet/broadcom/bnx2x/bnx2x_main.c | 25 +++-------------- .../net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c | 12 ++------ 4 files changed, 27 insertions(+), 45 deletions(-) -- 2.25.1 ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2024-02-12 20:48 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-02-01 17:48 [PATCH v7 0/2] bnx2x: Fix error recovering in switch configuration Thinh Tran 2024-02-01 17:48 ` [PATCH v7 1/2] net/bnx2x: Prevent access to a freed page in page_pool Thinh Tran 2024-02-08 2:37 ` Jakub Kicinski 2024-02-01 17:48 ` [PATCH v7 2/2] net/bnx2x: refactor common code to bnx2x_stop_nic() Thinh Tran 2024-02-08 18:55 ` [PATCH v8 0/2] bnx2x: Fix error recovering in switch configuration Thinh Tran 2024-02-08 18:55 ` [PATCH v8 1/2] net/bnx2x: Prevent access to a freed page in page_pool Thinh Tran 2024-02-08 19:14 ` Thinh Tran 2024-02-08 19:18 ` Thinh Tran 2024-02-09 1:29 ` Jakub Kicinski 2024-02-12 20:47 ` Thinh Tran 2024-02-12 20:47 ` [PATCH v9 " Thinh Tran 2024-02-08 18:55 ` [PATCH v8 2/2] net/bnx2x: refactor common code to bnx2x_stop_nic() Thinh Tran 2024-02-12 20:47 ` [PATCH v9 " Thinh Tran 2024-02-12 20:47 ` [PATCH v9 0/2] bnx2x: Fix error recovering in switch configuration Thinh Tran
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).