* [Patch v6 0/4] bnx2x: Fix error recovering in switch configuration
2023-07-28 21:11 [PATCH v4] " Thinh Tran
@ 2023-08-18 16:14 ` Thinh Tran
0 siblings, 0 replies; 6+ messages in thread
From: Thinh Tran @ 2023-08-18 16:14 UTC (permalink / raw)
To: kuba
Cc: aelior, davem, edumazet, manishc, netdev, pabeni, skalluru,
VENKATA.SAI.DUGGI, Thinh Tran, Abdul Haleem, David Christensen,
Simon Horman, Venkata Sai Duggi
While injecting PCIe errors to the upstream PCIe switch of
a BCM57810 NIC, system hangs/crashes were observed.
After several calls to bnx2x_tx_timout() complete,
bnx2x_nic_unload() is called to free up HW resources
and bnx2x_napi_disable() is called to release NAPI objects.
Later, when the EEH driver calls bnx2x_io_slot_reset() to
complete the recovery process, bnx2x attempts to disable
NAPI again by calling bnx2x_napi_disable() and freeing
resources which have already been freed, resulting in a
hang or crash.
This patch set introduces a new flag to track the HW
resource and NAPI allocation state, refactor duplicated
code into a single function, check page pool allocation
status before freeing, and reduces debug output when
a TX timeout event occurs.
Signed-off-by: Thinh Tran <thinhtr@linux.vnet.ibm.com>
Reviewed-by: Manish Chopra <manishc@marvell.com>
Tested-by: Abdul Haleem <abdhalee@in.ibm.com>
Tested-by: David Christensen <drc@linux.vnet.ibm.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Venkata Sai Duggi <venkata.sai.duggi@ibm.com>
v6:
- Clarifying and updating commit messages
v5:
- Breaking down into a series of individual patches
v4:
- factoring common code into new function bnx2x_stop_nic()
that disables and releases IRQs and NAPIs
v3:
- no changes, just repatched to the latest driver level
- updated the reviewed-by Manish in October, 2022
v2:
- Check the state of the NIC before calling disable nappi
and freeing the IRQ
- Prevent recurrence of TX timeout by turning off the carrier,
calling netif_carrier_off() in bnx2x_tx_timeout()
- Check and bail out early if fp->page_pool already freed
Thinh Tran (4):
bnx2x: new the bp->nic_stopped variable for checking NIC status
bnx2x: factor out common code to bnx2x_stop_nic()
bnx2x: Prevent access to a freed page in page_pool
bnx2x: prevent excessive debug information during a TX timeout
drivers/net/ethernet/broadcom/bnx2x/bnx2x.h | 2 ++
.../net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 33 ++++++++++++++-----
.../net/ethernet/broadcom/bnx2x/bnx2x_cmn.h | 4 +++
.../net/ethernet/broadcom/bnx2x/bnx2x_main.c | 26 +++------------
.../net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c | 9 ++---
5 files changed, 37 insertions(+), 37 deletions(-)
--
2.27.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Patch v6 0/4] bnx2x: Fix error recovering in switch configuration
@ 2023-11-16 16:08 Thinh Tran
2023-11-16 20:56 ` Simon Horman
2024-01-17 21:56 ` Thinh Tran
0 siblings, 2 replies; 6+ messages in thread
From: Thinh Tran @ 2023-11-16 16:08 UTC (permalink / raw)
To: Jakub Kicinski
Cc: aelior, davem, edumazet, manishc, netdev, pabeni, skalluru,
VENKATA.SAI.DUGGI, Thinh Tran, Abdul Haleem, David Christensen,
Simon Horman
Hi,
Could we proceed with advancing these patches? They've been in the
"Awaiting Upstream" state for a while now. Notably, one of them has
successfully made it to the mainline kernel:
[v6,1/4] bnx2x: new flag for tracking HW resource
https://github.com/torvalds/linux/commit/bf23ffc8a9a777dfdeb04232e0946b803adbb6a9
As testing the latest kernel, we are still encountering crashes due to
the absence of one of the patches:
[v6,3/4] bnx2x: Prevent access to a freed page in page_pool.
Is there anything specific I need to do to help moving these patches
forward?
We would greatly appreciate if they could be incorporated into the
mainline kernel.
Thank you,
Thinh Tran
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Patch v6 0/4] bnx2x: Fix error recovering in switch configuration
2023-11-16 16:08 [Patch v6 0/4] bnx2x: Fix error recovering in switch configuration Thinh Tran
@ 2023-11-16 20:56 ` Simon Horman
2024-01-17 21:56 ` Thinh Tran
1 sibling, 0 replies; 6+ messages in thread
From: Simon Horman @ 2023-11-16 20:56 UTC (permalink / raw)
To: Thinh Tran
Cc: Jakub Kicinski, aelior, davem, edumazet, manishc, netdev, pabeni,
skalluru, VENKATA.SAI.DUGGI, Abdul Haleem, David Christensen
On Thu, Nov 16, 2023 at 10:08:34AM -0600, Thinh Tran wrote:
> Hi,
>
> Could we proceed with advancing these patches? They've been in the
> "Awaiting Upstream" state for a while now. Notably, one of them has
> successfully made it to the mainline kernel:
> [v6,1/4] bnx2x: new flag for tracking HW resource
>
> https://github.com/torvalds/linux/commit/bf23ffc8a9a777dfdeb04232e0946b803adbb6a9
>
> As testing the latest kernel, we are still encountering crashes due to
> the absence of one of the patches:
> [v6,3/4] bnx2x: Prevent access to a freed page in page_pool.
>
> Is there anything specific I need to do to help moving these patches
> forward?
> We would greatly appreciate if they could be incorporated into the
> mainline kernel.
Hi Thinh Tran,
I'd suggest that the best way to move this forwards would
be to rebase the remaining patches on net-next and posting them as a v7.
It would be useful to include the information above in the cover letter.
And to annotate that they are targeting net-next in the subject of
each patch and the cover letter.
Subject: [PATCH net-next v8 x/3] ...
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Patch v6 0/4] bnx2x: Fix error recovering in switch configuration
2023-11-16 16:08 [Patch v6 0/4] bnx2x: Fix error recovering in switch configuration Thinh Tran
2023-11-16 20:56 ` Simon Horman
@ 2024-01-17 21:56 ` Thinh Tran
2024-01-17 23:55 ` Jakub Kicinski
1 sibling, 1 reply; 6+ messages in thread
From: Thinh Tran @ 2024-01-17 21:56 UTC (permalink / raw)
To: Jakub Kicinski
Cc: aelior, davem, edumazet, manishc, netdev, pabeni, skalluru,
VENKATA.SAI.DUGGI, Abdul Haleem, David Christensen, Simon Horman
Hi all,
I hope this message finds you well. I'm reaching out to move forward
with these patches. If there are any remaining concerns or if
additional information is needed from my side, please let me know.
Your guidance on the next steps would be greatly appreciated.
Best regards,
Thinh Tran
On 11/16/2023 10:08 AM, Thinh Tran wrote:
> Hi,
>
> Could we proceed with advancing these patches? They've been in the
> "Awaiting Upstream" state for a while now. Notably, one of them has
> successfully made it to the mainline kernel:
> [v6,1/4] bnx2x: new flag for tracking HW resource
>
> https://github.com/torvalds/linux/commit/bf23ffc8a9a777dfdeb04232e0946b803adbb6a9
>
> As testing the latest kernel, we are still encountering crashes due to
> the absence of one of the patches:
> [v6,3/4] bnx2x: Prevent access to a freed page in page_pool.
>
> Is there anything specific I need to do to help moving these patches
> forward?
> We would greatly appreciate if they could be incorporated into the
> mainline kernel.
>
> Thank you,
> Thinh Tran
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Patch v6 0/4] bnx2x: Fix error recovering in switch configuration
2024-01-17 21:56 ` Thinh Tran
@ 2024-01-17 23:55 ` Jakub Kicinski
2024-01-18 16:53 ` Thinh Tran
0 siblings, 1 reply; 6+ messages in thread
From: Jakub Kicinski @ 2024-01-17 23:55 UTC (permalink / raw)
To: Thinh Tran
Cc: aelior, davem, edumazet, manishc, netdev, pabeni, skalluru,
VENKATA.SAI.DUGGI, Abdul Haleem, David Christensen, Simon Horman
On Wed, 17 Jan 2024 15:56:21 -0600 Thinh Tran wrote:
> I hope this message finds you well. I'm reaching out to move forward
> with these patches. If there are any remaining concerns or if
> additional information is needed from my side, please let me know.
> Your guidance on the next steps would be greatly appreciated.
If there are any patches that got stuck in a limbo for a long time
please repost them in a new thread. If I'm looking this up right in
online archives the thread is 6 months old, I've deleted the old
messages already :(
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Patch v6 0/4] bnx2x: Fix error recovering in switch configuration
2024-01-17 23:55 ` Jakub Kicinski
@ 2024-01-18 16:53 ` Thinh Tran
0 siblings, 0 replies; 6+ messages in thread
From: Thinh Tran @ 2024-01-18 16:53 UTC (permalink / raw)
To: Jakub Kicinski
Cc: aelior, davem, edumazet, manishc, netdev, pabeni, skalluru,
VENKATA.SAI.DUGGI, Abdul Haleem, David Christensen, Simon Horman
On 1/17/2024 5:55 PM, Jakub Kicinski wrote:
> If there are any patches that got stuck in a limbo for a long time
> please repost them in a new thread. If I'm looking this up right in
> online archives the thread is 6 months old, I've deleted the old
> messages already :(
I will work on re-posting them in a new different thread.
Thank you.
Thinh Tran
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-01-18 16:53 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-16 16:08 [Patch v6 0/4] bnx2x: Fix error recovering in switch configuration Thinh Tran
2023-11-16 20:56 ` Simon Horman
2024-01-17 21:56 ` Thinh Tran
2024-01-17 23:55 ` Jakub Kicinski
2024-01-18 16:53 ` Thinh Tran
-- strict thread matches above, loose matches on Subject: below --
2023-07-28 21:11 [PATCH v4] " Thinh Tran
2023-08-18 16:14 ` [Patch v6 0/4] " Thinh Tran
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).