* [PATCH net] bnxt_en: bring back rtnl_lock() in bnxt_fw_reset_task()
@ 2025-05-12 6:37 Michael Chan
2025-05-12 14:20 ` Stanislav Fomichev
0 siblings, 1 reply; 7+ messages in thread
From: Michael Chan @ 2025-05-12 6:37 UTC (permalink / raw)
To: davem
Cc: netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi,
andrew.gospodarek, sdf, Kalesh AP
RTNL assertion failed in netif_set_real_num_tx_queues() in the
error recovery path:
RTNL: assertion failed at net/core/dev.c (3178)
WARNING: CPU: 3 PID: 3392 at net/core/dev.c:3178 netif_set_real_num_tx_queues+0x1fd/0x210
Call Trace:
<TASK>
? __pfx_bnxt_msix+0x10/0x10 [bnxt_en]
__bnxt_open_nic+0x1ef/0xb20 [bnxt_en]
bnxt_open+0xda/0x130 [bnxt_en]
bnxt_fw_reset_task+0x21f/0x780 [bnxt_en]
process_scheduled_works+0x9d/0x400
Bring back the rtnl_lock() for now in bnxt_fw_reset_task().
Fixes: 004b5008016a ("eth: bnxt: remove most dependencies on RTNL")
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 86a5de44b6f3..8df602663e0d 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -14960,15 +14960,17 @@ static void bnxt_fw_reset_task(struct work_struct *work)
bp->fw_reset_state = BNXT_FW_RESET_STATE_OPENING;
fallthrough;
case BNXT_FW_RESET_STATE_OPENING:
- while (!netdev_trylock(bp->dev)) {
+ while (!rtnl_trylock()) {
bnxt_queue_fw_reset_work(bp, HZ / 10);
return;
}
+ netdev_lock(bp->dev);
rc = bnxt_open(bp->dev);
if (rc) {
netdev_err(bp->dev, "bnxt_open() failed during FW reset\n");
bnxt_fw_reset_abort(bp, rc);
netdev_unlock(bp->dev);
+ rtnl_unlock();
goto ulp_start;
}
@@ -14988,6 +14990,7 @@ static void bnxt_fw_reset_task(struct work_struct *work)
bnxt_dl_health_fw_status_update(bp, true);
}
netdev_unlock(bp->dev);
+ rtnl_unlock();
bnxt_ulp_start(bp, 0);
bnxt_reenable_sriov(bp);
netdev_lock(bp->dev);
--
2.30.1
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH net] bnxt_en: bring back rtnl_lock() in bnxt_fw_reset_task()
2025-05-12 6:37 [PATCH net] bnxt_en: bring back rtnl_lock() in bnxt_fw_reset_task() Michael Chan
@ 2025-05-12 14:20 ` Stanislav Fomichev
2025-05-12 22:08 ` Michael Chan
0 siblings, 1 reply; 7+ messages in thread
From: Stanislav Fomichev @ 2025-05-12 14:20 UTC (permalink / raw)
To: Michael Chan
Cc: davem, netdev, edumazet, kuba, pabeni, andrew+netdev,
pavan.chebbi, andrew.gospodarek, sdf, Kalesh AP
On 05/11, Michael Chan wrote:
> RTNL assertion failed in netif_set_real_num_tx_queues() in the
> error recovery path:
>
> RTNL: assertion failed at net/core/dev.c (3178)
> WARNING: CPU: 3 PID: 3392 at net/core/dev.c:3178 netif_set_real_num_tx_queues+0x1fd/0x210
>
> Call Trace:
> <TASK>
> ? __pfx_bnxt_msix+0x10/0x10 [bnxt_en]
> __bnxt_open_nic+0x1ef/0xb20 [bnxt_en]
> bnxt_open+0xda/0x130 [bnxt_en]
> bnxt_fw_reset_task+0x21f/0x780 [bnxt_en]
> process_scheduled_works+0x9d/0x400
>
> Bring back the rtnl_lock() for now in bnxt_fw_reset_task().
>
> Fixes: 004b5008016a ("eth: bnxt: remove most dependencies on RTNL")
> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
> Signed-off-by: Michael Chan <michael.chan@broadcom.com>
> ---
> drivers/net/ethernet/broadcom/bnxt/bnxt.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index 86a5de44b6f3..8df602663e0d 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -14960,15 +14960,17 @@ static void bnxt_fw_reset_task(struct work_struct *work)
> bp->fw_reset_state = BNXT_FW_RESET_STATE_OPENING;
> fallthrough;
> case BNXT_FW_RESET_STATE_OPENING:
> - while (!netdev_trylock(bp->dev)) {
> + while (!rtnl_trylock()) {
> bnxt_queue_fw_reset_work(bp, HZ / 10);
> return;
> }
> + netdev_lock(bp->dev);
> rc = bnxt_open(bp->dev);
> if (rc) {
> netdev_err(bp->dev, "bnxt_open() failed during FW reset\n");
> bnxt_fw_reset_abort(bp, rc);
> netdev_unlock(bp->dev);
> + rtnl_unlock();
> goto ulp_start;
> }
>
> @@ -14988,6 +14990,7 @@ static void bnxt_fw_reset_task(struct work_struct *work)
> bnxt_dl_health_fw_status_update(bp, true);
> }
> netdev_unlock(bp->dev);
> + rtnl_unlock();
> bnxt_ulp_start(bp, 0);
> bnxt_reenable_sriov(bp);
> netdev_lock(bp->dev);
> --
> 2.30.1
>
Will the following work instead? netdev_ops_assert_locked should take
care of asserting either ops lock or rtnl lock depending on the device
properties.
diff --git a/net/core/dev.c b/net/core/dev.c
index c9013632296f..d8d29729c685 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3177,7 +3177,6 @@ int netif_set_real_num_tx_queues(struct net_device *dev, unsigned int txq)
if (dev->reg_state == NETREG_REGISTERED ||
dev->reg_state == NETREG_UNREGISTERING) {
- ASSERT_RTNL();
netdev_ops_assert_locked(dev);
rc = netdev_queue_update_kobjects(dev, dev->real_num_tx_queues,
@@ -3227,7 +3226,6 @@ int netif_set_real_num_rx_queues(struct net_device *dev, unsigned int rxq)
return -EINVAL;
if (dev->reg_state == NETREG_REGISTERED) {
- ASSERT_RTNL();
netdev_ops_assert_locked(dev);
rc = net_rx_queue_update_kobjects(dev, dev->real_num_rx_queues,
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH net] bnxt_en: bring back rtnl_lock() in bnxt_fw_reset_task()
2025-05-12 14:20 ` Stanislav Fomichev
@ 2025-05-12 22:08 ` Michael Chan
2025-05-12 23:43 ` Stanislav Fomichev
0 siblings, 1 reply; 7+ messages in thread
From: Michael Chan @ 2025-05-12 22:08 UTC (permalink / raw)
To: Stanislav Fomichev
Cc: davem, netdev, edumazet, kuba, pabeni, andrew+netdev,
pavan.chebbi, andrew.gospodarek, sdf, Kalesh AP
[-- Attachment #1: Type: text/plain, Size: 1397 bytes --]
On Mon, May 12, 2025 at 7:20 AM Stanislav Fomichev <stfomichev@gmail.com> wrote:
> Will the following work instead? netdev_ops_assert_locked should take
> care of asserting either ops lock or rtnl lock depending on the device
> properties.
It works for netif_set_real_num_tx_queues() but I also need to replace
the ASSERT_RTNL() with netdev_ops_assert_locked(dev) in
__udp_tunnel_nic_reset_ntf().
With the additional change, it works well with repeated NIC resets.
Thanks for the suggestion.
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index c9013632296f..d8d29729c685 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3177,7 +3177,6 @@ int netif_set_real_num_tx_queues(struct net_device *dev, unsigned int txq)
>
> if (dev->reg_state == NETREG_REGISTERED ||
> dev->reg_state == NETREG_UNREGISTERING) {
> - ASSERT_RTNL();
> netdev_ops_assert_locked(dev);
>
> rc = netdev_queue_update_kobjects(dev, dev->real_num_tx_queues,
> @@ -3227,7 +3226,6 @@ int netif_set_real_num_rx_queues(struct net_device *dev, unsigned int rxq)
> return -EINVAL;
>
> if (dev->reg_state == NETREG_REGISTERED) {
> - ASSERT_RTNL();
> netdev_ops_assert_locked(dev);
>
> rc = net_rx_queue_update_kobjects(dev, dev->real_num_rx_queues,
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4196 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH net] bnxt_en: bring back rtnl_lock() in bnxt_fw_reset_task()
2025-05-12 22:08 ` Michael Chan
@ 2025-05-12 23:43 ` Stanislav Fomichev
2025-05-13 0:26 ` Jakub Kicinski
0 siblings, 1 reply; 7+ messages in thread
From: Stanislav Fomichev @ 2025-05-12 23:43 UTC (permalink / raw)
To: Michael Chan
Cc: davem, netdev, edumazet, kuba, pabeni, andrew+netdev,
pavan.chebbi, andrew.gospodarek, sdf, Kalesh AP
On 05/12, Michael Chan wrote:
> On Mon, May 12, 2025 at 7:20 AM Stanislav Fomichev <stfomichev@gmail.com> wrote:
> > Will the following work instead? netdev_ops_assert_locked should take
> > care of asserting either ops lock or rtnl lock depending on the device
> > properties.
>
> It works for netif_set_real_num_tx_queues() but I also need to replace
> the ASSERT_RTNL() with netdev_ops_assert_locked(dev) in
> __udp_tunnel_nic_reset_ntf().
Sounds good!
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] bnxt_en: bring back rtnl_lock() in bnxt_fw_reset_task()
2025-05-12 23:43 ` Stanislav Fomichev
@ 2025-05-13 0:26 ` Jakub Kicinski
2025-05-13 1:14 ` Stanislav Fomichev
0 siblings, 1 reply; 7+ messages in thread
From: Jakub Kicinski @ 2025-05-13 0:26 UTC (permalink / raw)
To: Stanislav Fomichev
Cc: Michael Chan, davem, netdev, edumazet, pabeni, andrew+netdev,
pavan.chebbi, andrew.gospodarek, sdf, Kalesh AP
On Mon, 12 May 2025 16:43:12 -0700 Stanislav Fomichev wrote:
> On 05/12, Michael Chan wrote:
> > On Mon, May 12, 2025 at 7:20 AM Stanislav Fomichev <stfomichev@gmail.com> wrote:
> > > Will the following work instead? netdev_ops_assert_locked should take
> > > care of asserting either ops lock or rtnl lock depending on the device
> > > properties.
> >
> > It works for netif_set_real_num_tx_queues() but I also need to replace
> > the ASSERT_RTNL() with netdev_ops_assert_locked(dev) in
> > __udp_tunnel_nic_reset_ntf().
>
> Sounds good!
Mm... To me it sounds concerning. UDP tunnel port tracking doesn't have
any locks, it depends on RTNL. Are y'all sure we can just drop the
ASSERT_RTNL() and nothing will blow up? Or did I misunderstand?
I'd go with Michael's patch for net and revisit in net-next if you're
filling bold.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] bnxt_en: bring back rtnl_lock() in bnxt_fw_reset_task()
2025-05-13 0:26 ` Jakub Kicinski
@ 2025-05-13 1:14 ` Stanislav Fomichev
2025-05-13 5:41 ` Michael Chan
0 siblings, 1 reply; 7+ messages in thread
From: Stanislav Fomichev @ 2025-05-13 1:14 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Michael Chan, davem, netdev, edumazet, pabeni, andrew+netdev,
pavan.chebbi, andrew.gospodarek, sdf, Kalesh AP
On 05/12, Jakub Kicinski wrote:
> On Mon, 12 May 2025 16:43:12 -0700 Stanislav Fomichev wrote:
> > On 05/12, Michael Chan wrote:
> > > On Mon, May 12, 2025 at 7:20 AM Stanislav Fomichev <stfomichev@gmail.com> wrote:
> > > > Will the following work instead? netdev_ops_assert_locked should take
> > > > care of asserting either ops lock or rtnl lock depending on the device
> > > > properties.
> > >
> > > It works for netif_set_real_num_tx_queues() but I also need to replace
> > > the ASSERT_RTNL() with netdev_ops_assert_locked(dev) in
> > > __udp_tunnel_nic_reset_ntf().
> >
> > Sounds good!
>
> Mm... To me it sounds concerning. UDP tunnel port tracking doesn't have
> any locks, it depends on RTNL. Are y'all sure we can just drop the
> ASSERT_RTNL() and nothing will blow up? Or did I misunderstand?
>
> I'd go with Michael's patch for net and revisit in net-next if you're
> filling bold.
Good point. But in this case, we need to cover more? bnxt_open from
bnxt_resume and the callers of bnxt_reset?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] bnxt_en: bring back rtnl_lock() in bnxt_fw_reset_task()
2025-05-13 1:14 ` Stanislav Fomichev
@ 2025-05-13 5:41 ` Michael Chan
0 siblings, 0 replies; 7+ messages in thread
From: Michael Chan @ 2025-05-13 5:41 UTC (permalink / raw)
To: Stanislav Fomichev
Cc: Jakub Kicinski, davem, netdev, edumazet, pabeni, andrew+netdev,
pavan.chebbi, andrew.gospodarek, sdf, Kalesh AP
[-- Attachment #1: Type: text/plain, Size: 543 bytes --]
On Mon, May 12, 2025 at 6:14 PM Stanislav Fomichev <stfomichev@gmail.com> wrote:
>
> On 05/12, Jakub Kicinski wrote:
> > I'd go with Michael's patch for net and revisit in net-next if you're
> > filling bold.
>
> Good point. But in this case, we need to cover more? bnxt_open from
> bnxt_resume and the callers of bnxt_reset?
You are right. We haven't tested suspend/resume, PCI AER, and TX
timeout yet and these will have the same problem. Let me expand on
the original fix and get these code paths tested tomorrow. Thanks.
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4196 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-05-13 5:41 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-12 6:37 [PATCH net] bnxt_en: bring back rtnl_lock() in bnxt_fw_reset_task() Michael Chan
2025-05-12 14:20 ` Stanislav Fomichev
2025-05-12 22:08 ` Michael Chan
2025-05-12 23:43 ` Stanislav Fomichev
2025-05-13 0:26 ` Jakub Kicinski
2025-05-13 1:14 ` Stanislav Fomichev
2025-05-13 5:41 ` Michael Chan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).