* [PATCH net] bnxt_en: bring back rtnl_lock() in bnxt_fw_reset_task()
@ 2025-05-12 6:37 Michael Chan
2025-05-12 14:20 ` Stanislav Fomichev
0 siblings, 1 reply; 7+ messages in thread
From: Michael Chan @ 2025-05-12 6:37 UTC (permalink / raw)
To: davem
Cc: netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi,
andrew.gospodarek, sdf, Kalesh AP
RTNL assertion failed in netif_set_real_num_tx_queues() in the
error recovery path:
RTNL: assertion failed at net/core/dev.c (3178)
WARNING: CPU: 3 PID: 3392 at net/core/dev.c:3178 netif_set_real_num_tx_queues+0x1fd/0x210
Call Trace:
<TASK>
? __pfx_bnxt_msix+0x10/0x10 [bnxt_en]
__bnxt_open_nic+0x1ef/0xb20 [bnxt_en]
bnxt_open+0xda/0x130 [bnxt_en]
bnxt_fw_reset_task+0x21f/0x780 [bnxt_en]
process_scheduled_works+0x9d/0x400
Bring back the rtnl_lock() for now in bnxt_fw_reset_task().
Fixes: 004b5008016a ("eth: bnxt: remove most dependencies on RTNL")
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 86a5de44b6f3..8df602663e0d 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -14960,15 +14960,17 @@ static void bnxt_fw_reset_task(struct work_struct *work)
bp->fw_reset_state = BNXT_FW_RESET_STATE_OPENING;
fallthrough;
case BNXT_FW_RESET_STATE_OPENING:
- while (!netdev_trylock(bp->dev)) {
+ while (!rtnl_trylock()) {
bnxt_queue_fw_reset_work(bp, HZ / 10);
return;
}
+ netdev_lock(bp->dev);
rc = bnxt_open(bp->dev);
if (rc) {
netdev_err(bp->dev, "bnxt_open() failed during FW reset\n");
bnxt_fw_reset_abort(bp, rc);
netdev_unlock(bp->dev);
+ rtnl_unlock();
goto ulp_start;
}
@@ -14988,6 +14990,7 @@ static void bnxt_fw_reset_task(struct work_struct *work)
bnxt_dl_health_fw_status_update(bp, true);
}
netdev_unlock(bp->dev);
+ rtnl_unlock();
bnxt_ulp_start(bp, 0);
bnxt_reenable_sriov(bp);
netdev_lock(bp->dev);
--
2.30.1
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH net] bnxt_en: bring back rtnl_lock() in bnxt_fw_reset_task() 2025-05-12 6:37 [PATCH net] bnxt_en: bring back rtnl_lock() in bnxt_fw_reset_task() Michael Chan @ 2025-05-12 14:20 ` Stanislav Fomichev 2025-05-12 22:08 ` Michael Chan 0 siblings, 1 reply; 7+ messages in thread From: Stanislav Fomichev @ 2025-05-12 14:20 UTC (permalink / raw) To: Michael Chan Cc: davem, netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi, andrew.gospodarek, sdf, Kalesh AP On 05/11, Michael Chan wrote: > RTNL assertion failed in netif_set_real_num_tx_queues() in the > error recovery path: > > RTNL: assertion failed at net/core/dev.c (3178) > WARNING: CPU: 3 PID: 3392 at net/core/dev.c:3178 netif_set_real_num_tx_queues+0x1fd/0x210 > > Call Trace: > <TASK> > ? __pfx_bnxt_msix+0x10/0x10 [bnxt_en] > __bnxt_open_nic+0x1ef/0xb20 [bnxt_en] > bnxt_open+0xda/0x130 [bnxt_en] > bnxt_fw_reset_task+0x21f/0x780 [bnxt_en] > process_scheduled_works+0x9d/0x400 > > Bring back the rtnl_lock() for now in bnxt_fw_reset_task(). > > Fixes: 004b5008016a ("eth: bnxt: remove most dependencies on RTNL") > Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> > Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> > Signed-off-by: Michael Chan <michael.chan@broadcom.com> > --- > drivers/net/ethernet/broadcom/bnxt/bnxt.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c > index 86a5de44b6f3..8df602663e0d 100644 > --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c > +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c > @@ -14960,15 +14960,17 @@ static void bnxt_fw_reset_task(struct work_struct *work) > bp->fw_reset_state = BNXT_FW_RESET_STATE_OPENING; > fallthrough; > case BNXT_FW_RESET_STATE_OPENING: > - while (!netdev_trylock(bp->dev)) { > + while (!rtnl_trylock()) { > bnxt_queue_fw_reset_work(bp, HZ / 10); > return; > } > + netdev_lock(bp->dev); > rc = bnxt_open(bp->dev); > if (rc) { > netdev_err(bp->dev, "bnxt_open() failed during FW reset\n"); > bnxt_fw_reset_abort(bp, rc); > netdev_unlock(bp->dev); > + rtnl_unlock(); > goto ulp_start; > } > > @@ -14988,6 +14990,7 @@ static void bnxt_fw_reset_task(struct work_struct *work) > bnxt_dl_health_fw_status_update(bp, true); > } > netdev_unlock(bp->dev); > + rtnl_unlock(); > bnxt_ulp_start(bp, 0); > bnxt_reenable_sriov(bp); > netdev_lock(bp->dev); > -- > 2.30.1 > Will the following work instead? netdev_ops_assert_locked should take care of asserting either ops lock or rtnl lock depending on the device properties. diff --git a/net/core/dev.c b/net/core/dev.c index c9013632296f..d8d29729c685 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -3177,7 +3177,6 @@ int netif_set_real_num_tx_queues(struct net_device *dev, unsigned int txq) if (dev->reg_state == NETREG_REGISTERED || dev->reg_state == NETREG_UNREGISTERING) { - ASSERT_RTNL(); netdev_ops_assert_locked(dev); rc = netdev_queue_update_kobjects(dev, dev->real_num_tx_queues, @@ -3227,7 +3226,6 @@ int netif_set_real_num_rx_queues(struct net_device *dev, unsigned int rxq) return -EINVAL; if (dev->reg_state == NETREG_REGISTERED) { - ASSERT_RTNL(); netdev_ops_assert_locked(dev); rc = net_rx_queue_update_kobjects(dev, dev->real_num_rx_queues, ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH net] bnxt_en: bring back rtnl_lock() in bnxt_fw_reset_task() 2025-05-12 14:20 ` Stanislav Fomichev @ 2025-05-12 22:08 ` Michael Chan 2025-05-12 23:43 ` Stanislav Fomichev 0 siblings, 1 reply; 7+ messages in thread From: Michael Chan @ 2025-05-12 22:08 UTC (permalink / raw) To: Stanislav Fomichev Cc: davem, netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi, andrew.gospodarek, sdf, Kalesh AP [-- Attachment #1: Type: text/plain, Size: 1397 bytes --] On Mon, May 12, 2025 at 7:20 AM Stanislav Fomichev <stfomichev@gmail.com> wrote: > Will the following work instead? netdev_ops_assert_locked should take > care of asserting either ops lock or rtnl lock depending on the device > properties. It works for netif_set_real_num_tx_queues() but I also need to replace the ASSERT_RTNL() with netdev_ops_assert_locked(dev) in __udp_tunnel_nic_reset_ntf(). With the additional change, it works well with repeated NIC resets. Thanks for the suggestion. > > diff --git a/net/core/dev.c b/net/core/dev.c > index c9013632296f..d8d29729c685 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -3177,7 +3177,6 @@ int netif_set_real_num_tx_queues(struct net_device *dev, unsigned int txq) > > if (dev->reg_state == NETREG_REGISTERED || > dev->reg_state == NETREG_UNREGISTERING) { > - ASSERT_RTNL(); > netdev_ops_assert_locked(dev); > > rc = netdev_queue_update_kobjects(dev, dev->real_num_tx_queues, > @@ -3227,7 +3226,6 @@ int netif_set_real_num_rx_queues(struct net_device *dev, unsigned int rxq) > return -EINVAL; > > if (dev->reg_state == NETREG_REGISTERED) { > - ASSERT_RTNL(); > netdev_ops_assert_locked(dev); > > rc = net_rx_queue_update_kobjects(dev, dev->real_num_rx_queues, [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4196 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] bnxt_en: bring back rtnl_lock() in bnxt_fw_reset_task() 2025-05-12 22:08 ` Michael Chan @ 2025-05-12 23:43 ` Stanislav Fomichev 2025-05-13 0:26 ` Jakub Kicinski 0 siblings, 1 reply; 7+ messages in thread From: Stanislav Fomichev @ 2025-05-12 23:43 UTC (permalink / raw) To: Michael Chan Cc: davem, netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi, andrew.gospodarek, sdf, Kalesh AP On 05/12, Michael Chan wrote: > On Mon, May 12, 2025 at 7:20 AM Stanislav Fomichev <stfomichev@gmail.com> wrote: > > Will the following work instead? netdev_ops_assert_locked should take > > care of asserting either ops lock or rtnl lock depending on the device > > properties. > > It works for netif_set_real_num_tx_queues() but I also need to replace > the ASSERT_RTNL() with netdev_ops_assert_locked(dev) in > __udp_tunnel_nic_reset_ntf(). Sounds good! ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] bnxt_en: bring back rtnl_lock() in bnxt_fw_reset_task() 2025-05-12 23:43 ` Stanislav Fomichev @ 2025-05-13 0:26 ` Jakub Kicinski 2025-05-13 1:14 ` Stanislav Fomichev 0 siblings, 1 reply; 7+ messages in thread From: Jakub Kicinski @ 2025-05-13 0:26 UTC (permalink / raw) To: Stanislav Fomichev Cc: Michael Chan, davem, netdev, edumazet, pabeni, andrew+netdev, pavan.chebbi, andrew.gospodarek, sdf, Kalesh AP On Mon, 12 May 2025 16:43:12 -0700 Stanislav Fomichev wrote: > On 05/12, Michael Chan wrote: > > On Mon, May 12, 2025 at 7:20 AM Stanislav Fomichev <stfomichev@gmail.com> wrote: > > > Will the following work instead? netdev_ops_assert_locked should take > > > care of asserting either ops lock or rtnl lock depending on the device > > > properties. > > > > It works for netif_set_real_num_tx_queues() but I also need to replace > > the ASSERT_RTNL() with netdev_ops_assert_locked(dev) in > > __udp_tunnel_nic_reset_ntf(). > > Sounds good! Mm... To me it sounds concerning. UDP tunnel port tracking doesn't have any locks, it depends on RTNL. Are y'all sure we can just drop the ASSERT_RTNL() and nothing will blow up? Or did I misunderstand? I'd go with Michael's patch for net and revisit in net-next if you're filling bold. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] bnxt_en: bring back rtnl_lock() in bnxt_fw_reset_task() 2025-05-13 0:26 ` Jakub Kicinski @ 2025-05-13 1:14 ` Stanislav Fomichev 2025-05-13 5:41 ` Michael Chan 0 siblings, 1 reply; 7+ messages in thread From: Stanislav Fomichev @ 2025-05-13 1:14 UTC (permalink / raw) To: Jakub Kicinski Cc: Michael Chan, davem, netdev, edumazet, pabeni, andrew+netdev, pavan.chebbi, andrew.gospodarek, sdf, Kalesh AP On 05/12, Jakub Kicinski wrote: > On Mon, 12 May 2025 16:43:12 -0700 Stanislav Fomichev wrote: > > On 05/12, Michael Chan wrote: > > > On Mon, May 12, 2025 at 7:20 AM Stanislav Fomichev <stfomichev@gmail.com> wrote: > > > > Will the following work instead? netdev_ops_assert_locked should take > > > > care of asserting either ops lock or rtnl lock depending on the device > > > > properties. > > > > > > It works for netif_set_real_num_tx_queues() but I also need to replace > > > the ASSERT_RTNL() with netdev_ops_assert_locked(dev) in > > > __udp_tunnel_nic_reset_ntf(). > > > > Sounds good! > > Mm... To me it sounds concerning. UDP tunnel port tracking doesn't have > any locks, it depends on RTNL. Are y'all sure we can just drop the > ASSERT_RTNL() and nothing will blow up? Or did I misunderstand? > > I'd go with Michael's patch for net and revisit in net-next if you're > filling bold. Good point. But in this case, we need to cover more? bnxt_open from bnxt_resume and the callers of bnxt_reset? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] bnxt_en: bring back rtnl_lock() in bnxt_fw_reset_task() 2025-05-13 1:14 ` Stanislav Fomichev @ 2025-05-13 5:41 ` Michael Chan 0 siblings, 0 replies; 7+ messages in thread From: Michael Chan @ 2025-05-13 5:41 UTC (permalink / raw) To: Stanislav Fomichev Cc: Jakub Kicinski, davem, netdev, edumazet, pabeni, andrew+netdev, pavan.chebbi, andrew.gospodarek, sdf, Kalesh AP [-- Attachment #1: Type: text/plain, Size: 543 bytes --] On Mon, May 12, 2025 at 6:14 PM Stanislav Fomichev <stfomichev@gmail.com> wrote: > > On 05/12, Jakub Kicinski wrote: > > I'd go with Michael's patch for net and revisit in net-next if you're > > filling bold. > > Good point. But in this case, we need to cover more? bnxt_open from > bnxt_resume and the callers of bnxt_reset? You are right. We haven't tested suspend/resume, PCI AER, and TX timeout yet and these will have the same problem. Let me expand on the original fix and get these code paths tested tomorrow. Thanks. [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4196 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-05-13 5:41 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-05-12 6:37 [PATCH net] bnxt_en: bring back rtnl_lock() in bnxt_fw_reset_task() Michael Chan 2025-05-12 14:20 ` Stanislav Fomichev 2025-05-12 22:08 ` Michael Chan 2025-05-12 23:43 ` Stanislav Fomichev 2025-05-13 0:26 ` Jakub Kicinski 2025-05-13 1:14 ` Stanislav Fomichev 2025-05-13 5:41 ` Michael Chan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox