* [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset
2025-05-19 20:41 [PATCH net 0/3] bnxt_en: 2 bug fixes Michael Chan
@ 2025-05-19 20:41 ` Michael Chan
2025-05-20 14:42 ` Simon Horman
` (2 more replies)
0 siblings, 3 replies; 23+ messages in thread
From: Michael Chan @ 2025-05-19 20:41 UTC (permalink / raw)
To: davem
Cc: netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi,
andrew.gospodarek, David Wei
From: Pavan Chebbi <pavan.chebbi@broadcom.com>
The commit under the Fixes tag below which updates the VNICs' RSS
and MRU during .ndo_queue_start(), needs to be extended to cover any
non-default RSS contexts which have their own VNICs. Without this
step, packets that are destined to a non-default RSS context may be
dropped after .ndo_queue_start().
Fixes: 5ac066b7b062 ("bnxt_en: Fix queue start to update vnic RSS table")
Reported-by: David Wei <dw@davidwei.uk>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
Cc: David Wei <dw@davidwei.uk>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 27 +++++++++++++++++++----
1 file changed, 23 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index a45c5ce81111..71e428710a26 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -10758,6 +10758,24 @@ static int bnxt_set_vnic_mru_p5(struct bnxt *bp, struct bnxt_vnic_info *vnic,
return 0;
}
+static int bnxt_set_rss_ctx_vnic_mru(struct bnxt *bp, u16 mru)
+{
+ struct ethtool_rxfh_context *ctx;
+ unsigned long context;
+ int rc;
+
+ xa_for_each(&bp->dev->ethtool->rss_ctx, context, ctx) {
+ struct bnxt_rss_ctx *rss_ctx = ethtool_rxfh_context_priv(ctx);
+ struct bnxt_vnic_info *vnic = &rss_ctx->vnic;
+
+ rc = bnxt_set_vnic_mru_p5(bp, vnic, mru);
+ if (rc)
+ return rc;
+ }
+
+ return 0;
+}
+
static void bnxt_hwrm_realloc_rss_ctx_vnic(struct bnxt *bp)
{
bool set_tpa = !!(bp->flags & BNXT_FLAG_TPA);
@@ -15904,6 +15922,7 @@ static int bnxt_queue_start(struct net_device *dev, void *qmem, int idx)
struct bnxt_vnic_info *vnic;
struct bnxt_napi *bnapi;
int i, rc;
+ u16 mru;
rxr = &bp->rx_ring[idx];
clone = qmem;
@@ -15953,16 +15972,15 @@ static int bnxt_queue_start(struct net_device *dev, void *qmem, int idx)
napi_enable_locked(&bnapi->napi);
bnxt_db_nq_arm(bp, &cpr->cp_db, cpr->cp_raw_cons);
+ mru = bp->dev->mtu + ETH_HLEN + VLAN_HLEN;
for (i = 0; i < bp->nr_vnics; i++) {
vnic = &bp->vnic_info[i];
- rc = bnxt_set_vnic_mru_p5(bp, vnic,
- bp->dev->mtu + ETH_HLEN + VLAN_HLEN);
+ rc = bnxt_set_vnic_mru_p5(bp, vnic, mru);
if (rc)
return rc;
}
-
- return 0;
+ return bnxt_set_rss_ctx_vnic_mru(bp, mru);
err_reset:
netdev_err(bp->dev, "Unexpected HWRM error during queue start rc: %d\n",
@@ -15987,6 +16005,7 @@ static int bnxt_queue_stop(struct net_device *dev, void *qmem, int idx)
bnxt_set_vnic_mru_p5(bp, vnic, 0);
}
+ bnxt_set_rss_ctx_vnic_mru(bp, 0);
/* Make sure NAPI sees that the VNIC is disabled */
synchronize_net();
rxr = &bp->rx_ring[idx];
--
2.30.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset
2025-05-19 20:41 ` [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset Michael Chan
@ 2025-05-20 14:42 ` Simon Horman
2025-05-20 14:59 ` David Wei
2025-05-21 1:28 ` Jakub Kicinski
2 siblings, 0 replies; 23+ messages in thread
From: Simon Horman @ 2025-05-20 14:42 UTC (permalink / raw)
To: Michael Chan
Cc: davem, netdev, edumazet, kuba, pabeni, andrew+netdev,
pavan.chebbi, andrew.gospodarek, David Wei
On Mon, May 19, 2025 at 01:41:30PM -0700, Michael Chan wrote:
> From: Pavan Chebbi <pavan.chebbi@broadcom.com>
>
> The commit under the Fixes tag below which updates the VNICs' RSS
> and MRU during .ndo_queue_start(), needs to be extended to cover any
> non-default RSS contexts which have their own VNICs. Without this
> step, packets that are destined to a non-default RSS context may be
> dropped after .ndo_queue_start().
>
> Fixes: 5ac066b7b062 ("bnxt_en: Fix queue start to update vnic RSS table")
> Reported-by: David Wei <dw@davidwei.uk>
> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
> Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset
2025-05-19 20:41 ` [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset Michael Chan
2025-05-20 14:42 ` Simon Horman
@ 2025-05-20 14:59 ` David Wei
2025-05-21 1:28 ` Jakub Kicinski
2 siblings, 0 replies; 23+ messages in thread
From: David Wei @ 2025-05-20 14:59 UTC (permalink / raw)
To: Michael Chan, davem
Cc: netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi,
andrew.gospodarek
On 5/19/25 13:41, Michael Chan wrote:
> From: Pavan Chebbi <pavan.chebbi@broadcom.com>
>
> The commit under the Fixes tag below which updates the VNICs' RSS
> and MRU during .ndo_queue_start(), needs to be extended to cover any
> non-default RSS contexts which have their own VNICs. Without this
> step, packets that are destined to a non-default RSS context may be
> dropped after .ndo_queue_start().
>
> Fixes: 5ac066b7b062 ("bnxt_en: Fix queue start to update vnic RSS table")
> Reported-by: David Wei <dw@davidwei.uk>
> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
> Signed-off-by: Michael Chan <michael.chan@broadcom.com>
> ---
> Cc: David Wei <dw@davidwei.uk>
> ---
> drivers/net/ethernet/broadcom/bnxt/bnxt.c | 27 +++++++++++++++++++----
> 1 file changed, 23 insertions(+), 4 deletions(-)
>
Thanks for the fix.
Reviewed-by: David Wei <dw@davidwei.uk>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset
2025-05-19 20:41 ` [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset Michael Chan
2025-05-20 14:42 ` Simon Horman
2025-05-20 14:59 ` David Wei
@ 2025-05-21 1:28 ` Jakub Kicinski
2025-05-21 1:38 ` Michael Chan
2 siblings, 1 reply; 23+ messages in thread
From: Jakub Kicinski @ 2025-05-21 1:28 UTC (permalink / raw)
To: Michael Chan
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, pavan.chebbi,
andrew.gospodarek, David Wei
On Mon, 19 May 2025 13:41:30 -0700 Michael Chan wrote:
> @@ -15987,6 +16005,7 @@ static int bnxt_queue_stop(struct net_device *dev, void *qmem, int idx)
>
> bnxt_set_vnic_mru_p5(bp, vnic, 0);
> }
> + bnxt_set_rss_ctx_vnic_mru(bp, 0);
> /* Make sure NAPI sees that the VNIC is disabled */
> synchronize_net();
> rxr = &bp->rx_ring[idx];
What does setting MRU to zero do? All traffic will be dropped?
Traffic will no longer be filtered based on MRU? Or.. ?
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset
2025-05-21 1:28 ` Jakub Kicinski
@ 2025-05-21 1:38 ` Michael Chan
2025-05-21 1:51 ` Jakub Kicinski
0 siblings, 1 reply; 23+ messages in thread
From: Michael Chan @ 2025-05-21 1:38 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, pavan.chebbi,
andrew.gospodarek, David Wei
[-- Attachment #1: Type: text/plain, Size: 798 bytes --]
On Tue, May 20, 2025 at 6:28 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Mon, 19 May 2025 13:41:30 -0700 Michael Chan wrote:
> > @@ -15987,6 +16005,7 @@ static int bnxt_queue_stop(struct net_device *dev, void *qmem, int idx)
> >
> > bnxt_set_vnic_mru_p5(bp, vnic, 0);
> > }
> > + bnxt_set_rss_ctx_vnic_mru(bp, 0);
> > /* Make sure NAPI sees that the VNIC is disabled */
> > synchronize_net();
> > rxr = &bp->rx_ring[idx];
>
> What does setting MRU to zero do? All traffic will be dropped?
> Traffic will no longer be filtered based on MRU? Or.. ?
That VNIC with MRU set to zero will not receive any more traffic.
This step was recommended by the FW team when we first started working
with David to implement the queue_mgmt_ops.
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4196 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset
2025-05-21 1:38 ` Michael Chan
@ 2025-05-21 1:51 ` Jakub Kicinski
2025-05-21 2:10 ` Michael Chan
0 siblings, 1 reply; 23+ messages in thread
From: Jakub Kicinski @ 2025-05-21 1:51 UTC (permalink / raw)
To: Michael Chan
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, pavan.chebbi,
andrew.gospodarek, David Wei
On Tue, 20 May 2025 18:38:45 -0700 Michael Chan wrote:
> On Tue, May 20, 2025 at 6:28 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > On Mon, 19 May 2025 13:41:30 -0700 Michael Chan wrote:
> > > @@ -15987,6 +16005,7 @@ static int bnxt_queue_stop(struct net_device *dev, void *qmem, int idx)
> > >
> > > bnxt_set_vnic_mru_p5(bp, vnic, 0);
> > > }
> > > + bnxt_set_rss_ctx_vnic_mru(bp, 0);
> > > /* Make sure NAPI sees that the VNIC is disabled */
> > > synchronize_net();
> > > rxr = &bp->rx_ring[idx];
> >
> > What does setting MRU to zero do? All traffic will be dropped?
> > Traffic will no longer be filtered based on MRU? Or.. ?
>
> That VNIC with MRU set to zero will not receive any more traffic.
> This step was recommended by the FW team when we first started working
> with David to implement the queue_mgmt_ops.
Shutting down traffic to ZC queues is one thing, but now you
seem to be walking all RSS contexts and shutting them all down.
The whole point of the queue API is to avoid shutting down
the entire device.
--
pw-bot: cr
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset
2025-05-21 1:51 ` Jakub Kicinski
@ 2025-05-21 2:10 ` Michael Chan
2025-05-21 2:17 ` Jakub Kicinski
0 siblings, 1 reply; 23+ messages in thread
From: Michael Chan @ 2025-05-21 2:10 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, pavan.chebbi,
andrew.gospodarek, David Wei
[-- Attachment #1: Type: text/plain, Size: 1349 bytes --]
On Tue, May 20, 2025 at 6:51 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue, 20 May 2025 18:38:45 -0700 Michael Chan wrote:
> > On Tue, May 20, 2025 at 6:28 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > > On Mon, 19 May 2025 13:41:30 -0700 Michael Chan wrote:
> > > > @@ -15987,6 +16005,7 @@ static int bnxt_queue_stop(struct net_device *dev, void *qmem, int idx)
> > > >
> > > > bnxt_set_vnic_mru_p5(bp, vnic, 0);
> > > > }
> > > > + bnxt_set_rss_ctx_vnic_mru(bp, 0);
> > > > /* Make sure NAPI sees that the VNIC is disabled */
> > > > synchronize_net();
> > > > rxr = &bp->rx_ring[idx];
> > >
> > > What does setting MRU to zero do? All traffic will be dropped?
> > > Traffic will no longer be filtered based on MRU? Or.. ?
> >
> > That VNIC with MRU set to zero will not receive any more traffic.
> > This step was recommended by the FW team when we first started working
> > with David to implement the queue_mgmt_ops.
>
> Shutting down traffic to ZC queues is one thing, but now you
> seem to be walking all RSS contexts and shutting them all down.
> The whole point of the queue API is to avoid shutting down
> the entire device.
The existing code has been setting the MRU to 0 for the default RSS
context's VNIC. They found that this sequence was reliable.
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4196 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset
2025-05-21 2:10 ` Michael Chan
@ 2025-05-21 2:17 ` Jakub Kicinski
2025-05-21 2:29 ` Michael Chan
0 siblings, 1 reply; 23+ messages in thread
From: Jakub Kicinski @ 2025-05-21 2:17 UTC (permalink / raw)
To: Michael Chan
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, pavan.chebbi,
andrew.gospodarek, David Wei
On Tue, 20 May 2025 19:10:37 -0700 Michael Chan wrote:
> > Shutting down traffic to ZC queues is one thing, but now you
> > seem to be walking all RSS contexts and shutting them all down.
> > The whole point of the queue API is to avoid shutting down
> > the entire device.
>
> The existing code has been setting the MRU to 0 for the default RSS
> context's VNIC.
:/ I must have misunderstood. I wouldn't have merged this if I knew.
You can't be shutting down system queues because some application
decided to bind a ZC queue.
> They found that this sequence was reliable.
"reliable" is a bit of a big word that some people would reserve
for code which is production tested or at the very least very
heavily validated.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset
2025-05-21 2:17 ` Jakub Kicinski
@ 2025-05-21 2:29 ` Michael Chan
2025-05-22 11:01 ` David Wei
0 siblings, 1 reply; 23+ messages in thread
From: Michael Chan @ 2025-05-21 2:29 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, pavan.chebbi,
andrew.gospodarek, David Wei
[-- Attachment #1: Type: text/plain, Size: 517 bytes --]
On Tue, May 20, 2025 at 7:17 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue, 20 May 2025 19:10:37 -0700 Michael Chan wrote:
> > They found that this sequence was reliable.
>
> "reliable" is a bit of a big word that some people would reserve
> for code which is production tested or at the very least very
> heavily validated.
FWIW, queue_mgmt_ops was heavily tested by Somnath under heavy traffic
conditions. Obviously RSS contexts were not included during testing
and this problem was missed.
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4196 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset
2025-05-21 2:29 ` Michael Chan
@ 2025-05-22 11:01 ` David Wei
2025-05-22 15:26 ` Jakub Kicinski
0 siblings, 1 reply; 23+ messages in thread
From: David Wei @ 2025-05-22 11:01 UTC (permalink / raw)
To: Michael Chan, Jakub Kicinski
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, pavan.chebbi,
andrew.gospodarek
On 5/20/25 19:29, Michael Chan wrote:
> On Tue, May 20, 2025 at 7:17 PM Jakub Kicinski <kuba@kernel.org> wrote:
>>
>> On Tue, 20 May 2025 19:10:37 -0700 Michael Chan wrote:
>>> They found that this sequence was reliable.
>>
>> "reliable" is a bit of a big word that some people would reserve
>> for code which is production tested or at the very least very
>> heavily validated.
>
> FWIW, queue_mgmt_ops was heavily tested by Somnath under heavy traffic
> conditions. Obviously RSS contexts were not included during testing
> and this problem was missed.
IIRC from the initial testing w/ Somnath even though the VNICs are reset
the traffic on unrelated queues are unaffected. If we ensure that is the
cse with this patchset, would that resolve your concerns Jakub?
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset
2025-05-22 11:01 ` David Wei
@ 2025-05-22 15:26 ` Jakub Kicinski
2025-05-27 15:37 ` David Wei
0 siblings, 1 reply; 23+ messages in thread
From: Jakub Kicinski @ 2025-05-22 15:26 UTC (permalink / raw)
To: David Wei
Cc: Michael Chan, davem, netdev, edumazet, pabeni, andrew+netdev,
pavan.chebbi, andrew.gospodarek
On Thu, 22 May 2025 12:01:34 +0100 David Wei wrote:
> On 5/20/25 19:29, Michael Chan wrote:
> > On Tue, May 20, 2025 at 7:17 PM Jakub Kicinski <kuba@kernel.org> wrote:
> >> "reliable" is a bit of a big word that some people would reserve
> >> for code which is production tested or at the very least very
> >> heavily validated.
> >
> > FWIW, queue_mgmt_ops was heavily tested by Somnath under heavy traffic
> > conditions. Obviously RSS contexts were not included during testing
> > and this problem was missed.
>
> IIRC from the initial testing w/ Somnath even though the VNICs are reset
> the traffic on unrelated queues are unaffected.
How did you check that? IIUC the device does not currently report
packet loss due to MRU clamp (!?!)
> If we ensure that is the cse with this patchset, would that resolve
> your concerns Jakub?
For ZC we expect the queues to be taken out of the main context.
IIUC it'd be a significant improvement over the status quo if
we could check which contexts the queue is in (incl. context 0)
and only clamp MRU on those.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset
2025-05-22 15:26 ` Jakub Kicinski
@ 2025-05-27 15:37 ` David Wei
2025-05-27 16:50 ` Michael Chan
0 siblings, 1 reply; 23+ messages in thread
From: David Wei @ 2025-05-27 15:37 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Michael Chan, davem, netdev, edumazet, pabeni, andrew+netdev,
pavan.chebbi, andrew.gospodarek
On 2025-05-22 16:26, Jakub Kicinski wrote:
> On Thu, 22 May 2025 12:01:34 +0100 David Wei wrote:
>> On 5/20/25 19:29, Michael Chan wrote:
>>> On Tue, May 20, 2025 at 7:17 PM Jakub Kicinski <kuba@kernel.org> wrote:
>>>> "reliable" is a bit of a big word that some people would reserve
>>>> for code which is production tested or at the very least very
>>>> heavily validated.
>>>
>>> FWIW, queue_mgmt_ops was heavily tested by Somnath under heavy traffic
>>> conditions. Obviously RSS contexts were not included during testing
>>> and this problem was missed.
>>
>> IIRC from the initial testing w/ Somnath even though the VNICs are reset
>> the traffic on unrelated queues are unaffected.
>
> How did you check that? IIUC the device does not currently report
> packet loss due to MRU clamp (!?!)
Only from iperf3. On the server side while it is running, resetting
queues do not affect it. Didn't check for packet drops, though...
>
>> If we ensure that is the cse with this patchset, would that resolve
>> your concerns Jakub?
>
> For ZC we expect the queues to be taken out of the main context.
> IIUC it'd be a significant improvement over the status quo if
> we could check which contexts the queue is in (incl. context 0)
> and only clamp MRU on those.
Got it, thanks. Michael, is that something the FW is able to handle
without affecting the queue reset behaviour?
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset
2025-05-27 15:37 ` David Wei
@ 2025-05-27 16:50 ` Michael Chan
0 siblings, 0 replies; 23+ messages in thread
From: Michael Chan @ 2025-05-27 16:50 UTC (permalink / raw)
To: David Wei
Cc: Jakub Kicinski, davem, netdev, edumazet, pabeni, andrew+netdev,
pavan.chebbi, andrew.gospodarek
[-- Attachment #1: Type: text/plain, Size: 622 bytes --]
On Tue, May 27, 2025 at 8:37 AM David Wei <dw@davidwei.uk> wrote:
>
> On 2025-05-22 16:26, Jakub Kicinski wrote:
> > For ZC we expect the queues to be taken out of the main context.
> > IIUC it'd be a significant improvement over the status quo if
> > we could check which contexts the queue is in (incl. context 0)
> > and only clamp MRU on those.
>
> Got it, thanks. Michael, is that something the FW is able to handle
> without affecting the queue reset behaviour?
We are looking into ways to limit the reset to the VNIC containing the
ring being reset. If it works, it should address Jakub's concern.
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4196 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH net 0/3] bnxt_en: Bug fixes
@ 2025-06-13 23:18 Michael Chan
2025-06-13 23:18 ` [PATCH net 1/3] bnxt_en: Fix double invocation of bnxt_ulp_stop()/bnxt_ulp_start() Michael Chan
` (3 more replies)
0 siblings, 4 replies; 23+ messages in thread
From: Michael Chan @ 2025-06-13 23:18 UTC (permalink / raw)
To: davem
Cc: netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi,
andrew.gospodarek
Thie first patch fixes a crash during PCIe AER when the bnxt_re RoCE
driver is loaded. The second patch is a refactor patch needed by
patch 3. Patch 3 fixes a packet drop issue if queue restart is done
on a ring belonging to a non-default RSS context. Patch 2 and 3 are
version 2 that has addressed the v1 issue by reducing the scope of
the traffic disruptions:
https://lore.kernel.org/netdev/CACKFLi=P9xYHVF4h2Ovjd-8DaoyzFAHnY6Y6H+1b7eGq+BQZzA@mail.gmail.com/
Kalesh AP (1):
bnxt_en: Fix double invocation of bnxt_ulp_stop()/bnxt_ulp_start()
Pavan Chebbi (2):
bnxt_en: Add a helper function to configure MRU and RSS
bnxt_en: Update MRU and RSS table of RSS contexts on queue reset
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 87 ++++++++++++++++---
drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 24 +++--
2 files changed, 84 insertions(+), 27 deletions(-)
--
2.30.1
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH net 1/3] bnxt_en: Fix double invocation of bnxt_ulp_stop()/bnxt_ulp_start()
2025-06-13 23:18 [PATCH net 0/3] bnxt_en: Bug fixes Michael Chan
@ 2025-06-13 23:18 ` Michael Chan
2025-06-16 13:39 ` Simon Horman
2025-06-13 23:18 ` [PATCH net 2/3] bnxt_en: Add a helper function to configure MRU and RSS Michael Chan
` (2 subsequent siblings)
3 siblings, 1 reply; 23+ messages in thread
From: Michael Chan @ 2025-06-13 23:18 UTC (permalink / raw)
To: davem
Cc: netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi,
andrew.gospodarek, Kalesh AP
From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Before the commit under the Fixes tag below, bnxt_ulp_stop() and
bnxt_ulp_start() were always invoked in pairs. After that commit,
the new bnxt_ulp_restart() can be invoked after bnxt_ulp_stop()
has been called. This may result in the RoCE driver's aux driver
.suspend() method being invoked twice. The 2nd bnxt_re_suspend()
call will crash when it dereferences a NULL pointer:
(NULL ib_device): Handle device suspend call
BUG: kernel NULL pointer dereference, address: 0000000000000b78
PGD 0 P4D 0
Oops: Oops: 0000 [#1] SMP PTI
CPU: 20 UID: 0 PID: 181 Comm: kworker/u96:5 Tainted: G S 6.15.0-rc1 #4 PREEMPT(voluntary)
Tainted: [S]=CPU_OUT_OF_SPEC
Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.4.3 01/17/2017
Workqueue: bnxt_pf_wq bnxt_sp_task [bnxt_en]
RIP: 0010:bnxt_re_suspend+0x45/0x1f0 [bnxt_re]
Code: 8b 05 a7 3c 5b f5 48 89 44 24 18 31 c0 49 8b 5c 24 08 4d 8b 2c 24 e8 ea 06 0a f4 48 c7 c6 04 60 52 c0 48 89 df e8 1b ce f9 ff <48> 8b 83 78 0b 00 00 48 8b 80 38 03 00 00 a8 40 0f 85 b5 00 00 00
RSP: 0018:ffffa2e84084fd88 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000001
RDX: 0000000000000000 RSI: ffffffffb4b6b934 RDI: 00000000ffffffff
RBP: ffffa1760954c9c0 R08: 0000000000000000 R09: c0000000ffffdfff
R10: 0000000000000001 R11: ffffa2e84084fb50 R12: ffffa176031ef070
R13: ffffa17609775000 R14: ffffa17603adc180 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffffa17daa397000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000b78 CR3: 00000004aaa30003 CR4: 00000000003706f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
bnxt_ulp_stop+0x69/0x90 [bnxt_en]
bnxt_sp_task+0x678/0x920 [bnxt_en]
? __schedule+0x514/0xf50
process_scheduled_works+0x9d/0x400
worker_thread+0x11c/0x260
? __pfx_worker_thread+0x10/0x10
kthread+0xfe/0x1e0
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2b/0x40
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
Check the BNXT_EN_FLAG_ULP_STOPPED flag and do not proceed if the flag
is already set. This will preserve the original symmetrical
bnxt_ulp_stop() and bnxt_ulp_start().
Also, inside bnxt_ulp_start(), clear the BNXT_EN_FLAG_ULP_STOPPED
flag after taking the mutex to avoid any race condition. And for
symmetry, only proceed in bnxt_ulp_start() if the
BNXT_EN_FLAG_ULP_STOPPED is set.
Fixes: 3c163f35bd50 ("bnxt_en: Optimize recovery path ULP locking in the driver")
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Co-developed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 24 ++++++++-----------
1 file changed, 10 insertions(+), 14 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
index 84c4812414fd..2450a369b792 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
@@ -231,10 +231,9 @@ void bnxt_ulp_stop(struct bnxt *bp)
return;
mutex_lock(&edev->en_dev_lock);
- if (!bnxt_ulp_registered(edev)) {
- mutex_unlock(&edev->en_dev_lock);
- return;
- }
+ if (!bnxt_ulp_registered(edev) ||
+ (edev->flags & BNXT_EN_FLAG_ULP_STOPPED))
+ goto ulp_stop_exit;
edev->flags |= BNXT_EN_FLAG_ULP_STOPPED;
if (aux_priv) {
@@ -250,6 +249,7 @@ void bnxt_ulp_stop(struct bnxt *bp)
adrv->suspend(adev, pm);
}
}
+ulp_stop_exit:
mutex_unlock(&edev->en_dev_lock);
}
@@ -258,19 +258,13 @@ void bnxt_ulp_start(struct bnxt *bp, int err)
struct bnxt_aux_priv *aux_priv = bp->aux_priv;
struct bnxt_en_dev *edev = bp->edev;
- if (!edev)
- return;
-
- edev->flags &= ~BNXT_EN_FLAG_ULP_STOPPED;
-
- if (err)
+ if (!edev || err)
return;
mutex_lock(&edev->en_dev_lock);
- if (!bnxt_ulp_registered(edev)) {
- mutex_unlock(&edev->en_dev_lock);
- return;
- }
+ if (!bnxt_ulp_registered(edev) ||
+ !(edev->flags & BNXT_EN_FLAG_ULP_STOPPED))
+ goto ulp_start_exit;
if (edev->ulp_tbl->msix_requested)
bnxt_fill_msix_vecs(bp, edev->msix_entries);
@@ -287,6 +281,8 @@ void bnxt_ulp_start(struct bnxt *bp, int err)
adrv->resume(adev);
}
}
+ulp_start_exit:
+ edev->flags &= ~BNXT_EN_FLAG_ULP_STOPPED;
mutex_unlock(&edev->en_dev_lock);
}
--
2.30.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH net 2/3] bnxt_en: Add a helper function to configure MRU and RSS
2025-06-13 23:18 [PATCH net 0/3] bnxt_en: Bug fixes Michael Chan
2025-06-13 23:18 ` [PATCH net 1/3] bnxt_en: Fix double invocation of bnxt_ulp_stop()/bnxt_ulp_start() Michael Chan
@ 2025-06-13 23:18 ` Michael Chan
2025-06-13 23:18 ` [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset Michael Chan
2025-06-17 23:10 ` [PATCH net 0/3] bnxt_en: Bug fixes patchwork-bot+netdevbpf
3 siblings, 0 replies; 23+ messages in thread
From: Michael Chan @ 2025-06-13 23:18 UTC (permalink / raw)
To: davem
Cc: netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi,
andrew.gospodarek, Simon Horman, David Wei
From: Pavan Chebbi <pavan.chebbi@broadcom.com>
Add a new helper function that will configure MRU and RSS table
of a VNIC. This will be useful when we configure both on a VNIC
when resetting an RX ring. This function will be used again in
the next bug fix patch where we have to reconfigure VNICs for RSS
contexts.
Suggested-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: David Wei <dw@davidwei.uk>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
Cc: David Wei <dw@davidwei.uk>
v1: https://lore.kernel.org/netdev/20250519204130.3097027-3-michael.chan@broadcom.com/
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 37 ++++++++++++++++-------
1 file changed, 26 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 869580b6f70d..dfd2366d4c8c 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -10780,6 +10780,26 @@ void bnxt_del_one_rss_ctx(struct bnxt *bp, struct bnxt_rss_ctx *rss_ctx,
bp->num_rss_ctx--;
}
+static int bnxt_set_vnic_mru_p5(struct bnxt *bp, struct bnxt_vnic_info *vnic,
+ u16 mru)
+{
+ int rc;
+
+ if (mru) {
+ rc = bnxt_hwrm_vnic_set_rss_p5(bp, vnic, true);
+ if (rc) {
+ netdev_err(bp->dev, "hwrm vnic %d set rss failure rc: %d\n",
+ vnic->vnic_id, rc);
+ return rc;
+ }
+ }
+ vnic->mru = mru;
+ bnxt_hwrm_vnic_update(bp, vnic,
+ VNIC_UPDATE_REQ_ENABLES_MRU_VALID);
+
+ return 0;
+}
+
static void bnxt_hwrm_realloc_rss_ctx_vnic(struct bnxt *bp)
{
bool set_tpa = !!(bp->flags & BNXT_FLAG_TPA);
@@ -15927,6 +15947,7 @@ static int bnxt_queue_start(struct net_device *dev, void *qmem, int idx)
struct bnxt_vnic_info *vnic;
struct bnxt_napi *bnapi;
int i, rc;
+ u16 mru;
rxr = &bp->rx_ring[idx];
clone = qmem;
@@ -15977,18 +15998,13 @@ static int bnxt_queue_start(struct net_device *dev, void *qmem, int idx)
napi_enable_locked(&bnapi->napi);
bnxt_db_nq_arm(bp, &cpr->cp_db, cpr->cp_raw_cons);
+ mru = bp->dev->mtu + ETH_HLEN + VLAN_HLEN;
for (i = 0; i < bp->nr_vnics; i++) {
vnic = &bp->vnic_info[i];
- rc = bnxt_hwrm_vnic_set_rss_p5(bp, vnic, true);
- if (rc) {
- netdev_err(bp->dev, "hwrm vnic %d set rss failure rc: %d\n",
- vnic->vnic_id, rc);
+ rc = bnxt_set_vnic_mru_p5(bp, vnic, mru);
+ if (rc)
return rc;
- }
- vnic->mru = bp->dev->mtu + ETH_HLEN + VLAN_HLEN;
- bnxt_hwrm_vnic_update(bp, vnic,
- VNIC_UPDATE_REQ_ENABLES_MRU_VALID);
}
return 0;
@@ -16013,9 +16029,8 @@ static int bnxt_queue_stop(struct net_device *dev, void *qmem, int idx)
for (i = 0; i < bp->nr_vnics; i++) {
vnic = &bp->vnic_info[i];
- vnic->mru = 0;
- bnxt_hwrm_vnic_update(bp, vnic,
- VNIC_UPDATE_REQ_ENABLES_MRU_VALID);
+
+ bnxt_set_vnic_mru_p5(bp, vnic, 0);
}
/* Make sure NAPI sees that the VNIC is disabled */
synchronize_net();
--
2.30.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset
2025-06-13 23:18 [PATCH net 0/3] bnxt_en: Bug fixes Michael Chan
2025-06-13 23:18 ` [PATCH net 1/3] bnxt_en: Fix double invocation of bnxt_ulp_stop()/bnxt_ulp_start() Michael Chan
2025-06-13 23:18 ` [PATCH net 2/3] bnxt_en: Add a helper function to configure MRU and RSS Michael Chan
@ 2025-06-13 23:18 ` Michael Chan
2025-06-16 13:38 ` Simon Horman
2025-06-17 23:10 ` [PATCH net 0/3] bnxt_en: Bug fixes patchwork-bot+netdevbpf
3 siblings, 1 reply; 23+ messages in thread
From: Michael Chan @ 2025-06-13 23:18 UTC (permalink / raw)
To: davem
Cc: netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi,
andrew.gospodarek, David Wei
From: Pavan Chebbi <pavan.chebbi@broadcom.com>
The commit under the Fixes tag below which updates the VNICs' RSS
and MRU during .ndo_queue_start(), needs to be extended to cover any
non-default RSS contexts which have their own VNICs. Without this
step, packets that are destined to a non-default RSS context may be
dropped after .ndo_queue_start().
We further optimize this scheme by updating the VNIC only if the
RX ring being restarted is in the RSS table of the VNIC. Updating
the VNIC (in particular setting the MRU to 0) will momentarily stop
all traffic to all rings in the RSS table. Any VNIC that has the
RX ring excluded from the RSS table can skip this step and avoid the
traffic disruption.
Note that this scheme is just an improvement. A VNIC with multiple
rings in the RSS table will still see traffic disruptions to all rings
in the RSS table when one of the rings is being restarted. We are
working on a FW scheme that will improve upon this further.
Fixes: 5ac066b7b062 ("bnxt_en: Fix queue start to update vnic RSS table")
Reported-by: David Wei <dw@davidwei.uk>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
Cc: David Wei <dw@davidwei.uk>
v2: Reduce scope of traffic disruptions.
v1: https://lore.kernel.org/netdev/20250519204130.3097027-4-michael.chan@broadcom.com/
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 56 +++++++++++++++++++++--
1 file changed, 51 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index dfd2366d4c8c..2cb3185c442c 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -10780,11 +10780,39 @@ void bnxt_del_one_rss_ctx(struct bnxt *bp, struct bnxt_rss_ctx *rss_ctx,
bp->num_rss_ctx--;
}
+static bool bnxt_vnic_has_rx_ring(struct bnxt *bp, struct bnxt_vnic_info *vnic,
+ int rxr_id)
+{
+ u16 tbl_size = bnxt_get_rxfh_indir_size(bp->dev);
+ int i, vnic_rx;
+
+ /* Ntuple VNIC always has all the rx rings. Any change of ring id
+ * must be updated because a future filter may use it.
+ */
+ if (vnic->flags & BNXT_VNIC_NTUPLE_FLAG)
+ return true;
+
+ for (i = 0; i < tbl_size; i++) {
+ if (vnic->flags & BNXT_VNIC_RSSCTX_FLAG)
+ vnic_rx = ethtool_rxfh_context_indir(vnic->rss_ctx)[i];
+ else
+ vnic_rx = bp->rss_indir_tbl[i];
+
+ if (rxr_id == vnic_rx)
+ return true;
+ }
+
+ return false;
+}
+
static int bnxt_set_vnic_mru_p5(struct bnxt *bp, struct bnxt_vnic_info *vnic,
- u16 mru)
+ u16 mru, int rxr_id)
{
int rc;
+ if (!bnxt_vnic_has_rx_ring(bp, vnic, rxr_id))
+ return 0;
+
if (mru) {
rc = bnxt_hwrm_vnic_set_rss_p5(bp, vnic, true);
if (rc) {
@@ -10800,6 +10828,24 @@ static int bnxt_set_vnic_mru_p5(struct bnxt *bp, struct bnxt_vnic_info *vnic,
return 0;
}
+static int bnxt_set_rss_ctx_vnic_mru(struct bnxt *bp, u16 mru, int rxr_id)
+{
+ struct ethtool_rxfh_context *ctx;
+ unsigned long context;
+ int rc;
+
+ xa_for_each(&bp->dev->ethtool->rss_ctx, context, ctx) {
+ struct bnxt_rss_ctx *rss_ctx = ethtool_rxfh_context_priv(ctx);
+ struct bnxt_vnic_info *vnic = &rss_ctx->vnic;
+
+ rc = bnxt_set_vnic_mru_p5(bp, vnic, mru, rxr_id);
+ if (rc)
+ return rc;
+ }
+
+ return 0;
+}
+
static void bnxt_hwrm_realloc_rss_ctx_vnic(struct bnxt *bp)
{
bool set_tpa = !!(bp->flags & BNXT_FLAG_TPA);
@@ -16002,12 +16048,11 @@ static int bnxt_queue_start(struct net_device *dev, void *qmem, int idx)
for (i = 0; i < bp->nr_vnics; i++) {
vnic = &bp->vnic_info[i];
- rc = bnxt_set_vnic_mru_p5(bp, vnic, mru);
+ rc = bnxt_set_vnic_mru_p5(bp, vnic, mru, idx);
if (rc)
return rc;
}
-
- return 0;
+ return bnxt_set_rss_ctx_vnic_mru(bp, mru, idx);
err_reset:
netdev_err(bp->dev, "Unexpected HWRM error during queue start rc: %d\n",
@@ -16030,8 +16075,9 @@ static int bnxt_queue_stop(struct net_device *dev, void *qmem, int idx)
for (i = 0; i < bp->nr_vnics; i++) {
vnic = &bp->vnic_info[i];
- bnxt_set_vnic_mru_p5(bp, vnic, 0);
+ bnxt_set_vnic_mru_p5(bp, vnic, 0, idx);
}
+ bnxt_set_rss_ctx_vnic_mru(bp, 0, idx);
/* Make sure NAPI sees that the VNIC is disabled */
synchronize_net();
rxr = &bp->rx_ring[idx];
--
2.30.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset
2025-06-13 23:18 ` [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset Michael Chan
@ 2025-06-16 13:38 ` Simon Horman
2025-06-16 17:40 ` Michael Chan
0 siblings, 1 reply; 23+ messages in thread
From: Simon Horman @ 2025-06-16 13:38 UTC (permalink / raw)
To: Michael Chan
Cc: davem, netdev, edumazet, kuba, pabeni, andrew+netdev,
pavan.chebbi, andrew.gospodarek, David Wei
On Fri, Jun 13, 2025 at 04:18:41PM -0700, Michael Chan wrote:
> From: Pavan Chebbi <pavan.chebbi@broadcom.com>
>
> The commit under the Fixes tag below which updates the VNICs' RSS
> and MRU during .ndo_queue_start(), needs to be extended to cover any
> non-default RSS contexts which have their own VNICs. Without this
> step, packets that are destined to a non-default RSS context may be
> dropped after .ndo_queue_start().
Hi,
This patch seems to be doing two things:
1. Addressing the bug described above
2. Implementing the optimisation below
As such I think it would be best split into two patches.
And I'd lean towards targeting the optimisation at net-next
since "this scheme is just an improvement".
>
> We further optimize this scheme by updating the VNIC only if the
> RX ring being restarted is in the RSS table of the VNIC. Updating
> the VNIC (in particular setting the MRU to 0) will momentarily stop
> all traffic to all rings in the RSS table. Any VNIC that has the
> RX ring excluded from the RSS table can skip this step and avoid the
> traffic disruption.
>
> Note that this scheme is just an improvement. A VNIC with multiple
> rings in the RSS table will still see traffic disruptions to all rings
> in the RSS table when one of the rings is being restarted. We are
> working on a FW scheme that will improve upon this further.
>
> Fixes: 5ac066b7b062 ("bnxt_en: Fix queue start to update vnic RSS table")
> Reported-by: David Wei <dw@davidwei.uk>
> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
> Signed-off-by: Michael Chan <michael.chan@broadcom.com>
...
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net 1/3] bnxt_en: Fix double invocation of bnxt_ulp_stop()/bnxt_ulp_start()
2025-06-13 23:18 ` [PATCH net 1/3] bnxt_en: Fix double invocation of bnxt_ulp_stop()/bnxt_ulp_start() Michael Chan
@ 2025-06-16 13:39 ` Simon Horman
0 siblings, 0 replies; 23+ messages in thread
From: Simon Horman @ 2025-06-16 13:39 UTC (permalink / raw)
To: Michael Chan
Cc: davem, netdev, edumazet, kuba, pabeni, andrew+netdev,
pavan.chebbi, andrew.gospodarek, Kalesh AP
On Fri, Jun 13, 2025 at 04:18:39PM -0700, Michael Chan wrote:
> From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
>
> Before the commit under the Fixes tag below, bnxt_ulp_stop() and
> bnxt_ulp_start() were always invoked in pairs. After that commit,
> the new bnxt_ulp_restart() can be invoked after bnxt_ulp_stop()
> has been called. This may result in the RoCE driver's aux driver
> .suspend() method being invoked twice. The 2nd bnxt_re_suspend()
> call will crash when it dereferences a NULL pointer:
>
> (NULL ib_device): Handle device suspend call
> BUG: kernel NULL pointer dereference, address: 0000000000000b78
> PGD 0 P4D 0
> Oops: Oops: 0000 [#1] SMP PTI
> CPU: 20 UID: 0 PID: 181 Comm: kworker/u96:5 Tainted: G S 6.15.0-rc1 #4 PREEMPT(voluntary)
> Tainted: [S]=CPU_OUT_OF_SPEC
> Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.4.3 01/17/2017
> Workqueue: bnxt_pf_wq bnxt_sp_task [bnxt_en]
> RIP: 0010:bnxt_re_suspend+0x45/0x1f0 [bnxt_re]
> Code: 8b 05 a7 3c 5b f5 48 89 44 24 18 31 c0 49 8b 5c 24 08 4d 8b 2c 24 e8 ea 06 0a f4 48 c7 c6 04 60 52 c0 48 89 df e8 1b ce f9 ff <48> 8b 83 78 0b 00 00 48 8b 80 38 03 00 00 a8 40 0f 85 b5 00 00 00
> RSP: 0018:ffffa2e84084fd88 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000001
> RDX: 0000000000000000 RSI: ffffffffb4b6b934 RDI: 00000000ffffffff
> RBP: ffffa1760954c9c0 R08: 0000000000000000 R09: c0000000ffffdfff
> R10: 0000000000000001 R11: ffffa2e84084fb50 R12: ffffa176031ef070
> R13: ffffa17609775000 R14: ffffa17603adc180 R15: 0000000000000000
> FS: 0000000000000000(0000) GS:ffffa17daa397000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000b78 CR3: 00000004aaa30003 CR4: 00000000003706f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <TASK>
> bnxt_ulp_stop+0x69/0x90 [bnxt_en]
> bnxt_sp_task+0x678/0x920 [bnxt_en]
> ? __schedule+0x514/0xf50
> process_scheduled_works+0x9d/0x400
> worker_thread+0x11c/0x260
> ? __pfx_worker_thread+0x10/0x10
> kthread+0xfe/0x1e0
> ? __pfx_kthread+0x10/0x10
> ret_from_fork+0x2b/0x40
> ? __pfx_kthread+0x10/0x10
> ret_from_fork_asm+0x1a/0x30
>
> Check the BNXT_EN_FLAG_ULP_STOPPED flag and do not proceed if the flag
> is already set. This will preserve the original symmetrical
> bnxt_ulp_stop() and bnxt_ulp_start().
>
> Also, inside bnxt_ulp_start(), clear the BNXT_EN_FLAG_ULP_STOPPED
> flag after taking the mutex to avoid any race condition. And for
> symmetry, only proceed in bnxt_ulp_start() if the
> BNXT_EN_FLAG_ULP_STOPPED is set.
>
> Fixes: 3c163f35bd50 ("bnxt_en: Optimize recovery path ULP locking in the driver")
> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> Co-developed-by: Michael Chan <michael.chan@broadcom.com>
> Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset
2025-06-16 13:38 ` Simon Horman
@ 2025-06-16 17:40 ` Michael Chan
2025-06-16 20:48 ` Jakub Kicinski
0 siblings, 1 reply; 23+ messages in thread
From: Michael Chan @ 2025-06-16 17:40 UTC (permalink / raw)
To: Simon Horman
Cc: davem, netdev, edumazet, kuba, pabeni, andrew+netdev,
pavan.chebbi, andrew.gospodarek, David Wei
[-- Attachment #1: Type: text/plain, Size: 980 bytes --]
On Mon, Jun 16, 2025 at 6:39 AM Simon Horman <horms@kernel.org> wrote:
>
> On Fri, Jun 13, 2025 at 04:18:41PM -0700, Michael Chan wrote:
> > From: Pavan Chebbi <pavan.chebbi@broadcom.com>
> >
> > The commit under the Fixes tag below which updates the VNICs' RSS
> > and MRU during .ndo_queue_start(), needs to be extended to cover any
> > non-default RSS contexts which have their own VNICs. Without this
> > step, packets that are destined to a non-default RSS context may be
> > dropped after .ndo_queue_start().
>
> Hi,
>
> This patch seems to be doing two things:
> 1. Addressing the bug described above
> 2. Implementing the optimisation below
>
> As such I think it would be best split into two patches.
> And I'd lean towards targeting the optimisation at net-next
> since "this scheme is just an improvement".
The original fix (without the improvement) was rejected by Jakub and
that's why we are improving it here.
Jakub, what do you think?
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4196 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset
2025-06-16 17:40 ` Michael Chan
@ 2025-06-16 20:48 ` Jakub Kicinski
2025-06-16 21:00 ` Michael Chan
0 siblings, 1 reply; 23+ messages in thread
From: Jakub Kicinski @ 2025-06-16 20:48 UTC (permalink / raw)
To: Michael Chan
Cc: Simon Horman, davem, netdev, edumazet, pabeni, andrew+netdev,
pavan.chebbi, andrew.gospodarek, David Wei
On Mon, 16 Jun 2025 10:40:27 -0700 Michael Chan wrote:
> On Mon, Jun 16, 2025 at 6:39 AM Simon Horman <horms@kernel.org> wrote:
> > On Fri, Jun 13, 2025 at 04:18:41PM -0700, Michael Chan wrote:
> > > From: Pavan Chebbi <pavan.chebbi@broadcom.com>
> > >
> > > The commit under the Fixes tag below which updates the VNICs' RSS
> > > and MRU during .ndo_queue_start(), needs to be extended to cover any
> > > non-default RSS contexts which have their own VNICs. Without this
> > > step, packets that are destined to a non-default RSS context may be
> > > dropped after .ndo_queue_start().
> >
> > This patch seems to be doing two things:
> > 1. Addressing the bug described above
> > 2. Implementing the optimisation below
> >
> > As such I think it would be best split into two patches.
> > And I'd lean towards targeting the optimisation at net-next
> > since "this scheme is just an improvement".
>
> The original fix (without the improvement) was rejected by Jakub and
> that's why we are improving it here.
>
> Jakub, what do you think?
I think the phrasing of the commit message could be better, but the fix
is correct as is. We were shutting down just the main vNIC, now we shut
down all the vNICs to which the queue belongs.
It's not an "optimization" in the sense of an improvement to status quo,
IIUC Pavan means that shutting down the vNIC is still not 100% correct
for single queue reset, but best we can do with current FW. If we were
to split this into 2 changes, I don't think those changes would form a
logical progression: reset vNIC 0 (current) -> reset all (net) -> reset
the correct set of vNICs (net-next).. ?
I'd chalk this up to people writing sassy / opinion-tainted commit
messages after reviewers disagree with their initial implementation.
I tend not to fight it :)
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset
2025-06-16 20:48 ` Jakub Kicinski
@ 2025-06-16 21:00 ` Michael Chan
0 siblings, 0 replies; 23+ messages in thread
From: Michael Chan @ 2025-06-16 21:00 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Simon Horman, davem, netdev, edumazet, pabeni, andrew+netdev,
pavan.chebbi, andrew.gospodarek, David Wei
[-- Attachment #1: Type: text/plain, Size: 958 bytes --]
On Mon, Jun 16, 2025 at 1:48 PM Jakub Kicinski <kuba@kernel.org> wrote:
> I think the phrasing of the commit message could be better, but the fix
> is correct as is. We were shutting down just the main vNIC, now we shut
> down all the vNICs to which the queue belongs.
>
> It's not an "optimization" in the sense of an improvement to status quo,
> IIUC Pavan means that shutting down the vNIC is still not 100% correct
> for single queue reset, but best we can do with current FW.
Correct. This is the best we can do at the moment to limit the scope
of the reset. In the future with new FW, we should be able to limit
the traffic disruption to only the queue being reset for any vNIC.
> If we were
> to split this into 2 changes, I don't think those changes would form a
> logical progression: reset vNIC 0 (current) -> reset all (net) -> reset
> the correct set of vNICs (net-next).. ?
>
Agreed. This progression is not ideal.
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4196 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net 0/3] bnxt_en: Bug fixes
2025-06-13 23:18 [PATCH net 0/3] bnxt_en: Bug fixes Michael Chan
` (2 preceding siblings ...)
2025-06-13 23:18 ` [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset Michael Chan
@ 2025-06-17 23:10 ` patchwork-bot+netdevbpf
3 siblings, 0 replies; 23+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-06-17 23:10 UTC (permalink / raw)
To: Michael Chan
Cc: davem, netdev, edumazet, kuba, pabeni, andrew+netdev,
pavan.chebbi, andrew.gospodarek
Hello:
This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Fri, 13 Jun 2025 16:18:38 -0700 you wrote:
> Thie first patch fixes a crash during PCIe AER when the bnxt_re RoCE
> driver is loaded. The second patch is a refactor patch needed by
> patch 3. Patch 3 fixes a packet drop issue if queue restart is done
> on a ring belonging to a non-default RSS context. Patch 2 and 3 are
> version 2 that has addressed the v1 issue by reducing the scope of
> the traffic disruptions:
>
> [...]
Here is the summary with links:
- [net,1/3] bnxt_en: Fix double invocation of bnxt_ulp_stop()/bnxt_ulp_start()
https://git.kernel.org/netdev/net/c/1e9ac33fa271
- [net,2/3] bnxt_en: Add a helper function to configure MRU and RSS
https://git.kernel.org/netdev/net/c/e11baaea94e2
- [net,3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset
https://git.kernel.org/netdev/net/c/5dacc94c6fe6
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2025-06-17 23:10 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-13 23:18 [PATCH net 0/3] bnxt_en: Bug fixes Michael Chan
2025-06-13 23:18 ` [PATCH net 1/3] bnxt_en: Fix double invocation of bnxt_ulp_stop()/bnxt_ulp_start() Michael Chan
2025-06-16 13:39 ` Simon Horman
2025-06-13 23:18 ` [PATCH net 2/3] bnxt_en: Add a helper function to configure MRU and RSS Michael Chan
2025-06-13 23:18 ` [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset Michael Chan
2025-06-16 13:38 ` Simon Horman
2025-06-16 17:40 ` Michael Chan
2025-06-16 20:48 ` Jakub Kicinski
2025-06-16 21:00 ` Michael Chan
2025-06-17 23:10 ` [PATCH net 0/3] bnxt_en: Bug fixes patchwork-bot+netdevbpf
-- strict thread matches above, loose matches on Subject: below --
2025-05-19 20:41 [PATCH net 0/3] bnxt_en: 2 bug fixes Michael Chan
2025-05-19 20:41 ` [PATCH net 3/3] bnxt_en: Update MRU and RSS table of RSS contexts on queue reset Michael Chan
2025-05-20 14:42 ` Simon Horman
2025-05-20 14:59 ` David Wei
2025-05-21 1:28 ` Jakub Kicinski
2025-05-21 1:38 ` Michael Chan
2025-05-21 1:51 ` Jakub Kicinski
2025-05-21 2:10 ` Michael Chan
2025-05-21 2:17 ` Jakub Kicinski
2025-05-21 2:29 ` Michael Chan
2025-05-22 11:01 ` David Wei
2025-05-22 15:26 ` Jakub Kicinski
2025-05-27 15:37 ` David Wei
2025-05-27 16:50 ` Michael Chan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).