netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/4] bnxt_en: Update for net-next
@ 2025-04-15 17:48 Michael Chan
  2025-04-15 17:48 ` [PATCH net-next 1/4] bnxt_en: Change FW message timeout warning Michael Chan
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Michael Chan @ 2025-04-15 17:48 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi,
	andrew.gospodarek

The first patch changes the FW message timeout threshold for a warning
message.  The second patch adjusts the ethtool -w coredump length to
suppress a warning.  The last 2 patches are small cleanup patches for
the bnxt_ulp RoCE auxbus code.

Kalesh AP (2):
  bnxt_en: Remove unused field "ref_count" in struct bnxt_ulp
  bnxt_en: Remove unused macros in bnxt_ulp.h

Michael Chan (1):
  bnxt_en: Change FW message timeout warning

Shruti Parab (1):
  bnxt_en: Report the ethtool coredump length after copying the coredump

 drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.c | 11 +++++++++--
 drivers/net/ethernet/broadcom/bnxt/bnxt_hwrm.h     |  2 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c      |  5 -----
 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h      |  4 ----
 4 files changed, 10 insertions(+), 12 deletions(-)

-- 
2.30.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH net-next 1/4] bnxt_en: Change FW message timeout warning
  2025-04-15 17:48 [PATCH net-next 0/4] bnxt_en: Update for net-next Michael Chan
@ 2025-04-15 17:48 ` Michael Chan
  2025-04-16  3:14   ` Jakub Kicinski
  2025-04-15 17:48 ` [PATCH net-next 2/4] bnxt_en: Report the ethtool coredump length after copying the coredump Michael Chan
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Michael Chan @ 2025-04-15 17:48 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi,
	andrew.gospodarek, Kalesh AP

The firmware advertises a "hwrm_cmd_max_timeout" value to the driver
for NVRAM and coredump related functions that can take tens of seconds
to complete.  The driver polls for the operation to complete under
mutex and may trigger hung task watchdog warning if the wait is too long.
To warn the user about this, the driver currently prints a warning if
this advertised value exceeds 40 seconds:

Device requests max timeout of %d seconds, may trigger hung task watchdog

The original hope was that newer FW would be able to reduce this timeout
below 40 seconds.  But 60 seconds is the timeout on most production FW
and cannot be reduced further.  Change the driver's warning threshold to
60 seconds to avoid triggering this warning on all production devices.

Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_hwrm.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_hwrm.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_hwrm.h
index 15ca51b5d204..fb5f5b063c3d 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_hwrm.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_hwrm.h
@@ -58,7 +58,7 @@ void hwrm_update_token(struct bnxt *bp, u16 seq, enum bnxt_hwrm_wait_state s);
 
 #define BNXT_HWRM_MAX_REQ_LEN		(bp->hwrm_max_req_len)
 #define BNXT_HWRM_SHORT_REQ_LEN		sizeof(struct hwrm_short_input)
-#define HWRM_CMD_MAX_TIMEOUT		40000U
+#define HWRM_CMD_MAX_TIMEOUT		60000U
 #define SHORT_HWRM_CMD_TIMEOUT		20
 #define HWRM_CMD_TIMEOUT		(bp->hwrm_cmd_timeout)
 #define HWRM_RESET_TIMEOUT		((HWRM_CMD_TIMEOUT) * 4)
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net-next 2/4] bnxt_en: Report the ethtool coredump length after copying the coredump
  2025-04-15 17:48 [PATCH net-next 0/4] bnxt_en: Update for net-next Michael Chan
  2025-04-15 17:48 ` [PATCH net-next 1/4] bnxt_en: Change FW message timeout warning Michael Chan
@ 2025-04-15 17:48 ` Michael Chan
  2025-04-15 17:48 ` [PATCH net-next 3/4] bnxt_en: Remove unused field "ref_count" in struct bnxt_ulp Michael Chan
  2025-04-15 17:48 ` [PATCH net-next 4/4] bnxt_en: Remove unused macros in bnxt_ulp.h Michael Chan
  3 siblings, 0 replies; 9+ messages in thread
From: Michael Chan @ 2025-04-15 17:48 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi,
	andrew.gospodarek, Shruti Parab, Damodharam Ammepalli

From: Shruti Parab <shruti.parab@broadcom.com>

ethtool first calls .get_dump_flags() to get the dump length.  For
coredump, the driver calls the FW to get the coredump length (L1).  The
min. of L1 and the user specified length is then passed to
.get_dump_data() (L2) to get the coredump.  The actual coredump length
retrieved by the FW (L3) during .get_dump_data() may be smaller than L1.
This length discrepancy will trigger a WARN_ON() in
ethtool_get_dump_data().

ethtool has already vzalloc'ed a buffer with size L1.  Just report
the coredump length as L2 even though the actual coredump length L3
may be smaller.  The extra zero padding does not matter.  This will
prevent the warning that may alarm the user.

For correctness, only do the final length update if there is no error.

Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com>
Signed-off-by: Shruti Parab <shruti.parab@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.c
index 5576e7cf8463..9b6489e417fc 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.c
@@ -496,9 +496,16 @@ static int __bnxt_get_coredump(struct bnxt *bp, u16 dump_type, void *buf,
 					  start_utc, coredump.total_segs + 1,
 					  rc);
 	kfree(coredump.data);
-	*dump_len += sizeof(struct bnxt_coredump_record);
-	if (rc == -ENOBUFS)
+	if (!rc) {
+		*dump_len += sizeof(struct bnxt_coredump_record);
+		/* The actual coredump length can be smaller than the FW
+		 * reported length earlier.  Use the ethtool provided length.
+		 */
+		if (buf_len)
+			*dump_len = buf_len;
+	} else if (rc == -ENOBUFS) {
 		netdev_err(bp->dev, "Firmware returned large coredump buffer\n");
+	}
 	return rc;
 }
 
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net-next 3/4] bnxt_en: Remove unused field "ref_count" in struct bnxt_ulp
  2025-04-15 17:48 [PATCH net-next 0/4] bnxt_en: Update for net-next Michael Chan
  2025-04-15 17:48 ` [PATCH net-next 1/4] bnxt_en: Change FW message timeout warning Michael Chan
  2025-04-15 17:48 ` [PATCH net-next 2/4] bnxt_en: Report the ethtool coredump length after copying the coredump Michael Chan
@ 2025-04-15 17:48 ` Michael Chan
  2025-04-15 17:48 ` [PATCH net-next 4/4] bnxt_en: Remove unused macros in bnxt_ulp.h Michael Chan
  3 siblings, 0 replies; 9+ messages in thread
From: Michael Chan @ 2025-04-15 17:48 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi,
	andrew.gospodarek, Kalesh AP, Somnath Kotur

From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>

The "ref_count" field in struct bnxt_ulp is unused after
commit a43c26fa2e6c ("RDMA/bnxt_re: Remove the sriov config callback").
So we can just remove it now.

Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 5 -----
 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h | 1 -
 2 files changed, 6 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
index a8e930d5dbb0..238db9a1aebf 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
@@ -148,7 +148,6 @@ void bnxt_unregister_dev(struct bnxt_en_dev *edev)
 	struct net_device *dev = edev->net;
 	struct bnxt *bp = netdev_priv(dev);
 	struct bnxt_ulp *ulp;
-	int i = 0;
 
 	ulp = edev->ulp_tbl;
 	netdev_lock(dev);
@@ -164,10 +163,6 @@ void bnxt_unregister_dev(struct bnxt_en_dev *edev)
 	synchronize_rcu();
 	ulp->max_async_event_id = 0;
 	ulp->async_events_bmap = NULL;
-	while (atomic_read(&ulp->ref_count) != 0 && i < 10) {
-		msleep(100);
-		i++;
-	}
 	mutex_unlock(&edev->en_dev_lock);
 	netdev_unlock(dev);
 	return;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h
index 7fa3b8d1ebd2..f6b5efb5e775 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h
@@ -50,7 +50,6 @@ struct bnxt_ulp {
 	unsigned long	*async_events_bmap;
 	u16		max_async_event_id;
 	u16		msix_requested;
-	atomic_t	ref_count;
 };
 
 struct bnxt_en_dev {
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net-next 4/4] bnxt_en: Remove unused macros in bnxt_ulp.h
  2025-04-15 17:48 [PATCH net-next 0/4] bnxt_en: Update for net-next Michael Chan
                   ` (2 preceding siblings ...)
  2025-04-15 17:48 ` [PATCH net-next 3/4] bnxt_en: Remove unused field "ref_count" in struct bnxt_ulp Michael Chan
@ 2025-04-15 17:48 ` Michael Chan
  3 siblings, 0 replies; 9+ messages in thread
From: Michael Chan @ 2025-04-15 17:48 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi,
	andrew.gospodarek, Kalesh AP, Shruti Parab

From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>

BNXT_ROCE_ULP and BNXT_MAX_ULP are no longer used.  Remove them to
clean up the code.

Reviewed-by: Shruti Parab <shruti.parab@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h
index f6b5efb5e775..7b9dd8ebe4bc 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h
@@ -10,9 +10,6 @@
 #ifndef BNXT_ULP_H
 #define BNXT_ULP_H
 
-#define BNXT_ROCE_ULP	0
-#define BNXT_MAX_ULP	1
-
 #define BNXT_MIN_ROCE_CP_RINGS	2
 #define BNXT_MIN_ROCE_STAT_CTXS	1
 
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 1/4] bnxt_en: Change FW message timeout warning
  2025-04-15 17:48 ` [PATCH net-next 1/4] bnxt_en: Change FW message timeout warning Michael Chan
@ 2025-04-16  3:14   ` Jakub Kicinski
  2025-04-16 21:41     ` Michael Chan
  0 siblings, 1 reply; 9+ messages in thread
From: Jakub Kicinski @ 2025-04-16  3:14 UTC (permalink / raw)
  To: Michael Chan
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, pavan.chebbi,
	andrew.gospodarek, Kalesh AP

On Tue, 15 Apr 2025 10:48:15 -0700 Michael Chan wrote:
> The firmware advertises a "hwrm_cmd_max_timeout" value to the driver
> for NVRAM and coredump related functions that can take tens of seconds
> to complete.  The driver polls for the operation to complete under
> mutex and may trigger hung task watchdog warning if the wait is too long.
> To warn the user about this, the driver currently prints a warning if
> this advertised value exceeds 40 seconds:
> 
> Device requests max timeout of %d seconds, may trigger hung task watchdog
> 
> The original hope was that newer FW would be able to reduce this timeout
> below 40 seconds.  But 60 seconds is the timeout on most production FW
> and cannot be reduced further.  Change the driver's warning threshold to
> 60 seconds to avoid triggering this warning on all production devices.
> 
> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
> Signed-off-by: Michael Chan <michael.chan@broadcom.com>
> ---
>  drivers/net/ethernet/broadcom/bnxt/bnxt_hwrm.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_hwrm.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_hwrm.h
> index 15ca51b5d204..fb5f5b063c3d 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_hwrm.h
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_hwrm.h
> @@ -58,7 +58,7 @@ void hwrm_update_token(struct bnxt *bp, u16 seq, enum bnxt_hwrm_wait_state s);
>  
>  #define BNXT_HWRM_MAX_REQ_LEN		(bp->hwrm_max_req_len)
>  #define BNXT_HWRM_SHORT_REQ_LEN		sizeof(struct hwrm_short_input)
> -#define HWRM_CMD_MAX_TIMEOUT		40000U
> +#define HWRM_CMD_MAX_TIMEOUT		60000U
>  #define SHORT_HWRM_CMD_TIMEOUT		20
>  #define HWRM_CMD_TIMEOUT		(bp->hwrm_cmd_timeout)
>  #define HWRM_RESET_TIMEOUT		((HWRM_CMD_TIMEOUT) * 4)

sysctl_hung_task_timeout_secs is an exported symbol, and it defaults to 120.
Should you not use it in the warning (assuming I understand the intent
there)?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 1/4] bnxt_en: Change FW message timeout warning
  2025-04-16  3:14   ` Jakub Kicinski
@ 2025-04-16 21:41     ` Michael Chan
  2025-04-16 21:44       ` Jakub Kicinski
  0 siblings, 1 reply; 9+ messages in thread
From: Michael Chan @ 2025-04-16 21:41 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, pavan.chebbi,
	andrew.gospodarek, Kalesh AP

[-- Attachment #1: Type: text/plain, Size: 629 bytes --]

On Tue, Apr 15, 2025 at 8:14 PM Jakub Kicinski <kuba@kernel.org> wrote:

> sysctl_hung_task_timeout_secs is an exported symbol, and it defaults to 120.
> Should you not use it in the warning (assuming I understand the intent
> there)?
Yes, we have considered that.  This is only printed once at driver
load time, but the sysctl value can be changed at any time after the
driver is loaded.  So we just want to use a reasonable value well
below the default sysctl_hung_task_timeout_secs value as the
threshold.

But we can reference and compare with the
sysctl_hung_task_timeout_secs value if that makes more sense.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4196 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 1/4] bnxt_en: Change FW message timeout warning
  2025-04-16 21:41     ` Michael Chan
@ 2025-04-16 21:44       ` Jakub Kicinski
  2025-04-16 22:07         ` Michael Chan
  0 siblings, 1 reply; 9+ messages in thread
From: Jakub Kicinski @ 2025-04-16 21:44 UTC (permalink / raw)
  To: Michael Chan
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, pavan.chebbi,
	andrew.gospodarek, Kalesh AP

On Wed, 16 Apr 2025 14:41:10 -0700 Michael Chan wrote:
> On Tue, Apr 15, 2025 at 8:14 PM Jakub Kicinski <kuba@kernel.org> wrote:
> 
> > sysctl_hung_task_timeout_secs is an exported symbol, and it defaults to 120.
> > Should you not use it in the warning (assuming I understand the intent
> > there)?  
> Yes, we have considered that.  This is only printed once at driver
> load time, but the sysctl value can be changed at any time after the
> driver is loaded.  So we just want to use a reasonable value well
> below the default sysctl_hung_task_timeout_secs value as the
> threshold.
> 
> But we can reference and compare with the
> sysctl_hung_task_timeout_secs value if that makes more sense.

I see your point.  We could also check against
CONFIG_DEFAULT_HUNG_TASK_TIMEOUT ?

I noticed that some arches set this value really low (10 or 20 sec),
it may be worth warning the users in such cases.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 1/4] bnxt_en: Change FW message timeout warning
  2025-04-16 21:44       ` Jakub Kicinski
@ 2025-04-16 22:07         ` Michael Chan
  0 siblings, 0 replies; 9+ messages in thread
From: Michael Chan @ 2025-04-16 22:07 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, pavan.chebbi,
	andrew.gospodarek, Kalesh AP

[-- Attachment #1: Type: text/plain, Size: 1122 bytes --]

On Wed, Apr 16, 2025 at 2:44 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Wed, 16 Apr 2025 14:41:10 -0700 Michael Chan wrote:
> > On Tue, Apr 15, 2025 at 8:14 PM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > > sysctl_hung_task_timeout_secs is an exported symbol, and it defaults to 120.
> > > Should you not use it in the warning (assuming I understand the intent
> > > there)?
> > Yes, we have considered that.  This is only printed once at driver
> > load time, but the sysctl value can be changed at any time after the
> > driver is loaded.  So we just want to use a reasonable value well
> > below the default sysctl_hung_task_timeout_secs value as the
> > threshold.
> >
> > But we can reference and compare with the
> > sysctl_hung_task_timeout_secs value if that makes more sense.
>
> I see your point.  We could also check against
> CONFIG_DEFAULT_HUNG_TASK_TIMEOUT ?
>
> I noticed that some arches set this value really low (10 or 20 sec),
> it may be worth warning the users in such cases.

OK, that sounds good.  I will check CONFIG_DEFAULT_HUNG_TASK_TIMEOUT
in v2.  Thanks.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4196 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-04-16 22:07 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-15 17:48 [PATCH net-next 0/4] bnxt_en: Update for net-next Michael Chan
2025-04-15 17:48 ` [PATCH net-next 1/4] bnxt_en: Change FW message timeout warning Michael Chan
2025-04-16  3:14   ` Jakub Kicinski
2025-04-16 21:41     ` Michael Chan
2025-04-16 21:44       ` Jakub Kicinski
2025-04-16 22:07         ` Michael Chan
2025-04-15 17:48 ` [PATCH net-next 2/4] bnxt_en: Report the ethtool coredump length after copying the coredump Michael Chan
2025-04-15 17:48 ` [PATCH net-next 3/4] bnxt_en: Remove unused field "ref_count" in struct bnxt_ulp Michael Chan
2025-04-15 17:48 ` [PATCH net-next 4/4] bnxt_en: Remove unused macros in bnxt_ulp.h Michael Chan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).