* [PATCH net-next 1/3] net: ethtool: Track pause storm events
2026-01-22 19:21 [PATCH net-next 0/3] net: ethtool: Track TX pause storm Mohsin Bashir
@ 2026-01-22 19:21 ` Mohsin Bashir
2026-01-23 21:27 ` Oleksij Rempel
2026-01-22 19:21 ` [PATCH net-next 2/3] eth: fbnic: Add protection against pause storm Mohsin Bashir
` (3 subsequent siblings)
4 siblings, 1 reply; 18+ messages in thread
From: Mohsin Bashir @ 2026-01-22 19:21 UTC (permalink / raw)
To: netdev
Cc: alexanderduyck, alok.a.tiwari, andrew+netdev, andrew, chuck.lever,
davem, donald.hunter, edumazet, gal, horms, idosch,
jacob.e.keller, kernel-team, kory.maincent, kuba, lee,
mohsin.bashr, o.rempel, pabeni, vadim.fedorenko
With TX pause enabled, if a device is unable to pass packets up to the
stack (e.g., CPU is hanged), the device can cause pause storm. Given
that devices can have native support to protect the neighbor from such
flooding, such events need some tracking. This support is to track TX
pause storm events for better observability.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
---
Documentation/netlink/specs/ethtool.yaml | 13 +++++++++++++
include/linux/ethtool.h | 2 ++
include/uapi/linux/ethtool_netlink_generated.h | 1 +
net/ethtool/pause.c | 4 +++-
4 files changed, 19 insertions(+), 1 deletion(-)
diff --git a/Documentation/netlink/specs/ethtool.yaml b/Documentation/netlink/specs/ethtool.yaml
index 0a2d2343f79a..4707063af3b4 100644
--- a/Documentation/netlink/specs/ethtool.yaml
+++ b/Documentation/netlink/specs/ethtool.yaml
@@ -879,6 +879,19 @@ attribute-sets:
-
name: rx-frames
type: u64
+ -
+ name: tx-pause-storm-events
+ type: u64
+ doc: >-
+ TX pause storm event count. Increments each time device
+ detects that its pause assertion condition has been true
+ for too long for normal operation. As a result, the device
+ has temporarily disabled its own Pause TX function to
+ protect the network from itself.
+ This counter should never increment under normal overload
+ conditions; it indicates catastrophic failure like an OS
+ crash. The rate of incrementing is implementation specific.
+
-
name: pause
attr-cnt-name: __ethtool-a-pause-cnt
diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 798abec67a1b..83c375840835 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -512,12 +512,14 @@ struct ethtool_eth_ctrl_stats {
*
* Equivalent to `30.3.4.3 aPAUSEMACCtrlFramesReceived`
* from the standard.
+ * @tx_pause_storm_events: TX pause storm event count (see ethtool.yaml).
*/
struct ethtool_pause_stats {
enum ethtool_mac_stats_src src;
struct_group(stats,
u64 tx_pause_frames;
u64 rx_pause_frames;
+ u64 tx_pause_storm_events;
);
};
diff --git a/include/uapi/linux/ethtool_netlink_generated.h b/include/uapi/linux/ethtool_netlink_generated.h
index 556a0c834df5..114b83017297 100644
--- a/include/uapi/linux/ethtool_netlink_generated.h
+++ b/include/uapi/linux/ethtool_netlink_generated.h
@@ -381,6 +381,7 @@ enum {
ETHTOOL_A_PAUSE_STAT_PAD,
ETHTOOL_A_PAUSE_STAT_TX_FRAMES,
ETHTOOL_A_PAUSE_STAT_RX_FRAMES,
+ ETHTOOL_A_PAUSE_STAT_TX_PAUSE_STORM_EVENTS,
__ETHTOOL_A_PAUSE_STAT_CNT,
ETHTOOL_A_PAUSE_STAT_MAX = (__ETHTOOL_A_PAUSE_STAT_CNT - 1)
diff --git a/net/ethtool/pause.c b/net/ethtool/pause.c
index 0f9af1e66548..5d28f642764c 100644
--- a/net/ethtool/pause.c
+++ b/net/ethtool/pause.c
@@ -130,7 +130,9 @@ static int pause_put_stats(struct sk_buff *skb,
if (ethtool_put_stat(skb, pause_stats->tx_pause_frames,
ETHTOOL_A_PAUSE_STAT_TX_FRAMES, pad) ||
ethtool_put_stat(skb, pause_stats->rx_pause_frames,
- ETHTOOL_A_PAUSE_STAT_RX_FRAMES, pad))
+ ETHTOOL_A_PAUSE_STAT_RX_FRAMES, pad) ||
+ ethtool_put_stat(skb, pause_stats->tx_pause_storm_events,
+ ETHTOOL_A_PAUSE_STAT_TX_PAUSE_STORM_EVENTS, pad))
goto err_cancel;
nla_nest_end(skb, nest);
--
2.47.3
^ permalink raw reply related [flat|nested] 18+ messages in thread* Re: [PATCH net-next 1/3] net: ethtool: Track pause storm events
2026-01-22 19:21 ` [PATCH net-next 1/3] net: ethtool: Track pause storm events Mohsin Bashir
@ 2026-01-23 21:27 ` Oleksij Rempel
2026-01-23 22:15 ` Jakub Kicinski
0 siblings, 1 reply; 18+ messages in thread
From: Oleksij Rempel @ 2026-01-23 21:27 UTC (permalink / raw)
To: Mohsin Bashir
Cc: netdev, alexanderduyck, alok.a.tiwari, andrew+netdev, andrew,
chuck.lever, davem, donald.hunter, edumazet, gal, horms, idosch,
jacob.e.keller, kernel-team, kory.maincent, kuba, lee, pabeni,
vadim.fedorenko
On Thu, Jan 22, 2026 at 11:21:56AM -0800, Mohsin Bashir wrote:
> With TX pause enabled, if a device is unable to pass packets up to the
> stack (e.g., CPU is hanged), the device can cause pause storm. Given
> that devices can have native support to protect the neighbor from such
> flooding, such events need some tracking. This support is to track TX
> pause storm events for better observability.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
> ---
> Documentation/netlink/specs/ethtool.yaml | 13 +++++++++++++
> include/linux/ethtool.h | 2 ++
> include/uapi/linux/ethtool_netlink_generated.h | 1 +
> net/ethtool/pause.c | 4 +++-
> 4 files changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/netlink/specs/ethtool.yaml b/Documentation/netlink/specs/ethtool.yaml
> index 0a2d2343f79a..4707063af3b4 100644
> --- a/Documentation/netlink/specs/ethtool.yaml
> +++ b/Documentation/netlink/specs/ethtool.yaml
> @@ -879,6 +879,19 @@ attribute-sets:
> -
> name: rx-frames
> type: u64
> + -
> + name: tx-pause-storm-events
> + type: u64
> + doc: >-
> + TX pause storm event count. Increments each time device
> + detects that its pause assertion condition has been true
> + for too long for normal operation. As a result, the device
> + has temporarily disabled its own Pause TX function to
> + protect the network from itself.
> + This counter should never increment under normal overload
> + conditions; it indicates catastrophic failure like an OS
> + crash. The rate of incrementing is implementation specific.
Hm, we already have the tx pause frame counters. So, the anomaly is
visible to the user anyway (even if it isn't explicitly labeled as an
anomaly).
What is not visible to the user is when HW or SW disables flow control.
Maybe that is what the counter should represent and be named? Would
tx-pause-auto-disabled-events make sense?
The reason I do not like tx-pause-storm-events is that the meaning is
device specific; the user has to read the device manual to know what it
actually means.
tx-pause-auto-disabled-events can be reused in more cases - every time
we try to pause flow control for some reason.
--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net-next 1/3] net: ethtool: Track pause storm events
2026-01-23 21:27 ` Oleksij Rempel
@ 2026-01-23 22:15 ` Jakub Kicinski
2026-01-24 9:28 ` Oleksij Rempel
0 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2026-01-23 22:15 UTC (permalink / raw)
To: Oleksij Rempel
Cc: Mohsin Bashir, netdev, alexanderduyck, alok.a.tiwari,
andrew+netdev, andrew, chuck.lever, davem, donald.hunter,
edumazet, gal, horms, idosch, jacob.e.keller, kernel-team,
kory.maincent, lee, pabeni, vadim.fedorenko
On Fri, 23 Jan 2026 22:27:19 +0100 Oleksij Rempel wrote:
> > + -
> > + name: tx-pause-storm-events
> > + type: u64
> > + doc: >-
> > + TX pause storm event count. Increments each time device
> > + detects that its pause assertion condition has been true
> > + for too long for normal operation. As a result, the device
> > + has temporarily disabled its own Pause TX function to
> > + protect the network from itself.
> > + This counter should never increment under normal overload
> > + conditions; it indicates catastrophic failure like an OS
> > + crash. The rate of incrementing is implementation specific.
>
> Hm, we already have the tx pause frame counters. So, the anomaly is
> visible to the user anyway (even if it isn't explicitly labeled as an
> anomaly).
We are trying to prove a negative here, that's why we need a new
counter. As the doc says this counter should indicate that storm
is never actually detected under normal conditions. Another thing
to keep in mind is that we're talking about metric collection at scale,
so every 1min to 5min.
> What is not visible to the user is when HW or SW disables flow control.
> Maybe that is what the counter should represent and be named? Would
> tx-pause-auto-disabled-events make sense?
According to our existing uAPI for PFC pause storm is the term of art.
> The reason I do not like tx-pause-storm-events is that the meaning is
> device specific; the user has to read the device manual to know what it
> actually means.
>
> tx-pause-auto-disabled-events can be reused in more cases - every time
> we try to pause flow control for some reason.
TBH I feel like you may be overestimating your ability to do anything
like that in the SW here. The silicon can do this cycle-accurate, FIFO
pressure never relieved. In SW you have to poll, and if you can poll
why not just read the packets from the fifo and let the pipe move?
On the "device manual" point, pause frames as an estimate of congestion
are also quite useless device to device. You have to "read the manual".
Different devices use different pause quanta so to speak.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net-next 1/3] net: ethtool: Track pause storm events
2026-01-23 22:15 ` Jakub Kicinski
@ 2026-01-24 9:28 ` Oleksij Rempel
0 siblings, 0 replies; 18+ messages in thread
From: Oleksij Rempel @ 2026-01-24 9:28 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Mohsin Bashir, netdev, alexanderduyck, alok.a.tiwari,
andrew+netdev, andrew, chuck.lever, davem, donald.hunter,
edumazet, gal, horms, idosch, jacob.e.keller, kernel-team,
kory.maincent, lee, pabeni, vadim.fedorenko
On Fri, Jan 23, 2026 at 02:15:27PM -0800, Jakub Kicinski wrote:
> On Fri, 23 Jan 2026 22:27:19 +0100 Oleksij Rempel wrote:
> > > + -
> > > + name: tx-pause-storm-events
> > > + type: u64
> > > + doc: >-
> > > + TX pause storm event count. Increments each time device
> > > + detects that its pause assertion condition has been true
> > > + for too long for normal operation. As a result, the device
> > > + has temporarily disabled its own Pause TX function to
> > > + protect the network from itself.
> > > + This counter should never increment under normal overload
> > > + conditions; it indicates catastrophic failure like an OS
> > > + crash. The rate of incrementing is implementation specific.
> >
> > Hm, we already have the tx pause frame counters. So, the anomaly is
> > visible to the user anyway (even if it isn't explicitly labeled as an
> > anomaly).
>
> We are trying to prove a negative here, that's why we need a new
> counter. As the doc says this counter should indicate that storm
> is never actually detected under normal conditions. Another thing
> to keep in mind is that we're talking about metric collection at scale,
> so every 1min to 5min.
>
> > What is not visible to the user is when HW or SW disables flow control.
> > Maybe that is what the counter should represent and be named? Would
> > tx-pause-auto-disabled-events make sense?
>
> According to our existing uAPI for PFC pause storm is the term of art.
Fair enough. If it is aligned with the existing interface and doing the
same thing, there is no reason to continue this discussion.
For the uAPI part:
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Thank you,
Oleksij
--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH net-next 2/3] eth: fbnic: Add protection against pause storm
2026-01-22 19:21 [PATCH net-next 0/3] net: ethtool: Track TX pause storm Mohsin Bashir
2026-01-22 19:21 ` [PATCH net-next 1/3] net: ethtool: Track pause storm events Mohsin Bashir
@ 2026-01-22 19:21 ` Mohsin Bashir
2026-01-22 19:21 ` [PATCH net-next 3/3] eth: fbnic: Fetch TX pause storm stats Mohsin Bashir
` (2 subsequent siblings)
4 siblings, 0 replies; 18+ messages in thread
From: Mohsin Bashir @ 2026-01-22 19:21 UTC (permalink / raw)
To: netdev
Cc: alexanderduyck, alok.a.tiwari, andrew+netdev, andrew, chuck.lever,
davem, donald.hunter, edumazet, gal, horms, idosch,
jacob.e.keller, kernel-team, kory.maincent, kuba, lee,
mohsin.bashr, o.rempel, pabeni, vadim.fedorenko
Enable protection against TX pause storm. When detected, the device
tolerates the storm for 500ms before it stops sending pause frames.
Use service task to bring the device back to the normal operation,
preparing it for any subsequent episode of pause storm.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
---
drivers/net/ethernet/meta/fbnic/fbnic_csr.h | 10 ++++
drivers/net/ethernet/meta/fbnic/fbnic_irq.c | 2 +
drivers/net/ethernet/meta/fbnic/fbnic_mac.c | 62 +++++++++++++++++++--
drivers/net/ethernet/meta/fbnic/fbnic_mac.h | 15 +++++
drivers/net/ethernet/meta/fbnic/fbnic_pci.c | 4 ++
5 files changed, 89 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_csr.h b/drivers/net/ethernet/meta/fbnic/fbnic_csr.h
index 422265dc7abd..39eb7b94a310 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_csr.h
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_csr.h
@@ -230,6 +230,7 @@ enum {
#define FBNIC_INTR_MSIX_CTRL_VECTOR_MASK CSR_GENMASK(7, 0)
#define FBNIC_INTR_MSIX_CTRL_ENABLE CSR_BIT(31)
enum {
+ FBNIC_INTR_MSIX_CTRL_RXB_IDX = 7,
FBNIC_INTR_MSIX_CTRL_PCS_IDX = 34,
};
@@ -560,6 +561,11 @@ enum {
#define FBNIC_RXB_DROP_THLD_CNT 8
#define FBNIC_RXB_DROP_THLD_ON CSR_GENMASK(12, 0)
#define FBNIC_RXB_DROP_THLD_OFF CSR_GENMASK(25, 13)
+#define FBNIC_RXB_PAUSE_STORM(n) (0x08019 + (n)) /* 0x20064 + 4*n */
+#define FBNIC_RXB_PAUSE_STORM_CNT 4
+#define FBNIC_RXB_PAUSE_STORM_FORCE_NORMAL CSR_BIT(20)
+#define FBNIC_RXB_PAUSE_STORM_THLD_TIME CSR_GENMASK(19, 0)
+#define FBNIC_RXB_PAUSE_STORM_UNIT_WR 0x0801d /* 0x20074 */
#define FBNIC_RXB_ECN_THLD(n) (0x0801e + (n)) /* 0x20078 + 4*n */
#define FBNIC_RXB_ECN_THLD_CNT 8
#define FBNIC_RXB_ECN_THLD_ON CSR_GENMASK(12, 0)
@@ -596,6 +602,9 @@ enum {
#define FBNIC_RXB_INTF_CREDIT_MASK2 CSR_GENMASK(11, 8)
#define FBNIC_RXB_INTF_CREDIT_MASK3 CSR_GENMASK(15, 12)
+#define FBNIC_RXB_ERR_INTR_STS 0x08050 /* 0x20140 */
+#define FBNIC_RXB_ERR_INTR_STS_PS CSR_GENMASK(15, 12)
+#define FBNIC_RXB_ERR_INTR_MASK 0x08052 /* 0x20148 */
#define FBNIC_RXB_PAUSE_EVENT_CNT(n) (0x08053 + (n)) /* 0x2014c + 4*n */
#define FBNIC_RXB_DROP_FRMS_STS(n) (0x08057 + (n)) /* 0x2015c + 4*n */
#define FBNIC_RXB_DROP_BYTES_STS_L(n) \
@@ -636,6 +645,7 @@ enum {
#define FBNIC_RXB_PBUF_FIFO_LEVEL(n) (0x0811d + (n)) /* 0x20474 + 4*n */
+#define FBNIC_RXB_PAUSE_STORM_UNIT_RD 0x08125 /* 0x20494 */
#define FBNIC_RXB_INTEGRITY_ERR(n) (0x0812f + (n)) /* 0x204bc + 4*n */
#define FBNIC_RXB_MAC_ERR(n) (0x08133 + (n)) /* 0x204cc + 4*n */
#define FBNIC_RXB_PARSER_ERR(n) (0x08137 + (n)) /* 0x204dc + 4*n */
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_irq.c b/drivers/net/ethernet/meta/fbnic/fbnic_irq.c
index 02e8b0b257fe..1e6a8fd6f702 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_irq.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_irq.c
@@ -170,6 +170,8 @@ int fbnic_mac_request_irq(struct fbnic_dev *fbd)
fbnic_wr32(fbd, FBNIC_INTR_MSIX_CTRL(FBNIC_INTR_MSIX_CTRL_PCS_IDX),
FBNIC_PCS_MSIX_ENTRY | FBNIC_INTR_MSIX_CTRL_ENABLE);
+ fbnic_wr32(fbd, FBNIC_INTR_MSIX_CTRL(FBNIC_INTR_MSIX_CTRL_RXB_IDX), 0);
+
fbd->mac_msix_vector = vector;
return 0;
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_mac.c b/drivers/net/ethernet/meta/fbnic/fbnic_mac.c
index 9d0e4b2cc9ac..9cdf21192f98 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_mac.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_mac.c
@@ -143,6 +143,7 @@ static void fbnic_mac_init_qm(struct fbnic_dev *fbd)
#define FBNIC_DROP_EN_MASK 0x7d
#define FBNIC_PAUSE_EN_MASK 0x14
#define FBNIC_ECN_EN_MASK 0x10
+#define FBNIC_PS_EN_MASK 0x01
struct fbnic_fifo_config {
unsigned int addr;
@@ -424,13 +425,17 @@ static void fbnic_mac_tx_pause_config(struct fbnic_dev *fbd, bool tx_pause)
{
u32 rxb_pause_ctrl;
- /* Enable generation of pause frames if enabled */
+ /* Enable generation of pause frames based on tx_pause setting */
rxb_pause_ctrl = rd32(fbd, FBNIC_RXB_PAUSE_DROP_CTRL);
- rxb_pause_ctrl &= ~FBNIC_RXB_PAUSE_DROP_CTRL_PAUSE_ENABLE;
- if (tx_pause)
+ rxb_pause_ctrl &= ~(FBNIC_RXB_PAUSE_DROP_CTRL_PAUSE_ENABLE |
+ FBNIC_RXB_PAUSE_DROP_CTRL_PS_ENABLE);
+ if (tx_pause) {
rxb_pause_ctrl |=
FIELD_PREP(FBNIC_RXB_PAUSE_DROP_CTRL_PAUSE_ENABLE,
- FBNIC_PAUSE_EN_MASK);
+ FBNIC_PAUSE_EN_MASK) |
+ FIELD_PREP(FBNIC_RXB_PAUSE_DROP_CTRL_PS_ENABLE,
+ FBNIC_PS_EN_MASK);
+ }
wr32(fbd, FBNIC_RXB_PAUSE_DROP_CTRL, rxb_pause_ctrl);
}
@@ -445,6 +450,33 @@ static int fbnic_mac_get_link_event(struct fbnic_dev *fbd)
FBNIC_LINK_EVENT_UP : FBNIC_LINK_EVENT_NONE;
}
+static void fbnic_mac_pause_storm_config(struct fbnic_dev *fbd, bool tx_pause)
+{
+ u32 reg;
+
+ wr32(fbd, FBNIC_RXB_PAUSE_STORM_UNIT_WR,
+ FBNIC_RXB_PAUSE_STORM_CLK_DIV);
+
+ wr32(fbd, FBNIC_RXB_PAUSE_STORM(0),
+ FIELD_PREP(FBNIC_RXB_PAUSE_STORM_THLD_TIME,
+ FBNIC_MAC_PAUSE_STORM_INTERVAL));
+
+ /* There is a possibility that in transition from tx_pause enable to
+ * disable, the interrupt status was set and we did not clear it. To
+ * handle this case, we will clear the interrupt status here.
+ */
+ wr32(fbd, FBNIC_RXB_ERR_INTR_STS,
+ FIELD_PREP(FBNIC_RXB_ERR_INTR_STS_PS, FBNIC_PS_EN_MASK));
+
+ /* Unmask the Network to Host PS interrupt if tx_pause is on */
+ reg = rd32(fbd, FBNIC_RXB_ERR_INTR_MASK);
+ reg |= FBNIC_RXB_ERR_INTR_STS_PS;
+ if (tx_pause)
+ reg &= ~FIELD_PREP(FBNIC_RXB_ERR_INTR_STS_PS,
+ FBNIC_PS_EN_MASK);
+ wr32(fbd, FBNIC_RXB_ERR_INTR_MASK, reg);
+}
+
static u32 __fbnic_mac_cmd_config_asic(struct fbnic_dev *fbd,
bool tx_pause, bool rx_pause)
{
@@ -658,6 +690,7 @@ static void fbnic_mac_link_up_asic(struct fbnic_dev *fbd,
u32 cmd_cfg, mac_ctrl;
fbnic_mac_tx_pause_config(fbd, tx_pause);
+ fbnic_mac_pause_storm_config(fbd, tx_pause);
cmd_cfg = __fbnic_mac_cmd_config_asic(fbd, tx_pause, rx_pause);
mac_ctrl = rd32(fbd, FBNIC_SIG_MAC_IN0);
@@ -918,3 +951,24 @@ int fbnic_mac_init(struct fbnic_dev *fbd)
return 0;
}
+
+void fbnic_mac_rxb_pause_storm_handler(struct fbnic_dev *fbd)
+{
+ u32 rxb_err_sts = rd32(fbd, FBNIC_RXB_ERR_INTR_STS);
+
+ /* Check if pause storm interrupt for network was triggered */
+ if (rxb_err_sts & FBNIC_RXB_ERR_INTR_STS_PS) {
+ wr32(fbd, FBNIC_RXB_PAUSE_STORM(0),
+ FIELD_PREP(FBNIC_RXB_PAUSE_STORM_THLD_TIME,
+ FBNIC_MAC_PAUSE_STORM_INTERVAL) |
+ FBNIC_RXB_PAUSE_STORM_FORCE_NORMAL);
+ wrfl(fbd);
+ wr32(fbd, FBNIC_RXB_PAUSE_STORM(0),
+ FIELD_PREP(FBNIC_RXB_PAUSE_STORM_THLD_TIME,
+ FBNIC_MAC_PAUSE_STORM_INTERVAL));
+
+ /* Write 1 to clear the interrupt status */
+ wr32(fbd, FBNIC_RXB_ERR_INTR_STS,
+ FIELD_PREP(FBNIC_RXB_ERR_INTR_STS_PS, FBNIC_PS_EN_MASK));
+ }
+}
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_mac.h b/drivers/net/ethernet/meta/fbnic/fbnic_mac.h
index f08fe8b7c497..812023a33d5c 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_mac.h
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_mac.h
@@ -8,6 +8,20 @@
struct fbnic_dev;
+/* The RXB clock runs at 600 MHZ in the ASIC and the PAUSE_STORM_UNIT_WR
+ * is 10us granularity, so set the clock to 6000 (0x1770)
+ */
+#define FBNIC_RXB_PAUSE_STORM_CLK_DIV 0x1770
+
+/* Configure the timer to 500msec which should be longer than any
+ * reasonable period of continuous pausing. The service task, which runs
+ * once per second, periodically resets the pause storm trigger.
+
+ * As a result on a functioning system we enforce a 50% duty cycle should
+ * continuous pause ever occur. A crashed system will not have the service
+ * task and therefore pause will remain disabled until reboot recovery.
+ */
+#define FBNIC_MAC_PAUSE_STORM_INTERVAL 50000
#define FBNIC_MAX_JUMBO_FRAME_SIZE 9742
/* States loosely based on section 136.8.11.7.5 of IEEE 802.3-2022 Ethernet
@@ -119,4 +133,5 @@ struct fbnic_mac {
int fbnic_mac_init(struct fbnic_dev *fbd);
void fbnic_mac_get_fw_settings(struct fbnic_dev *fbd, u8 *aui, u8 *fec);
+void fbnic_mac_rxb_pause_storm_handler(struct fbnic_dev *fbd);
#endif /* _FBNIC_MAC_H_ */
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_pci.c b/drivers/net/ethernet/meta/fbnic/fbnic_pci.c
index 9240673c7533..a8c9dc2d0d6c 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_pci.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_pci.c
@@ -216,6 +216,10 @@ static void fbnic_service_task(struct work_struct *work)
fbnic_get_hw_stats32(fbd);
+ if (!FIELD_GET(FBNIC_MAC_COMMAND_CONFIG_TX_PAUSE_DIS,
+ rd32(fbd, FBNIC_MAC_COMMAND_CONFIG)))
+ fbnic_mac_rxb_pause_storm_handler(fbd);
+
fbnic_fw_check_heartbeat(fbd);
fbnic_health_check(fbd);
--
2.47.3
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH net-next 3/3] eth: fbnic: Fetch TX pause storm stats
2026-01-22 19:21 [PATCH net-next 0/3] net: ethtool: Track TX pause storm Mohsin Bashir
2026-01-22 19:21 ` [PATCH net-next 1/3] net: ethtool: Track pause storm events Mohsin Bashir
2026-01-22 19:21 ` [PATCH net-next 2/3] eth: fbnic: Add protection against pause storm Mohsin Bashir
@ 2026-01-22 19:21 ` Mohsin Bashir
2026-01-23 9:34 ` [PATCH net-next 0/3] net: ethtool: Track TX pause storm Oleksij Rempel
2026-01-23 19:36 ` Florian Fainelli
4 siblings, 0 replies; 18+ messages in thread
From: Mohsin Bashir @ 2026-01-22 19:21 UTC (permalink / raw)
To: netdev
Cc: alexanderduyck, alok.a.tiwari, andrew+netdev, andrew, chuck.lever,
davem, donald.hunter, edumazet, gal, horms, idosch,
jacob.e.keller, kernel-team, kory.maincent, kuba, lee,
mohsin.bashr, o.rempel, pabeni, vadim.fedorenko
With pause storm protection in place, track the occurrence of pause
storm events. Since there is a one-to-one mapping between pause storm
interrupts and events, use the interrupt count to track this metric.
./ethtool -I -a eth0
Pause parameters for eth0:
Autonegotiate: off
RX: off
TX: on
Statistics:
tx_pause_frames: 759657
rx_pause_frames: 0
tx_pause_storm_events: 219
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
---
drivers/net/ethernet/meta/fbnic/fbnic_csr.h | 1 +
drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c | 3 +++
drivers/net/ethernet/meta/fbnic/fbnic_hw_stats.h | 1 +
drivers/net/ethernet/meta/fbnic/fbnic_mac.c | 15 +++++++++++++++
4 files changed, 20 insertions(+)
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_csr.h b/drivers/net/ethernet/meta/fbnic/fbnic_csr.h
index 39eb7b94a310..53c04293083c 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_csr.h
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_csr.h
@@ -627,6 +627,7 @@ enum {
FBNIC_RXB_ENQUEUE_INDICES = 4
};
+#define FBNIC_RXB_INTR_PS_COUNT(n) (0x080e9 + (n)) /* 0x203a4 + 4*n */
#define FBNIC_RXB_DRBO_FRM_CNT_SRC(n) (0x080f9 + (n)) /* 0x203e4 + 4*n */
#define FBNIC_RXB_DRBO_BYTE_CNT_SRC_L(n) \
(0x080fd + (n)) /* 0x203f4 + 4*n */
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c b/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c
index 61b8005a0db5..ad7bf876a044 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c
@@ -1707,6 +1707,7 @@ fbnic_get_pause_stats(struct net_device *netdev,
struct fbnic_net *fbn = netdev_priv(netdev);
struct fbnic_mac_stats *mac_stats;
struct fbnic_dev *fbd = fbn->fbd;
+ u64 tx_ps_events;
mac_stats = &fbd->hw_stats.mac;
@@ -1714,6 +1715,8 @@ fbnic_get_pause_stats(struct net_device *netdev,
pause_stats->tx_pause_frames = mac_stats->pause.tx_pause_frames.value;
pause_stats->rx_pause_frames = mac_stats->pause.rx_pause_frames.value;
+ tx_ps_events = mac_stats->pause.tx_pause_storm_events.value;
+ pause_stats->tx_pause_storm_events = tx_ps_events;
}
static void
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_hw_stats.h b/drivers/net/ethernet/meta/fbnic/fbnic_hw_stats.h
index aa3f429a9aed..caea4be46762 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_hw_stats.h
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_hw_stats.h
@@ -54,6 +54,7 @@ struct fbnic_rmon_stats {
struct fbnic_pause_stats {
struct fbnic_stat_counter tx_pause_frames;
struct fbnic_stat_counter rx_pause_frames;
+ struct fbnic_stat_counter tx_pause_storm_events;
};
struct fbnic_eth_mac_stats {
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_mac.c b/drivers/net/ethernet/meta/fbnic/fbnic_mac.c
index 9cdf21192f98..97ae7086d7bd 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_mac.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_mac.c
@@ -418,6 +418,18 @@ static void __fbnic_mac_stat_rd64(struct fbnic_dev *fbd, bool reset, u32 reg,
stat->reported = true;
}
+static void fbnic_mac_stat_rd32(struct fbnic_dev *fbd, bool reset, u32 reg,
+ struct fbnic_stat_counter *stat)
+{
+ u32 new_reg_value;
+
+ new_reg_value = rd32(fbd, reg);
+ if (!reset)
+ stat->value += new_reg_value - stat->u.old_reg_value_32;
+ stat->u.old_reg_value_32 = new_reg_value;
+ stat->reported = true;
+}
+
#define fbnic_mac_stat_rd64(fbd, reset, __stat, __CSR) \
__fbnic_mac_stat_rd64(fbd, reset, FBNIC_##__CSR##_L, &(__stat))
@@ -792,6 +804,9 @@ fbnic_mac_get_pause_stats(struct fbnic_dev *fbd, bool reset,
MAC_STAT_TX_XOFF_STB);
fbnic_mac_stat_rd64(fbd, reset, pause_stats->rx_pause_frames,
MAC_STAT_RX_XOFF_STB);
+ fbnic_mac_stat_rd32(fbd, reset,
+ FBNIC_RXB_INTR_PS_COUNT(FBNIC_RXB_INTF_NET),
+ &pause_stats->tx_pause_storm_events);
}
static void
--
2.47.3
^ permalink raw reply related [flat|nested] 18+ messages in thread* Re: [PATCH net-next 0/3] net: ethtool: Track TX pause storm
2026-01-22 19:21 [PATCH net-next 0/3] net: ethtool: Track TX pause storm Mohsin Bashir
` (2 preceding siblings ...)
2026-01-22 19:21 ` [PATCH net-next 3/3] eth: fbnic: Fetch TX pause storm stats Mohsin Bashir
@ 2026-01-23 9:34 ` Oleksij Rempel
2026-01-23 11:28 ` Oleksij Rempel
2026-01-23 19:36 ` Florian Fainelli
4 siblings, 1 reply; 18+ messages in thread
From: Oleksij Rempel @ 2026-01-23 9:34 UTC (permalink / raw)
To: Mohsin Bashir
Cc: netdev, alexanderduyck, alok.a.tiwari, andrew+netdev, andrew,
chuck.lever, davem, donald.hunter, edumazet, gal, horms, idosch,
jacob.e.keller, kernel-team, kory.maincent, kuba, lee, pabeni,
vadim.fedorenko, kernel
Hi Mohsin, hi Jakub!
On Thu, Jan 22, 2026 at 11:21:55AM -0800, Mohsin Bashir wrote:
> With TX pause enabled, if a device cannot deliver received frames to
> the stack (e.g., during a system hang), it may generate excessive pause
> frames causing a pause storm. This series updates the uAPI to track TX
> pause storm events as part of the pause stats (p1), adds pause storm
> protection support for fbnic (p2), and leverages p1 to provide
> observability into these events for fbnic (p3).
>
> Mohsin Bashir (3):
> net: ethtool: Track pause storm events
> eth: fbnic: Add protection against pause storm
> eth: fbnic: Fetch TX pause storm stats
>
Very interesting series! Before the Christmas break, I started looking into a
similar issue on different hardware (LAN78xx USB Ethernet), which has counters
but no automatic pause storm mitigation in silicon. For reference, here is the
RFC patch set I'm working on:
https://lore.kernel.org/all/20260123090741.1566469-1-o.rempel@pengutronix.de/
In my implementation, I added the detection logic to the driver but exposed it
via the devlink health interface.
I have a few questions regarding the definitions and the architectural approach
here.
- Definition of "Pause Storm"
It seems fbnic defines a storm as a continuous assertion of the pause signal
(duration-based). The code defines FBNIC_RXB_PAUSE_STORM_UNIT_WR with a 10us
granularity and a 500ms threshold.
Does this hardware detector handle "flapping" (high rate of distinct pause
frames) or is it strictly for a stuck-high condition (deadlock/hang)? If the RX
queue drains slightly and fills up again every 100ms, does the internal timer
reset? If so, a "stuttering" storm might bypass this protection. Except the
primary goal of this feature is to prevent the link partner's TX watchdog from
firing (saving the neighbor from a reset).
- Policy and Tunability
The current implementation enforces an auto-recovery policy that is invisible
to userspace. By resetting the storm protection in fbnic_service_task (runs
every 1s), you effectively enforce a ~50% duty cycle on a hung system.
Since this logic decides when to stop flow control and start dropping packets,
shouldn't these thresholds be configurable?
I understand that in a complete OS crash, software cannot intervene, and we
rely entirely on the hardware's 500ms safety breaker. However, for partial
stalls (livelocks, soft lockups) where the service task might still run, the
hardcoded "force normal" policy might not be ideal.
Furthermore, having the ability to disable auto-recovery is critical for
testing. If we want to develop/validate a driver for the link partner that
detects incoming pause storms, we need a way to generate a sustained storm from
this device without the safety breaker constantly resetting it. A "controllable
reference" is very valuable here.
In my LAN78xx work, I used the devlink reporter's .recover callback. This lets
userspace control the policy (devlink health set ... recover
true/false) - determining whether the driver should auto-reset the MAC/PHY or
wait for manual intervention. Baking the "force normal" logic into the service
task removes this control.
Architecture:
This feels very similar to the existing ndo_tx_timeout watchdog, but for the RX
side. We are essentially trying to handle an "RX Timeout" or "RX Stall" where
the device cannot push packets to the host.
I assume the 500ms hardware threshold is specifically chosen to be well below
the standard 5-second ndo_tx_timeout of link partners, preventing a hung node
from causing watchdog resets on its neighbors. I've observed cases where
triggering a full ndo_tx_timeout reset on a stalled device leaves the hardware
in a partially reconfigured "zombie" state, unable to receive anything
thereafter.
Did you consider if we should approach this as a central networking mechanism
(an "RX Watchdog")? I am debating this for my own driver:
- Option A: Export needed metrics/events to userspace (Devlink/Ethtool) and let
a userspace agent decide when to reset.
- Option B: Handle it entirely in the kernel (like this patch), but potentially
standardize the "RX Stall" behavior so we don't end up with different
heuristics in every driver.
Best Regards,
Oleksij
--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH net-next 0/3] net: ethtool: Track TX pause storm
2026-01-23 9:34 ` [PATCH net-next 0/3] net: ethtool: Track TX pause storm Oleksij Rempel
@ 2026-01-23 11:28 ` Oleksij Rempel
2026-01-23 18:40 ` Jakub Kicinski
0 siblings, 1 reply; 18+ messages in thread
From: Oleksij Rempel @ 2026-01-23 11:28 UTC (permalink / raw)
To: Mohsin Bashir
Cc: netdev, alexanderduyck, alok.a.tiwari, andrew+netdev, andrew,
chuck.lever, davem, donald.hunter, edumazet, gal, horms, idosch,
jacob.e.keller, kernel-team, kory.maincent, kuba, lee, pabeni,
vadim.fedorenko, kernel
Here is a TL;DR summary of my questions regarding the pause storm logic
:)
- Does the 500ms hardware timer reset on "flapping" pause signals? If so,
a stuttering storm might still crash the link partner (tx watchdog
timeout).
- The auto-recovery (service task) enforces a fixed policy. Can we make
this configurable? I used devlink health (.recover) to let userspace
decide between auto-reset or manual intervention.
- Should we standardize an "RX Watchdog" mechanism in the core instead of
or in addition to driver-specific stats?
- If main case where we will run in to tx pause storm is OS crash, what
instance will be able to read this stats? Are they preserved on reboot
or kexec?
On Fri, Jan 23, 2026 at 10:34:04AM +0100, Oleksij Rempel wrote:
> Hi Mohsin, hi Jakub!
>
> On Thu, Jan 22, 2026 at 11:21:55AM -0800, Mohsin Bashir wrote:
> > With TX pause enabled, if a device cannot deliver received frames to
> > the stack (e.g., during a system hang), it may generate excessive pause
> > frames causing a pause storm. This series updates the uAPI to track TX
> > pause storm events as part of the pause stats (p1), adds pause storm
> > protection support for fbnic (p2), and leverages p1 to provide
> > observability into these events for fbnic (p3).
> >
> > Mohsin Bashir (3):
> > net: ethtool: Track pause storm events
> > eth: fbnic: Add protection against pause storm
> > eth: fbnic: Fetch TX pause storm stats
> >
>
> Very interesting series! Before the Christmas break, I started looking into a
> similar issue on different hardware (LAN78xx USB Ethernet), which has counters
> but no automatic pause storm mitigation in silicon. For reference, here is the
> RFC patch set I'm working on:
> https://lore.kernel.org/all/20260123090741.1566469-1-o.rempel@pengutronix.de/
>
> In my implementation, I added the detection logic to the driver but exposed it
> via the devlink health interface.
>
> I have a few questions regarding the definitions and the architectural approach
> here.
>
> - Definition of "Pause Storm"
>
> It seems fbnic defines a storm as a continuous assertion of the pause signal
> (duration-based). The code defines FBNIC_RXB_PAUSE_STORM_UNIT_WR with a 10us
> granularity and a 500ms threshold.
>
> Does this hardware detector handle "flapping" (high rate of distinct pause
> frames) or is it strictly for a stuck-high condition (deadlock/hang)? If the RX
> queue drains slightly and fills up again every 100ms, does the internal timer
> reset? If so, a "stuttering" storm might bypass this protection. Except the
> primary goal of this feature is to prevent the link partner's TX watchdog from
> firing (saving the neighbor from a reset).
>
> - Policy and Tunability
>
> The current implementation enforces an auto-recovery policy that is invisible
> to userspace. By resetting the storm protection in fbnic_service_task (runs
> every 1s), you effectively enforce a ~50% duty cycle on a hung system.
>
> Since this logic decides when to stop flow control and start dropping packets,
> shouldn't these thresholds be configurable?
>
> I understand that in a complete OS crash, software cannot intervene, and we
> rely entirely on the hardware's 500ms safety breaker. However, for partial
> stalls (livelocks, soft lockups) where the service task might still run, the
> hardcoded "force normal" policy might not be ideal.
>
> Furthermore, having the ability to disable auto-recovery is critical for
> testing. If we want to develop/validate a driver for the link partner that
> detects incoming pause storms, we need a way to generate a sustained storm from
> this device without the safety breaker constantly resetting it. A "controllable
> reference" is very valuable here.
>
> In my LAN78xx work, I used the devlink reporter's .recover callback. This lets
> userspace control the policy (devlink health set ... recover
> true/false) - determining whether the driver should auto-reset the MAC/PHY or
> wait for manual intervention. Baking the "force normal" logic into the service
> task removes this control.
>
> Architecture:
>
> This feels very similar to the existing ndo_tx_timeout watchdog, but for the RX
> side. We are essentially trying to handle an "RX Timeout" or "RX Stall" where
> the device cannot push packets to the host.
>
> I assume the 500ms hardware threshold is specifically chosen to be well below
> the standard 5-second ndo_tx_timeout of link partners, preventing a hung node
> from causing watchdog resets on its neighbors. I've observed cases where
> triggering a full ndo_tx_timeout reset on a stalled device leaves the hardware
> in a partially reconfigured "zombie" state, unable to receive anything
> thereafter.
>
> Did you consider if we should approach this as a central networking mechanism
> (an "RX Watchdog")? I am debating this for my own driver:
>
> - Option A: Export needed metrics/events to userspace (Devlink/Ethtool) and let
> a userspace agent decide when to reset.
>
> - Option B: Handle it entirely in the kernel (like this patch), but potentially
> standardize the "RX Stall" behavior so we don't end up with different
> heuristics in every driver.
>
> Best Regards,
> Oleksij
> --
> Pengutronix e.K. | |
> Steuerwalder Str. 21 | http://www.pengutronix.de/ |
> 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
> Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net-next 0/3] net: ethtool: Track TX pause storm
2026-01-23 11:28 ` Oleksij Rempel
@ 2026-01-23 18:40 ` Jakub Kicinski
2026-01-23 19:31 ` Mohsin Bashir
` (2 more replies)
0 siblings, 3 replies; 18+ messages in thread
From: Jakub Kicinski @ 2026-01-23 18:40 UTC (permalink / raw)
To: Oleksij Rempel
Cc: Mohsin Bashir, netdev, alexanderduyck, alok.a.tiwari,
andrew+netdev, andrew, chuck.lever, davem, donald.hunter,
edumazet, gal, horms, idosch, jacob.e.keller, kernel-team,
kory.maincent, lee, pabeni, vadim.fedorenko, kernel
On Fri, 23 Jan 2026 12:28:13 +0100 Oleksij Rempel wrote:
> Here is a TL;DR summary of my questions regarding the pause storm logic
> :)
Eh, did you get AI to help write the full version? :) So much text :)
> - Does the 500ms hardware timer reset on "flapping" pause signals? If so,
> a stuttering storm might still crash the link partner (tx watchdog
> timeout).
Yes any discontinuity resets AFAIU, Mohsin keep me honest.
There's a conflict here between respecting user configuration (pause
enabled) vs safety of the network. We're trying to err on the side of
respecting the config. We haven't seen any stutter, yet.
> - The auto-recovery (service task) enforces a fixed policy. Can we make
> this configurable? I used devlink health (.recover) to let userspace
> decide between auto-reset or manual intervention.
There is already a tunable for this exact feature but for PFC:
ETHTOOL_PFC_PREVENTION_TOUT. Should be trivial to add the same thing for
non-PFC pause. But we didn't want to open the uAPI can of warms unless
there's a clear ask and consensus. We don't need tuning (or so we
think), and there was some talk about not adding uAPI for fbnic because
it's a "private device".
> - Should we standardize an "RX Watchdog" mechanism in the core instead of
> or in addition to driver-specific stats?
Our primary use case is machine is hard-wedged. Either Linux crash, or
kexec died, or UEFI issue. So it must be the device that implements the
logic.
Florian was proposing a hook to auto-disable pause from the crash
notifier. It sounds like your use case is closer to that?
> - If main case where we will run in to tx pause storm is OS crash, what
> instance will be able to read this stats? Are they preserved on reboot
> or kexec?
Good question! I was wondering the same thing. In the end I couldn't
figure out which behavior would be less confusing. We want to make sure
that the stat never increments on a live system, if the machines come
out of boot with non-zero value some alerting system could fire.
OTOH as you say we may want to know that it did happen while machine
was out. So IDK. The fbnic implementation starts with 0.
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH net-next 0/3] net: ethtool: Track TX pause storm
2026-01-23 18:40 ` Jakub Kicinski
@ 2026-01-23 19:31 ` Mohsin Bashir
2026-01-23 21:04 ` Oleksij Rempel
2026-01-25 9:59 ` Gal Pressman
2 siblings, 0 replies; 18+ messages in thread
From: Mohsin Bashir @ 2026-01-23 19:31 UTC (permalink / raw)
To: Jakub Kicinski, Oleksij Rempel
Cc: netdev, alexanderduyck, alok.a.tiwari, andrew+netdev, andrew,
chuck.lever, davem, donald.hunter, edumazet, gal, horms, idosch,
jacob.e.keller, kernel-team, kory.maincent, lee, pabeni,
vadim.fedorenko, kernel, mohsin.bashr
> Eh, did you get AI to help write the full version? :) So much text :)
>
>> - Does the 500ms hardware timer reset on "flapping" pause signals? If so,
>> a stuttering storm might still crash the link partner (tx watchdog
>> timeout).
>
> Yes any discontinuity resets AFAIU, Mohsin keep me honest.
>
Correct, stuttering resets the timer. A pause storm is only detected
when we remain continuously in the PAUSE state for more than 500ms.
>> - If main case where we will run in to tx pause storm is OS crash, what
>> instance will be able to read this stats? Are they preserved on reboot
>> or kexec?
>
> Good question! I was wondering the same thing. In the end I couldn't
> figure out which behavior would be less confusing. We want to make sure
> that the stat never increments on a live system, if the machines come
> out of boot with non-zero value some alerting system could fire.
> OTOH as you say we may want to know that it did happen while machine
> was out. So IDK. The fbnic implementation starts with 0.
Right, a non-zero stat would enable monitoring and alerting systems to
detect anomalies.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net-next 0/3] net: ethtool: Track TX pause storm
2026-01-23 18:40 ` Jakub Kicinski
2026-01-23 19:31 ` Mohsin Bashir
@ 2026-01-23 21:04 ` Oleksij Rempel
2026-01-23 22:21 ` Jakub Kicinski
2026-01-25 9:59 ` Gal Pressman
2 siblings, 1 reply; 18+ messages in thread
From: Oleksij Rempel @ 2026-01-23 21:04 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Mohsin Bashir, netdev, alexanderduyck, alok.a.tiwari,
andrew+netdev, andrew, chuck.lever, davem, donald.hunter,
edumazet, gal, horms, idosch, jacob.e.keller, kernel-team,
kory.maincent, lee, pabeni, vadim.fedorenko, kernel
On Fri, Jan 23, 2026 at 10:40:31AM -0800, Jakub Kicinski wrote:
> On Fri, 23 Jan 2026 12:28:13 +0100 Oleksij Rempel wrote:
> > Here is a TL;DR summary of my questions regarding the pause storm logic
> > :)
>
> Eh, did you get AI to help write the full version? :) So much text :)
Yes, to reduce the text! Sorry, sometimes I have word diarrhea :D
> > - Should we standardize an "RX Watchdog" mechanism in the core instead of
> > or in addition to driver-specific stats?
>
> Our primary use case is machine is hard-wedged. Either Linux crash, or
> kexec died, or UEFI issue. So it must be the device that implements the
> logic.
>
> Florian was proposing a hook to auto-disable pause from the crash
> notifier. It sounds like your use case is closer to that?
It is valid use case - it will be nice to have it too.
In my tests, I was able to trigger an Rx stall and a pause storm (if
flow control is enabled), for example by partially disrupting the USB
connection. Since this controller is used in medical devices, it will
be good to detect these anomalies and attempt recovery.
Sorry, here I want to hijack this discussion for my purpose :)
Since a pause storm is only a symptom of an Rx stall, should we have a
common method to detect it? Is it even reasonably possible? In my cases,
I tried to detect it by monitoring the level of the Rx queue, Rx HW
counters, and Rx SW counters. But maybe I just have a blind spot and
this is a naive way to detect things.
--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net-next 0/3] net: ethtool: Track TX pause storm
2026-01-23 21:04 ` Oleksij Rempel
@ 2026-01-23 22:21 ` Jakub Kicinski
0 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2026-01-23 22:21 UTC (permalink / raw)
To: Oleksij Rempel
Cc: Mohsin Bashir, netdev, alexanderduyck, alok.a.tiwari,
andrew+netdev, andrew, chuck.lever, davem, donald.hunter,
edumazet, gal, horms, idosch, jacob.e.keller, kernel-team,
kory.maincent, lee, pabeni, vadim.fedorenko, kernel
On Fri, 23 Jan 2026 22:04:39 +0100 Oleksij Rempel wrote:
> In my tests, I was able to trigger an Rx stall and a pause storm (if
> flow control is enabled), for example by partially disrupting the USB
> connection. Since this controller is used in medical devices, it will
> be good to detect these anomalies and attempt recovery.
>
> Sorry, here I want to hijack this discussion for my purpose :)
>
> Since a pause storm is only a symptom of an Rx stall, should we have a
> common method to detect it? Is it even reasonably possible? In my cases,
> I tried to detect it by monitoring the level of the Rx queue, Rx HW
> counters, and Rx SW counters. But maybe I just have a blind spot and
> this is a naive way to detect things.
Not sure what USB connection disruption entails exactly but there's
presumably a lot of things which can go wrong in critical infra and
for which some daemon must periodically check and remediate.
IDK if this belongs in the kernel, but perhaps folks with more embedded
experience would find it useful.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net-next 0/3] net: ethtool: Track TX pause storm
2026-01-23 18:40 ` Jakub Kicinski
2026-01-23 19:31 ` Mohsin Bashir
2026-01-23 21:04 ` Oleksij Rempel
@ 2026-01-25 9:59 ` Gal Pressman
2026-01-25 22:30 ` Jakub Kicinski
2 siblings, 1 reply; 18+ messages in thread
From: Gal Pressman @ 2026-01-25 9:59 UTC (permalink / raw)
To: Jakub Kicinski, Oleksij Rempel
Cc: Mohsin Bashir, netdev, alexanderduyck, alok.a.tiwari,
andrew+netdev, andrew, chuck.lever, davem, donald.hunter,
edumazet, horms, idosch, jacob.e.keller, kernel-team,
kory.maincent, lee, pabeni, vadim.fedorenko, kernel
On 23/01/2026 20:40, Jakub Kicinski wrote:
>> - The auto-recovery (service task) enforces a fixed policy. Can we make
>> this configurable? I used devlink health (.recover) to let userspace
>> decide between auto-reset or manual intervention.
>
> There is already a tunable for this exact feature but for PFC:
> ETHTOOL_PFC_PREVENTION_TOUT. Should be trivial to add the same thing for
> non-PFC pause. But we didn't want to open the uAPI can of warms unless
> there's a clear ask and consensus. We don't need tuning (or so we
> think), and there was some talk about not adding uAPI for fbnic because
> it's a "private device".
I had to refresh my memory on this, but I think we've chosen a non-ideal
name back then.
We use this value for both PFC and global pause, I recommend you do the
same (perhaps with better documentation?).
mlx5 exposes tx_pause_storm_warning_events/tx_pause_storm_error_events
through 'ethtool -S', we can probably assign one of them into
tx-pause-storm-events.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net-next 0/3] net: ethtool: Track TX pause storm
2026-01-25 9:59 ` Gal Pressman
@ 2026-01-25 22:30 ` Jakub Kicinski
2026-01-26 6:51 ` Gal Pressman
0 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2026-01-25 22:30 UTC (permalink / raw)
To: Gal Pressman
Cc: Oleksij Rempel, Mohsin Bashir, netdev, alexanderduyck,
alok.a.tiwari, andrew+netdev, andrew, chuck.lever, davem,
donald.hunter, edumazet, horms, idosch, jacob.e.keller,
kernel-team, kory.maincent, lee, pabeni, vadim.fedorenko, kernel
On Sun, 25 Jan 2026 11:59:07 +0200 Gal Pressman wrote:
> On 23/01/2026 20:40, Jakub Kicinski wrote:
> >> - The auto-recovery (service task) enforces a fixed policy. Can we make
> >> this configurable? I used devlink health (.recover) to let userspace
> >> decide between auto-reset or manual intervention.
> >
> > There is already a tunable for this exact feature but for PFC:
> > ETHTOOL_PFC_PREVENTION_TOUT. Should be trivial to add the same thing for
> > non-PFC pause. But we didn't want to open the uAPI can of warms unless
> > there's a clear ask and consensus. We don't need tuning (or so we
> > think), and there was some talk about not adding uAPI for fbnic because
> > it's a "private device".
>
> I had to refresh my memory on this, but I think we've chosen a non-ideal
> name back then.
> We use this value for both PFC and global pause, I recommend you do the
> same (perhaps with better documentation?).
Excellent, I was wondering if that may indeed be the case. Thanks for
doing the digging. Let's respin, use that knob and document it better
as you suggest.
> mlx5 exposes tx_pause_storm_warning_events/tx_pause_storm_error_events
> through 'ethtool -S', we can probably assign one of them into
> tx-pause-storm-events.
Do you prefer to take care of that or should Mohsin do it? According to
https://enterprise-support.nvidia.com/s/article/understanding-mlx5-ethtool-counters
sounds like the "error" counter is the one that matches.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net-next 0/3] net: ethtool: Track TX pause storm
2026-01-25 22:30 ` Jakub Kicinski
@ 2026-01-26 6:51 ` Gal Pressman
0 siblings, 0 replies; 18+ messages in thread
From: Gal Pressman @ 2026-01-26 6:51 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Oleksij Rempel, Mohsin Bashir, netdev, alexanderduyck,
alok.a.tiwari, andrew+netdev, andrew, chuck.lever, davem,
donald.hunter, edumazet, horms, idosch, jacob.e.keller,
kernel-team, kory.maincent, lee, pabeni, vadim.fedorenko, kernel
On 26/01/2026 0:30, Jakub Kicinski wrote:
> On Sun, 25 Jan 2026 11:59:07 +0200 Gal Pressman wrote:
>> On 23/01/2026 20:40, Jakub Kicinski wrote:
>>>> - The auto-recovery (service task) enforces a fixed policy. Can we make
>>>> this configurable? I used devlink health (.recover) to let userspace
>>>> decide between auto-reset or manual intervention.
>>>
>>> There is already a tunable for this exact feature but for PFC:
>>> ETHTOOL_PFC_PREVENTION_TOUT. Should be trivial to add the same thing for
>>> non-PFC pause. But we didn't want to open the uAPI can of warms unless
>>> there's a clear ask and consensus. We don't need tuning (or so we
>>> think), and there was some talk about not adding uAPI for fbnic because
>>> it's a "private device".
>>
>> I had to refresh my memory on this, but I think we've chosen a non-ideal
>> name back then.
>> We use this value for both PFC and global pause, I recommend you do the
>> same (perhaps with better documentation?).
>
> Excellent, I was wondering if that may indeed be the case. Thanks for
> doing the digging. Let's respin, use that knob and document it better
> as you suggest.
FWIW, I see bnxt also supports this tunable.
>
>> mlx5 exposes tx_pause_storm_warning_events/tx_pause_storm_error_events
>> through 'ethtool -S', we can probably assign one of them into
>> tx-pause-storm-events.
>
> Do you prefer to take care of that or should Mohsin do it? According to
> https://enterprise-support.nvidia.com/s/article/understanding-mlx5-ethtool-counters
> sounds like the "error" counter is the one that matches.
Yes, tx_pause_storm_error_events should be used.
I don't have a preference on who adds it.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net-next 0/3] net: ethtool: Track TX pause storm
2026-01-22 19:21 [PATCH net-next 0/3] net: ethtool: Track TX pause storm Mohsin Bashir
` (3 preceding siblings ...)
2026-01-23 9:34 ` [PATCH net-next 0/3] net: ethtool: Track TX pause storm Oleksij Rempel
@ 2026-01-23 19:36 ` Florian Fainelli
2026-01-23 20:05 ` Jakub Kicinski
4 siblings, 1 reply; 18+ messages in thread
From: Florian Fainelli @ 2026-01-23 19:36 UTC (permalink / raw)
To: Mohsin Bashir, netdev, Jakub Kicinski
Cc: alexanderduyck, alok.a.tiwari, andrew+netdev, andrew, chuck.lever,
davem, donald.hunter, edumazet, gal, horms, idosch,
jacob.e.keller, kernel-team, kory.maincent, lee, o.rempel, pabeni,
vadim.fedorenko
On 1/22/26 11:21, Mohsin Bashir wrote:
> With TX pause enabled, if a device cannot deliver received frames to
> the stack (e.g., during a system hang), it may generate excessive pause
> frames causing a pause storm. This series updates the uAPI to track TX
> pause storm events as part of the pause stats (p1), adds pause storm
> protection support for fbnic (p2), and leverages p1 to provide
> observability into these events for fbnic (p3).
Jakub, should I go ahead and re-submit this patch series then:
https://patchew.org/linux/20251107002510.1678369-1-florian.fainelli@broadcom.com/
Looks like this is exactly what you ran into here.
--
Florian
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH net-next 0/3] net: ethtool: Track TX pause storm
2026-01-23 19:36 ` Florian Fainelli
@ 2026-01-23 20:05 ` Jakub Kicinski
0 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2026-01-23 20:05 UTC (permalink / raw)
To: Florian Fainelli
Cc: Mohsin Bashir, netdev, alexanderduyck, alok.a.tiwari,
andrew+netdev, andrew, chuck.lever, davem, donald.hunter,
edumazet, gal, horms, idosch, jacob.e.keller, kernel-team,
kory.maincent, lee, o.rempel, pabeni, vadim.fedorenko
On Fri, 23 Jan 2026 11:36:33 -0800 Florian Fainelli wrote:
> On 1/22/26 11:21, Mohsin Bashir wrote:
> > With TX pause enabled, if a device cannot deliver received frames to
> > the stack (e.g., during a system hang), it may generate excessive pause
> > frames causing a pause storm. This series updates the uAPI to track TX
> > pause storm events as part of the pause stats (p1), adds pause storm
> > protection support for fbnic (p2), and leverages p1 to provide
> > observability into these events for fbnic (p3).
>
> Jakub, should I go ahead and re-submit this patch series then:
>
> https://patchew.org/linux/20251107002510.1678369-1-florian.fainelli@broadcom.com/
Sounded like Oleksij is also interested so let's go ahead.
Presumably an attr in ethtool -a/-A ?
> Looks like this is exactly what you ran into here.
Kinda, the machines reboot on panic. The problematic cases are those
that crashed in a way that didn't trigger normal reboot cycle.
Or got wedge somewhere funny.
^ permalink raw reply [flat|nested] 18+ messages in thread