From: Jakub Kicinski <kuba@kernel.org>
To: Oleksij Rempel <o.rempel@pengutronix.de>
Cc: Mohsin Bashir <mohsin.bashr@gmail.com>,
netdev@vger.kernel.org, alexanderduyck@fb.com,
alok.a.tiwari@oracle.com, andrew+netdev@lunn.ch, andrew@lunn.ch,
chuck.lever@oracle.com, davem@davemloft.net,
donald.hunter@gmail.com, edumazet@google.com, gal@nvidia.com,
horms@kernel.org, idosch@nvidia.com, jacob.e.keller@intel.com,
kernel-team@meta.com, kory.maincent@bootlin.com, lee@trager.us,
pabeni@redhat.com, vadim.fedorenko@linux.dev
Subject: Re: [PATCH net-next 1/3] net: ethtool: Track pause storm events
Date: Fri, 23 Jan 2026 14:15:27 -0800 [thread overview]
Message-ID: <20260123141527.358506c6@kernel.org> (raw)
In-Reply-To: <aXPntx9wiUqbKGRN@pengutronix.de>
On Fri, 23 Jan 2026 22:27:19 +0100 Oleksij Rempel wrote:
> > + -
> > + name: tx-pause-storm-events
> > + type: u64
> > + doc: >-
> > + TX pause storm event count. Increments each time device
> > + detects that its pause assertion condition has been true
> > + for too long for normal operation. As a result, the device
> > + has temporarily disabled its own Pause TX function to
> > + protect the network from itself.
> > + This counter should never increment under normal overload
> > + conditions; it indicates catastrophic failure like an OS
> > + crash. The rate of incrementing is implementation specific.
>
> Hm, we already have the tx pause frame counters. So, the anomaly is
> visible to the user anyway (even if it isn't explicitly labeled as an
> anomaly).
We are trying to prove a negative here, that's why we need a new
counter. As the doc says this counter should indicate that storm
is never actually detected under normal conditions. Another thing
to keep in mind is that we're talking about metric collection at scale,
so every 1min to 5min.
> What is not visible to the user is when HW or SW disables flow control.
> Maybe that is what the counter should represent and be named? Would
> tx-pause-auto-disabled-events make sense?
According to our existing uAPI for PFC pause storm is the term of art.
> The reason I do not like tx-pause-storm-events is that the meaning is
> device specific; the user has to read the device manual to know what it
> actually means.
>
> tx-pause-auto-disabled-events can be reused in more cases - every time
> we try to pause flow control for some reason.
TBH I feel like you may be overestimating your ability to do anything
like that in the SW here. The silicon can do this cycle-accurate, FIFO
pressure never relieved. In SW you have to poll, and if you can poll
why not just read the packets from the fifo and let the pipe move?
On the "device manual" point, pause frames as an estimate of congestion
are also quite useless device to device. You have to "read the manual".
Different devices use different pause quanta so to speak.
next prev parent reply other threads:[~2026-01-23 22:15 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-22 19:21 [PATCH net-next 0/3] net: ethtool: Track TX pause storm Mohsin Bashir
2026-01-22 19:21 ` [PATCH net-next 1/3] net: ethtool: Track pause storm events Mohsin Bashir
2026-01-23 21:27 ` Oleksij Rempel
2026-01-23 22:15 ` Jakub Kicinski [this message]
2026-01-24 9:28 ` Oleksij Rempel
2026-01-22 19:21 ` [PATCH net-next 2/3] eth: fbnic: Add protection against pause storm Mohsin Bashir
2026-01-22 19:21 ` [PATCH net-next 3/3] eth: fbnic: Fetch TX pause storm stats Mohsin Bashir
2026-01-23 9:34 ` [PATCH net-next 0/3] net: ethtool: Track TX pause storm Oleksij Rempel
2026-01-23 11:28 ` Oleksij Rempel
2026-01-23 18:40 ` Jakub Kicinski
2026-01-23 19:31 ` Mohsin Bashir
2026-01-23 21:04 ` Oleksij Rempel
2026-01-23 22:21 ` Jakub Kicinski
2026-01-25 9:59 ` Gal Pressman
2026-01-25 22:30 ` Jakub Kicinski
2026-01-26 6:51 ` Gal Pressman
2026-01-23 19:36 ` Florian Fainelli
2026-01-23 20:05 ` Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260123141527.358506c6@kernel.org \
--to=kuba@kernel.org \
--cc=alexanderduyck@fb.com \
--cc=alok.a.tiwari@oracle.com \
--cc=andrew+netdev@lunn.ch \
--cc=andrew@lunn.ch \
--cc=chuck.lever@oracle.com \
--cc=davem@davemloft.net \
--cc=donald.hunter@gmail.com \
--cc=edumazet@google.com \
--cc=gal@nvidia.com \
--cc=horms@kernel.org \
--cc=idosch@nvidia.com \
--cc=jacob.e.keller@intel.com \
--cc=kernel-team@meta.com \
--cc=kory.maincent@bootlin.com \
--cc=lee@trager.us \
--cc=mohsin.bashr@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=o.rempel@pengutronix.de \
--cc=pabeni@redhat.com \
--cc=vadim.fedorenko@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox