public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: Oleksij Rempel <o.rempel@pengutronix.de>
Cc: Mohsin Bashir <mohsin.bashr@gmail.com>,
	netdev@vger.kernel.org, alexanderduyck@fb.com,
	alok.a.tiwari@oracle.com, andrew+netdev@lunn.ch, andrew@lunn.ch,
	chuck.lever@oracle.com, davem@davemloft.net,
	donald.hunter@gmail.com, edumazet@google.com, gal@nvidia.com,
	horms@kernel.org, idosch@nvidia.com, jacob.e.keller@intel.com,
	kernel-team@meta.com, kory.maincent@bootlin.com, lee@trager.us,
	pabeni@redhat.com, vadim.fedorenko@linux.dev,
	kernel@pengutronix.de
Subject: Re: [PATCH net-next 0/3] net: ethtool: Track TX pause storm
Date: Fri, 23 Jan 2026 10:40:31 -0800	[thread overview]
Message-ID: <20260123104031.16d914e4@kernel.org> (raw)
In-Reply-To: <aXNbTVF5KNz1yV-1@pengutronix.de>

On Fri, 23 Jan 2026 12:28:13 +0100 Oleksij Rempel wrote:
> Here is a TL;DR summary of my questions regarding the pause storm logic
> :)

Eh, did you get AI to help write the full version? :) So much text :)

> - Does the 500ms hardware timer reset on "flapping" pause signals? If so,
>   a stuttering storm might still crash the link partner (tx watchdog
>   timeout).

Yes any discontinuity resets AFAIU, Mohsin keep me honest.

There's a conflict here between respecting user configuration (pause
enabled) vs safety of the network. We're trying to err on the side of
respecting the config. We haven't seen any stutter, yet.

> - The auto-recovery (service task) enforces a fixed policy. Can we make
>   this configurable? I used devlink health (.recover) to let userspace
>   decide between auto-reset or manual intervention.

There is already a tunable for this exact feature but for PFC:
ETHTOOL_PFC_PREVENTION_TOUT. Should be trivial to add the same thing for
non-PFC pause. But we didn't want to open the uAPI can of warms unless
there's a clear ask and consensus. We don't need tuning (or so we
think), and there was some talk about not adding uAPI for fbnic because
it's a "private device".

> - Should we standardize an "RX Watchdog" mechanism in the core instead of
>   or in addition to driver-specific stats?

Our primary use case is machine is hard-wedged. Either Linux crash, or
kexec died, or UEFI issue. So it must be the device that implements the
logic.

Florian was proposing a hook to auto-disable pause from the crash
notifier. It sounds like your use case is closer to that?

> - If main case where we will run in to tx pause storm is OS crash, what
>   instance will be able to read this stats? Are they preserved on reboot
>   or kexec?

Good question! I was wondering the same thing. In the end I couldn't
figure out which behavior would be less confusing. We want to make sure
that the stat never increments on a live system, if the machines come
out of boot with non-zero value some alerting system could fire.
OTOH as you say we may want to know that it did happen while machine
was out. So IDK. The fbnic implementation starts with 0.

  reply	other threads:[~2026-01-23 18:40 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-22 19:21 [PATCH net-next 0/3] net: ethtool: Track TX pause storm Mohsin Bashir
2026-01-22 19:21 ` [PATCH net-next 1/3] net: ethtool: Track pause storm events Mohsin Bashir
2026-01-23 21:27   ` Oleksij Rempel
2026-01-23 22:15     ` Jakub Kicinski
2026-01-24  9:28       ` Oleksij Rempel
2026-01-22 19:21 ` [PATCH net-next 2/3] eth: fbnic: Add protection against pause storm Mohsin Bashir
2026-01-22 19:21 ` [PATCH net-next 3/3] eth: fbnic: Fetch TX pause storm stats Mohsin Bashir
2026-01-23  9:34 ` [PATCH net-next 0/3] net: ethtool: Track TX pause storm Oleksij Rempel
2026-01-23 11:28   ` Oleksij Rempel
2026-01-23 18:40     ` Jakub Kicinski [this message]
2026-01-23 19:31       ` Mohsin Bashir
2026-01-23 21:04       ` Oleksij Rempel
2026-01-23 22:21         ` Jakub Kicinski
2026-01-25  9:59       ` Gal Pressman
2026-01-25 22:30         ` Jakub Kicinski
2026-01-26  6:51           ` Gal Pressman
2026-01-23 19:36 ` Florian Fainelli
2026-01-23 20:05   ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260123104031.16d914e4@kernel.org \
    --to=kuba@kernel.org \
    --cc=alexanderduyck@fb.com \
    --cc=alok.a.tiwari@oracle.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=andrew@lunn.ch \
    --cc=chuck.lever@oracle.com \
    --cc=davem@davemloft.net \
    --cc=donald.hunter@gmail.com \
    --cc=edumazet@google.com \
    --cc=gal@nvidia.com \
    --cc=horms@kernel.org \
    --cc=idosch@nvidia.com \
    --cc=jacob.e.keller@intel.com \
    --cc=kernel-team@meta.com \
    --cc=kernel@pengutronix.de \
    --cc=kory.maincent@bootlin.com \
    --cc=lee@trager.us \
    --cc=mohsin.bashr@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=o.rempel@pengutronix.de \
    --cc=pabeni@redhat.com \
    --cc=vadim.fedorenko@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox