Intel-Wired-Lan Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-wired-lan] [PATCH net v1] ice: stop counting UDP csum mismatch as rx_errors
@ 2025-12-01 23:38 Jesse Brandeburg
  2025-12-02 22:08 ` Jacob Keller
  2025-12-05  8:26 ` Loktionov, Aleksandr
  0 siblings, 2 replies; 9+ messages in thread
From: Jesse Brandeburg @ 2025-12-01 23:38 UTC (permalink / raw)
  To: netdev
  Cc: Jesse Brandeburg, Tony Nguyen, Jake Keller, IWL, Przemek Kitszel,
	Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Brett Creeley

From: Jesse Brandeburg <jbrandeburg@cloudflare.com>

Since the beginning, the Intel ice driver has counted receive checksum
offload mismatches into the rx_errors member of the rtnl_link_stats64
struct. In ethtool -S these show up as rx_csum_bad.nic.

I believe counting these in rx_errors is fundamentally wrong, as it's
pretty clear from the comments in if_link.h and from every other statistic
the driver is summing into rx_errors, that all of them would cause a
"hardware drop" except for the UDP checksum mismatch, as well as the fact
that all the other causes for rx_errors are L2 reasons, and this L4 UDP
"mismatch" is an outlier.

A last nail in the coffin is that rx_errors is monitored in production and
can indicate a bad NIC/cable/Switch port, but instead some random series of
UDP packets with bad checksums will now trigger this alert. This false
positive makes the alert useless and affects us as well as other companies.

This packet with presumably a bad UDP checksum is *already* passed to the
stack, just not marked as offloaded by the hardware/driver. If it is
dropped by the stack it will show up as UDP_MIB_CSUMERRORS.

And one more thing, none of the other Intel drivers, and at least bnxt_en
and mlx5 both don't appear to count UDP offload mismatches as rx_errors.

Here is a related customer complaint:
https://community.intel.com/t5/Ethernet-Products/ice-rx-errros-is-too-sensitive-to-IP-TCP-attack-packets-Intel/td-p/1662125

Fixes: 4f1fe43c920b ("ice: Add more Rx errors to netdev's rx_error counter")
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
Cc: Jake Keller <jacob.e.keller@intel.com>
Cc: IWL <intel-wired-lan@lists.osuosl.org>
Signed-off-by: Jesse Brandeburg <jbrandeburg@cloudflare.com>
--
I am sending this to net as I consider it a bug, and it will backport
cleanly.
---
 drivers/net/ethernet/intel/ice/ice_main.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 86f5859e88ef..d004acfa0f36 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -6995,7 +6995,6 @@ void ice_update_vsi_stats(struct ice_vsi *vsi)
 		cur_ns->rx_errors = pf->stats.crc_errors +
 				    pf->stats.illegal_bytes +
 				    pf->stats.rx_undersize +
-				    pf->hw_csum_rx_error +
 				    pf->stats.rx_jabber +
 				    pf->stats.rx_fragments +
 				    pf->stats.rx_oversize;
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-01-27 18:25 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-01 23:38 [Intel-wired-lan] [PATCH net v1] ice: stop counting UDP csum mismatch as rx_errors Jesse Brandeburg
2025-12-02 22:08 ` Jacob Keller
2025-12-05  8:26 ` Loktionov, Aleksandr
2025-12-05 19:05   ` Jesse Brandeburg
2025-12-05 20:56     ` Loktionov, Aleksandr
2025-12-05 22:44       ` jbrandeburg
2025-12-08 22:20         ` Jacob Keller
2026-01-27 17:52           ` Jesse Brandeburg via Intel-wired-lan
2026-01-27 18:25             ` Tony Nguyen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox