[PATCH] net: enetc: fix sirq-storm by clearing IDR registers

public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] net: enetc: fix sirq-storm by clearing IDR registers
@ 2026-02-20 13:29 Zefir Kurtisi
  2026-02-23 16:32 ` Vladimir Oltean
  0 siblings, 1 reply; 8+ messages in thread
From: Zefir Kurtisi @ 2026-02-20 13:29 UTC (permalink / raw)
  To: claudiu.manoil, vladimir.oltean, wei.fang, xiaoning.wang
  Cc: davem, kuba, netdev, linux-kernel, Zefir Kurtisi

From: Zefir Kurtisi <zefir.kurtisi@westermo.com>

The fsl_enetc driver experiences soft-IRQ storms on LS1028A systems
where up to 500k interrupts/sec are generated, completely saturating
one CPU core. When running with a single core, this causes watchdog
timeouts and system reboots.

Root cause:
The driver was writing to SITXIDR/SIRXIDR (Station Interface summary
registers) to acknowledge interrupts, but these are W1C registers that
only provide a summary view. According to the LS1028A Reference Manual
(Rev. 0, Chapter 16.3):

- TBaIDR/RBaIDR (per-ring, offset 0xa4): RO, "Reading will
  automatically clear all events"
- SITXIDR/SIRXIDR (summary, offset 0xa18/0xa28): W1C, "provides a
  non-destructive read access"

The actual interrupt sources are the per-ring TBaIDR/RBaIDR registers.
The summary registers merely reflect their combined state. Writing to
SITXIDR/SIRXIDR does not clear the underlying per-ring sources, causing
the hardware to immediately re-assert the interrupt.

Fix:
1. Point ring->idr to per-ring TBaIDR/RBaIDR instead of summary
   registers
2. Remove per-packet writes to SITXIDR/SIRXIDR from packet processing
3. Read TBaIDR/RBaIDR once per NAPI poll (in enetc_poll) before
   re-enabling interrupts

This properly acknowledges interrupts at the hardware level and
eliminates the interrupt storm. The optimization of clearing once per
NAPI poll rather than per packet also reduces register access overhead.

Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers")
Tested-on: LS1028A (NXP Layerscape), Linux 6.6.93
Signed-off-by: Zefir Kurtisi <zefir.kurtisi@westermo.com>
---
 drivers/net/ethernet/freescale/enetc/enetc.c | 18 +++++++++---------
 drivers/net/ethernet/freescale/enetc/enetc.h |  1 +
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
index e380a4f39855..8442e87b9b86 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -1291,9 +1291,6 @@ static bool enetc_clean_tx_ring(struct enetc_bdr *tx_ring, int napi_budget)
 		/* BD iteration loop end */
 		if (is_eof) {
 			tx_frm_cnt++;
-			/* re-arm interrupt source */
-			enetc_wr_reg_hot(tx_ring->idr, BIT(tx_ring->index) |
-					 BIT(16 + tx_ring->index));
 		}
 
 		if (unlikely(!bds_to_clean))
@@ -1620,7 +1617,6 @@ static int enetc_clean_rx_ring(struct enetc_bdr *rx_ring,
 		if (!bd_status)
 			break;
 
-		enetc_wr_reg_hot(rx_ring->idr, BIT(rx_ring->index));
 		dma_rmb(); /* for reading other rxbd fields */
 
 		if (enetc_check_bd_errors_and_consume(rx_ring, bd_status,
@@ -1977,7 +1973,6 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring,
 		if (!bd_status)
 			break;
 
-		enetc_wr_reg_hot(rx_ring->idr, BIT(rx_ring->index));
 		dma_rmb(); /* for reading other rxbd fields */
 
 		if (enetc_check_bd_errors_and_consume(rx_ring, bd_status,
@@ -2143,12 +2138,16 @@ static int enetc_poll(struct napi_struct *napi, int budget)
 	v->rx_napi_work = false;
 
 	enetc_lock_mdio();
-	/* enable interrupts */
+	/* Read RBaIDR to acknowledge interrupt (RO, read-to-clear) */
+	enetc_rd_reg_hot(rx_ring->idr);
 	enetc_wr_reg_hot(v->rbier, ENETC_RBIER_RXTIE);
 
-	for_each_set_bit(i, &v->tx_rings_map, ENETC_MAX_NUM_TXQS)
+	for_each_set_bit(i, &v->tx_rings_map, ENETC_MAX_NUM_TXQS) {
+		/* Read TBaIDR to acknowledge interrupt (RO, read-to-clear) */
+		enetc_rd_reg_hot(v->tbidr_base + ENETC_BDR_OFF(i));
 		enetc_wr_reg_hot(v->tbier_base + ENETC_BDR_OFF(i),
 				 ENETC_TBIER_TXTIE);
+	}
 
 	enetc_unlock_mdio();
 
@@ -2608,7 +2607,7 @@ static void enetc_setup_txbdr(struct enetc_hw *hw, struct enetc_bdr *tx_ring)
 
 	tx_ring->tpir = hw->reg + ENETC_BDR(TX, idx, ENETC_TBPIR);
 	tx_ring->tcir = hw->reg + ENETC_BDR(TX, idx, ENETC_TBCIR);
-	tx_ring->idr = hw->reg + ENETC_SITXIDR;
+	tx_ring->idr = hw->reg + ENETC_BDR(TX, idx, ENETC_TBIDR);
 }
 
 static void enetc_setup_rxbdr(struct enetc_hw *hw, struct enetc_bdr *rx_ring,
@@ -2650,7 +2649,7 @@ static void enetc_setup_rxbdr(struct enetc_hw *hw, struct enetc_bdr *rx_ring,
 		rbmr |= ENETC_RBMR_VTE;
 
 	rx_ring->rcir = hw->reg + ENETC_BDR(RX, idx, ENETC_RBCIR);
-	rx_ring->idr = hw->reg + ENETC_SIRXIDR;
+	rx_ring->idr = hw->reg + ENETC_BDR(RX, idx, ENETC_RBIDR);
 
 	rx_ring->next_to_clean = 0;
 	rx_ring->next_to_use = 0;
@@ -2793,6 +2792,7 @@ static int enetc_setup_irqs(struct enetc_ndev_priv *priv)
 		}
 
 		v->tbier_base = hw->reg + ENETC_BDR(TX, 0, ENETC_TBIER);
+		v->tbidr_base = hw->reg + ENETC_BDR(TX, 0, ENETC_TBIDR);
 		v->rbier = hw->reg + ENETC_BDR(RX, i, ENETC_RBIER);
 		v->ricr1 = hw->reg + ENETC_BDR(RX, i, ENETC_RBICR1);
 
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h
index aecd40aeef9c..2b4b052e43c8 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.h
+++ b/drivers/net/ethernet/freescale/enetc/enetc.h
@@ -374,6 +374,7 @@ static inline bool enetc_is_pseudo_mac(struct enetc_si *si)
 struct enetc_int_vector {
 	void __iomem *rbier;
 	void __iomem *tbier_base;
+	void __iomem *tbidr_base;
 	void __iomem *ricr1;
 	unsigned long tx_rings_map;
 	int count_tx_rings;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] net: enetc: fix sirq-storm by clearing IDR registers
  2026-02-20 13:29 [PATCH] net: enetc: fix sirq-storm by clearing IDR registers Zefir Kurtisi
@ 2026-02-23 16:32 ` Vladimir Oltean
  2026-02-26 12:00   ` Vladimir Oltean
  2026-02-26 19:28   ` Zefir Kurtisi
  0 siblings, 2 replies; 8+ messages in thread
From: Vladimir Oltean @ 2026-02-23 16:32 UTC (permalink / raw)
  To: Zefir Kurtisi
  Cc: claudiu.manoil, wei.fang, xiaoning.wang, davem, kuba, netdev,
	linux-kernel, Zefir Kurtisi

Hi Zefir,

On Fri, Feb 20, 2026 at 02:29:30PM +0100, Zefir Kurtisi wrote:
> From: Zefir Kurtisi <zefir.kurtisi@westermo.com>
> 
> The fsl_enetc driver experiences soft-IRQ storms on LS1028A systems
> where up to 500k interrupts/sec are generated, completely saturating
> one CPU core. When running with a single core, this causes watchdog
> timeouts and system reboots.
> 
> Root cause:
> The driver was writing to SITXIDR/SIRXIDR (Station Interface summary
> registers) to acknowledge interrupts, but these are W1C registers that
> only provide a summary view. According to the LS1028A Reference Manual
> (Rev. 0, Chapter 16.3):
> 
> - TBaIDR/RBaIDR (per-ring, offset 0xa4): RO, "Reading will
>   automatically clear all events"
> - SITXIDR/SIRXIDR (summary, offset 0xa18/0xa28): W1C, "provides a
>   non-destructive read access"
> 
> The actual interrupt sources are the per-ring TBaIDR/RBaIDR registers.
> The summary registers merely reflect their combined state. Writing to
> SITXIDR/SIRXIDR does not clear the underlying per-ring sources, causing
> the hardware to immediately re-assert the interrupt.
> 
> Fix:
> 1. Point ring->idr to per-ring TBaIDR/RBaIDR instead of summary
>    registers
> 2. Remove per-packet writes to SITXIDR/SIRXIDR from packet processing
> 3. Read TBaIDR/RBaIDR once per NAPI poll (in enetc_poll) before
>    re-enabling interrupts
> 
> This properly acknowledges interrupts at the hardware level and
> eliminates the interrupt storm. The optimization of clearing once per
> NAPI poll rather than per packet also reduces register access overhead.
> 
> Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers")
> Tested-on: LS1028A (NXP Layerscape), Linux 6.6.93
> Signed-off-by: Zefir Kurtisi <zefir.kurtisi@westermo.com>
> ---

Thank you for your patch and for debugging.

I am not sure whether your interpretation of the documentation is
correct. I have asked a colleague familiar with the hardware design and
will come back when I am 100% sure.

Superficially, I believe you may have mixed up the documentation for
SITXIDR/SIRXIDR with PSIIDR/VSIIDR. There, indeed, it says "Summary of
detected interrupts for all transmit rings belonging to the SI (...)
Read only, clear using SITXIDR."

I wonder whether it's possible you are looking at a different issue
instead, completely unrelated to hardirq masking. I notice that stable
tag v6.6.93 is lacking this commit:
https://github.com/torvalds/linux/commit/50bd33f6b392
which is high on my list of suspiciously similar issues in terms of behaviour.

(note: when submitting a patch to mainline net.git main branch, it's a
good idea to also test *on* the net.git main branch, aka
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/)

I also note that I have put prints each time the driver clears the
interrupts by writing to SITXIDR/SIRXIDR, and with various workloads on
eno0/eno2/eno3, not once have I noticed the interrupt to still be pending
in TBaIDR/RBaIDR.

Is there something special about your setup? What interfaces and traffic
pattern are you using?

This patch should be put on hold until it is clear to everybody what is
going on.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] net: enetc: fix sirq-storm by clearing IDR registers
  2026-02-23 16:32 ` Vladimir Oltean
@ 2026-02-26 12:00   ` Vladimir Oltean
  2026-02-26 19:28   ` Zefir Kurtisi
  1 sibling, 0 replies; 8+ messages in thread
From: Vladimir Oltean @ 2026-02-26 12:00 UTC (permalink / raw)
  To: Zefir Kurtisi
  Cc: claudiu.manoil, wei.fang, xiaoning.wang, davem, kuba, netdev,
	linux-kernel, Zefir Kurtisi

On Mon, Feb 23, 2026 at 06:32:27PM +0200, Vladimir Oltean wrote:
> I am not sure whether your interpretation of the documentation is
> correct. I have asked a colleague familiar with the hardware design and
> will come back when I am 100% sure.

I got confirmation that clearing ring interrupts through the
SITXIDR/SIRXIDR registers is fine, which is contrary to the main claim
of this patch.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] net: enetc: fix sirq-storm by clearing IDR registers
  2026-02-23 16:32 ` Vladimir Oltean
  2026-02-26 12:00   ` Vladimir Oltean
@ 2026-02-26 19:28   ` Zefir Kurtisi
  2026-02-26 19:51     ` Vladimir Oltean
  1 sibling, 1 reply; 8+ messages in thread
From: Zefir Kurtisi @ 2026-02-26 19:28 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: claudiu.manoil, wei.fang, xiaoning.wang, davem, kuba, netdev,
	linux-kernel, Zefir Kurtisi

Hi Vladimir,

On 2/23/26 17:32, Vladimir Oltean wrote:
> Hi Zefir,
> 
> On Fri, Feb 20, 2026 at 02:29:30PM +0100, Zefir Kurtisi wrote:
>> From: Zefir Kurtisi <zefir.kurtisi@westermo.com>
>>
>> The fsl_enetc driver experiences soft-IRQ storms on LS1028A systems
>> where up to 500k interrupts/sec are generated, completely saturating
>> one CPU core. When running with a single core, this causes watchdog
>> timeouts and system reboots.
>>
>> Root cause:
>> The driver was writing to SITXIDR/SIRXIDR (Station Interface summary
>> registers) to acknowledge interrupts, but these are W1C registers that
>> only provide a summary view. According to the LS1028A Reference Manual
>> (Rev. 0, Chapter 16.3):
>>
>> - TBaIDR/RBaIDR (per-ring, offset 0xa4): RO, "Reading will
>>    automatically clear all events"
>> - SITXIDR/SIRXIDR (summary, offset 0xa18/0xa28): W1C, "provides a
>>    non-destructive read access"
>>
>> The actual interrupt sources are the per-ring TBaIDR/RBaIDR registers.
>> The summary registers merely reflect their combined state. Writing to
>> SITXIDR/SIRXIDR does not clear the underlying per-ring sources, causing
>> the hardware to immediately re-assert the interrupt.
>>
>> Fix:
>> 1. Point ring->idr to per-ring TBaIDR/RBaIDR instead of summary
>>     registers
>> 2. Remove per-packet writes to SITXIDR/SIRXIDR from packet processing
>> 3. Read TBaIDR/RBaIDR once per NAPI poll (in enetc_poll) before
>>     re-enabling interrupts
>>
>> This properly acknowledges interrupts at the hardware level and
>> eliminates the interrupt storm. The optimization of clearing once per
>> NAPI poll rather than per packet also reduces register access overhead.
>>
>> Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers")
>> Tested-on: LS1028A (NXP Layerscape), Linux 6.6.93
>> Signed-off-by: Zefir Kurtisi <zefir.kurtisi@westermo.com>
>> ---
> 
> Thank you for your patch and for debugging.
> 
> I am not sure whether your interpretation of the documentation is
> correct. I have asked a colleague familiar with the hardware design and
> will come back when I am 100% sure.
> 
> Superficially, I believe you may have mixed up the documentation for
> SITXIDR/SIRXIDR with PSIIDR/VSIIDR. There, indeed, it says "Summary of
> detected interrupts for all transmit rings belonging to the SI (...)
> Read only, clear using SITXIDR."
> 
> I wonder whether it's possible you are looking at a different issue
> instead, completely unrelated to hardirq masking. I notice that stable
> tag v6.6.93 is lacking this commit:
> https://github.com/torvalds/linux/commit/50bd33f6b392
> which is high on my list of suspiciously similar issues in terms of behaviour.
> 
> (note: when submitting a patch to mainline net.git main branch, it's a
> good idea to also test *on* the net.git main branch, aka
> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/)
> 
> I also note that I have put prints each time the driver clears the
> interrupts by writing to SITXIDR/SIRXIDR, and with various workloads on
> eno0/eno2/eno3, not once have I noticed the interrupt to still be pending
> in TBaIDR/RBaIDR.
> 
> Is there something special about your setup? What interfaces and traffic
> pattern are you using?
> 
> This patch should be put on hold until it is clear to everybody what is
> going on.

Thank you for the feedback and clarifications.

The statement that SITXIDR/SIRXIDR bits are directly linked to 
TBaIDR/RBaIDR is missing in the reference manual, i.e. it states that 
the former represent a summary of the latter, but not that W1C-ing the 
bits in SITXIDR would also move the other direction and clear TBaIDR. 
That clarifies quite a bit, thanks.

As for your request to take the mainline branch, I am depending on a 
OpenWRT build-system and shifting linux kernel-versions is unfortunately 
not something I can do in no time. As for the potentially missing patch 
you pointed me to, I backported that one, but it makes no difference.

Luckily meanwhile I was able to narrow down the issue and can provide 
you a means to hopefully reproduce it. This is the tl;dr version:

* enetc operates eth0
* ath9k operates wlan0
* both are bridged over OVS
* device is AP with an active STA connected to it
* STA regularly sends an L2 WNM keep-alive frame
  * that frame is 'buggy' as being tagged IPv4 but without payload
* through the OVS bridge that frame makes it into eth0 TX path
* enetc_start_xmit() enqueues it into TX-BD
* HW processes that descriptor, sets IDR and issues interrupt
* enetc_clean_tx_ring()
  * gets a bds_to_clean=0 (tx_ring->tcir = tx_ring->next_to_clean)
   * i.e. HW signals it completed the BD but did not advance TCIR
  * skips the while() loop
  * and hence never clears the according SITXIDR bit
* enetc_poll() after completion of ring processing
  * re-enables interrupts
  * but the one bit in SITXIDR is now sticky
* interrupt is re-asserted immediately
* the affected core remains 100% SIRQing
* it only recovers when the affected TX ring advances

So in short, enetc breaks when sending 0-byte frames.

The patch that I provided resolves the problem by force-cleaning all 
IDRs before interrupts are re-enabled. That is the sledge-hammer 
approach, since it also unmasks BDs that were just completed during
execution of enetc_poll() or no_eof BDs. Hence it is not the final
solution, but currently anything is better than a freezing box.

Below is the tool I wrote to fire such a frame-of-death. If you can 
reproduce the observation, I'd prepare a v2 patch to unblock the issue 
once it happens - preventing enetc_start_xmit() from sending such frames 
I'd leave to you, since that part looks complex to me to handle it properly.

Cheers,
Zefir

---

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/socket.h>
#include <net/ethernet.h>
#include <netpacket/packet.h>
#include <net/if.h>
#include <arpa/inet.h>
#include <sys/ioctl.h>

/*
  * Enetc-Killer
  *
  * This is a PoC for fsl_enetc Ethernet driver to detect an
  * issue the driver has when zero-payload IP packets are sent.
  *
  * It was detected when using an enetc Ethernet interface bridged
  * with a wireless interface operating as AP. A connected client
  * regularly sends L2 WNM keep-alive frames without IP payload.
  * Through the bridge this 'buggy' packet makes it into the
  * enetc TX path, which the driver enqueues for sending and
  * the HW signals transmission done but without providing a
  * completed TX-BD. This leads to a sticky interrupt detected
  * flag causing a SIRQ-storm.
  *
  * This has been tested on a LS1028A based system under an
  * OpenWRT derivative / linux 6.6.93
  *
  * To test:
  * * build and copy binary to device
  * * connect over serial, leave eth0 idle
  * * ensure device runs with multiple cores enabled (otherwise it freezes)
  * * run the program
  * * with top, observe that one core is fully loaded with SIRQ
  * * to recover, storm-ping eth0 from outside to
  *   enforce TX-BD advance
  */

int main()
{
	int sock = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
	if (sock < 0) {
		perror("socket");
		return 1;
	}

	struct ifreq ifr;
	memset(&ifr, 0, sizeof(ifr));
	strncpy(ifr.ifr_name, "eth0", IFNAMSIZ);
	if (ioctl(sock, SIOCGIFINDEX, &ifr) < 0) {
		perror("ioctl");
		return 1;
	}

	struct sockaddr_ll addr = { 0 };
	addr.sll_family = AF_PACKET;
	addr.sll_ifindex = ifr.ifr_ifindex;
	addr.sll_halen = ETH_ALEN;
	addr.sll_protocol = htons(ETH_P_IP);
	// Destination MAC (Broadcast)
	addr.sll_addr[0] = 0xff;
	addr.sll_addr[1] = 0xff;
	addr.sll_addr[2] = 0xff;
	addr.sll_addr[3] = 0xff;
	addr.sll_addr[4] = 0xff;
	addr.sll_addr[5] = 0xff;

	// "broken" packet: only Ethernet-header, no IP-payload
	// as sent by wpa_supplicant as L2 WNM keep-alive frame
	unsigned char buf[14] = {
		0xff, 0xff, 0xff, 0xff, 0xff, 0xff,	// DST MAC
		0x00, 0x11, 0x22, 0x33, 0x44, 0x55,	// SRC MAC
		0x08, 0x00				// EtherType = IPv4
	};

	if (sendto(sock, buf, sizeof(buf), 0, (struct sockaddr*) &addr, 
sizeof(addr)) < 0) {
		perror("sendto");
		return 1;
	}
	close(sock);
	return 0;
}

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] net: enetc: fix sirq-storm by clearing IDR registers
  2026-02-26 19:28   ` Zefir Kurtisi
@ 2026-02-26 19:51     ` Vladimir Oltean
  2026-02-27 11:28       ` Zefir Kurtisi
  0 siblings, 1 reply; 8+ messages in thread
From: Vladimir Oltean @ 2026-02-26 19:51 UTC (permalink / raw)
  To: Zefir Kurtisi
  Cc: claudiu.manoil, wei.fang, xiaoning.wang, davem, kuba, netdev,
	linux-kernel, Zefir Kurtisi

On Thu, Feb 26, 2026 at 08:28:53PM +0100, Zefir Kurtisi wrote:
> Thank you for the feedback and clarifications.
> 
> The statement that SITXIDR/SIRXIDR bits are directly linked to TBaIDR/RBaIDR
> is missing in the reference manual, i.e. it states that the former represent
> a summary of the latter, but not that W1C-ing the bits in SITXIDR would also
> move the other direction and clear TBaIDR. That clarifies quite a bit,
> thanks.
> 
> As for your request to take the mainline branch, I am depending on a OpenWRT
> build-system and shifting linux kernel-versions is unfortunately not
> something I can do in no time. As for the potentially missing patch you
> pointed me to, I backported that one, but it makes no difference.
> 
> Luckily meanwhile I was able to narrow down the issue and can provide you a
> means to hopefully reproduce it. This is the tl;dr version:
> 
> * enetc operates eth0
> * ath9k operates wlan0
> * both are bridged over OVS
> * device is AP with an active STA connected to it
> * STA regularly sends an L2 WNM keep-alive frame
>  * that frame is 'buggy' as being tagged IPv4 but without payload
> * through the OVS bridge that frame makes it into eth0 TX path
> * enetc_start_xmit() enqueues it into TX-BD
> * HW processes that descriptor, sets IDR and issues interrupt
> * enetc_clean_tx_ring()
>  * gets a bds_to_clean=0 (tx_ring->tcir = tx_ring->next_to_clean)
>   * i.e. HW signals it completed the BD but did not advance TCIR
>  * skips the while() loop
>  * and hence never clears the according SITXIDR bit
> * enetc_poll() after completion of ring processing
>  * re-enables interrupts
>  * but the one bit in SITXIDR is now sticky
> * interrupt is re-asserted immediately
> * the affected core remains 100% SIRQing
> * it only recovers when the affected TX ring advances
> 
> So in short, enetc breaks when sending 0-byte frames.
> 
> The patch that I provided resolves the problem by force-cleaning all IDRs
> before interrupts are re-enabled. That is the sledge-hammer approach, since
> it also unmasks BDs that were just completed during
> execution of enetc_poll() or no_eof BDs. Hence it is not the final
> solution, but currently anything is better than a freezing box.
> 
> Below is the tool I wrote to fire such a frame-of-death. If you can
> reproduce the observation, I'd prepare a v2 patch to unblock the issue once
> it happens - preventing enetc_start_xmit() from sending such frames I'd
> leave to you, since that part looks complex to me to handle it properly.
> 
> 
> Cheers,
> Zefir
> 
> ---
> 
> #include <stdio.h>
> #include <string.h>
> #include <stdlib.h>
> #include <unistd.h>
> #include <sys/socket.h>
> #include <net/ethernet.h>
> #include <netpacket/packet.h>
> #include <net/if.h>
> #include <arpa/inet.h>
> #include <sys/ioctl.h>
> 
> /*
>  * Enetc-Killer
>  *
>  * This is a PoC for fsl_enetc Ethernet driver to detect an
>  * issue the driver has when zero-payload IP packets are sent.
>  *
>  * It was detected when using an enetc Ethernet interface bridged
>  * with a wireless interface operating as AP. A connected client
>  * regularly sends L2 WNM keep-alive frames without IP payload.
>  * Through the bridge this 'buggy' packet makes it into the
>  * enetc TX path, which the driver enqueues for sending and
>  * the HW signals transmission done but without providing a
>  * completed TX-BD. This leads to a sticky interrupt detected
>  * flag causing a SIRQ-storm.
>  *
>  * This has been tested on a LS1028A based system under an
>  * OpenWRT derivative / linux 6.6.93
>  *
>  * To test:
>  * * build and copy binary to device
>  * * connect over serial, leave eth0 idle
>  * * ensure device runs with multiple cores enabled (otherwise it freezes)
>  * * run the program
>  * * with top, observe that one core is fully loaded with SIRQ
>  * * to recover, storm-ping eth0 from outside to
>  *   enforce TX-BD advance
>  */
> 
> int main()
> {
> 	int sock = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
> 	if (sock < 0) {
> 		perror("socket");
> 		return 1;
> 	}
> 
> 	struct ifreq ifr;
> 	memset(&ifr, 0, sizeof(ifr));
> 	strncpy(ifr.ifr_name, "eth0", IFNAMSIZ);
> 	if (ioctl(sock, SIOCGIFINDEX, &ifr) < 0) {
> 		perror("ioctl");
> 		return 1;
> 	}
> 
> 	struct sockaddr_ll addr = { 0 };
> 	addr.sll_family = AF_PACKET;
> 	addr.sll_ifindex = ifr.ifr_ifindex;
> 	addr.sll_halen = ETH_ALEN;
> 	addr.sll_protocol = htons(ETH_P_IP);
> 	// Destination MAC (Broadcast)
> 	addr.sll_addr[0] = 0xff;
> 	addr.sll_addr[1] = 0xff;
> 	addr.sll_addr[2] = 0xff;
> 	addr.sll_addr[3] = 0xff;
> 	addr.sll_addr[4] = 0xff;
> 	addr.sll_addr[5] = 0xff;
> 
> 	// "broken" packet: only Ethernet-header, no IP-payload
> 	// as sent by wpa_supplicant as L2 WNM keep-alive frame
> 	unsigned char buf[14] = {
> 		0xff, 0xff, 0xff, 0xff, 0xff, 0xff,	// DST MAC
> 		0x00, 0x11, 0x22, 0x33, 0x44, 0x55,	// SRC MAC
> 		0x08, 0x00				// EtherType = IPv4
> 	};
> 
> 	if (sendto(sock, buf, sizeof(buf), 0, (struct sockaddr*) &addr,
> sizeof(addr)) < 0) {
> 		perror("sendto");
> 		return 1;
> 	}
> 	close(sock);
> 	return 0;
> }
> 

If I understand correctly, this patch should resolve your issue?

From 887cf74648fb10b7dee3c60199349d184c5a851e Mon Sep 17 00:00:00 2001
From: Vladimir Oltean <vladimir.oltean@nxp.com>
Date: Thu, 26 Feb 2026 21:50:13 +0200
Subject: [PATCH] net: enetc: avoid sending too short Ethernet frames

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 drivers/net/ethernet/freescale/enetc/enetc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
index b0b47b0f7723..4f5e593b348a 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -982,6 +982,9 @@ static netdev_tx_t enetc_start_xmit(struct sk_buff *skb,
 	struct enetc_bdr *tx_ring;
 	int count;
 
+	if (eth_skb_pad(skb))
+		return NETDEV_TX_OK;
+
 	/* Queue one-step Sync packet if already locked */
 	if (enetc_cb->flag & ENETC_F_TX_ONESTEP_SYNC_TSTAMP) {
 		if (test_and_set_bit_lock(ENETC_TX_ONESTEP_TSTAMP_IN_PROGRESS,
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] net: enetc: fix sirq-storm by clearing IDR registers
  2026-02-26 19:51     ` Vladimir Oltean
@ 2026-02-27 11:28       ` Zefir Kurtisi
  2026-02-27 11:57         ` Vladimir Oltean
  0 siblings, 1 reply; 8+ messages in thread
From: Zefir Kurtisi @ 2026-02-27 11:28 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: claudiu.manoil, wei.fang, xiaoning.wang, davem, kuba, netdev,
	linux-kernel, Zefir Kurtisi



On 2/26/26 20:51, Vladimir Oltean wrote:
> [...]
> If I understand correctly, this patch should resolve your issue?
> 
>  From 887cf74648fb10b7dee3c60199349d184c5a851e Mon Sep 17 00:00:00 2001
> From: Vladimir Oltean <vladimir.oltean@nxp.com>
> Date: Thu, 26 Feb 2026 21:50:13 +0200
> Subject: [PATCH] net: enetc: avoid sending too short Ethernet frames
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> ---
>   drivers/net/ethernet/freescale/enetc/enetc.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
> index b0b47b0f7723..4f5e593b348a 100644
> --- a/drivers/net/ethernet/freescale/enetc/enetc.c
> +++ b/drivers/net/ethernet/freescale/enetc/enetc.c
> @@ -982,6 +982,9 @@ static netdev_tx_t enetc_start_xmit(struct sk_buff *skb,
>   	struct enetc_bdr *tx_ring;
>   	int count;
>   
> +	if (eth_skb_pad(skb))
> +		return NETDEV_TX_OK;
> +
>   	/* Queue one-step Sync packet if already locked */
>   	if (enetc_cb->flag & ENETC_F_TX_ONESTEP_SYNC_TSTAMP) {
>   		if (test_and_set_bit_lock(ENETC_TX_ONESTEP_TSTAMP_IN_PROGRESS,

Hi Vladimir,

yes, that fixes it.

I expected identifying invalid frames with all corner-cases to be 
difficult, not realizing that making them valid is way easier. Thank you 
for that insight.

Tested-by: Zefir Kurtisi <zefir.kurtisi@westermo.com>


While I was debugging the issue I extended enetc with features that 
still might be useful; the first change was to selectively process only 
the pending BDs instead of looping over all available, saving a 
significant number of unneeded calls to enetc_clean_rx/tx_ring(); the 
second adds a fail-safe mechanism for potentially other issues taking 
the same code path based on the fact that now a BD is only processed 
when its IDR is active, which means there must be BDs available to process.

Please find the patches below, if you think they are of some value, take 
as you see fit or let me know to prepare a v2.

Thank you for looking into it and resolving it so swiftly.

Cheers,
Zefir

---
 From 54791dad5555d6dea0c0276268058d491b25548a Mon Sep 17 00:00:00 2001
From: Zefir Kurtisi <zefir.kurtisi@westermo.com>
Date: Fri, 27 Feb 2026 11:14:08 +0100
Subject: [PATCH 1/2] net: enetc: process only those BDs that have IDR set

The enetc_poll() function currently tries to process all
BDs without considering which one caused the interrupt.

When a frame is received, it then still loops over all
TX BDs to in enetc_clean_tx_ring() find out there are no
pending BDs to clean.

SITXIDR/SIRXIDR provide us with the list of BDs that were
triggering the interrupt, so we can limit processing to
those affected.

This changes enetc_poll processing accordingly.

Signed-off-by: Zefir Kurtisi <zefir.kurtisi@westermo.com>
---
  drivers/net/ethernet/freescale/enetc/enetc.c | 24 +++++++++++++++-----
  1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c 
b/drivers/net/ethernet/freescale/enetc/enetc.c
index e380a4f39855..e9d600506d83 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -2114,19 +2114,31 @@ static int enetc_poll(struct napi_struct *napi, 
int budget)
  	bool complete = true;
  	int work_done;
  	int i;
+	u32 si_idr;

  	enetc_lock_mdio();

-	for (i = 0; i < v->count_tx_rings; i++)
+	si_idr = enetc_rd_reg_hot(v->tx_ring[0].idr);
+	for (i = 0; i < v->count_tx_rings; i++) {
+		/* only process TX BDs that have IDR bit set */
+		if (!(si_idr & BIT(v->tx_ring[i].index)))
+			continue;
  		if (!enetc_clean_tx_ring(&v->tx_ring[i], budget))
  			complete = false;
+	}
+
+	si_idr = enetc_rd_reg_hot(rx_ring->idr);
  	enetc_unlock_mdio();

-	prog = rx_ring->xdp.prog;
-	if (prog)
-		work_done = enetc_clean_rx_ring_xdp(rx_ring, napi, budget, prog);
-	else
-		work_done = enetc_clean_rx_ring(rx_ring, napi, budget);
+	if (si_idr & BIT(rx_ring->index)) {
+		prog = rx_ring->xdp.prog;
+		if (prog)
+			work_done = enetc_clean_rx_ring_xdp(rx_ring, napi, budget, prog);
+		else
+			work_done = enetc_clean_rx_ring(rx_ring, napi, budget);
+	} else {
+		work_done = 0;
+	}
  	if (work_done == budget)
  		complete = false;
  	if (work_done)
-- 
2.43.0
---
 From a144fc991784ea7ee51c4a46d03b46ee4a17956a Mon Sep 17 00:00:00 2001
From: Zefir Kurtisi <zefir.kurtisi@westermo.com>
Date: Fri, 27 Feb 2026 11:21:49 +0100
Subject: [PATCH 2/2] net: enetc: clear sticky IDR bit causing IRQ-storms

We observed enetc entering a state with a sticky IDR bit
set causing IRQ-storms. It could be tracked down to
a zero-sized IP frame being transmitted, which causes the
HW to issue IDR but not advance the TX BD.

Vladimir Oltean fixed this by ensuring TX frames are padded
to the minimum size.

This adds a fail-safe mechanism to recover from such issues
just in case there are other stimuli causing taking the
same code path.

The mechanism bases on the fact that the current TX BD is
being processed because it issued a IDR. If contrary to
that no descriptors are pending, a warning is issued and
the TBaIDR is cleared.

Signed-off-by: Zefir Kurtisi <zefir.kurtisi@westermo.com>
---
  drivers/net/ethernet/freescale/enetc/enetc.c | 8 ++++++++
  1 file changed, 8 insertions(+)

diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c 
b/drivers/net/ethernet/freescale/enetc/enetc.c
index e9d600506d83..d8680a5768b5 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -1229,6 +1229,14 @@ static bool enetc_clean_tx_ring(struct enetc_bdr 
*tx_ring, int napi_budget)

  	bds_to_clean = enetc_bd_ready_count(tx_ring, i);

+	if (!bds_to_clean) {
+		/* this shall not happen, since IDR indicated data available */
+		netdev_warn(ndev, "TX[%d]: IDR set, but no BDs to clean => "
+			    "clearing IDR to recover\n", tx_ring->index);
+		enetc_wr_reg_hot(tx_ring->idr, BIT(tx_ring->index) |
+				 BIT(16 + tx_ring->index));
+	}
+
  	do_twostep_tstamp = false;

  	while (bds_to_clean && tx_frm_cnt < ENETC_DEFAULT_TX_WORK) {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] net: enetc: fix sirq-storm by clearing IDR registers
  2026-02-27 11:28       ` Zefir Kurtisi
@ 2026-02-27 11:57         ` Vladimir Oltean
  2026-02-27 13:03           ` Zefir Kurtisi
  0 siblings, 1 reply; 8+ messages in thread
From: Vladimir Oltean @ 2026-02-27 11:57 UTC (permalink / raw)
  To: Zefir Kurtisi
  Cc: claudiu.manoil, wei.fang, xiaoning.wang, davem, kuba, netdev,
	linux-kernel, Zefir Kurtisi

On Fri, Feb 27, 2026 at 12:28:17PM +0100, Zefir Kurtisi wrote:
> While I was debugging the issue I extended enetc with features that still
> might be useful; the first change was to selectively process only the
> pending BDs instead of looping over all available, saving a significant
> number of unneeded calls to enetc_clean_rx/tx_ring(); the second adds a
> fail-safe mechanism for potentially other issues taking the same code path
> based on the fact that now a BD is only processed when its IDR is active,
> which means there must be BDs available to process.

We'll perform our own investigation of what is happening when
transmitting short frames. It will take some time.

But I'm not sure I understand what you mean about processing a BD ring
when its interrupt is active. NAPI (Documentation/networking/napi.rst)
works on the basic premise that a single hardirq is sufficient to process
a very large batch of frames, and keeping hardirqs enabled is detrimential
to performance. Instead, NAPI (re-)schedules softirqs until there are no
further frames to process, and only then re-enables the hardirq.

Perhaps I didn't understand very well what you mean, but it sounds like
you want to circumvent NAPI, essentially. When enetc_poll() is called,
you seem to assume it's the first time it's been called after enetc_msix()
has called napi_schedule(). But it's not. It can also reschedule itself,
when the work done is equal to the NAPI budget (complete==false), with
the hardirq _still_ masked.

The correct NAPI behaviour _is_ for the hardirq to be masked for a very
long while, when subject to continuous streams of traffic.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] net: enetc: fix sirq-storm by clearing IDR registers
  2026-02-27 11:57         ` Vladimir Oltean
@ 2026-02-27 13:03           ` Zefir Kurtisi
  0 siblings, 0 replies; 8+ messages in thread
From: Zefir Kurtisi @ 2026-02-27 13:03 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: claudiu.manoil, wei.fang, xiaoning.wang, davem, kuba, netdev,
	linux-kernel, Zefir Kurtisi



On 2/27/26 12:57, Vladimir Oltean wrote:
> On Fri, Feb 27, 2026 at 12:28:17PM +0100, Zefir Kurtisi wrote:
>> While I was debugging the issue I extended enetc with features that still
>> might be useful; the first change was to selectively process only the
>> pending BDs instead of looping over all available, saving a significant
>> number of unneeded calls to enetc_clean_rx/tx_ring(); the second adds a
>> fail-safe mechanism for potentially other issues taking the same code path
>> based on the fact that now a BD is only processed when its IDR is active,
>> which means there must be BDs available to process.
> 
> We'll perform our own investigation of what is happening when
> transmitting short frames. It will take some time.
> 
> But I'm not sure I understand what you mean about processing a BD ring
> when its interrupt is active. NAPI (Documentation/networking/napi.rst)
> works on the basic premise that a single hardirq is sufficient to process
> a very large batch of frames, and keeping hardirqs enabled is detrimential
> to performance. Instead, NAPI (re-)schedules softirqs until there are no
> further frames to process, and only then re-enables the hardirq.
> 
> Perhaps I didn't understand very well what you mean, but it sounds like
> you want to circumvent NAPI, essentially. When enetc_poll() is called,
> you seem to assume it's the first time it's been called after enetc_msix()
> has called napi_schedule(). But it's not. It can also reschedule itself,
> when the work done is equal to the NAPI budget (complete==false), with
> the hardirq _still_ masked.
> 
> The correct NAPI behaviour _is_ for the hardirq to be masked for a very
> long while, when subject to continuous streams of traffic.

The change proposed does in no way change how and when interrupts are 
re-enabled or NAPI is re-scheduled. What it does is:

assume there are completed RX frames, but no completed TX frames:
SIRXIDR=0x0001, SITXIDR=0x0000

enetc_poll today does:
* enetc_clean_tx_ring(0) -> nop
  * read TB0CIR and find out it did not progress
* enetc_clean_tx_ring(2) -> nop
* enetc_clean_tx_ring(4) -> nop
* enetc_clean_tx_ring(6) -> nop
* enetc_clean_rx_ring(0)

Proposal instead: as long as SITXIDR==0x0000, limit to
* enetc_clean_rx_ring(0)

IRQ remains masked until NAPI is completed, behaviour is same as before. 
If there was an RX burst and NAPI is re-scheduled multiple times and 
during that time a TX BD completes, the SITXIDR would get the related 
bit set and TX BD would be processed in enetc_poll - all before the 
interrupt is re-armed.


But if the improvement is not obvious, it maybe does not justify adding 
the change - in the end it spares some function calls and register 
reads, which might not be that significant.

Cheers,
Zefir


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-02-27 13:03 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-20 13:29 [PATCH] net: enetc: fix sirq-storm by clearing IDR registers Zefir Kurtisi
2026-02-23 16:32 ` Vladimir Oltean
2026-02-26 12:00   ` Vladimir Oltean
2026-02-26 19:28   ` Zefir Kurtisi
2026-02-26 19:51     ` Vladimir Oltean
2026-02-27 11:28       ` Zefir Kurtisi
2026-02-27 11:57         ` Vladimir Oltean
2026-02-27 13:03           ` Zefir Kurtisi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox