public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Zefir Kurtisi <zefir.kurtisi@gmail.com>
To: Vladimir Oltean <vladimir.oltean@nxp.com>
Cc: claudiu.manoil@nxp.com, wei.fang@nxp.com, xiaoning.wang@nxp.com,
	davem@davemloft.net, kuba@kernel.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Zefir Kurtisi <zefir.kurtisi@westermo.com>
Subject: Re: [PATCH] net: enetc: fix sirq-storm by clearing IDR registers
Date: Thu, 26 Feb 2026 20:28:53 +0100	[thread overview]
Message-ID: <b3d9136c-2803-4203-b1ea-1f9e62de80a1@gmail.com> (raw)
In-Reply-To: <20260223163227.y3yzzpncyf5a7klq@skbuf>

Hi Vladimir,

On 2/23/26 17:32, Vladimir Oltean wrote:
> Hi Zefir,
> 
> On Fri, Feb 20, 2026 at 02:29:30PM +0100, Zefir Kurtisi wrote:
>> From: Zefir Kurtisi <zefir.kurtisi@westermo.com>
>>
>> The fsl_enetc driver experiences soft-IRQ storms on LS1028A systems
>> where up to 500k interrupts/sec are generated, completely saturating
>> one CPU core. When running with a single core, this causes watchdog
>> timeouts and system reboots.
>>
>> Root cause:
>> The driver was writing to SITXIDR/SIRXIDR (Station Interface summary
>> registers) to acknowledge interrupts, but these are W1C registers that
>> only provide a summary view. According to the LS1028A Reference Manual
>> (Rev. 0, Chapter 16.3):
>>
>> - TBaIDR/RBaIDR (per-ring, offset 0xa4): RO, "Reading will
>>    automatically clear all events"
>> - SITXIDR/SIRXIDR (summary, offset 0xa18/0xa28): W1C, "provides a
>>    non-destructive read access"
>>
>> The actual interrupt sources are the per-ring TBaIDR/RBaIDR registers.
>> The summary registers merely reflect their combined state. Writing to
>> SITXIDR/SIRXIDR does not clear the underlying per-ring sources, causing
>> the hardware to immediately re-assert the interrupt.
>>
>> Fix:
>> 1. Point ring->idr to per-ring TBaIDR/RBaIDR instead of summary
>>     registers
>> 2. Remove per-packet writes to SITXIDR/SIRXIDR from packet processing
>> 3. Read TBaIDR/RBaIDR once per NAPI poll (in enetc_poll) before
>>     re-enabling interrupts
>>
>> This properly acknowledges interrupts at the hardware level and
>> eliminates the interrupt storm. The optimization of clearing once per
>> NAPI poll rather than per packet also reduces register access overhead.
>>
>> Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers")
>> Tested-on: LS1028A (NXP Layerscape), Linux 6.6.93
>> Signed-off-by: Zefir Kurtisi <zefir.kurtisi@westermo.com>
>> ---
> 
> Thank you for your patch and for debugging.
> 
> I am not sure whether your interpretation of the documentation is
> correct. I have asked a colleague familiar with the hardware design and
> will come back when I am 100% sure.
> 
> Superficially, I believe you may have mixed up the documentation for
> SITXIDR/SIRXIDR with PSIIDR/VSIIDR. There, indeed, it says "Summary of
> detected interrupts for all transmit rings belonging to the SI (...)
> Read only, clear using SITXIDR."
> 
> I wonder whether it's possible you are looking at a different issue
> instead, completely unrelated to hardirq masking. I notice that stable
> tag v6.6.93 is lacking this commit:
> https://github.com/torvalds/linux/commit/50bd33f6b392
> which is high on my list of suspiciously similar issues in terms of behaviour.
> 
> (note: when submitting a patch to mainline net.git main branch, it's a
> good idea to also test *on* the net.git main branch, aka
> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/)
> 
> I also note that I have put prints each time the driver clears the
> interrupts by writing to SITXIDR/SIRXIDR, and with various workloads on
> eno0/eno2/eno3, not once have I noticed the interrupt to still be pending
> in TBaIDR/RBaIDR.
> 
> Is there something special about your setup? What interfaces and traffic
> pattern are you using?
> 
> This patch should be put on hold until it is clear to everybody what is
> going on.

Thank you for the feedback and clarifications.

The statement that SITXIDR/SIRXIDR bits are directly linked to 
TBaIDR/RBaIDR is missing in the reference manual, i.e. it states that 
the former represent a summary of the latter, but not that W1C-ing the 
bits in SITXIDR would also move the other direction and clear TBaIDR. 
That clarifies quite a bit, thanks.

As for your request to take the mainline branch, I am depending on a 
OpenWRT build-system and shifting linux kernel-versions is unfortunately 
not something I can do in no time. As for the potentially missing patch 
you pointed me to, I backported that one, but it makes no difference.

Luckily meanwhile I was able to narrow down the issue and can provide 
you a means to hopefully reproduce it. This is the tl;dr version:

* enetc operates eth0
* ath9k operates wlan0
* both are bridged over OVS
* device is AP with an active STA connected to it
* STA regularly sends an L2 WNM keep-alive frame
  * that frame is 'buggy' as being tagged IPv4 but without payload
* through the OVS bridge that frame makes it into eth0 TX path
* enetc_start_xmit() enqueues it into TX-BD
* HW processes that descriptor, sets IDR and issues interrupt
* enetc_clean_tx_ring()
  * gets a bds_to_clean=0 (tx_ring->tcir = tx_ring->next_to_clean)
   * i.e. HW signals it completed the BD but did not advance TCIR
  * skips the while() loop
  * and hence never clears the according SITXIDR bit
* enetc_poll() after completion of ring processing
  * re-enables interrupts
  * but the one bit in SITXIDR is now sticky
* interrupt is re-asserted immediately
* the affected core remains 100% SIRQing
* it only recovers when the affected TX ring advances

So in short, enetc breaks when sending 0-byte frames.

The patch that I provided resolves the problem by force-cleaning all 
IDRs before interrupts are re-enabled. That is the sledge-hammer 
approach, since it also unmasks BDs that were just completed during
execution of enetc_poll() or no_eof BDs. Hence it is not the final
solution, but currently anything is better than a freezing box.

Below is the tool I wrote to fire such a frame-of-death. If you can 
reproduce the observation, I'd prepare a v2 patch to unblock the issue 
once it happens - preventing enetc_start_xmit() from sending such frames 
I'd leave to you, since that part looks complex to me to handle it properly.


Cheers,
Zefir

---

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/socket.h>
#include <net/ethernet.h>
#include <netpacket/packet.h>
#include <net/if.h>
#include <arpa/inet.h>
#include <sys/ioctl.h>

/*
  * Enetc-Killer
  *
  * This is a PoC for fsl_enetc Ethernet driver to detect an
  * issue the driver has when zero-payload IP packets are sent.
  *
  * It was detected when using an enetc Ethernet interface bridged
  * with a wireless interface operating as AP. A connected client
  * regularly sends L2 WNM keep-alive frames without IP payload.
  * Through the bridge this 'buggy' packet makes it into the
  * enetc TX path, which the driver enqueues for sending and
  * the HW signals transmission done but without providing a
  * completed TX-BD. This leads to a sticky interrupt detected
  * flag causing a SIRQ-storm.
  *
  * This has been tested on a LS1028A based system under an
  * OpenWRT derivative / linux 6.6.93
  *
  * To test:
  * * build and copy binary to device
  * * connect over serial, leave eth0 idle
  * * ensure device runs with multiple cores enabled (otherwise it freezes)
  * * run the program
  * * with top, observe that one core is fully loaded with SIRQ
  * * to recover, storm-ping eth0 from outside to
  *   enforce TX-BD advance
  */

int main()
{
	int sock = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
	if (sock < 0) {
		perror("socket");
		return 1;
	}

	struct ifreq ifr;
	memset(&ifr, 0, sizeof(ifr));
	strncpy(ifr.ifr_name, "eth0", IFNAMSIZ);
	if (ioctl(sock, SIOCGIFINDEX, &ifr) < 0) {
		perror("ioctl");
		return 1;
	}

	struct sockaddr_ll addr = { 0 };
	addr.sll_family = AF_PACKET;
	addr.sll_ifindex = ifr.ifr_ifindex;
	addr.sll_halen = ETH_ALEN;
	addr.sll_protocol = htons(ETH_P_IP);
	// Destination MAC (Broadcast)
	addr.sll_addr[0] = 0xff;
	addr.sll_addr[1] = 0xff;
	addr.sll_addr[2] = 0xff;
	addr.sll_addr[3] = 0xff;
	addr.sll_addr[4] = 0xff;
	addr.sll_addr[5] = 0xff;

	// "broken" packet: only Ethernet-header, no IP-payload
	// as sent by wpa_supplicant as L2 WNM keep-alive frame
	unsigned char buf[14] = {
		0xff, 0xff, 0xff, 0xff, 0xff, 0xff,	// DST MAC
		0x00, 0x11, 0x22, 0x33, 0x44, 0x55,	// SRC MAC
		0x08, 0x00				// EtherType = IPv4
	};

	if (sendto(sock, buf, sizeof(buf), 0, (struct sockaddr*) &addr, 
sizeof(addr)) < 0) {
		perror("sendto");
		return 1;
	}
	close(sock);
	return 0;
}


  parent reply	other threads:[~2026-02-26 19:28 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-20 13:29 [PATCH] net: enetc: fix sirq-storm by clearing IDR registers Zefir Kurtisi
2026-02-23 16:32 ` Vladimir Oltean
2026-02-26 12:00   ` Vladimir Oltean
2026-02-26 19:28   ` Zefir Kurtisi [this message]
2026-02-26 19:51     ` Vladimir Oltean
2026-02-27 11:28       ` Zefir Kurtisi
2026-02-27 11:57         ` Vladimir Oltean
2026-02-27 13:03           ` Zefir Kurtisi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b3d9136c-2803-4203-b1ea-1f9e62de80a1@gmail.com \
    --to=zefir.kurtisi@gmail.com \
    --cc=claudiu.manoil@nxp.com \
    --cc=davem@davemloft.net \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=vladimir.oltean@nxp.com \
    --cc=wei.fang@nxp.com \
    --cc=xiaoning.wang@nxp.com \
    --cc=zefir.kurtisi@westermo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox