netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Russell King (Oracle)" <linux@armlinux.org.uk>
To: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: netdev@vger.kernel.org, imx@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	Kieran Bingham <kieran.bingham@ideasonboard.com>,
	Stefan Klug <stefan.klug@ideasonboard.com>,
	Andrew Lunn <andrew+netdev@lunn.ch>,
	Clark Wang <xiaoning.wang@nxp.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Fabio Estevam <festevam@denx.de>,
	Fabio Estevam <festevam@gmail.com>,
	Francesco Dolcini <francesco.dolcini@toradex.com>,
	Frank Li <Frank.Li@nxp.com>, Heiko Schocher <hs@denx.de>,
	Jakub Kicinski <kuba@kernel.org>,
	Joakim Zhang <qiangqing.zhang@nxp.com>, Joy Zou <joy.zou@nxp.com>,
	Marcel Ziswiler <marcel.ziswiler@toradex.com>,
	Marco Felsch <m.felsch@pengutronix.de>,
	Martyn Welch <martyn.welch@collabora.com>,
	Mathieu Othacehe <othacehe@gnu.org>,
	Paolo Abeni <pabeni@redhat.com>,
	Pengutronix Kernel Team <kernel@pengutronix.de>,
	Richard Hu <richard.hu@technexion.com>,
	Sascha Hauer <s.hauer@pengutronix.de>,
	Shawn Guo <shawnguo@kernel.org>,
	Shenwei Wang <shenwei.wang@nxp.com>,
	Stefano Radaelli <stefano.radaelli21@gmail.com>,
	Wei Fang <wei.fang@nxp.com>,
	Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Subject: Re: [PATCH] net: stmmac: imx: Do not stop RX_CLK in Rx LPI state for i.MX8MP
Date: Sun, 23 Nov 2025 08:47:41 +0000	[thread overview]
Message-ID: <aSLKLYuz0WA2LpFF@shell.armlinux.org.uk> (raw)
In-Reply-To: <20251123053518.8478-1-laurent.pinchart@ideasonboard.com>

On Sun, Nov 23, 2025 at 02:35:18PM +0900, Laurent Pinchart wrote:
> The i.MX8MP-based Debix Model A board experiences an interrupt storm
> on the ENET_EQOS IRQ (135) when connected to an EEE-enabled peer.
> 
> Setting the eee-broken-1000t DT property in the PHY node solves the
> problem, which confirms that the issue is related to EEE. Device trees
> for 8 boards in the mainline kernel, including the i.MX8MP EVK, set the
> property, which indicates the issue is likely not limited to the Debix
> board, although some of those device trees may have blindly copied the
> property from the EVK.
> 
> The IRQ is documented in the reference manual as the logical OR of 4
> signals:
> 
> - ENET QOS TSN LPI RX Exit Interrupt
> - ENET QOS TSN Host System Interrupt
> - ENET QOS TSN Host System RX Channel Interrupts, Logical OR of
>   channels[4:0]
> - ENET QOS TSN Host System TX Channel Interrupts, Logical OR of
>   channels[4:0]
> 
> Debugging the issue showed no unmasked interrupt sources from the Host
> System Interrupt (GMAC_INT_STATUS), Host System RX Channel Interrupts or
> Host System TX Channel Interrupts (MTL_INT_STATUS, MTL_CHAN_INT_CTRL and
> DMA_CHAN_STATUS) that was flagged at an unexpected high rate. This
> leaves the LPI RX Exit Interrupt as the most likely culprit.
> 
> The reference manual doesn't clearly indicate what the interrupt signal
> is, but from its name we can reasonably infer that it would be connected
> to the EQOS lpi_intr_o output. That interrupt is cleared when reading
> the LPI control/status register. However, its deassertion is synchronous
> to the RX clock domain, so it will take time to clear. It appears that
> it could even fail to clear at all, as in the following sequence of
> events:
> 
> - When the PHY exits LPI mode, it restarts generating the RX clock
>   (clk_rx_i input signal to the GMAC).
> - The MAC detects exit from LPI, and asserts lpi_intr_o. This triggers
>   the ENET_EQOS interrupt.
> - Before the CPU has time to process the interrupt, the PHY enters LPI
>   mode again, and stops generating the RX clock.
> - The CPU processes the interrupt and reads the GMAC4_LPI_CTRL_STATUS
>   registers. This does not clear lpi_intr_o as there's no clk_rx_i.
> 
> The ENET_EQOS interrupt will keep firing until the PHY resumes
> generating the RX clock when it eventually exits LPI mode again.
> 
> As LPI exit is reported by the LPIIS bit in GMAC_INT_STATUS, the
> lpi_intr_o signal may not have been meant to be wired to a CPU
> interrupt. It can't be masked in GMAC registers, and OR'ing it to the
> other GMAC interrupt signals seems to be a design mistake as it makes it
> impossible to selectively mask the interrupt in the GIC either.
> 
> Setting the STMMAC_FLAG_RX_CLK_RUNS_IN_LPI platform data flag gets rid
> of the interrupt storm, which confirms the above theory.
> 
> The i.MX8DXL and i.MX93, which also integrate an EQOS, may also be
> affected, as hinted by the eee-broken-1000t property being set in the
> i.MX8DXL EVK and the i.MX93 Variscite SoM device trees. The reference
> manual of the i.MX93 indicates that the ENET_EQOS interrupt also OR's
> the "ENET QOS TSN LPI RX exit Interrupt", while the i.MX8DXL reference
> manual doesn't provide details about the ENET_EQOS interrupt.
> 
> Additional testing is needed with the i.MX8DXL and i.MX93, so for now
> set the flag for the i.MX8MP only. The eee-broken-1000t property could
> possibly be removed from some of the i.MX8MP device trees, but that also
> require per-board testing.
> 
> Suggested-by: Russell King <linux@armlinux.org.uk>
> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> ---
> I have CC'ed authors and maintainers of the i.MX8DXL, i.MX8MP and i.MX93
> device trees that set the eee-broken-1000t property for awareness. To
> test if the property can be dropped, you will need to
> 
> - Connect the EQOS interface to an EEE-enabled peer with a 1000T link.
> - Drop the eee-broken-1000t property from the device tree.
> - Boot the board and check with `ethtool --show-eee` that EEE is active.
> - Check the number of interrupts received from the EQOS in
>   /proc/interrupts. After boot on my system (with an NFS root) I have
>   ~6000 interrupts when no interrupt storm occurs, and hundreds of
>   thousands otherwise.
> - Apply this patch and check that EEE works as expected without any
>   interrupt storm. For i.MX8DXL and i.MX93, you will need to set the
>   STMMAC_FLAG_RX_CLK_RUNS_IN_LPI in the corresponding imx_dwmac_ops
>   instances in drivers/net/ethernet/stmicro/stmmac/dwmac-imx.c.

Hang on... also check 100M connections, as I indicated, the lpi_intr_o
is slow to clear even when the receive clock is running (it takes for
receive clock cycles - 160ns for 100M, 32ns for 1G.)

So, I suspect you still get a storm, but it's not as severe.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

  reply	other threads:[~2025-11-23  8:47 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-23  5:35 [PATCH] net: stmmac: imx: Do not stop RX_CLK in Rx LPI state for i.MX8MP Laurent Pinchart
2025-11-23  8:47 ` Russell King (Oracle) [this message]
2025-11-23 15:22   ` Laurent Pinchart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aSLKLYuz0WA2LpFF@shell.armlinux.org.uk \
    --to=linux@armlinux.org.uk \
    --cc=Frank.Li@nxp.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=festevam@denx.de \
    --cc=festevam@gmail.com \
    --cc=francesco.dolcini@toradex.com \
    --cc=hs@denx.de \
    --cc=imx@lists.linux.dev \
    --cc=joy.zou@nxp.com \
    --cc=kernel@pengutronix.de \
    --cc=kieran.bingham@ideasonboard.com \
    --cc=kuba@kernel.org \
    --cc=laurent.pinchart@ideasonboard.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=m.felsch@pengutronix.de \
    --cc=marcel.ziswiler@toradex.com \
    --cc=martyn.welch@collabora.com \
    --cc=netdev@vger.kernel.org \
    --cc=othacehe@gnu.org \
    --cc=pabeni@redhat.com \
    --cc=qiangqing.zhang@nxp.com \
    --cc=richard.hu@technexion.com \
    --cc=s.hauer@pengutronix.de \
    --cc=shawnguo@kernel.org \
    --cc=shenwei.wang@nxp.com \
    --cc=stefan.klug@ideasonboard.com \
    --cc=stefano.radaelli21@gmail.com \
    --cc=wei.fang@nxp.com \
    --cc=xiaoliang.yang_1@nxp.com \
    --cc=xiaoning.wang@nxp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).