From: Bjorn Helgaas <helgaas@kernel.org>
To: Harshank Matkar <harshankmatkar1304@outlook.com>
Cc: "intel-wired-lan@lists.osuosl.org"
<intel-wired-lan@lists.osuosl.org>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"tony.nguyen@intel.com" <tony.nguyen@intel.com>,
"davem@davemloft.net" <davem@davemloft.net>,
"kuba@kernel.org" <kuba@kernel.org>,
"pabeni@redhat.com" <pabeni@redhat.com>,
"edumazet@google.com" <edumazet@google.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] igc: Add PCIe link recovery for I225/I226
Date: Wed, 11 Feb 2026 12:29:09 -0600 [thread overview]
Message-ID: <20260211182909.GA117627@bhelgaas> (raw)
In-Reply-To: <20260210203332.23200-1-harshankmatkar1304@outlook.com>
On Tue, Feb 10, 2026 at 08:34:02PM +0000, Harshank Matkar wrote:
> From: Harshank Matkar <harshankmatkar1304@outlook.com>
>
> When ASPM L0s transitions occur on Intel I225/I226 controllers,
> transient PCIe link instability can cause register read failures
> (0xFFFFFFFF responses).
At the PCIe level, the failure is some uncorrectable PCIe error like a
Completion Timeout or Unsupported Request. The 0xFFFFFFFF response is
implementation-specific behavior determined by the Root Complex
design.
> Implement a multi-layer recovery strategy:
> 1. Immediate retries: 3 attempts with 100-200μs delays
> 2. Link retraining: Trigger PCIe link retraining via capabilities
> 3. Device detachment: Only as last resort after max attempts
>
> The recovery mechanism includes rate limiting, maximum attempt
> tracking, and device presence validation to prevent false detaches
> on transient ASPM glitches while maintaining safety through
> bounded retry limits.
I assume the glitch is a hardware erratum and should be documented as
such by Intel, although it's possible ASPM L0s isn't configured
correctly.
If it's a hardware erratum, I think you should use a quirk to disable
L0s on these devices, e.g., pci_disable_link_state(pdev,
PCIE_LINK_STATE_L0S). Even if this patch allows recovery, the PCIe
errors will be logged and reported via AER, which will be confusing to
users.
Bjorn
next prev parent reply other threads:[~2026-02-11 18:29 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-10 20:34 [PATCH] igc: Add PCIe link recovery for I225/I226 Harshank Matkar
2026-02-11 14:30 ` [Intel-wired-lan] " Ruinskiy, Dima
2026-02-11 15:00 ` Paul Menzel
2026-02-12 18:14 ` Harshank Matkar
2026-02-11 18:29 ` Bjorn Helgaas [this message]
-- strict thread matches above, loose matches on Subject: below --
2026-02-09 12:08 Harshank Matkar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260211182909.GA117627@bhelgaas \
--to=helgaas@kernel.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=harshankmatkar1304@outlook.com \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=tony.nguyen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox