public inbox for u-boot@lists.denx.de
 help / color / mirror / Atom feed
From: Tom Rini <trini@konsulko.com>
To: "Maciej W. Rozycki" <macro@orcam.me.uk>
Cc: u-boot@lists.denx.de, "Stefan Roese" <sr@denx.de>,
	"Simon Glass" <sjg@chromium.org>, "Pali Rohár" <pali@kernel.org>,
	"Phil Sutter" <phil@nwl.cc>,
	"Vladimir Oltean" <vladimir.oltean@nxp.com>,
	"Bin Meng" <bmeng.cn@gmail.com>,
	"Tim Harvey" <tharvey@gateworks.com>,
	"Jim Wilson" <wilson@tuliptree.org>,
	"David Abdurachmanov" <david.abdurachmanov@sifive.com>
Subject: Re: [PATCH v3] pci: Work around PCIe link training failures
Date: Sat, 15 Jan 2022 07:37:38 -0500	[thread overview]
Message-ID: <20220115123738.GW9207@bill-the-cat> (raw)
In-Reply-To: <alpine.DEB.2.21.2111201114100.10804@angie.orcam.me.uk>

[-- Attachment #1: Type: text/plain, Size: 5171 bytes --]

On Sat, Nov 20, 2021 at 11:03:30PM +0000, Maciej W. Rozycki wrote:

> Attempt to handle cases with a downstream port of a PCIe switch where
> link training never completes and the link continues switching between 
> speeds indefinitely with the data link layer never reaching the active 
> state.
> 
> It has been observed with a downstream port of the ASMedia ASM2824 Gen 3 
> switch wired to the upstream port of the Pericom PI7C9X2G304 Gen 2 
> switch, using a Delock Riser Card PCI Express x1 > 2 x PCIe x1 device, 
> P/N 41433, wired to a SiFive HiFive Unmatched board.  In this setup the 
> switches are supposed to negotiate the link speed of preferably 5.0GT/s, 
> falling back to 2.5GT/s.
> 
> However the link continues oscillating between the two speeds, at the 
> rate of 34-35 times per second, with link training reported repeatedly 
> active ~84% of the time, e.g.:
> 
> 02:03.0 PCI bridge [0604]: ASMedia Technology Inc. ASM2824 PCIe Gen3 Packet Switch [1b21:2824] (rev 01) (prog-if 00 [Normal decode])
> [...]
> 	Bus: primary=02, secondary=05, subordinate=05, sec-latency=0
> [...]
> 	Capabilities: [80] Express (v2) Downstream Port (Slot+), MSI 00
> [...]
> 		LnkSta:	Speed 5GT/s (downgraded), Width x1 (ok)
> 			TrErr- Train+ SlotClk+ DLActive- BWMgmt+ ABWMgmt-
> [...]
> 		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis+, Selectable De-emphasis: -3.5dB
> 			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
> 			 Compliance De-emphasis: -6dB
> [...]
> 
> Forcibly limiting the target link speed to 2.5GT/s with the upstream 
> ASM2824 device makes the two switches communicate correctly however:
> 
> 02:03.0 PCI bridge [0604]: ASMedia Technology Inc. ASM2824 PCIe Gen3 Packet Switch [1b21:2824] (rev 01) (prog-if 00 [Normal decode])
> [...]
> 	Bus: primary=02, secondary=05, subordinate=09, sec-latency=0
> [...]
> 	Capabilities: [80] Express (v2) Downstream Port (Slot+), MSI 00
> [...]
> 		LnkSta:	Speed 2.5GT/s (downgraded), Width x1 (ok)
> 			TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt-
> [...]
> 		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis+, Selectable De-emphasis: -3.5dB
> 			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
> 			 Compliance De-emphasis: -6dB
> [...]
> 
> and then:
> 
> 05:00.0 PCI bridge [0604]: Pericom Semiconductor PI7C9X2G304 EL/SL PCIe2 3-Port/4-Lane Packet Switch [12d8:2304] (rev 05) (prog-if 00 [Normal decode])
> [...]
> 	Bus: primary=05, secondary=06, subordinate=09, sec-latency=0
> [...]
> 	Capabilities: [c0] Express (v2) Upstream Port, MSI 00
> [...]
> 		LnkSta:	Speed 2.5GT/s (downgraded), Width x1 (downgraded)
> 			TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> [...]
> 		LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
> 			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
> 			 Compliance De-emphasis: -6dB
> [...]
> 
> Make use of this observation then and attempt to detect the inability to 
> negotiate the link speed automatically, and then handle it by hand.  Use 
> the Data Link Layer Link Active status flag as the primary indicator of 
> successful link speed negotiation, but given that the flag is optional 
> by hardware to implement (the ASM2824 does have it though), resort to 
> checking for the mandatory Link Bandwidth Management Status flag showing 
> that the link speed or width has been changed in an attempt to correct 
> unreliable link operation (the ASM2824 does set it too).
> 
> If these checks indicate that link may not operate correctly, then poll 
> the Data Link Layer Link Active status flag along with the Link Training 
> flag for the duration of 200ms to see if the link has stabilised, that 
> is either that the Data Link Layer Link Active status flag has been set 
> or that Link Training has been inactive during at least the second half 
> of the interval.
> 
> If that has indicated failure, restrict the target speed to 2.5GT/s, 
> request a link retrain and check again if the link has stabilised.  If 
> that does not work either, then restore the original speed setting and 
> claim defeat, otherwise we are done.
> 
> NB interestingly enough with the ASM2824 vs PI7C9X2G304 configuration 
> referred above asking the ASM2824 to retrain with a higher target link 
> speed once the 2.5GT/s speed has been negotiated makes the two devices 
> successfully negotiate 5.0GT/s.  Lifting the 2.5GT/s speed restriction 
> would however prevent our workaround from working with an OS that issues 
> a reset and that is unaware of the problem.  This is because the devices 
> would then try to negotiate a higher link speed from scratch and fail, 
> while the sticky property of the Target Link Speed setting will keep the 
> 2.5GT/s speed restriction across a reset.
> 
> Keep the 2.5GT/s speed restriction then, conservatively, if functional 
> once applied.
> 
> Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
> Reviewed-by: Stefan Roese <sr@denx.de>

Applied to u-boot/master, thanks!

-- 
Tom

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 659 bytes --]

  parent reply	other threads:[~2022-01-15 12:37 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-20 23:03 [PATCH v3] pci: Work around PCIe link training failures Maciej W. Rozycki
2022-01-02 23:23 ` [PING][PATCH " Maciej W. Rozycki
2022-01-12 19:19 ` [PATCH " Tom Rini
2022-01-12 22:43   ` Maciej W. Rozycki
2022-01-13 12:57 ` Stefan Roese
2022-01-15 12:37 ` Tom Rini [this message]
2022-01-17 11:21   ` Maciej W. Rozycki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220115123738.GW9207@bill-the-cat \
    --to=trini@konsulko.com \
    --cc=bmeng.cn@gmail.com \
    --cc=david.abdurachmanov@sifive.com \
    --cc=macro@orcam.me.uk \
    --cc=pali@kernel.org \
    --cc=phil@nwl.cc \
    --cc=sjg@chromium.org \
    --cc=sr@denx.de \
    --cc=tharvey@gateworks.com \
    --cc=u-boot@lists.denx.de \
    --cc=vladimir.oltean@nxp.com \
    --cc=wilson@tuliptree.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox