All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tom Rini <trini@konsulko.com>
To: "Maciej W. Rozycki" <macro@orcam.me.uk>
Cc: u-boot@lists.denx.de, "Stefan Roese" <sr@denx.de>,
	"Simon Glass" <sjg@chromium.org>, "Pali Rohár" <pali@kernel.org>,
	"Phil Sutter" <phil@nwl.cc>,
	"Vladimir Oltean" <vladimir.oltean@nxp.com>,
	"Bin Meng" <bmeng.cn@gmail.com>,
	"Tim Harvey" <tharvey@gateworks.com>,
	"Jim Wilson" <wilson@tuliptree.org>,
	"David Abdurachmanov" <david.abdurachmanov@sifive.com>
Subject: Re: [PATCH v3] pci: Work around PCIe link training failures
Date: Sat, 15 Jan 2022 07:37:38 -0500	[thread overview]
Message-ID: <20220115123738.GW9207@bill-the-cat> (raw)
In-Reply-To: <alpine.DEB.2.21.2111201114100.10804@angie.orcam.me.uk>

[-- Attachment #1: Type: text/plain, Size: 5171 bytes --]

On Sat, Nov 20, 2021 at 11:03:30PM +0000, Maciej W. Rozycki wrote:

> Attempt to handle cases with a downstream port of a PCIe switch where
> link training never completes and the link continues switching between 
> speeds indefinitely with the data link layer never reaching the active 
> state.
> 
> It has been observed with a downstream port of the ASMedia ASM2824 Gen 3 
> switch wired to the upstream port of the Pericom PI7C9X2G304 Gen 2 
> switch, using a Delock Riser Card PCI Express x1 > 2 x PCIe x1 device, 
> P/N 41433, wired to a SiFive HiFive Unmatched board.  In this setup the 
> switches are supposed to negotiate the link speed of preferably 5.0GT/s, 
> falling back to 2.5GT/s.
> 
> However the link continues oscillating between the two speeds, at the 
> rate of 34-35 times per second, with link training reported repeatedly 
> active ~84% of the time, e.g.:
> 
> 02:03.0 PCI bridge [0604]: ASMedia Technology Inc. ASM2824 PCIe Gen3 Packet Switch [1b21:2824] (rev 01) (prog-if 00 [Normal decode])
> [...]
> 	Bus: primary=02, secondary=05, subordinate=05, sec-latency=0
> [...]
> 	Capabilities: [80] Express (v2) Downstream Port (Slot+), MSI 00
> [...]
> 		LnkSta:	Speed 5GT/s (downgraded), Width x1 (ok)
> 			TrErr- Train+ SlotClk+ DLActive- BWMgmt+ ABWMgmt-
> [...]
> 		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis+, Selectable De-emphasis: -3.5dB
> 			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
> 			 Compliance De-emphasis: -6dB
> [...]
> 
> Forcibly limiting the target link speed to 2.5GT/s with the upstream 
> ASM2824 device makes the two switches communicate correctly however:
> 
> 02:03.0 PCI bridge [0604]: ASMedia Technology Inc. ASM2824 PCIe Gen3 Packet Switch [1b21:2824] (rev 01) (prog-if 00 [Normal decode])
> [...]
> 	Bus: primary=02, secondary=05, subordinate=09, sec-latency=0
> [...]
> 	Capabilities: [80] Express (v2) Downstream Port (Slot+), MSI 00
> [...]
> 		LnkSta:	Speed 2.5GT/s (downgraded), Width x1 (ok)
> 			TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt-
> [...]
> 		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis+, Selectable De-emphasis: -3.5dB
> 			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
> 			 Compliance De-emphasis: -6dB
> [...]
> 
> and then:
> 
> 05:00.0 PCI bridge [0604]: Pericom Semiconductor PI7C9X2G304 EL/SL PCIe2 3-Port/4-Lane Packet Switch [12d8:2304] (rev 05) (prog-if 00 [Normal decode])
> [...]
> 	Bus: primary=05, secondary=06, subordinate=09, sec-latency=0
> [...]
> 	Capabilities: [c0] Express (v2) Upstream Port, MSI 00
> [...]
> 		LnkSta:	Speed 2.5GT/s (downgraded), Width x1 (downgraded)
> 			TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> [...]
> 		LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
> 			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
> 			 Compliance De-emphasis: -6dB
> [...]
> 
> Make use of this observation then and attempt to detect the inability to 
> negotiate the link speed automatically, and then handle it by hand.  Use 
> the Data Link Layer Link Active status flag as the primary indicator of 
> successful link speed negotiation, but given that the flag is optional 
> by hardware to implement (the ASM2824 does have it though), resort to 
> checking for the mandatory Link Bandwidth Management Status flag showing 
> that the link speed or width has been changed in an attempt to correct 
> unreliable link operation (the ASM2824 does set it too).
> 
> If these checks indicate that link may not operate correctly, then poll 
> the Data Link Layer Link Active status flag along with the Link Training 
> flag for the duration of 200ms to see if the link has stabilised, that 
> is either that the Data Link Layer Link Active status flag has been set 
> or that Link Training has been inactive during at least the second half 
> of the interval.
> 
> If that has indicated failure, restrict the target speed to 2.5GT/s, 
> request a link retrain and check again if the link has stabilised.  If 
> that does not work either, then restore the original speed setting and 
> claim defeat, otherwise we are done.
> 
> NB interestingly enough with the ASM2824 vs PI7C9X2G304 configuration 
> referred above asking the ASM2824 to retrain with a higher target link 
> speed once the 2.5GT/s speed has been negotiated makes the two devices 
> successfully negotiate 5.0GT/s.  Lifting the 2.5GT/s speed restriction 
> would however prevent our workaround from working with an OS that issues 
> a reset and that is unaware of the problem.  This is because the devices 
> would then try to negotiate a higher link speed from scratch and fail, 
> while the sticky property of the Target Link Speed setting will keep the 
> 2.5GT/s speed restriction across a reset.
> 
> Keep the 2.5GT/s speed restriction then, conservatively, if functional 
> once applied.
> 
> Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
> Reviewed-by: Stefan Roese <sr@denx.de>

Applied to u-boot/master, thanks!

-- 
Tom

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 659 bytes --]

  parent reply	other threads:[~2022-01-15 12:37 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-20 23:03 [PATCH v3] pci: Work around PCIe link training failures Maciej W. Rozycki
2022-01-02 23:23 ` [PING][PATCH " Maciej W. Rozycki
2022-01-12 19:19 ` [PATCH " Tom Rini
2022-01-12 22:43   ` Maciej W. Rozycki
2022-01-13 12:57 ` Stefan Roese
2022-01-15 12:37 ` Tom Rini [this message]
2022-01-17 11:21   ` Maciej W. Rozycki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220115123738.GW9207@bill-the-cat \
    --to=trini@konsulko.com \
    --cc=bmeng.cn@gmail.com \
    --cc=david.abdurachmanov@sifive.com \
    --cc=macro@orcam.me.uk \
    --cc=pali@kernel.org \
    --cc=phil@nwl.cc \
    --cc=sjg@chromium.org \
    --cc=sr@denx.de \
    --cc=tharvey@gateworks.com \
    --cc=u-boot@lists.denx.de \
    --cc=vladimir.oltean@nxp.com \
    --cc=wilson@tuliptree.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.