From mboxrd@z Thu Jan 1 00:00:00 1970 From: Kurt Miller Date: Tue, 30 Jun 2020 13:57:08 -0400 Subject: [PATCH] rockchip: Add delay after link-training In-Reply-To: <8e5fb3ba-0935-78de-568e-5354fb6cd7ed@rock-chips.com> References: <20200601203041.58631-1-kurt@intricatesoftware.com> <4158a56b-f780-81a7-7895-35bcf9b0ac91@rock-chips.com> <1591110727.9708.109.camel@intricatesoftware.com> <8e5fb3ba-0935-78de-568e-5354fb6cd7ed@rock-chips.com> Message-ID: <1593539828.8506.263.camel@intricatesoftware.com> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: u-boot@lists.denx.de On Sat, 2020-06-27 at 20:57 +0800, Kever Yang wrote: > Hi Kurt, > > > On 2020/6/4 ??5:17, Peter Geis wrote: > > > > > On Tue, Jun 2, 2020 at 11:12 AM Kurt Miller wrote: > > > > > > On Tue, 2020-06-02 at 10:23 +0800, Shawn Lin wrote: > > > > > > > > ? 2020/6/2 9:59, Kever Yang ??: > > > > > > > > > > Hi Kurt, > > > > > > > > > > On 2020/6/2 ??4:30, Kurt Miller wrote: > > > > > > > > > > > > On at least the RockPro64, many cards will trip a > > > > > > synchronous abort when first accessing PCIe config space > > > > > > during bus scanning. A delay after link training allows > > > > > > some of these cards to function. > > > > > > > > > > > > Signed-off-by: Kurt Miller > > > > > > --- > > > > > > On the RockPro64, some pci cards trip a synchronous abort when > > > > > > scanning the > > > > > > pci bus. For example with HighPoint Rocket Raid 640L which is based on > > > > > > Marvell 88SE9230 I see this: > > > > > > > > > > > > => pci > > > > > > "Synchronous Abort"?handler, esr 0x96000210 > > > > > > elr: 000000000022d034 lr : 000000000022cfd0 (reloc) > > > > > > elr: 00000000f4568034 lr : 00000000f4567fd0 > > > > > > x0 : 0000000000100000 x1 : 00000000f8000000 > > > > > > x2 : 0000000000000000 x3 : 0000000000100000 > > > > > > x4 : 00000000f2559290 x5 : 0000000000000000 > > > > > > x6 : 0000000000000001 x7 : 00000000f2559860 > > > > > > x8 : 0000000000000030 x9 : 0000000000000008 > > > > > > x10: 0000000000000010 x11: 00000000f251fd1c > > > > > > x12: 0000000000001421 x13: 0000000000001468 > > > > > > x14: 00000000f251fd4c x15: 00000000ffffffff > > > > > > x16: 0000000000060001 x17: 000000000000001f > > > > > > x18: 00000000f2532dc0 x19: 00000000f251fcd0 > > > > > > x20: 0000000000000001 x21: 0000000000000000 > > > > > > x22: 0000000000010000 x23: 00000000f45d4000 > > > > > > x24: 0000000000000000 x25: 00000000f45bc000 > > > > > > x26: 0000000000000000 x27: 0000000000000000 > > > > > > x28: 00000000f2541440 x29: 00000000f251fc20 > > > > > > > > > > > > Code: 540000c1 350000a5 93407c00 f9400081 (b8616800) > > > > > > Resetting CPU ... > > > > > > > > > > > > Adding a delay after link training works-around the problem. I added this > > > > > > delay to the OpenBSD rkpcie driver as well: > > > > > > > > > > > > https://github.com/openbsd/src/commit/9857dee3520d8ca5bec68538f4b0708d7e64fc87 > > > > > > > > > > > > > > > > > > HighPoint Rocket Raid 640L needs a 1.75 sec delay and Crossfield > > > > > > SAS9211-4i > > > > > > needs a 1 second delay, so I arbitrarily decided on 2 seconds. > > > > > > > > > > > > The delay work-around was originally discovered by nuumio: > > > > > > https://github.com/nuumio/linux-kernel/commit/5a65b17686002dc84d461bffa324a2cb68e67aee > > > > > > > > > > > > > > > > > > ???drivers/pci/pcie_rockchip.c | 8 ++++++++ > > > > > > ???1 file changed, 8 insertions(+) > > > > > > > > > > > > diff --git a/drivers/pci/pcie_rockchip.c b/drivers/pci/pcie_rockchip.c > > > > > > index 0edc2464a8..51cfbf6b18 100644 > > > > > > --- a/drivers/pci/pcie_rockchip.c > > > > > > +++ b/drivers/pci/pcie_rockchip.c > > > > > > @@ -288,6 +288,14 @@ static int rockchip_pcie_init_port(struct udevice > > > > > > *dev) > > > > > > ???????????goto err_power_off_phy; > > > > > > ???????} > > > > > > +????/* > > > > > > +?????* XXX: On at least the RockPro64, many cards will trip a > > > > > > +?????* synchronous abort when first accessing PCIe config space > > > > > > +?????* during bus scanning. A delay after link training allows > > > > > > +?????* some of these cards to function. > > > > > > +?????*/ > > > > > > +????mdelay(2000); > > > > > I don't understand what kind of delay for init needs 2 Seconds, root > > > > > cause will preferred. > > > Hi Kever, > > > > > > While working on this issue for the OpenBSD PCIe driver I was not > > > able to determine the root cause. I tested the following adapters: > > > > > > ROCKPro64 2 Port SATA > > > StarTech PEXSAT32 2 Port SATA > > > Samsung 970 Evo NVMe w/m.2 adapter > > > IO CREST SI-PEX40148 2 Port SATA > > > IO CREST SI-PEX40057 4 port > > > HighPoint Rocket Raid 640L > > > Crossfield SAS9211-4i > > > Del PERC H200 > > > Dell PERC 6/i > > > Intel Gigabit VT Quad Port Server > > > > > > All of the above adapters successfully link trained, however > > > three of them would panic upon the first read of the PCI config > > > space. nuumio's work-around of placing a delay after link-training > > > allows these cards to work. > > > > > > HighPoint Rocket Raid 640L ~1.75 sec delay > > > Crossfield SAS9211-4i ~1 sec delay > > > Dell PERC H200 ~1 sec delay > > > > > > In attempt to determine if there was a way to detect how long > > > to wait, I compared every status and debug register documented > > > in the rk3399 TRM part 2 for the PCI controller. I compared the > > > values pre-delay and post-delay. I was not able to find a value > > > that would indicate it was safe to proceed. > > > > > > > > > > > Strictly speaking, how long should we need for this had been provided, > > > > see this: > > > > > > > > https://patchwork.kernel.org/patch/11561977/ > I can accept the same fix like kernel which is 100ms, but 2 Second is? > really too much for most cases. > Hi Kever, Thank you for following up on this. The 100ms delay before link-training appears to be solving a different issue where some cards fail to link train. From the comment in the referenced patch "Otherwise we do see some failures for link training." The delay I am proposing is after successful link-training to avoid the synchronous abort that happens when scanning the pci bus and attempting read the pci config space for some cards. I performed a test where I added the 100ms delay before link-training and it did not correct the synchronous abort on the?HighPoint Rocket Raid 640L. Shawn Lin suggested I check if the power supply is stable before enabling training. How would I go about doing this?? Regards, -Kurt > > Thanks, > > - Kever > > > > > > > > > > > > > > > > > > If you need more delay, I??highly suspect you should check if > > > > the power supply is stable before enabling training. > > > Hi Shawn, > > > > > > Thank you for pointing out the need for the 100ms delay before > > > beginning link-training. I believe this is to correct link > > > training failures, while the delay after link training is > > > to work-around post link training reading of PCI config > > > space on certain cards. > > There are similar issues on the Linux driver with various cards > > randomly throwing an abort. > > If it is power, a 2 second wait to allow cards to stabilize after > > power on might be wise for the Linux driver as well. > > > > I imagine some cards take longer to complete power on reset than > > others, and attempting to read them immediately after powering them up > > could be the issue here. > > > > > > > > In my testing I discussed above, I suspected that power > > > was a likely cause. The HighPoint Rocket Raid 640L is > > > a good test card because it documents its 3.3v power at > > > 0.7 watts: > > > > > > https://highpoint-tech.com/USA_new/series_r600-specifications.htm > > > > > > How can I check if the power supply is stable? > > > > > > Regards, > > > -Kurt > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > - Kever > > > > > > > > > > > > > > > > > + > > > > > > ???????/* Initialize Root Complex registers. */ > > > > > > ???????writel(PCIE_LM_VENDOR_ROCKCHIP, priv->apb_base + > > > > > > PCIE_LM_VENDOR_ID); > > > > > > ???????writel(PCI_CLASS_BRIDGE_PCI << 16, >