devicetree.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Damien Le Moal <damien.lemoal@opensource.wdc.com>
To: Rick Wertenbroek <rick.wertenbroek@gmail.com>
Cc: alberto.dassatti@heig-vd.ch, xxm@rock-chips.com,
	rick.wertenbroek@heig-vd.ch, "Rob Herring" <robh+dt@kernel.org>,
	"Krzysztof Kozlowski" <krzysztof.kozlowski+dt@linaro.org>,
	"Heiko Stuebner" <heiko@sntech.de>,
	"Shawn Lin" <shawn.lin@rock-chips.com>,
	"Lorenzo Pieralisi" <lpieralisi@kernel.org>,
	"Krzysztof Wilczyński" <kw@linux.com>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Jani Nikula" <jani.nikula@intel.com>,
	"Rodrigo Vivi" <rodrigo.vivi@intel.com>,
	"Mikko Kovanen" <mikko.kovanen@aavamobile.com>,
	"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-rockchip@lists.infradead.org, linux-kernel@vger.kernel.org,
	linux-pci@vger.kernel.org
Subject: Re: [PATCH v2 0/9] PCI: rockchip: Fix RK3399 PCIe endpoint controller driver
Date: Wed, 15 Mar 2023 07:54:14 +0900	[thread overview]
Message-ID: <8392a7de-666a-bce6-dc9f-b60d6dd93013@opensource.wdc.com> (raw)
In-Reply-To: <CAAEEuhqk0scWd3wFbVb9fSgHxPBKotpEPNi+YPG4GD9vLO94mw@mail.gmail.com>

On 3/14/23 23:53, Rick Wertenbroek wrote:
> Hello Damien,
> I also noticed random issues I suspect to be related to link status or power
> state, in my case it sometimes happens that the BARs (0-6) in the config
> space get reset to 0. This is not due to the driver because the driver never
> ever accesses these registers (@0xfd80'0010 to 0xfd80'0024 TRM
> 17.6.4.1.5-17.6.4.1.10).
> I don't think the host rewrites them because lspci shows the BARs as
> "[virtual]" which means they have been assigned by host but have 0
> value in the endpoint device (when lspci rereads the PCI config header).
> See https://github.com/pciutils/pciutils/blob/master/lspci.c#L422
> 
> So I suspect the controller detects something related to link status or
> power state and internally (in hardware) resets those registers. It's not
> the kernel code, it never accesses these regs. The problem occurs
> very randomly, sometimes in a few seconds, sometimes I cannot see
> it for a whole day.
> 
> Is this similar to what you are experiencing ?

Yes. I sometimes get NMIs after starting the function driver, when my function
driver starts probing the bar registers after seeing the host changing one
register. And the link also comes up with 4 lanes or 2 lanes, random.

> Do you have any idea as to what could make these registers to be reset
> (I could not find anything in the TRM, also nothing in the driver seems to
> cause it).

My thinking is that since we do not have a linkup notifier, the function driver
starts setting things up without the link established (e.g. when the host is
still powered down). Once the host start booting and pic link is established,
things may be reset in the hardware... That is the only thing I can think of.

And yes, there are definitely something going on with the power states too I
think: if I let things idle for a few minutes, everything stops working: no
activity seen on the endpoint over the BARs. I tried enabling the sys and client
interrupts to see if I can see power state changes, or if clearing the
interrupts helps (they are masked by default), but no change. And booting the
host with pci_aspm=off does not help either. Also tried to change all the
capabilities related to link & power states to "off" (not supported), and no
change either. So currently, I am out of ideas regarding that one.

I am trying to make progress on my endpoint driver (nvme function) to be sure it
is not a bug there that breaks things. I may still have something bad because
when I enable the BIOS native NVMe driver on the host, either the host does not
boot, or grub crashes with memory corruptions. Overall, not yet very stable and
still trying to sort out the root cause of that.


> Do you want me to include this patch in the V3 series or will you
> submit another patch series for the changes you applied on the RK3399 PCIe
> endpoint controller ? I don't know if you prefer to build the V3
> together or if you
> prefer to submit another patch series on top of mine. Let me know.

If it is no trouble, please include it with your series. Will be easier to
retest everything together :)

-- 
Damien Le Moal
Western Digital Research


  reply	other threads:[~2023-03-14 22:54 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-14 14:08 [PATCH v2 0/9] PCI: rockchip: Fix RK3399 PCIe endpoint controller driver Rick Wertenbroek
2023-02-14 14:08 ` [PATCH v2 1/9] PCI: rockchip: Remove writes to unused registers Rick Wertenbroek
2023-02-14 23:56   ` Damien Le Moal
2023-02-15  9:04     ` Rick Wertenbroek
2023-02-15  9:17       ` Damien Le Moal
2023-02-15  9:58       ` Damien Le Moal
2023-02-16  7:28         ` Damien Le Moal
2023-02-16  8:43           ` Rick Wertenbroek
2023-02-16  9:54             ` Damien Le Moal
2023-02-14 14:08 ` [PATCH v2 2/9] PCI: rockchip: Write PCI Device ID to correct register Rick Wertenbroek
2023-02-14 23:57   ` Damien Le Moal
2023-02-14 14:08 ` [PATCH v2 3/9] PCI: rockchip: Assert PCI Configuration Enable bit after probe Rick Wertenbroek
2023-02-14 23:58   ` Damien Le Moal
2023-02-14 14:08 ` [PATCH v2 4/9] PCI: rockchip: Add poll and timeout to wait for PHY PLLs to be locked Rick Wertenbroek
2023-02-15  1:01   ` Damien Le Moal
2023-02-14 14:08 ` [PATCH v2 5/9] arm64: dts: rockchip: Add dtsi entry for RK3399 PCIe endpoint core Rick Wertenbroek
2023-02-15  1:03   ` Damien Le Moal
2023-02-14 14:08 ` [PATCH v2 6/9] PCI: rockchip: Fix window mapping and address translation for endpoint Rick Wertenbroek
2023-02-15  1:20   ` Damien Le Moal
2023-02-14 14:08 ` [PATCH v2 7/9] PCI: rockchip: Fix legacy IRQ generation for RK3399 PCIe endpoint core Rick Wertenbroek
2023-02-15  1:26   ` Damien Le Moal
2023-02-15  2:38   ` Damien Le Moal
2023-02-14 14:08 ` [PATCH v2 8/9] PCI: rockchip: Use u32 variable to access 32-bit registers Rick Wertenbroek
2023-02-15  1:34   ` Damien Le Moal
2023-02-15 10:46     ` David Laight
2023-02-15 11:20       ` Damien Le Moal
2023-03-14 15:45     ` Rick Wertenbroek
2023-02-14 14:08 ` [PATCH v2 9/9] PCI: rockchip: Add parameter check for RK3399 PCIe endpoint core set_msi() Rick Wertenbroek
2023-02-15  1:39   ` Damien Le Moal
2023-02-21 10:47     ` Rick Wertenbroek
2023-02-21 10:55       ` Damien Le Moal
2023-02-21 13:19         ` Rick Wertenbroek
2023-02-21 16:37           ` Rick Wertenbroek
2023-02-21 22:01             ` Damien Le Moal
2023-02-21 21:57           ` Damien Le Moal
2023-02-15  1:51 ` [PATCH v2 0/9] PCI: rockchip: Fix RK3399 PCIe endpoint controller driver Damien Le Moal
2023-02-15 10:28   ` Rick Wertenbroek
2023-02-15 10:41     ` Damien Le Moal
2023-03-14  0:02 ` Damien Le Moal
2023-03-14  7:57   ` Rick Wertenbroek
2023-03-14  8:10     ` Damien Le Moal
2023-03-14 14:53       ` Rick Wertenbroek
2023-03-14 22:54         ` Damien Le Moal [this message]
2023-03-15  0:00           ` Damien Le Moal
2023-03-16 12:52             ` Rick Wertenbroek
2023-03-16 16:34               ` Rick Wertenbroek
2023-03-16 22:09                 ` Damien Le Moal
2023-04-13 13:49                   ` Lorenzo Pieralisi
2023-04-13 14:34                     ` Rick Wertenbroek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8392a7de-666a-bce6-dc9f-b60d6dd93013@opensource.wdc.com \
    --to=damien.lemoal@opensource.wdc.com \
    --cc=alberto.dassatti@heig-vd.ch \
    --cc=bhelgaas@google.com \
    --cc=devicetree@vger.kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=heiko@sntech.de \
    --cc=jani.nikula@intel.com \
    --cc=krzysztof.kozlowski+dt@linaro.org \
    --cc=kw@linux.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rockchip@lists.infradead.org \
    --cc=lpieralisi@kernel.org \
    --cc=mikko.kovanen@aavamobile.com \
    --cc=rick.wertenbroek@gmail.com \
    --cc=rick.wertenbroek@heig-vd.ch \
    --cc=robh+dt@kernel.org \
    --cc=rodrigo.vivi@intel.com \
    --cc=shawn.lin@rock-chips.com \
    --cc=xxm@rock-chips.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).