From: Bjorn Helgaas <helgaas@kernel.org>
To: Naveen Kumar P <naveenkumar.parna@gmail.com>
Cc: linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org,
linux-kernel@vger.kernel.org,
kernelnewbies <kernelnewbies@kernelnewbies.org>
Subject: Re: PCI: hotplug_event: PCIe PLDA Device BAR Reset
Date: Mon, 24 Feb 2025 13:54:23 -0600 [thread overview]
Message-ID: <20250224195423.GA473540@bhelgaas> (raw)
In-Reply-To: <CAMciSVVV9tHH1M2bOnwqCJCQ8OjNFGjuQB7R-fY7JHHD5tQHoA@mail.gmail.com>
On Tue, Feb 25, 2025 at 12:29:00AM +0530, Naveen Kumar P wrote:
> On Mon, Feb 24, 2025 at 11:03 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Mon, Feb 24, 2025 at 05:45:35PM +0530, Naveen Kumar P wrote:
> > > On Wed, Feb 19, 2025 at 10:36 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > On Wed, Feb 19, 2025 at 05:52:47PM +0530, Naveen Kumar P wrote:
> > > > > Hi all,
> > > > >
> > > > > I am writing to seek assistance with an issue we are experiencing with
> > > > > a PCIe device (PLDA Device 5555) connected through PCI Express Root
> > > > > Port 1 to the host bridge.
> > > > >
> > > > > We have observed that after booting the system, the Base Address
> > > > > Register (BAR0) memory of this device gets reset to 0x0 after
> > > > > approximately one hour or more (the timing is inconsistent). This was
> > > > > verified using the lspci output and the setpci -s 01:00.0
> > > > > BASE_ADDRESS_0 command.
> ...
> I booted with the pcie_aspm=off kernel parameter, which means that
> PCIe Active State Power Management (ASPM) is disabled. Given this
> context, should I consider removing this setting to see if it affects
> the occurrence of the Bus Check notifications and the BAR0 reset
> issue?
Doesn't seem likely to be related. Once configured, ASPM operates
without any software intervention. But note that "pcie_aspm=off"
means the kernel doesn't touch ASPM configuration at all, and any
configuration done by firmware remains in effect.
You can tell whether ASPM has been enabled by firmware with "sudo
lspci -vv" before the problem occurs.
> > > During the ACPI_NOTIFY_BUS_CHECK event, the lspci output initially
> > > showed all FF's, and then the next run of the same command showed
> > > BASE_ADDRESS_0 reset to zero:
> > > $ sudo lspci -xxx -s 01:00.0 | grep "10:"
> > > 10: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> >
> > Looks like the device isn't responding at all here. Could happen if
> > the device is reset or powered down.
>
> From the kernel driver or user space tools, is it possible to
> determine whether the device has been reset or powered down? Are
> there any power management settings or configurations that could be
> causing the device to reset or power down unexpectedly?
Not really. By "powered down", I meant D3cold, where the main power
is removed. Config space is readable in all other power states.
> > What is this device? What driver is bound to it? I don't see
> > anything in dmesg that identifies a driver.
>
> The PCIe device in question is a Xilinx FPGA endpoint, which is
> flashed with RTL code to expose several host interfaces to the system
> via the PCIe link.
>
> We have an out-of-tree driver for this device, but to eliminate the
> driver's role in this issue, I renamed the driver to prevent it from
> loading automatically after rebooting the machine. Despite not using
> the driver, the issue still occurred.
Oh, right, I forgot that you mentioned this before.
> > You're seeing the problem on v5.4 (Nov 2019), which is much newer than
> > v4.4 (Jan 2016). But v5.4 is still really too old to spend a lot of
> > time on unless the problem still happens on a current kernel.
This part is important. We don't want to spend a lot of time
debugging an issue that may have already been fixed upstream.
Bjorn
_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
next parent reply other threads:[~2025-02-24 19:54 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CAMciSVVV9tHH1M2bOnwqCJCQ8OjNFGjuQB7R-fY7JHHD5tQHoA@mail.gmail.com>
2025-02-24 19:54 ` Bjorn Helgaas [this message]
[not found] <CAMciSVVu6qL6QV7KqLem2ZoRoW2T5a3s13EyKE-4SFGHDFfR4g@mail.gmail.com>
2025-03-19 21:41 ` PCI: hotplug_event: PCIe PLDA Device BAR Reset Bjorn Helgaas
[not found] <CAMciSVVhdRjfVYZGg+0Yo6EV4P80No3kLxCL8+LyVjwywiWxYg@mail.gmail.com>
2025-03-04 21:01 ` Bjorn Helgaas
[not found] <CAMciSVU2Xeh+3KsFK33GGLK7h59n9A_1RANdFV+ghGv39qcxPw@mail.gmail.com>
2025-03-04 20:45 ` Bjorn Helgaas
[not found] <CAMciSVX3X=DxLU0tfj4rG5WPaS5BCUDcMp2MYWBitT0ecEH+ig@mail.gmail.com>
2025-02-25 20:38 ` Bjorn Helgaas
[not found] <CAMciSVXDS_n7-XzHevMmAOhb-qCNsCBbE1Pym-zWybnOyjZWmw@mail.gmail.com>
2025-02-24 17:33 ` Bjorn Helgaas
[not found] <CAMciSVU4vv7=WjVUhuP3PJHdpnYqrgMPCmz-HnijEbhyxk54eQ@mail.gmail.com>
2025-02-19 17:06 ` Bjorn Helgaas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250224195423.GA473540@bhelgaas \
--to=helgaas@kernel.org \
--cc=kernelnewbies@kernelnewbies.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=naveenkumar.parna@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).