* PCIe: Unexpected .remove() and .probe() Callback Invocation Without Device Removal
@ 2025-05-07 14:31 Subhashini Rao Beerisetty
2025-05-27 17:43 ` Bjorn Helgaas
0 siblings, 1 reply; 2+ messages in thread
From: Subhashini Rao Beerisetty @ 2025-05-07 14:31 UTC (permalink / raw)
To: linux-pci
Hi All,
I’m reaching out for some guidance on a behavior we're observing with
a Xilinx FPGA based PCIe device on our test systems. This device uses
an out-of-tree driver.
We are seeing that, without any actual physical or hotplug
removal/reinsertion of the PCIe device, the kernel invokes the
pci_driver's .remove() callback followed shortly by the .probe()
callback. This appears to be an unexpected re-enumeration or reset of
the PCIe device.
Below is a snippet of the relevant kernel log captured during one such event.
Apr 30 20:01:05 xilinxtest1 kernel: [6612195.847385] XILINX_FPGA PCI
0000:01:00.0: PME# disabled
Apr 30 20:01:05 xilinxtest1 kernel: [6612195.847410] XILINX_FPGA PCI:
XILINX_FPGA_pci_remove
Apr 30 20:01:05 xilinxtest1 kernel: [6612195.848110] pci 0000:01:00.0:
EDR: Notify handler removed
Apr 30 20:01:05 xilinxtest1 kernel: [6612195.848240] pci 0000:01:00.0:
device released
Apr 30 20:01:05 xilinxtest1 kernel: [6612195.876157] pci_bus 0000:00:
scanning bus
Apr 30 20:01:05 xilinxtest1 kernel: [6612195.876419] pcieport
0000:00:1c.0: scanning [bus 01-01] behind bridge, pass 0
Apr 30 20:01:05 xilinxtest1 kernel: [6612195.876445] pci_bus 0000:01:
scanning bus
Apr 30 20:01:05 xilinxtest1 kernel: [6612195.876594] pci 0000:01:00.0:
[1556:5555] type 00 class 0x050000
Apr 30 20:01:05 xilinxtest1 kernel: [6612195.876678] pci 0000:01:00.0:
reg 0x10: [mem 0xd0400000-0xd07fffff]
Apr 30 20:01:05 xilinxtest1 kernel: [6612195.877426] pci 0000:01:00.0:
EDR: Notify handler installed
Apr 30 20:01:05 xilinxtest1 kernel: [6612195.877850] pci_bus 0000:01:
bus scan returning with max=01
Apr 30 20:01:05 xilinxtest1 kernel: [6612195.877872] pcieport
0000:00:1c.2: scanning [bus 02-02] behind bridge, pass 0
Apr 30 20:01:05 xilinxtest1 kernel: [6612195.877898] pci_bus 0000:02:
scanning bus
Apr 30 20:01:05 xilinxtest1 kernel: [6612195.877915] pci_bus 0000:02:
bus scan returning with max=02
Apr 30 20:01:05 xilinxtest1 kernel: [6612195.877932] pcieport
0000:00:1c.3: scanning [bus 03-03] behind bridge, pass 0
Apr 30 20:01:05 xilinxtest1 kernel: [6612195.877956] pci_bus 0000:03:
scanning bus
Apr 30 20:01:05 xilinxtest1 kernel: [6612195.877965] pci_bus 0000:03:
bus scan returning with max=03
Apr 30 20:01:05 xilinxtest1 kernel: [6612195.877982] pcieport
0000:00:1c.0: scanning [bus 01-01] behind bridge, pass 1
Apr 30 20:01:05 xilinxtest1 kernel: [6612195.878013] pcieport
0000:00:1c.2: scanning [bus 02-02] behind bridge, pass 1
Apr 30 20:01:05 xilinxtest1 kernel: [6612195.878043] pcieport
0000:00:1c.3: scanning [bus 03-03] behind bridge, pass 1
Apr 30 20:01:05 xilinxtest1 kernel: [6612195.878066] pci_bus 0000:00:
bus scan returning with max=03
Apr 30 20:01:05 xilinxtest1 kernel: [6612195.878094] pci 0000:01:00.0:
BAR 0: assigned [mem 0xd0400000-0xd07fffff]
Apr 30 20:01:05 xilinxtest1 kernel: [6612195.878204] XILINX_FPGA PCI
0000:01:00.0: runtime IRQ mapping not provided by arch
Our Questions:
What could trigger such a PCIe device re-enumeration without a physical event?
Are there any known kernel or platform-level triggers that might cause this?
Any debug hooks or sysfs entries we should monitor or enable to catch
the root cause?
Any guidance, insights, or debugging suggestions would be greatly appreciated.
Thanks,
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: PCIe: Unexpected .remove() and .probe() Callback Invocation Without Device Removal
2025-05-07 14:31 PCIe: Unexpected .remove() and .probe() Callback Invocation Without Device Removal Subhashini Rao Beerisetty
@ 2025-05-27 17:43 ` Bjorn Helgaas
0 siblings, 0 replies; 2+ messages in thread
From: Bjorn Helgaas @ 2025-05-27 17:43 UTC (permalink / raw)
To: Subhashini Rao Beerisetty; +Cc: linux-pci
On Wed, May 07, 2025 at 08:01:09PM +0530, Subhashini Rao Beerisetty wrote:
> Hi All,
>
> I’m reaching out for some guidance on a behavior we're observing with
> a Xilinx FPGA based PCIe device on our test systems. This device uses
> an out-of-tree driver.
>
> We are seeing that, without any actual physical or hotplug
> removal/reinsertion of the PCIe device, the kernel invokes the
> pci_driver's .remove() callback followed shortly by the .probe()
> callback. This appears to be an unexpected re-enumeration or reset of
> the PCIe device.
>
> Below is a snippet of the relevant kernel log captured during one such event.
>
> Apr 30 20:01:05 xilinxtest1 kernel: [6612195.847385] XILINX_FPGA PCI
> 0000:01:00.0: PME# disabled
> Apr 30 20:01:05 xilinxtest1 kernel: [6612195.847410] XILINX_FPGA PCI:
> XILINX_FPGA_pci_remove
> Apr 30 20:01:05 xilinxtest1 kernel: [6612195.848110] pci 0000:01:00.0:
> EDR: Notify handler removed
> Apr 30 20:01:05 xilinxtest1 kernel: [6612195.848240] pci 0000:01:00.0:
> device released
> Apr 30 20:01:05 xilinxtest1 kernel: [6612195.876157] pci_bus 0000:00:
> scanning bus
> Apr 30 20:01:05 xilinxtest1 kernel: [6612195.876419] pcieport
> 0000:00:1c.0: scanning [bus 01-01] behind bridge, pass 0
> Apr 30 20:01:05 xilinxtest1 kernel: [6612195.876445] pci_bus 0000:01:
> scanning bus
> Apr 30 20:01:05 xilinxtest1 kernel: [6612195.876594] pci 0000:01:00.0:
> [1556:5555] type 00 class 0x050000
> Apr 30 20:01:05 xilinxtest1 kernel: [6612195.876678] pci 0000:01:00.0:
> reg 0x10: [mem 0xd0400000-0xd07fffff]
> Apr 30 20:01:05 xilinxtest1 kernel: [6612195.877426] pci 0000:01:00.0:
> EDR: Notify handler installed
> Apr 30 20:01:05 xilinxtest1 kernel: [6612195.877850] pci_bus 0000:01:
> bus scan returning with max=01
> Apr 30 20:01:05 xilinxtest1 kernel: [6612195.877872] pcieport
> 0000:00:1c.2: scanning [bus 02-02] behind bridge, pass 0
> Apr 30 20:01:05 xilinxtest1 kernel: [6612195.877898] pci_bus 0000:02:
> scanning bus
> Apr 30 20:01:05 xilinxtest1 kernel: [6612195.877915] pci_bus 0000:02:
> bus scan returning with max=02
> Apr 30 20:01:05 xilinxtest1 kernel: [6612195.877932] pcieport
> 0000:00:1c.3: scanning [bus 03-03] behind bridge, pass 0
> Apr 30 20:01:05 xilinxtest1 kernel: [6612195.877956] pci_bus 0000:03:
> scanning bus
> Apr 30 20:01:05 xilinxtest1 kernel: [6612195.877965] pci_bus 0000:03:
> bus scan returning with max=03
> Apr 30 20:01:05 xilinxtest1 kernel: [6612195.877982] pcieport
> 0000:00:1c.0: scanning [bus 01-01] behind bridge, pass 1
> Apr 30 20:01:05 xilinxtest1 kernel: [6612195.878013] pcieport
> 0000:00:1c.2: scanning [bus 02-02] behind bridge, pass 1
> Apr 30 20:01:05 xilinxtest1 kernel: [6612195.878043] pcieport
> 0000:00:1c.3: scanning [bus 03-03] behind bridge, pass 1
> Apr 30 20:01:05 xilinxtest1 kernel: [6612195.878066] pci_bus 0000:00:
> bus scan returning with max=03
> Apr 30 20:01:05 xilinxtest1 kernel: [6612195.878094] pci 0000:01:00.0:
> BAR 0: assigned [mem 0xd0400000-0xd07fffff]
> Apr 30 20:01:05 xilinxtest1 kernel: [6612195.878204] XILINX_FPGA PCI
> 0000:01:00.0: runtime IRQ mapping not provided by arch
>
>
> Our Questions:
>
> What could trigger such a PCIe device re-enumeration without a physical event?
>
> Are there any known kernel or platform-level triggers that might cause this?
>
> Any debug hooks or sysfs entries we should monitor or enable to catch
> the root cause?
If you can point us to the source for the driver you're using, we
might be able to help.
Otherwise, adding dump_stack() in your .probe() and .remove()
functions might give clues.
Bjorn
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2025-05-27 17:43 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-07 14:31 PCIe: Unexpected .remove() and .probe() Callback Invocation Without Device Removal Subhashini Rao Beerisetty
2025-05-27 17:43 ` Bjorn Helgaas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).