* reset_slot() callback not respecting MPS config
@ 2025-05-22 16:19 Niklas Cassel
2025-05-23 5:30 ` Wilfred Mallawa
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Niklas Cassel @ 2025-05-22 16:19 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: Wilfred Mallawa, Bjorn Helgaas, Damien Le Moal, linux-pci
Hello there,
As you know the reset_slot() callback patches were merged recently.
Wilfred and I (mostly Wilfred), have been debugging DMA issues after the
reset_slot() callback has been invoked. The issue is reproduced when MPS
configuration is set to performance, but might be applicable for other
MPS configurations as well. The problem appears to be that reset_slot()
feature does not respect/restore the MPS configuration.
When having two Rock 5Bs, one in RC mode, one in EP mode:
# Boot with MPS bus performance
pci=pcie_bus_perf on kernel command line
# lspci RC
lspci -vvvs 0000:00:00.0 | grep -E MaxPayload
DevCap: MaxPayload 256 bytes, PhantFunc 0
DevCtl: MaxPayload 256 bytes, MaxReadReq 256 bytes
# Run pcitests
All tests pass
# Perform hot reset of EP
echo 1 > /sys/bus/pci/devices/0000:01:00.0/reset
# lspci RC
lspci -vvvs 0000:00:00.0 | grep -E MaxPayload
DevCap: MaxPayload 256 bytes, PhantFunc 0
DevCtl: MaxPayload 128 bytes, MaxReadReq 512 bytes
# Run pcitests
FAIL pci_ep_data_transfer.dma.READ_TEST
Compared to before, DMA read test now fails.
Wilfred has been able to track this down to being because MPS in the RC
is different before/after a ->reset_slot(), and simply re-storing MPS using
a dirty hack, the DMA read issues go away.
My first thinking was that pci_configure_device() (and thus pci_configure_mps())
was not called,
so I added some prints in pci_configure_mps().
pci_configure_mps() was not called for RC nor EP during ->reset_slot()
(I did not check if pcie_bus_configure_settings() was called.)
I tried Mani's earlier patch instead of what is queued up:
https://lore.kernel.org/linux-pci/20250221172309.120009-2-manivannan.sadhasivam@linaro.org/
And in this version pci_configure_mps() does get called,
for both the RC and EP during ->retrain_link()
(I did not check if pcie_bus_configure_settings() was called.)
But... MaxPayload was still incorrectly set after retrain_link().
I think the solution is to add a call to add a pcie_bus_configure_settings()
call in pcie_do_recovery() / pci_host_recover_slots() / pci_host_reset_slot() /
pcibios_reset_secondary_bus().
Or possibly a:
list_for_each_entry(child, &bus->children, node)
pcie_bus_configure_settings(child);
as done in pci_host_probe().
I'm not sure if we need to make sure that pci_configure_device() /
pci_configure_mps() gets called as well. Since we did reset the endpoint,
it seems that calling pci_configure_mps() does make sense.
Normally, I would wait for Wilfred and myself to come up with a nice fix,
and test it, but, considering that the ->reset_slot() patches are queued
up already, and the merge window is opening soon, I'm sharing our findings
on the list, as it might take some time to come up with a nice solution.
Kind regards,
Niklas
P.S. Thanks to Wilfred for all the hours spent root-causing this.
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: reset_slot() callback not respecting MPS config 2025-05-22 16:19 reset_slot() callback not respecting MPS config Niklas Cassel @ 2025-05-23 5:30 ` Wilfred Mallawa 2025-05-23 5:33 ` Lukas Wunner 2025-05-23 15:49 ` Manivannan Sadhasivam 2 siblings, 0 replies; 12+ messages in thread From: Wilfred Mallawa @ 2025-05-23 5:30 UTC (permalink / raw) To: cassel@kernel.org, manivannan.sadhasivam@linaro.org Cc: linux-pci@vger.kernel.org, helgaas@kernel.org, dlemoal@kernel.org On Thu, 2025-05-22 at 18:19 +0200, Niklas Cassel wrote: [snip] > > I think the solution is to add a call to add a > pcie_bus_configure_settings() > call in pcie_do_recovery() / pci_host_recover_slots() / > pci_host_reset_slot() / > pcibios_reset_secondary_bus(). > > > Or possibly a: > list_for_each_entry(child, &bus->children, node) > pcie_bus_configure_settings(child); Hey Niklas, Thanks for writing this out in detail. Just to add to this, I have tested your suggestion, and it does work. That's is: ``` @@ -4990,6 +4991,8 @@ void __weak pcibios_reset_secondary_bus(struct pci_dev *dev) if (ret) pci_err(dev, "failed to reset slot: %d\n", ret); + list_for_each_entry(child, &dev->bus->children, node) + pcie_bus_configure_settings(child); return; } -- ``` To clarify, with this patch applied, between a Rock5B EP and Rock5B RC both CONFIG_PCIE_BUS_PERFORMANCE=y. When the host issues a sysfs initiated bus reset to the endpoint. Everything works as expected. Whereas before, DMA from EP to host would not work after the bus reset. Regards, Wilfred ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: reset_slot() callback not respecting MPS config 2025-05-22 16:19 reset_slot() callback not respecting MPS config Niklas Cassel 2025-05-23 5:30 ` Wilfred Mallawa @ 2025-05-23 5:33 ` Lukas Wunner 2025-05-23 6:23 ` Wilfred Mallawa ` (2 more replies) 2025-05-23 15:49 ` Manivannan Sadhasivam 2 siblings, 3 replies; 12+ messages in thread From: Lukas Wunner @ 2025-05-23 5:33 UTC (permalink / raw) To: Niklas Cassel Cc: Manivannan Sadhasivam, Wilfred Mallawa, Bjorn Helgaas, Damien Le Moal, linux-pci On Thu, May 22, 2025 at 06:19:56PM +0200, Niklas Cassel wrote: > As you know the reset_slot() callback patches were merged recently. > > Wilfred and I (mostly Wilfred), have been debugging DMA issues after the > reset_slot() callback has been invoked. The issue is reproduced when MPS > configuration is set to performance, but might be applicable for other > MPS configurations as well. The problem appears to be that reset_slot() > feature does not respect/restore the MPS configuration. The Device Control register (and thus the MPS setting) is saved via: pci_save_state() pci_save_pcie_state() So either you're missing a call to pci_restore_state() after reset, or you're missing a call to pci_save_state() after changing MPS, or MPS is somehow overwritten after pci_restore_state(). Which one is it? Thanks, Lukas ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: reset_slot() callback not respecting MPS config 2025-05-23 5:33 ` Lukas Wunner @ 2025-05-23 6:23 ` Wilfred Mallawa 2025-05-23 6:39 ` Manivannan Sadhasivam 2025-05-23 7:02 ` Niklas Cassel 2 siblings, 0 replies; 12+ messages in thread From: Wilfred Mallawa @ 2025-05-23 6:23 UTC (permalink / raw) To: lukas@wunner.de, cassel@kernel.org Cc: linux-pci@vger.kernel.org, manivannan.sadhasivam@linaro.org, helgaas@kernel.org, dlemoal@kernel.org On Fri, 2025-05-23 at 07:33 +0200, Lukas Wunner wrote: > On Thu, May 22, 2025 at 06:19:56PM +0200, Niklas Cassel wrote: > > As you know the reset_slot() callback patches were merged recently. > > > > Wilfred and I (mostly Wilfred), have been debugging DMA issues > > after the > > reset_slot() callback has been invoked. The issue is reproduced > > when MPS > > configuration is set to performance, but might be applicable for > > other > > MPS configurations as well. The problem appears to be that > > reset_slot() > > feature does not respect/restore the MPS configuration. > > The Device Control register (and thus the MPS setting) is saved via: > > pci_save_state() > pci_save_pcie_state() > > So either you're missing a call to pci_restore_state() after reset, > or you're missing a call to pci_save_state() after changing MPS, > or MPS is somehow overwritten after pci_restore_state(). > Which one is it? > Hey Lukas, Just wanted to add to this: I do believe this is the case, in that we are missing a call to pci_save/restore_state(). My initial "fix" for this problem was to do "pci_save_state()" and "pci_restore_state()" in reset_slot(). Which does work, as you mentioned it saves and restores the Device Control register (thus MPS) amongst others...however, I have also tried the fix that Niklas suggested as well. Cheers, Wilfred > Thanks, > > Lukas ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: reset_slot() callback not respecting MPS config 2025-05-23 5:33 ` Lukas Wunner 2025-05-23 6:23 ` Wilfred Mallawa @ 2025-05-23 6:39 ` Manivannan Sadhasivam 2025-05-23 14:51 ` Lukas Wunner 2025-05-23 7:02 ` Niklas Cassel 2 siblings, 1 reply; 12+ messages in thread From: Manivannan Sadhasivam @ 2025-05-23 6:39 UTC (permalink / raw) To: Lukas Wunner Cc: Niklas Cassel, Wilfred Mallawa, Bjorn Helgaas, Damien Le Moal, linux-pci On Fri, May 23, 2025 at 07:33:16AM +0200, Lukas Wunner wrote: > On Thu, May 22, 2025 at 06:19:56PM +0200, Niklas Cassel wrote: > > As you know the reset_slot() callback patches were merged recently. > > > > Wilfred and I (mostly Wilfred), have been debugging DMA issues after the > > reset_slot() callback has been invoked. The issue is reproduced when MPS > > configuration is set to performance, but might be applicable for other > > MPS configurations as well. The problem appears to be that reset_slot() > > feature does not respect/restore the MPS configuration. > > The Device Control register (and thus the MPS setting) is saved via: > > pci_save_state() > pci_save_pcie_state() > > So either you're missing a call to pci_restore_state() after reset, > or you're missing a call to pci_save_state() after changing MPS, > or MPS is somehow overwritten after pci_restore_state(). > Which one is it? > I think the issue is that the PCI bridge is getting reset while trying to reset the PCI device. And in the reset path, we only save the config space of the *device*, not the bridge. As seen from the lspci output shared by Niklas, the content of the PCI bridge seem to be diverged. Since the slot_reset() callback resets the whole Root Complex (if there is a single Root port), then the config space of the Root port/bridge needs to be saved and restored as well. I believe pcibios_reset_secondary_bus() is not supposed to change the config space of the root port. As per the definition of the "Secondary Bus Reset" field in the Bridge Control Register, r3.0, sec 7.5.3.6: "Port configuration registers must not be changed, except as required to update Port status." So pci_reset_secondary_bus() is not changing the config space, but slot_reset() does. Are we plugging slot_reset() at the wrong place? - Mani -- மணிவண்ணன் சதாசிவம் ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: reset_slot() callback not respecting MPS config 2025-05-23 6:39 ` Manivannan Sadhasivam @ 2025-05-23 14:51 ` Lukas Wunner 2025-05-23 15:30 ` Manivannan Sadhasivam 0 siblings, 1 reply; 12+ messages in thread From: Lukas Wunner @ 2025-05-23 14:51 UTC (permalink / raw) To: Manivannan Sadhasivam Cc: Niklas Cassel, Wilfred Mallawa, Bjorn Helgaas, Damien Le Moal, linux-pci On Fri, May 23, 2025 at 12:09:06PM +0530, Manivannan Sadhasivam wrote: > On Fri, May 23, 2025 at 07:33:16AM +0200, Lukas Wunner wrote: > > On Thu, May 22, 2025 at 06:19:56PM +0200, Niklas Cassel wrote: > > > As you know the reset_slot() callback patches were merged recently. > > > > > > Wilfred and I (mostly Wilfred), have been debugging DMA issues after the > > > reset_slot() callback has been invoked. The issue is reproduced when MPS > > > configuration is set to performance, but might be applicable for other > > > MPS configurations as well. The problem appears to be that reset_slot() > > > feature does not respect/restore the MPS configuration. > > > > The Device Control register (and thus the MPS setting) is saved via: > > > > pci_save_state() > > pci_save_pcie_state() > > > > So either you're missing a call to pci_restore_state() after reset, > > or you're missing a call to pci_save_state() after changing MPS, > > or MPS is somehow overwritten after pci_restore_state(). > > I think the issue is that the PCI bridge is getting reset while trying > to reset the PCI device. And in the reset path, we only save the config > space of the *device*, not the bridge. > > As seen from the lspci output shared by Niklas, the content of the PCI > bridge seem to be diverged. Since the reset_slot() callback resets the > whole Root Complex (if there is a single Root port), then the config > space of the Root port/bridge needs to be saved and restored as well. > > I believe pcibios_reset_secondary_bus() is not supposed to change the > config space of the root port. As per the definition of the "Secondary > Bus Reset" field in the Bridge Control Register, r3.0, sec 7.5.3.6: > > "Port configuration registers must not be changed, except as required > to update Port status." > > So pci_reset_secondary_bus() is not changing the config space, > but reset_slot() does. Are we plugging reset_slot() at the wrong place? On ACPI-based platforms (x86 etc), I'm not aware that it's possible to reset the Root Complex. If it is, I don't think we've exposed that feature and hence we don't really have a better place to hook into. There's the pci_reset_fn_methods[] array and conceivably, an entry could be added there to reset the Root Port on capable platforms. However that array is meant to reset a single PCI function, whereas the ->reset_slot() also resets the entire hierarchy below the Root Port (IIUC). So that's not really what the array is meant to be used for. You wanted to use ->reset_slot() for aer_root_reset(). It performs a Secondary Bus Reset via: pci_bus_error_reset() pci_bus_reset() pci_bridge_secondary_bus_reset() or: pci_bus_error_reset() pci_slot_reset() pci_reset_hotplug_slot() hotplug->ops->reset_slot() pciehp_reset_slot() # or other hotplug driver pci_bridge_secondary_bus_reset() ...and that's the reason I suggested to plumb ->reset_slot() into pcibios_reset_secondary_bus(). I don't think we have a better place. If all host bridge drivers reset the Root Complex as part of ->reset_slot(), then it should be fine to just call pci_save_state(dev) before and pci_restore_state(dev) after invoking host->reset_slot() in pcibios_reset_secondary_bus(). If however this behavior is specific only to certain host bridge drivers, then you want to call pci_save_state() and pci_restore_state() directly in their ->reset_slot() implementations. I note that if you have a deeper hierarchy with PCIe switches below the host bridge, you'll reset the Root Complex even if the error was reported further down in the hierarchy by some Switch Downstream Port. I think in that case you may not want to reset the Root Complex, but only perform a Secondary Bus Reset at that Downstream Port. In other words, I'm wondering if pcibios_reset_secondary_bus() should invoke host->reset_slot() only if dev is a Root Port / is sitting on the root bus. I'm also wondering if ->reset_slot() should be renamed to something like ->reset_root_complex() or ->reset_root_port() or somesuch to more aptly describe what it does. I guess the name ->reset_slot() came about because these Root Complexes typically consist of a single Root Port with a single slot. Thanks, Lukas ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: reset_slot() callback not respecting MPS config 2025-05-23 14:51 ` Lukas Wunner @ 2025-05-23 15:30 ` Manivannan Sadhasivam 2025-05-24 12:40 ` Lukas Wunner 0 siblings, 1 reply; 12+ messages in thread From: Manivannan Sadhasivam @ 2025-05-23 15:30 UTC (permalink / raw) To: Lukas Wunner Cc: Niklas Cassel, Wilfred Mallawa, Bjorn Helgaas, Damien Le Moal, linux-pci On Fri, May 23, 2025 at 04:51:14PM +0200, Lukas Wunner wrote: > On Fri, May 23, 2025 at 12:09:06PM +0530, Manivannan Sadhasivam wrote: > > On Fri, May 23, 2025 at 07:33:16AM +0200, Lukas Wunner wrote: > > > On Thu, May 22, 2025 at 06:19:56PM +0200, Niklas Cassel wrote: > > > > As you know the reset_slot() callback patches were merged recently. > > > > > > > > Wilfred and I (mostly Wilfred), have been debugging DMA issues after the > > > > reset_slot() callback has been invoked. The issue is reproduced when MPS > > > > configuration is set to performance, but might be applicable for other > > > > MPS configurations as well. The problem appears to be that reset_slot() > > > > feature does not respect/restore the MPS configuration. > > > > > > The Device Control register (and thus the MPS setting) is saved via: > > > > > > pci_save_state() > > > pci_save_pcie_state() > > > > > > So either you're missing a call to pci_restore_state() after reset, > > > or you're missing a call to pci_save_state() after changing MPS, > > > or MPS is somehow overwritten after pci_restore_state(). > > > > I think the issue is that the PCI bridge is getting reset while trying > > to reset the PCI device. And in the reset path, we only save the config > > space of the *device*, not the bridge. > > > > As seen from the lspci output shared by Niklas, the content of the PCI > > bridge seem to be diverged. Since the reset_slot() callback resets the > > whole Root Complex (if there is a single Root port), then the config > > space of the Root port/bridge needs to be saved and restored as well. > > > > I believe pcibios_reset_secondary_bus() is not supposed to change the > > config space of the root port. As per the definition of the "Secondary > > Bus Reset" field in the Bridge Control Register, r3.0, sec 7.5.3.6: > > > > "Port configuration registers must not be changed, except as required > > to update Port status." > > > > So pci_reset_secondary_bus() is not changing the config space, > > but reset_slot() does. Are we plugging reset_slot() at the wrong place? > > On ACPI-based platforms (x86 etc), I'm not aware that it's possible > to reset the Root Complex. If it is, I don't think we've exposed > that feature and hence we don't really have a better place to hook > into. > > There's the pci_reset_fn_methods[] array and conceivably, an entry > could be added there to reset the Root Port on capable platforms. > However that array is meant to reset a single PCI function, > whereas the ->reset_slot() also resets the entire hierarchy below > the Root Port (IIUC). So that's not really what the array is > meant to be used for. > > You wanted to use ->reset_slot() for aer_root_reset(). It performs > a Secondary Bus Reset via: > > pci_bus_error_reset() > pci_bus_reset() > pci_bridge_secondary_bus_reset() > > or: > > pci_bus_error_reset() > pci_slot_reset() > pci_reset_hotplug_slot() > hotplug->ops->reset_slot() > pciehp_reset_slot() # or other hotplug driver > pci_bridge_secondary_bus_reset() > > ...and that's the reason I suggested to plumb ->reset_slot() > into pcibios_reset_secondary_bus(). I don't think we have > a better place. > > If all host bridge drivers reset the Root Complex as part of > ->reset_slot(), then it should be fine to just call > pci_save_state(dev) before and pci_restore_state(dev) after > invoking host->reset_slot() in pcibios_reset_secondary_bus(). > > If however this behavior is specific only to certain host > bridge drivers, then you want to call pci_save_state() and > pci_restore_state() directly in their ->reset_slot() > implementations. > The callback is supposed to reset the specific PCI slots through platform specific means. But I've worked only with single root port/slot devices so far, so I'm not sure if the controller driver can reset individual root ports on a multi port setup or not. I thought that it *might* be possible to reset individual ports, so that's why I passed the root port 'pci_dev' to the callback in a hope that the controller drivers could use it to identify the root port they are resetting. But since both of the controller drivers implementing the slot_reset() callback are single root port implementations, I guess it is fine to do save/restore state for the root port in pcibios_reset_secondary_bus() itself. > I note that if you have a deeper hierarchy with PCIe switches > below the host bridge, you'll reset the Root Complex even if > the error was reported further down in the hierarchy by some > Switch Downstream Port. I think in that case you may not > want to reset the Root Complex, but only perform a Secondary > Bus Reset at that Downstream Port. In other words, > I'm wondering if pcibios_reset_secondary_bus() should invoke > host->reset_slot() only if dev is a Root Port / is sitting > on the root bus. > You are right. We should check if the parent bus of the bridge is a root bus or not. > I'm also wondering if ->reset_slot() should be renamed to > something like ->reset_root_complex() or ->reset_root_port() > or somesuch to more aptly describe what it does. > I guess the name ->reset_slot() came about because these > Root Complexes typically consist of a single Root Port with > a single slot. > Yes, pretty much so. I could rename it to reset_root_port(), since I still believe that multi root port setups may be able to reset them separately. Will submit patches for the above changes. - Mani -- மணிவண்ணன் சதாசிவம் ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: reset_slot() callback not respecting MPS config 2025-05-23 15:30 ` Manivannan Sadhasivam @ 2025-05-24 12:40 ` Lukas Wunner 2025-05-25 7:59 ` Manivannan Sadhasivam 0 siblings, 1 reply; 12+ messages in thread From: Lukas Wunner @ 2025-05-24 12:40 UTC (permalink / raw) To: Manivannan Sadhasivam Cc: Niklas Cassel, Wilfred Mallawa, Bjorn Helgaas, Damien Le Moal, linux-pci On Fri, May 23, 2025 at 09:00:27PM +0530, Manivannan Sadhasivam wrote: > I thought that it *might* be possible to reset individual ports, > so that's why I passed the root port 'pci_dev' to the callback > in a hope that the controller drivers could use it to identify > the root port they are resetting. Makes sense. > You are right. We should check if the parent bus of the bridge > is a root bus or not. Okay, that's simple enough: pci_is_root_bus(dev->bus) > Yes, pretty much so. I could rename it to reset_root_port(), > since I still believe that multi root port setups may be able > to reset them separately. Conceivably, a PCIe host controller might also possess RCiEPs in addition to Root Ports. Those are allowed to be FLR-capable, but could also be reset through a platform-specific means. PCIe r6.3 page 121 defines the term "Root Complex Component", which encompasses Root Ports but also RCiEPs. So if you want to be super generic, you could use that term in lieu of Root Port, though it consumes more characters. Thanks, Lukas ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: reset_slot() callback not respecting MPS config 2025-05-24 12:40 ` Lukas Wunner @ 2025-05-25 7:59 ` Manivannan Sadhasivam 0 siblings, 0 replies; 12+ messages in thread From: Manivannan Sadhasivam @ 2025-05-25 7:59 UTC (permalink / raw) To: Lukas Wunner Cc: Niklas Cassel, Wilfred Mallawa, Bjorn Helgaas, Damien Le Moal, linux-pci On Sat, May 24, 2025 at 02:40:23PM +0200, Lukas Wunner wrote: > On Fri, May 23, 2025 at 09:00:27PM +0530, Manivannan Sadhasivam wrote: > > I thought that it *might* be possible to reset individual ports, > > so that's why I passed the root port 'pci_dev' to the callback > > in a hope that the controller drivers could use it to identify > > the root port they are resetting. > > Makes sense. > > > You are right. We should check if the parent bus of the bridge > > is a root bus or not. > > Okay, that's simple enough: pci_is_root_bus(dev->bus) > > > Yes, pretty much so. I could rename it to reset_root_port(), > > since I still believe that multi root port setups may be able > > to reset them separately. > > Conceivably, a PCIe host controller might also possess RCiEPs > in addition to Root Ports. Those are allowed to be FLR-capable, > but could also be reset through a platform-specific means. > > PCIe r6.3 page 121 defines the term "Root Complex Component", > which encompasses Root Ports but also RCiEPs. So if you want to > be super generic, you could use that term in lieu of Root Port, > though it consumes more characters. > We don't have any PCI controller driver incorporating RCiEPs afaik. So I'll stick to 'reset_root_port' for now. - Mani -- மணிவண்ணன் சதாசிவம் ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: reset_slot() callback not respecting MPS config 2025-05-23 5:33 ` Lukas Wunner 2025-05-23 6:23 ` Wilfred Mallawa 2025-05-23 6:39 ` Manivannan Sadhasivam @ 2025-05-23 7:02 ` Niklas Cassel 2025-05-23 14:59 ` Lukas Wunner 2 siblings, 1 reply; 12+ messages in thread From: Niklas Cassel @ 2025-05-23 7:02 UTC (permalink / raw) To: Lukas Wunner Cc: Manivannan Sadhasivam, Wilfred Mallawa, Bjorn Helgaas, Damien Le Moal, linux-pci On Fri, May 23, 2025 at 07:33:16AM +0200, Lukas Wunner wrote: > On Thu, May 22, 2025 at 06:19:56PM +0200, Niklas Cassel wrote: > > As you know the reset_slot() callback patches were merged recently. > > > > Wilfred and I (mostly Wilfred), have been debugging DMA issues after the > > reset_slot() callback has been invoked. The issue is reproduced when MPS > > configuration is set to performance, but might be applicable for other > > MPS configurations as well. The problem appears to be that reset_slot() > > feature does not respect/restore the MPS configuration. > > The Device Control register (and thus the MPS setting) is saved via: > > pci_save_state() > pci_save_pcie_state() > > So either you're missing a call to pci_restore_state() after reset, > or you're missing a call to pci_save_state() after changing MPS, > or MPS is somehow overwritten after pci_restore_state(). > Which one is it? I kind of liked the earlier revision of Mani's series where we kicked the devices off the bus, that way, we would re-use the exact same code paths as when doing the initial enumeration. Also, by removing the device, the exact same solution works fine both for link-down (since the device might never come back again), and for a sysfs initiated reset. Anyway, I'm happy with whatever solution that works. Adding pci_save_state() + pci_restore_state() seems fine, but since things are not working, I assume that those calls are missing, at least of the bridge. Are we also missing similar pci_save_state() + pci_restore_state() calls for the EP? Kind regards, Niklas ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: reset_slot() callback not respecting MPS config 2025-05-23 7:02 ` Niklas Cassel @ 2025-05-23 14:59 ` Lukas Wunner 0 siblings, 0 replies; 12+ messages in thread From: Lukas Wunner @ 2025-05-23 14:59 UTC (permalink / raw) To: Niklas Cassel Cc: Manivannan Sadhasivam, Wilfred Mallawa, Bjorn Helgaas, Damien Le Moal, linux-pci On Fri, May 23, 2025 at 09:02:38AM +0200, Niklas Cassel wrote: > I kind of liked the earlier revision of Mani's series where we kicked the > devices off the bus, that way, we would re-use the exact same code paths > as when doing the initial enumeration. > > Also, by removing the device, the exact same solution works fine both for > link-down (since the device might never come back again), and for a sysfs > initiated reset. PCI drivers such as NVMe are capable of recovering gracefully from reset. That's a very important feature for data center use cases. The PCIe hotplug driver goes to great lengths to avoid de-enumeration and re-enumeration and instead allow for recovery from reset (see e.g. the invocation of pci_dpc_recovered() and pciehp_ignore_dpc_link_change() in pciehp_ist()). So no, we do not want to remove devices in response to reset if we can help it. We generally only de-enumerate and re-enumerate devices on hotplug or if explicitly told to do so via sysfs "remove" / "rescan" attributes. Thanks, Lukas ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: reset_slot() callback not respecting MPS config 2025-05-22 16:19 reset_slot() callback not respecting MPS config Niklas Cassel 2025-05-23 5:30 ` Wilfred Mallawa 2025-05-23 5:33 ` Lukas Wunner @ 2025-05-23 15:49 ` Manivannan Sadhasivam 2 siblings, 0 replies; 12+ messages in thread From: Manivannan Sadhasivam @ 2025-05-23 15:49 UTC (permalink / raw) To: Niklas Cassel; +Cc: Wilfred Mallawa, Bjorn Helgaas, Damien Le Moal, linux-pci On Thu, May 22, 2025 at 06:19:56PM +0200, Niklas Cassel wrote: > Hello there, > > > As you know the reset_slot() callback patches were merged recently. > > Wilfred and I (mostly Wilfred), have been debugging DMA issues after the > reset_slot() callback has been invoked. The issue is reproduced when MPS > configuration is set to performance, but might be applicable for other > MPS configurations as well. The problem appears to be that reset_slot() > feature does not respect/restore the MPS configuration. > > > When having two Rock 5Bs, one in RC mode, one in EP mode: > > # Boot with MPS bus performance > pci=pcie_bus_perf on kernel command line > > # lspci RC > lspci -vvvs 0000:00:00.0 | grep -E MaxPayload > DevCap: MaxPayload 256 bytes, PhantFunc 0 > DevCtl: MaxPayload 256 bytes, MaxReadReq 256 bytes > > # Run pcitests > All tests pass > > # Perform hot reset of EP > echo 1 > /sys/bus/pci/devices/0000:01:00.0/reset > > # lspci RC > lspci -vvvs 0000:00:00.0 | grep -E MaxPayload > DevCap: MaxPayload 256 bytes, PhantFunc 0 > DevCtl: MaxPayload 128 bytes, MaxReadReq 512 bytes > > # Run pcitests > FAIL pci_ep_data_transfer.dma.READ_TEST > Compared to before, DMA read test now fails. > > Wilfred has been able to track this down to being because MPS in the RC > is different before/after a ->reset_slot(), and simply re-storing MPS using > a dirty hack, the DMA read issues go away. > > > > My first thinking was that pci_configure_device() (and thus pci_configure_mps()) > was not called, > so I added some prints in pci_configure_mps(). > pci_configure_mps() was not called for RC nor EP during ->reset_slot() > (I did not check if pcie_bus_configure_settings() was called.) > > > I tried Mani's earlier patch instead of what is queued up: > https://lore.kernel.org/linux-pci/20250221172309.120009-2-manivannan.sadhasivam@linaro.org/ > > And in this version pci_configure_mps() does get called, > for both the RC and EP during ->retrain_link() > (I did not check if pcie_bus_configure_settings() was called.) > > But... MaxPayload was still incorrectly set after retrain_link(). > > > I think the solution is to add a call to add a pcie_bus_configure_settings() > call in pcie_do_recovery() / pci_host_recover_slots() / pci_host_reset_slot() / > pcibios_reset_secondary_bus(). > > > Or possibly a: > list_for_each_entry(child, &bus->children, node) > pcie_bus_configure_settings(child); > > as done in pci_host_probe(). > > I'm not sure if we need to make sure that pci_configure_device() / > pci_configure_mps() gets called as well. Since we did reset the endpoint, > it seems that calling pci_configure_mps() does make sense. > > > Normally, I would wait for Wilfred and myself to come up with a nice fix, > and test it, but, considering that the ->reset_slot() patches are queued > up already, and the merge window is opening soon, I'm sharing our findings > on the list, as it might take some time to come up with a nice solution. > Thanks Niklas for reporting and Wilfred for debugging! I will submit the patches for fixing this issue along with couple of other comments raised by Lukas. - Mani -- மணிவண்ணன் சதாசிவம் ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-05-25 7:59 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-05-22 16:19 reset_slot() callback not respecting MPS config Niklas Cassel 2025-05-23 5:30 ` Wilfred Mallawa 2025-05-23 5:33 ` Lukas Wunner 2025-05-23 6:23 ` Wilfred Mallawa 2025-05-23 6:39 ` Manivannan Sadhasivam 2025-05-23 14:51 ` Lukas Wunner 2025-05-23 15:30 ` Manivannan Sadhasivam 2025-05-24 12:40 ` Lukas Wunner 2025-05-25 7:59 ` Manivannan Sadhasivam 2025-05-23 7:02 ` Niklas Cassel 2025-05-23 14:59 ` Lukas Wunner 2025-05-23 15:49 ` Manivannan Sadhasivam
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox