From: Niklas Cassel <cassel@kernel.org>
To: Manivannan Sadhasivam <mani@kernel.org>
Cc: manivannan.sadhasivam@oss.qualcomm.com,
"Bjorn Helgaas" <bhelgaas@google.com>,
"Mahesh J Salgaonkar" <mahesh@linux.ibm.com>,
"Oliver O'Halloran" <oohall@gmail.com>,
"Will Deacon" <will@kernel.org>,
"Lorenzo Pieralisi" <lpieralisi@kernel.org>,
"Krzysztof Wilczyński" <kwilczynski@kernel.org>,
"Rob Herring" <robh@kernel.org>,
"Heiko Stuebner" <heiko@sntech.de>,
"Philipp Zabel" <p.zabel@pengutronix.de>,
linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
linuxppc-dev@lists.ozlabs.org,
linux-arm-kernel@lists.infradead.org,
linux-arm-msm@vger.kernel.org,
linux-rockchip@lists.infradead.org,
"Wilfred Mallawa" <wilfred.mallawa@wdc.com>,
"Krishna Chaitanya Chundru" <krishna.chundru@oss.qualcomm.com>,
"Lukas Wunner" <lukas@wunner.de>
Subject: Re: [PATCH v6 0/4] PCI: Add support for resetting the Root Ports in a platform specific way
Date: Thu, 4 Sep 2025 16:03:39 +0200 [thread overview]
Message-ID: <aLmcO8ukT-CDZMuT@ryzen> (raw)
In-Reply-To: <lakgphb7ym3cybwmpdqyipzi4tlkwbfijzhd4r6hvhho3pc7iu@6ludgw6wqkjh>
Hello Mani,
On Fri, Aug 29, 2025 at 09:44:08PM +0530, Manivannan Sadhasivam wrote:
> On Fri, Aug 15, 2025 at 11:07:42AM GMT, Niklas Cassel wrote:
(snip)
> > > > > ## On EP side:
> > > > > # echo 0 > /sys/kernel/config/pci_ep/controllers/a40000000.pcie-ep/start && \
> > > > > sleep 0.1 && echo 1 > /sys/kernel/config/pci_ep/controllers/a40000000.pcie-ep/start
> > > > >
> > > > > Basically all tests timeout
> > > > > # FAILED: 1 / 16 tests passed.
> > > > >
> > > > > Which is the same as before this patch series.
> > >
> > > This is kind of expected since the pci_endpoint_test driver doesn't have the AER
> > > err_handlers defined.
> >
> > I see.
> > Would be nice if we could add them then, so that we can verify that this
> > series is working as intended.
(snip)
> Ok, thanks for the logs. I guess what is happening here is that we are not
> saving/restoring the config space of the devices under the Root Port if linkdown
> is happens. TBH, we cannot do that from the PCI core since once linkdown
> happens, we cannot access any devices underneath the Root Port. But if
> err_handlers are available for drivers for all devices, they could do something
> smart like below:
>
> diff --git a/drivers/misc/pci_endpoint_test.c b/drivers/misc/pci_endpoint_test.c
> index c4e5e2c977be..9aabf1fe902e 100644
> --- a/drivers/misc/pci_endpoint_test.c
> +++ b/drivers/misc/pci_endpoint_test.c
> @@ -989,6 +989,8 @@ static int pci_endpoint_test_probe(struct pci_dev *pdev,
>
> pci_set_drvdata(pdev, test);
>
> + pci_save_state(pdev);
> +
> id = ida_alloc(&pci_endpoint_test_ida, GFP_KERNEL);
> if (id < 0) {
> ret = id;
> @@ -1140,12 +1142,31 @@ static const struct pci_device_id pci_endpoint_test_tbl[] = {
> };
> MODULE_DEVICE_TABLE(pci, pci_endpoint_test_tbl);
>
> +static pci_ers_result_t pci_endpoint_test_error_detected(struct pci_dev *pdev,
> + pci_channel_state_t state)
> +{
> + return PCI_ERS_RESULT_NEED_RESET;
> +}
> +
> +static pci_ers_result_t pci_endpoint_test_slot_reset(struct pci_dev *pdev)
> +{
> + pci_restore_state(pdev);
> +
> + return PCI_ERS_RESULT_RECOVERED;
> +}
> +
> +static const struct pci_error_handlers pci_endpoint_test_err_handler = {
> + .error_detected = pci_endpoint_test_error_detected,
> + .slot_reset = pci_endpoint_test_slot_reset,
> +};
> +
> static struct pci_driver pci_endpoint_test_driver = {
> .name = DRV_MODULE_NAME,
> .id_table = pci_endpoint_test_tbl,
> .probe = pci_endpoint_test_probe,
> .remove = pci_endpoint_test_remove,
> .sriov_configure = pci_sriov_configure_simple,
> + .err_handler = &pci_endpoint_test_err_handler,
> };
> module_pci_driver(pci_endpoint_test_driver);
>
> This essentially saves the good known config space during probe and restores it
> during the slot_reset callback. Ofc, the state would've been overwritten if
> suspend/resume happens in-between, but the point I'm making is that unless all
> device drivers restore their known config space, devices cannot be resumed
> properly post linkdown recovery.
>
> I can add a patch based on the above diff in next revision if that helps. Right
> now, I do not have access to my endpoint test setup. So can't test anything.
I tested your patch series + your suggested change above, and after a:
## On EP side:
# echo 0 > /sys/kernel/config/pci_ep/controllers/a40000000.pcie-ep/start && \
sleep 0.1 && echo 1 > /sys/kernel/config/pci_ep/controllers/a40000000.pcie-ep/start
Instead of:
# FAILED: 1 / 16 tests passed.
I now get:
# FAILED: 7 / 16 tests passed.
Test cases 1-7 now passes (the test cases related to BARs),
all other test cases still fail:
# /pcitest
TAP version 13
1..16
# Starting 16 tests from 9 test cases.
# RUN pci_ep_bar.BAR0.BAR_TEST ...
# OK pci_ep_bar.BAR0.BAR_TEST
ok 1 pci_ep_bar.BAR0.BAR_TEST
# RUN pci_ep_bar.BAR1.BAR_TEST ...
# OK pci_ep_bar.BAR1.BAR_TEST
ok 2 pci_ep_bar.BAR1.BAR_TEST
# RUN pci_ep_bar.BAR2.BAR_TEST ...
# OK pci_ep_bar.BAR2.BAR_TEST
ok 3 pci_ep_bar.BAR2.BAR_TEST
# RUN pci_ep_bar.BAR3.BAR_TEST ...
# OK pci_ep_bar.BAR3.BAR_TEST
ok 4 pci_ep_bar.BAR3.BAR_TEST
# RUN pci_ep_bar.BAR4.BAR_TEST ...
# SKIP BAR is disabled
# OK pci_ep_bar.BAR4.BAR_TEST
ok 5 pci_ep_bar.BAR4.BAR_TEST # SKIP BAR is disabled
# RUN pci_ep_bar.BAR5.BAR_TEST ...
# OK pci_ep_bar.BAR5.BAR_TEST
ok 6 pci_ep_bar.BAR5.BAR_TEST
# RUN pci_ep_basic.CONSECUTIVE_BAR_TEST ...
# OK pci_ep_basic.CONSECUTIVE_BAR_TEST
ok 7 pci_ep_basic.CONSECUTIVE_BAR_TEST
# RUN pci_ep_basic.LEGACY_IRQ_TEST ...
# pci_endpoint_test.c:106:LEGACY_IRQ_TEST:Expected 0 (0) == ret (-110)
# pci_endpoint_test.c:106:LEGACY_IRQ_TEST:Test failed for Legacy IRQ
# LEGACY_IRQ_TEST: Test failed
# FAIL pci_ep_basic.LEGACY_IRQ_TEST
not ok 8 pci_ep_basic.LEGACY_IRQ_TEST
# RUN pci_ep_basic.MSI_TEST ...
# pci_endpoint_test.c:118:MSI_TEST:Expected 0 (0) == ret (-110)
# pci_endpoint_test.c:118:MSI_TEST:Test failed for MSI1
# pci_endpoint_test.c:118:MSI_TEST:Expected 0 (0) == ret (-110)
# pci_endpoint_test.c:118:MSI_TEST:Test failed for MSI2
# pci_endpoint_test.c:118:MSI_TEST:Expected 0 (0) == ret (-110)
# pci_endpoint_test.c:118:MSI_TEST:Test failed for MSI3
...
I think I know the reason.. you save the state before the IRQs have been allocated.
Perhaps we need to save the state after enabling IRQs?
I tried this patch on top of your patch:
--- a/drivers/misc/pci_endpoint_test.c
+++ b/drivers/misc/pci_endpoint_test.c
@@ -851,6 +851,8 @@ static int pci_endpoint_test_set_irq(struct pci_endpoint_test *test,
return ret;
}
+ pci_save_state(pdev);
+
return 0;
}
But still:
# FAILED: 7 / 16 tests passed.
So... apparently that did not help...
I tried with the following change as well (on top of my patch above):
+static pci_ers_result_t pci_endpoint_test_slot_reset(struct pci_dev *pdev)
+{
+ struct pci_endpoint_test *test = pci_get_drvdata(pdev);
+ int irq_type = test->irq_type;
+
+ pci_restore_state(pdev);
+
+ if (irq_type != PCITEST_IRQ_TYPE_UNDEFINED) {
+ pci_endpoint_test_clear_irq(test);
+ pci_endpoint_test_set_irq(test, irq_type);
+ }
+
+ return PCI_ERS_RESULT_RECOVERED;
+}
But still only:
# FAILED: 7 / 16 tests passed.
Do you have any suggestions?
Kind regards,
Niklas
next prev parent reply other threads:[~2025-09-04 14:03 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-15 14:21 [PATCH v6 0/4] PCI: Add support for resetting the Root Ports in a platform specific way Manivannan Sadhasivam via B4 Relay
2025-07-15 14:21 ` [PATCH v6 1/4] PCI/ERR: " Manivannan Sadhasivam via B4 Relay
2025-07-17 18:28 ` [PATCH v6 1/4] PCI/ERR: Add support for resetting the Root Ports in a platform specific wayy Frank Li
2025-07-15 14:21 ` [PATCH v6 2/4] PCI: host-common: Add link down handling for Root Ports Manivannan Sadhasivam via B4 Relay
2025-07-17 18:31 ` [PATCH v6 2/4] PCI: host-common: Add link down handling for Root Portsy Frank Li
2025-08-28 20:25 ` [PATCH v6 2/4] PCI: host-common: Add link down handling for Root Ports Brian Norris
2025-08-29 8:35 ` Lukas Wunner
2025-08-29 23:58 ` Brian Norris
2025-07-15 14:21 ` [PATCH v6 3/4] PCI: qcom: Add support for resetting the Root Port due to link down event Manivannan Sadhasivam via B4 Relay
2025-07-15 14:21 ` [PATCH v6 4/4] PCI: dw-rockchip: Add support to reset Root Port upon " Manivannan Sadhasivam via B4 Relay
2025-07-18 3:58 ` [PATCH v6 0/4] PCI: Add support for resetting the Root Ports in a platform specific way Krishna Chaitanya Chundru
2025-07-18 10:28 ` Niklas Cassel
2025-07-18 10:39 ` Niklas Cassel
2025-07-24 5:30 ` Manivannan Sadhasivam
2025-08-15 9:07 ` Niklas Cassel
2025-08-29 16:14 ` Manivannan Sadhasivam
2025-09-04 14:03 ` Niklas Cassel [this message]
2025-07-24 9:28 ` Hongxing Zhu
2025-08-28 20:01 ` Brian Norris
2025-08-29 13:56 ` Manivannan Sadhasivam
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aLmcO8ukT-CDZMuT@ryzen \
--to=cassel@kernel.org \
--cc=bhelgaas@google.com \
--cc=heiko@sntech.de \
--cc=krishna.chundru@oss.qualcomm.com \
--cc=kwilczynski@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-rockchip@lists.infradead.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=lpieralisi@kernel.org \
--cc=lukas@wunner.de \
--cc=mahesh@linux.ibm.com \
--cc=mani@kernel.org \
--cc=manivannan.sadhasivam@oss.qualcomm.com \
--cc=oohall@gmail.com \
--cc=p.zabel@pengutronix.de \
--cc=robh@kernel.org \
--cc=wilfred.mallawa@wdc.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).