From: Li Ming <ming4.li@intel.com>
To: linux-cxl@vger.kernel.org
Cc: dan.j.williams@intel.com, terry.bowman@amd.com, rrichter@amd.com,
Jonathan.Cameron@huawei.com, dave.jiang@intel.com,
Li Ming <ming4.li@intel.com>
Subject: [PATCH 1/1] cxl/pci: Skip to handle RAS errors if CXL.mem device is detached
Date: Thu, 25 Jan 2024 08:14:14 +0000 [thread overview]
Message-ID: <20240125081414.2189572-1-ming4.li@intel.com> (raw)
CXL.mem protocol errors are logged in CXL RAS capability, if CXL.mem
device is unbound from CXL.mem driver, will not expect any CXL.mem
protocol errors happen on the endpoint or the dport connected to the
endpoint. Giving up these unexpected errors to avoid error handler to
access unmapped RCH dport's RAS capability. The error handler of CXL PCI
device helps to handle RAS errors happened on RCH dport. The host of the
RCH dport's RAS capability mapping is CXL.mem device, so the error
handler will access unmapped RCH dport's RAS capability after CXL.mem
device is unbound from the CXL.mem driver.
Fixes: 6ac07883dbb5 ("cxl/pci: Add RCH downstream port error logging")
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Li Ming <ming4.li@intel.com>
---
drivers/cxl/core/pci.c | 43 ++++++++++++++++++++++++++++++------------
1 file changed, 31 insertions(+), 12 deletions(-)
diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index 6c9c8d92f8f7..480489f5644e 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -932,11 +932,21 @@ static void cxl_handle_rdport_errors(struct cxl_dev_state *cxlds) { }
void cxl_cor_error_detected(struct pci_dev *pdev)
{
struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);
+ struct device *dev = &cxlds->cxlmd->dev;
+
+ scoped_guard(device, dev) {
+ if (!dev->driver) {
+ dev_warn(&pdev->dev,
+ "%s: memdev disabled, abort error handling\n",
+ dev_name(dev));
+ return;
+ }
- if (cxlds->rcd)
- cxl_handle_rdport_errors(cxlds);
+ if (cxlds->rcd)
+ cxl_handle_rdport_errors(cxlds);
- cxl_handle_endpoint_cor_ras(cxlds);
+ cxl_handle_endpoint_cor_ras(cxlds);
+ }
}
EXPORT_SYMBOL_NS_GPL(cxl_cor_error_detected, CXL);
@@ -948,16 +958,25 @@ pci_ers_result_t cxl_error_detected(struct pci_dev *pdev,
struct device *dev = &cxlmd->dev;
bool ue;
- if (cxlds->rcd)
- cxl_handle_rdport_errors(cxlds);
+ scoped_guard(device, dev) {
+ if (!dev->driver) {
+ dev_warn(&pdev->dev,
+ "%s: memdev disabled, abort error handling\n",
+ dev_name(dev));
+ return PCI_ERS_RESULT_DISCONNECT;
+ }
+
+ if (cxlds->rcd)
+ cxl_handle_rdport_errors(cxlds);
+ /*
+ * A frozen channel indicates an impending reset which is fatal to
+ * CXL.mem operation, and will likely crash the system. On the off
+ * chance the situation is recoverable dump the status of the RAS
+ * capability registers and bounce the active state of the memdev.
+ */
+ ue = cxl_handle_endpoint_ras(cxlds);
+ }
- /*
- * A frozen channel indicates an impending reset which is fatal to
- * CXL.mem operation, and will likely crash the system. On the off
- * chance the situation is recoverable dump the status of the RAS
- * capability registers and bounce the active state of the memdev.
- */
- ue = cxl_handle_endpoint_ras(cxlds);
switch (state) {
case pci_channel_io_normal:
--
2.40.1
next reply other threads:[~2024-01-25 8:40 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-25 8:14 Li Ming [this message]
2024-01-26 6:37 ` [PATCH 1/1] cxl/pci: Skip to handle RAS errors if CXL.mem device is detached Dan Williams
2024-01-26 14:04 ` Bowman, Terry
2024-01-27 3:05 ` Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240125081414.2189572-1-ming4.li@intel.com \
--to=ming4.li@intel.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=rrichter@amd.com \
--cc=terry.bowman@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox