From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: "Bowman, Terry" <terry.bowman@amd.com>
Cc: <linux-cxl@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
<linux-pci@vger.kernel.org>, <nifan.cxl@gmail.com>,
<dave@stgolabs.net>, <dave.jiang@intel.com>,
<alison.schofield@intel.com>, <vishal.l.verma@intel.com>,
<dan.j.williams@intel.com>, <bhelgaas@google.com>,
<mahesh@linux.ibm.com>, <ira.weiny@intel.com>, <oohall@gmail.com>,
<Benjamin.Cheatham@amd.com>, <rrichter@amd.com>,
<nathan.fontenot@amd.com>,
<Smita.KoralahalliChannabasappa@amd.com>, <lukas@wunner.de>,
<PradeepVineshReddy.Kodamati@amd.com>,
Li Ming <ming.li@zohomail.com>
Subject: Re: [PATCH v4 14/15] cxl/pci: Add support to assign and clear pci_driver::cxl_err_handlers
Date: Tue, 7 Jan 2025 11:32:40 +0000 [thread overview]
Message-ID: <20250107113240.00003eda@huawei.com> (raw)
In-Reply-To: <0d552424-150e-4b92-8326-0fe6387e0ce6@amd.com>
On Thu, 26 Dec 2024 11:07:13 -0600
"Bowman, Terry" <terry.bowman@amd.com> wrote:
> On 12/24/2024 12:50 PM, Jonathan Cameron wrote:
> > On Wed, 11 Dec 2024 17:40:01 -0600
> > Terry Bowman <terry.bowman@amd.com> wrote:
> >
> >> pci_driver::cxl_err_handlers are not currently assigned handler callbacks.
> >> The handlers can't be set in the pci_driver static definition because the
> >> CXL PCIe Port devices are bound to the portdrv driver which is not CXL
> >> driver aware.
> >>
> >> Add cxl_assign_port_error_handlers() in the cxl_core module. This
> >> function will assign the default handlers for a CXL PCIe Port device.
> >>
> >> When the CXL Port (cxl_port or cxl_dport) is destroyed the device's
> >> pci_driver::cxl_err_handlers must be set to NULL indicating they should no
> >> longer be used.
> >>
> >> Create cxl_clear_port_error_handlers() and register it to be called
> >> when the CXL Port device (cxl_port or cxl_dport) is destroyed.
> >>
> >> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> >> ---
> >> drivers/cxl/core/pci.c | 40 ++++++++++++++++++++++++++++++++++++++++
> >> 1 file changed, 40 insertions(+)
> >>
> >> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> >> index 3294ad5ff28f..9734a4c55b29 100644
> >> --- a/drivers/cxl/core/pci.c
> >> +++ b/drivers/cxl/core/pci.c
> >> @@ -841,8 +841,38 @@ static bool cxl_port_error_detected(struct pci_dev *pdev)
> >> return __cxl_handle_ras(&pdev->dev, ras_base);
> >> }
> >>
> >> +static const struct cxl_error_handlers cxl_port_error_handlers = {
> >> + .error_detected = cxl_port_error_detected,
> >> + .cor_error_detected = cxl_port_cor_error_detected,
> >> +};
> >> +
> >> +static void cxl_assign_port_error_handlers(struct pci_dev *pdev)
> >> +{
> >> + struct pci_driver *pdrv;
> >> +
> >> + if (!pdev || !pdev->driver)
> >> + return;
> >> +
> >> + pdrv = pdev->driver;
> > What stops a race here? It's fiddly to remove that driver but
> > it can be done. At least I think we are messing withe portdrv
> > but this is such a fiddly stack I'm not 100% sure.
> >
> >> + pdrv->cxl_err_handler = &cxl_port_error_handlers;
> >> +}
> >> +
> >> +static void cxl_clear_port_error_handlers(void *data)
> >> +{
> >> + struct pci_dev *pdev = data;
> >> + struct pci_driver *pdrv;
> >> +
> >> + if (!pdev || !pdev->driver)
> >> + return;
> >> +
> >> + pdrv = pdev->driver;
> > Likewise. Smells like a possible race.
> >
> >> + pdrv->cxl_err_handler = NULL;
> >> +}
> >> +
>
> I can add a get_device()/put_device() for both cxl_clear_port_error_handlers() and cxl_assign_port_error_handlers() to prevent operating on a recently destroyed pci_dev. Is that sufficient? Regards, Terry
Probably (by which I mean I think it is, but haven't checked in detail)
Jonathan
> >> void cxl_uport_init_ras_reporting(struct cxl_port *port)
> >> {
> >> + struct pci_dev *pdev = to_pci_dev(port->uport_dev);
> >> +
> >> /* uport may have more than 1 downstream EP. Check if already mapped. */
> >> if (port->uport_regs.ras)
> >> return;
> >> @@ -853,6 +883,9 @@ void cxl_uport_init_ras_reporting(struct cxl_port *port)
> >> dev_err(&port->dev, "Failed to map RAS capability.\n");
> >> return;
> >> }
> >> +
> >> + cxl_assign_port_error_handlers(pdev);
> >> + devm_add_action_or_reset(port->uport_dev, cxl_clear_port_error_handlers, pdev);
> >> }
> >> EXPORT_SYMBOL_NS_GPL(cxl_uport_init_ras_reporting, CXL);
> >>
> >> @@ -864,6 +897,7 @@ void cxl_dport_init_ras_reporting(struct cxl_dport *dport)
> >> {
> >> struct device *dport_dev = dport->dport_dev;
> >> struct pci_host_bridge *host_bridge = to_pci_host_bridge(dport_dev);
> >> + struct pci_dev *pdev = to_pci_dev(dport_dev);
> >>
> >> dport->reg_map.host = dport_dev;
> >> if (dport->rch && host_bridge->native_aer) {
> >> @@ -880,6 +914,12 @@ void cxl_dport_init_ras_reporting(struct cxl_dport *dport)
> >> dev_err(dport_dev, "Failed to map RAS capability.\n");
> >> return;
> >> }
> >> +
> >> + if (dport->rch)
> >> + return;
> >> +
> >> + cxl_assign_port_error_handlers(pdev);
> >> + devm_add_action_or_reset(dport_dev, cxl_clear_port_error_handlers, pdev);
> >> }
> >> EXPORT_SYMBOL_NS_GPL(cxl_dport_init_ras_reporting, CXL);
> >>
>
next prev parent reply other threads:[~2025-01-07 11:32 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-11 23:39 [PATCH v4 0/15] Enable CXL PCIe Port protocol error handling and logging Terry Bowman
2024-12-11 23:39 ` [PATCH v4 01/15] PCI/AER: Introduce 'struct cxl_err_handlers' and add to 'struct pci_driver' Terry Bowman
2024-12-11 23:39 ` [PATCH v4 02/15] PCI/AER: Rename AER driver's interfaces to also indicate CXL PCIe Port support Terry Bowman
2024-12-11 23:39 ` [PATCH v4 03/15] cxl/pci: Introduce PCIe helper functions pcie_is_cxl() and pcie_is_cxl_port() Terry Bowman
2024-12-11 23:39 ` [PATCH v4 04/15] PCI/AER: Modify AER driver logging to report CXL or PCIe bus error type Terry Bowman
2024-12-12 1:34 ` Li Ming
2024-12-12 19:59 ` Bowman, Terry
2024-12-14 13:34 ` Li Ming
2024-12-11 23:39 ` [PATCH v4 05/15] PCI/AER: Add CXL PCIe Port correctable error support in AER service driver Terry Bowman
2024-12-11 23:39 ` [PATCH v4 06/15] PCI/AER: Change AER driver to read UCE fatal status for all CXL PCIe Port devices Terry Bowman
2024-12-24 18:28 ` Jonathan Cameron
2024-12-11 23:39 ` [PATCH v4 07/15] PCI/AER: Add CXL PCIe Port Uncorrectable Error recovery in AER service driver Terry Bowman
2024-12-12 9:28 ` Alejandro Lucero Palau
2024-12-13 15:07 ` Bowman, Terry
2024-12-24 18:31 ` Jonathan Cameron
2024-12-11 23:39 ` [PATCH v4 08/15] cxl/pci: Map CXL PCIe Root Port and Downstream Switch Port RAS registers Terry Bowman
2024-12-12 10:36 ` Alejandro Lucero Palau
2024-12-13 15:10 ` Bowman, Terry
2024-12-24 18:38 ` Jonathan Cameron
2024-12-11 23:39 ` [PATCH v4 09/15] cxl/pci: Map CXL PCIe Upstream " Terry Bowman
2024-12-24 18:41 ` Jonathan Cameron
2024-12-11 23:39 ` [PATCH v4 10/15] cxl/pci: Update RAS handler interfaces to also support CXL PCIe Ports Terry Bowman
2024-12-12 10:38 ` Alejandro Lucero Palau
2024-12-24 18:42 ` Jonathan Cameron
2024-12-11 23:39 ` [PATCH v4 11/15] cxl/pci: Change find_cxl_port() to non-static Terry Bowman
2024-12-11 23:39 ` [PATCH v4 12/15] cxl/pci: Add error handler for CXL PCIe Port RAS errors Terry Bowman
2024-12-12 2:19 ` Li Ming
2024-12-24 18:43 ` Jonathan Cameron
2024-12-11 23:40 ` [PATCH v4 13/15] cxl/pci: Add trace logging " Terry Bowman
2024-12-12 9:46 ` Alejandro Lucero Palau
2024-12-24 18:46 ` Jonathan Cameron
2024-12-26 17:01 ` Bowman, Terry
2024-12-11 23:40 ` [PATCH v4 14/15] cxl/pci: Add support to assign and clear pci_driver::cxl_err_handlers Terry Bowman
2024-12-12 2:31 ` Li Ming
2024-12-17 14:39 ` Bowman, Terry
2024-12-24 18:50 ` Jonathan Cameron
2024-12-26 17:07 ` Bowman, Terry
2025-01-07 11:32 ` Jonathan Cameron [this message]
2024-12-11 23:40 ` [PATCH v4 15/15] PCI/AER: Enable internal errors for CXL Upstream and Downstream Switch Ports Terry Bowman
2024-12-12 9:44 ` Alejandro Lucero Palau
2024-12-12 10:44 ` Alejandro Lucero Palau
2024-12-13 15:22 ` Bowman, Terry
2024-12-13 15:34 ` Bowman, Terry
2024-12-24 18:53 ` Jonathan Cameron
2024-12-26 17:19 ` Bowman, Terry
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250107113240.00003eda@huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=Benjamin.Cheatham@amd.com \
--cc=PradeepVineshReddy.Kodamati@amd.com \
--cc=Smita.KoralahalliChannabasappa@amd.com \
--cc=alison.schofield@intel.com \
--cc=bhelgaas@google.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=mahesh@linux.ibm.com \
--cc=ming.li@zohomail.com \
--cc=nathan.fontenot@amd.com \
--cc=nifan.cxl@gmail.com \
--cc=oohall@gmail.com \
--cc=rrichter@amd.com \
--cc=terry.bowman@amd.com \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.