From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Robert Richter <rrichter@amd.com>,
Terry Bowman <terry.bowman@amd.com>, <alison.schofield@intel.com>,
<vishal.l.verma@intel.com>, <ira.weiny@intel.com>,
<bwidawsk@kernel.org>, <dan.j.williams@intel.com>,
<dave.jiang@intel.com>, <linux-cxl@vger.kernel.org>,
<linux-kernel@vger.kernel.org>, <bhelgaas@google.com>
Subject: Re: [PATCH v4 23/23] PCI/AER: Unmask RCEC internal errors to enable RCH downstream port error handling
Date: Thu, 1 Jun 2023 15:11:34 +0100 [thread overview]
Message-ID: <20230601151134.00006281@Huawei.com> (raw)
In-Reply-To: <ZHEXOlxfCCApI+NE@bhelgaas>
> > > > @@ -1432,6 +1495,7 @@ static int aer_probe(struct pcie_device *dev)
> > > > return status;
> > > > }
> > > >
> > > > + cxl_rch_enable_rcec(port);
> > >
> > > Could this be done by the driver that claims the CXL RCiEP? There's
> > > no point in unmasking the errors before there's a driver with
> > > pci_error_handlers that can do something with them anyway.
> >
> > This sounds reasonable at the first glance. The problem is there could
> > be many devices associated with the RCEC. Not all of them will be
> > bound to a driver and handler at the same time. We would need to
> > refcount it or maintain a list of enabled devices. But there is
> > already something similar by checking dev->driver. But right, AER
> > errors could be seen and handled then at least on PCI level. I tent to
> > permanently enable RCEC AER, but that could cause side-effects. What
> > do you think?
>
> IIUC, this really just affects CXL devices, so I think the choice is
> (1) always unmask internal errors for RCECs where those CXL devices
> report errors (as this patch does), or (2) unmask when first CXL
> driver that can handle the errors is loaded and restore previous state
> when last one is unloaded.
>
> If the RCEC *only* handles errors for CXL devices, i.e., not for a mix
> of vanilla PCIe RCiEPs and CXL RCiEPs, I think I'm OK with (1). I
> think you said only the CXL driver knows how to collect and interpret
> the error data. Is it OK that when no such driver is loaded, we field
> error interrupts silently, without even mentioning that an error
> occurred? I guess without the driver, the device is probably not in
> use.
It might be in use. Firmware may well have set up the CXL device and
even have put the kernel image in that memory for example. OS first RAS
handling won't be up until the driver loads though. Would be a bit
odd to mix OS first handling with firmware setup. I'd expect firmware
first handling in that case, but I don't think anything stops the two
being mixed.
Jonathan
next prev parent reply other threads:[~2023-06-01 14:11 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-23 23:21 [PATCH v4 00/23] cxl/pci: Add support for RCH RAS error handling Terry Bowman
2023-05-23 23:21 ` [PATCH v4 01/23] cxl/acpi: Probe RCRB later during RCH downstream port creation Terry Bowman
2023-06-01 10:13 ` Jonathan Cameron
2023-06-02 14:16 ` Robert Richter
2023-05-23 23:21 ` [PATCH v4 02/23] cxl/rch: Prepare for caching the MMIO mapped PCIe AER capability Terry Bowman
2023-06-01 10:38 ` Jonathan Cameron
2023-06-02 14:53 ` Robert Richter
2023-05-23 23:21 ` [PATCH v4 03/23] cxl: Rename member @dport of struct cxl_dport to @dev Terry Bowman
2023-06-01 10:41 ` Jonathan Cameron
2023-05-23 23:21 ` [PATCH v4 04/23] cxl/core/regs: Add @dev to cxl_register_map Terry Bowman
2023-06-01 10:49 ` Jonathan Cameron
2023-06-02 15:11 ` Robert Richter
2023-05-23 23:21 ` [PATCH v4 05/23] cxl/pci: Refactor component register discovery for reuse Terry Bowman
2023-06-01 10:52 ` Jonathan Cameron
2023-05-23 23:21 ` [PATCH v4 06/23] cxl/acpi: Moving add_host_bridge_uport() around Terry Bowman
2023-06-01 10:54 ` Jonathan Cameron
2023-05-23 23:21 ` [PATCH v4 07/23] cxl/acpi: Directly bind the CEDT detected CHBCR to the Host Bridge's port Terry Bowman
2023-06-01 12:45 ` Jonathan Cameron
2023-06-02 15:42 ` Robert Richter
2023-05-23 23:21 ` [PATCH v4 08/23] cxl/regs: Remove early capability checks in Component Register setup Terry Bowman
2023-06-01 12:49 ` Jonathan Cameron
2023-05-23 23:22 ` [PATCH v4 09/23] cxl/pci: Early setup RCH dport component registers from RCRB Terry Bowman
2023-06-01 12:59 ` Jonathan Cameron
2023-06-02 15:45 ` Robert Richter
2023-05-23 23:22 ` [PATCH v4 10/23] cxl/port: Store the port's Component Register mappings in struct cxl_port Terry Bowman
2023-06-01 13:06 ` Jonathan Cameron
2023-06-02 15:58 ` Robert Richter
2023-05-23 23:22 ` [PATCH v4 11/23] cxl/port: Store the downstream port's Component Register mappings in struct cxl_dport Terry Bowman
2023-06-01 13:07 ` Jonathan Cameron
2023-05-23 23:22 ` [PATCH v4 12/23] cxl/pci: Store the endpoint's Component Register mappings in struct cxl_dev_state Terry Bowman
2023-06-01 13:07 ` Jonathan Cameron
2023-05-23 23:22 ` [PATCH v4 13/23] cxl/hdm: Use stored Component Register mappings to map HDM decoder capability Terry Bowman
2023-05-24 1:12 ` kernel test robot
2023-05-24 9:49 ` Robert Richter
2023-05-25 20:23 ` kernel test robot
2023-06-01 13:11 ` Jonathan Cameron
2023-05-23 23:22 ` [PATCH v4 14/23] cxl/port: Remove Component Register base address from struct cxl_port Terry Bowman
2023-06-01 13:27 ` Jonathan Cameron
2023-05-23 23:22 ` [PATCH v4 15/23] cxl/port: Remove Component Register base address from struct cxl_dport Terry Bowman
2023-06-01 13:28 ` Jonathan Cameron
2023-05-23 23:22 ` [PATCH v4 16/23] cxl/pci: Remove Component Register base address from struct cxl_dev_state Terry Bowman
2023-06-01 13:28 ` Jonathan Cameron
2023-05-23 23:22 ` [PATCH v4 17/23] cxl/pci: Add RCH downstream port AER register discovery Terry Bowman
2023-06-01 13:36 ` Jonathan Cameron
2023-06-01 13:38 ` Jonathan Cameron
2023-05-23 23:22 ` [PATCH v4 18/23] PCI/AER: Refactor cper_print_aer() for use by CXL driver module Terry Bowman
2023-05-24 16:55 ` Bjorn Helgaas
2023-05-25 21:38 ` Terry Bowman
2023-05-23 23:22 ` [PATCH v4 19/23] cxl/pci: Update CXL error logging to use RAS register address Terry Bowman
2023-06-01 13:42 ` Jonathan Cameron
2023-05-23 23:22 ` [PATCH v4 20/23] cxl/pci: Prepare for logging RCH downstream port protocol errors Terry Bowman
2023-06-01 13:49 ` Jonathan Cameron
2023-06-01 14:06 ` Terry Bowman
2023-06-01 14:12 ` Jonathan Cameron
2023-05-23 23:22 ` [PATCH v4 21/23] cxl/pci: Add RCH downstream port error logging Terry Bowman
2023-06-01 14:03 ` Jonathan Cameron
2023-05-23 23:22 ` [PATCH v4 22/23] PCI/AER: Forward RCH downstream port-detected errors to the CXL.mem dev handler Terry Bowman
2023-05-24 21:32 ` Bjorn Helgaas
2023-05-25 21:29 ` Robert Richter
2023-05-25 22:01 ` Bjorn Helgaas
2023-05-25 22:28 ` Robert Richter
2023-06-01 14:06 ` Jonathan Cameron
2023-05-23 23:22 ` [PATCH v4 23/23] PCI/AER: Unmask RCEC internal errors to enable RCH downstream port error handling Terry Bowman
2023-05-24 21:45 ` Bjorn Helgaas
2023-05-25 22:08 ` Robert Richter
2023-05-26 20:31 ` Bjorn Helgaas
2023-06-01 14:11 ` Jonathan Cameron [this message]
2023-06-02 16:41 ` Robert Richter
2023-05-23 23:29 ` [PATCH v4 00/23] cxl/pci: Add support for RCH RAS error handling - CHANGELOG Terry Bowman
2023-05-24 1:39 ` [PATCH v4 00/23] cxl/pci: Add support for RCH RAS error handling Terry Bowman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230601151134.00006281@Huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=bhelgaas@google.com \
--cc=bwidawsk@kernel.org \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=helgaas@kernel.org \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rrichter@amd.com \
--cc=terry.bowman@amd.com \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox