Linux CXL
 help / color / mirror / Atom feed
From: Terry Bowman <Terry.Bowman@amd.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: alison.schofield@intel.com, vishal.l.verma@intel.com,
	ira.weiny@intel.com, bwidawsk@kernel.org,
	dan.j.williams@intel.com, dave.jiang@intel.com,
	Jonathan.Cameron@huawei.com, linux-cxl@vger.kernel.org,
	rrichter@amd.com, linux-kernel@vger.kernel.org,
	bhelgaas@google.com
Subject: Re: [PATCH v2 4/5] cxl/pci: Forward RCH downstream port-detected errors to the CXL.mem dev handler
Date: Tue, 28 Mar 2023 08:41:38 -0500	[thread overview]
Message-ID: <c03b8d66-65d9-40fc-dd0a-1b8154e24f6e@amd.com> (raw)
In-Reply-To: <20230324223656.GA2660301@bhelgaas>

Hi Bjorn,

On 3/24/23 17:36, Bjorn Helgaas wrote:
> I'd call this a "PCI/AER: ..." patch since that's where all the
> changes are.
> 
> On Thu, Mar 23, 2023 at 04:38:07PM -0500, Terry Bowman wrote:
>> From: Robert Richter <rrichter@amd.com>
>>
>> In RCD mode a CXL device (RCD) is exposed as an RCiEP, but CXL
>> downstream and upstream ports are not enumerated and not visible in
>> the PCIe hierarchy. Protocol and link errors are sent to an RCEC.
> 
> "RCD" isn't a common term in drivers/pci; can you expand it once here?
> 
>> Now, RCH downstream port-detected errors are signaled as internal AER
>> errors (UIE/CIE) with the RCEC's source ID. A CXL handler must then
> 
> Similarly, "UIE" and "CIE" are new to drivers/pci; can you expand them
> before using?  I assume Uncorrectable Internal Error (UIE) and
> Corrected Internal Error (CIE)?  (Annoying that the PCIe spec uses
> "Correctable" in general, but "Corrected" for Internal Errors.)
> 
>> inspect the error status in various CXL registers residing in the
>> dport's component register space (CXL RAS cap) or the dport's RCRB
>> (AER ext cap). [1]
>>
>> This patch connects errors showing up in the RCEC's error handler with
> 
> "Connect errors ..." (we already know this text applies to *this
> patch*).
> 
>> the CXL subsystem. Implement this by forwarding the error to all CXL
>> devices below the RCEC. Since the entire CXL device is controlled only
>> using PCIe Configuration Space of device 0, Function 0, only pass it
>> there [2]. These devices have the Memory Device class code set
>> (PCI_CLASS_MEMORY_CXL, 502h) and the existing cxl_pci driver can
>> implement the handler.
> 
>> The CXL device driver is then responsible to
>> enable error reporting in the RCEC's AER cap
> 
> I don't know exactly what you mean by "error reporting in the RCEC's
> AER cap", but IIUC, for non-Root Port devices, generation of ERR_COR/
> ERR_NONFATAL/ERR_FATAL messages is controlled by the Device Control
> register and should already be enabled by pci_aer_init().
> 
> Maybe you mean setting AER mask/severity specifically for Internal
> Errors?  I'm hoping to get as much of AER management as we can in the
> PCI core and out of drivers, so maybe we need a new PCI interface to
> do that.
> 
> In any event, I assume this sort of configuration would be an
> enumeration-time thing, while *this* patch is a run-time thing, so
> maybe this information belongs with a different patch?
> 
>> (esp. CIE and UIE) and to
>> inspect the dport's CXL registers in addition (CXL RAS cap and AER ext
>> cap).
>>
>> The reason for choosing this implementation is that a CXL RCEC device
>> is bound to the AER port driver, but the driver does not allow it to
>> register a custom specific handler to support CXL. Connecting the RCEC
>> hard-wired with a CXL handler does not work, as the CXL subsystem
>> might not be present all the time. The alternative to add an
>> implementation to the portdrv to allow the registration of a custom
>> RCEC error handler isn't worth doing it as CXL would be its only user.
>> Instead, just check for an CXL RCEC and pass it down to the connected
>> CXL device's error handler.
>>
>> [1] CXL 3.0 spec, 12.2.1.1 RCH Downstream Port-detected Errors
>> [2] CXL 3.0 spec, 8.1.3 PCIe DVSEC for CXL Devices
>>
>> Co-developed-by: Terry Bowman <terry.bowman@amd.com>
>> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
>> Signed-off-by: Robert Richter <rrichter@amd.com>
> 
> Since you're sending this patch (Terry) your Signed-off-by should be
> last.
> 

I'll move my Signed-off-by to the last.

Regards,
Terry

  parent reply	other threads:[~2023-03-28 13:42 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-23 21:38 [PATCH v2 0/5] cxl/pci: Add support for RCH RAS error handling Terry Bowman
2023-03-23 21:38 ` [PATCH v2 1/5] cxl/pci: Add RCH downstream port AER and RAS register discovery Terry Bowman
2023-03-24  8:53   ` kernel test robot
2023-03-24 13:12     ` Terry Bowman
2023-03-23 21:38 ` [PATCH v2 2/5] efi/cper: Export cper_mem_err_unpack() for CXL logging Terry Bowman
2023-03-23 22:29   ` Terry Bowman
2023-03-23 21:38 ` [PATCH v2 3/5] pci/aer: Export cper_print_aer() for CXL driver logging Terry Bowman
2023-03-23 22:20   ` Terry Bowman
2023-03-23 22:26     ` Sathyanarayanan Kuppuswamy
2023-04-14 20:41       ` Terry Bowman
2023-03-24 21:41   ` Bjorn Helgaas
2023-03-24 21:52     ` Terry Bowman
2023-03-23 21:38 ` [PATCH v2 4/5] cxl/pci: Forward RCH downstream port-detected errors to the CXL.mem dev handler Terry Bowman
2023-03-23 22:27   ` Terry Bowman
2023-03-24 22:36   ` Bjorn Helgaas
2023-03-27 21:51     ` Robert Richter
2023-03-28 17:21       ` Bjorn Helgaas
2023-03-29 15:59         ` Robert Richter
2023-03-28 13:41     ` Terry Bowman [this message]
2023-03-23 21:38 ` [PATCH v2 5/5] cxl/pci: Add RCH downstream port error logging Terry Bowman
2023-03-24  5:39   ` kernel test robot
2023-03-24  6:09   ` kernel test robot
2023-03-24  6:30   ` kernel test robot
2023-03-24 17:41     ` Terry Bowman
2023-03-27 23:21   ` Dave Jiang
2023-03-28 13:53     ` Terry Bowman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c03b8d66-65d9-40fc-dd0a-1b8154e24f6e@amd.com \
    --to=terry.bowman@amd.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=alison.schofield@intel.com \
    --cc=bhelgaas@google.com \
    --cc=bwidawsk@kernel.org \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=helgaas@kernel.org \
    --cc=ira.weiny@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rrichter@amd.com \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox