public inbox for linux-cxl@vger.kernel.org
 help / color / mirror / Atom feed
From: Terry Bowman <Terry.Bowman@amd.com>
To: Dan Williams <dan.j.williams@intel.com>,
	ming4.li@intel.com, linux-cxl@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
	dave@stgolabs.net, jonathan.cameron@huawei.com,
	dave.jiang@intel.com, alison.schofield@intel.com,
	vishal.l.verma@intel.com, bhelgaas@google.com,
	mahesh@linux.ibm.com, oohall@gmail.com,
	Benjamin.Cheatham@amd.com, rrichter@amd.com,
	nathan.fontenot@amd.com, smita.koralahallichannabasappa@amd.com
Subject: Re: [PATCH 0/15] Enable CXL PCIe port protocol error handling and logging
Date: Tue, 22 Oct 2024 08:29:43 -0500	[thread overview]
Message-ID: <aaec3243-afb3-4021-a91a-a868595ffa69@amd.com> (raw)
In-Reply-To: <671703521fd1f_2312294e3@dwillia2-xfh.jf.intel.com.notmuch>

Hi Dan,

On 10/21/24 20:43, Dan Williams wrote:
> Terry Bowman wrote:
> [..]
>> Testing:
>>
>> Below are test results for this patchset. This is using Qemu with a root
>> port (0c:00.0), upstream switch port (0d:00.0),and downstream switch port
>> (0e:00.0).
>>
>> This was tested using aer-inject updated to support CE and UCE internal
>> error injection. CXL RAS was set using a test patch (not upstreamed).
> 
> Thanks for these test outputs!
> 
>>
>>     Root port UCE:
>>     root@tbowman-cxl:~/aer-inject# ./root-uce-inject.sh
>>     [   27.318920] pcieport 0000:0c:00.0: aer_inject: Injecting errors 00000000/00400000 into device 0000:0c:00.0
>>     [   27.320164] pcieport 0000:0c:00.0: AER: Uncorrectable (Fatal) error message received from 0000:0c:00.0
>>     [   27.321518] pcieport 0000:0c:00.0: PCIe Bus Error: severity=Uncorrectable (Fatal), type=Transaction Layer, (Receiver ID)
>>     [   27.322483] pcieport 0000:0c:00.0:   device [8086:7075] error status/mask=00400000/02000000
>>     [   27.323243] pcieport 0000:0c:00.0:    [22] UncorrIntErr
>>     [   27.325584] aer_event: 0000:0c:00.0 PCIe Bus Error: severity=Fatal, Uncorrectable Internal Error, TLP Header=Not available
> 
> It strikes that by this point the code knows that it is a "CXL Bus"
> error and no longer a "PCIe Bus" error. Given the divergent  responses
> to Fatal errors based on bus I think it would help to clarify that the
> kernel is panicking due to "CXL Bus", not "PCIe Bus" errors.
> 
>>     [   27.325584]
>>     [   27.327171] cxl_port_aer_uncorrectable_error: device=0000:0c:00.0 host=pci0000:0c status: 'Memory Address Parity Error'
> 
> ...i.e. someone may not notice that this is "cxl" reference in the
> backtrace.

Good idea. I'll add logic to print 'CXL' bus in the case of a CXL erroring device.

Regards,
Terry

      reply	other threads:[~2024-10-22 13:29 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-08 22:16 [PATCH 0/15] Enable CXL PCIe port protocol error handling and logging Terry Bowman
2024-10-08 22:16 ` [PATCH 01/15] cxl/aer/pci: Add CXL PCIe port error handler callbacks in AER service driver Terry Bowman
2024-10-22  1:53   ` Dan Williams
2024-10-22 13:50     ` Terry Bowman
2024-10-22 17:09       ` Dan Williams
2024-10-22 18:40         ` Terry Bowman
2024-10-22 23:43           ` Dan Williams
2024-10-24 15:20             ` Bowman, Terry
2024-10-24 19:10               ` Dan Williams
2024-10-08 22:16 ` [PATCH 02/15] cxl/aer/pci: Update is_internal_error() to be callable w/o CONFIG_PCIEAER_CXL Terry Bowman
2024-10-16 16:11   ` Jonathan Cameron
2024-10-22  2:17   ` Dan Williams
2024-10-22 13:54     ` Terry Bowman
2024-10-08 22:16 ` [PATCH 03/15] cxl/aer/pci: Refactor AER driver's existing interfaces to support CXL PCIe ports Terry Bowman
2024-10-10 19:11   ` Bjorn Helgaas
2024-10-14 17:27     ` Terry Bowman
2024-10-08 22:16 ` [PATCH 04/15] cxl/aer/pci: Add CXL PCIe port correctable error support in AER service driver Terry Bowman
2024-10-16 16:22   ` Jonathan Cameron
2024-10-16 17:18     ` Terry Bowman
2024-10-16 17:29       ` Jonathan Cameron
2024-10-08 22:16 ` [PATCH 05/15] cxl/aer/pci: Update AER driver to read UCE fatal status for all CXL PCIe port devices Terry Bowman
2024-10-16 16:28   ` Jonathan Cameron
2024-10-08 22:16 ` [PATCH 06/15] cxl/aer/pci: Introduce PCI_ERS_RESULT_PANIC to pci_ers_result type Terry Bowman
2024-10-16 16:30   ` Jonathan Cameron
2024-10-16 17:31     ` Terry Bowman
2024-10-17 13:31       ` Jonathan Cameron
2024-10-17 14:50         ` Bowman, Terry
2024-10-08 22:16 ` [PATCH 07/15] cxl/aer/pci: Add CXL PCIe port uncorrectable error recovery in AER service driver Terry Bowman
2024-10-16 16:54   ` Jonathan Cameron
2024-10-16 18:07     ` Terry Bowman
2024-10-17 13:43       ` Jonathan Cameron
2024-10-17 16:21         ` Bowman, Terry
2024-10-17 17:08           ` Jonathan Cameron
2024-10-08 22:16 ` [PATCH 08/15] cxl/pci: Change find_cxl_ports() to be non-static Terry Bowman
2024-10-08 22:16 ` [PATCH 09/15] cxl/pci: Map CXL PCIe downstream port RAS registers Terry Bowman
2024-10-16 17:14   ` Jonathan Cameron
2024-10-16 18:16     ` Terry Bowman
2024-10-17 13:50       ` Jonathan Cameron
2024-10-17 16:26         ` Bowman, Terry
2024-10-08 22:16 ` [PATCH 10/15] cxl/pci: Map CXL PCIe upstream " Terry Bowman
2024-10-08 22:16 ` [PATCH 11/15] cxl/pci: Update RAS handler interfaces to support CXL PCIe ports Terry Bowman
2024-10-08 22:16 ` [PATCH 12/15] cxl/pci: Add error handler for CXL PCIe port RAS errors Terry Bowman
2024-10-17 13:57   ` Jonathan Cameron
2024-10-17 16:42     ` Bowman, Terry
2024-10-08 22:16 ` [PATCH 13/15] cxl/pci: Add trace logging " Terry Bowman
2024-10-17 14:04   ` Jonathan Cameron
2024-10-08 22:16 ` [PATCH 14/15] cxl/aer/pci: Export pci_aer_unmask_internal_errors() Terry Bowman
2024-10-16 17:22   ` Jonathan Cameron
2024-10-08 22:16 ` [PATCH 15/15] cxl/pci: Enable internal CE/UCE interrupts for CXL PCIe port devices Terry Bowman
2024-10-16 17:21   ` Jonathan Cameron
2024-10-16 17:24     ` Terry Bowman
2024-10-10 19:07 ` [PATCH 0/15] Enable CXL PCIe port protocol error handling and logging Bjorn Helgaas
2024-10-14 17:22   ` Terry Bowman
2024-10-14 17:29     ` Bjorn Helgaas
2024-10-14 17:33       ` Terry Bowman
2024-10-17 16:34 ` Fan Ni
2024-10-17 17:27   ` Bowman, Terry
2024-10-21 22:19     ` Fan Ni
2024-10-18 23:22 ` Bjorn Helgaas
2024-10-21 19:22   ` Terry Bowman
2024-10-22  1:43 ` Dan Williams
2024-10-22 13:29   ` Terry Bowman [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aaec3243-afb3-4021-a91a-a868595ffa69@amd.com \
    --to=terry.bowman@amd.com \
    --cc=Benjamin.Cheatham@amd.com \
    --cc=alison.schofield@intel.com \
    --cc=bhelgaas@google.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=jonathan.cameron@huawei.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mahesh@linux.ibm.com \
    --cc=ming4.li@intel.com \
    --cc=nathan.fontenot@amd.com \
    --cc=oohall@gmail.com \
    --cc=rrichter@amd.com \
    --cc=smita.koralahallichannabasappa@amd.com \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox