From: Bjorn Helgaas <helgaas@kernel.org>
To: Karolina Stolarek <karolina.stolarek@oracle.com>
Cc: linux-pci@vger.kernel.org, Bjorn Helgaas <bhelgaas@google.com>,
Jon Pan-Doh <pandoh@google.com>,
Terry Bowman <terry.bowman@amd.com>, Len Brown <lenb@kernel.org>,
James Morse <james.morse@arm.com>,
Tony Luck <tony.luck@intel.com>, Borislav Petkov <bp@alien8.de>,
Ben Cheatham <Benjamin.Cheatham@amd.com>,
Ira Weiny <ira.weiny@intel.com>,
Shuai Xue <xueshuai@linux.alibaba.com>,
Liu Xinpeng <liuxp11@chinatelecom.cn>,
Darren Hart <darren@os.amperecomputing.com>,
Dan Williams <dan.j.williams@intel.com>
Subject: Re: [PATCH] PCI/AER: Consolidate CXL and native AER reporting paths
Date: Thu, 20 Mar 2025 13:17:36 -0500 [thread overview]
Message-ID: <20250320181736.GA1091349@bhelgaas> (raw)
In-Reply-To: <b919c39e-bb0f-40f6-84c9-f712404c0ac0@oracle.com>
On Thu, Mar 20, 2025 at 04:14:04PM +0100, Karolina Stolarek wrote:
> On 19/03/2025 23:21, Bjorn Helgaas wrote:
> > On Mon, Mar 17, 2025 at 10:14:46AM +0000, Karolina Stolarek wrote:
> > > Make CXL devices use aer_print_error() when reporting AER errors.
> > > Add a helper function to populate aer_err_info struct before logging
> > > an error. Move struct aer_err_info definition to the aer.h header
> > > to make it visible to CXL.
> >
> > Previously, pci_print_aer() was used by both CXL (via
> > cxl_handle_rdport_errors()) and by ACPI GHES via aer_recover_queue()
> > and aer_recover_work_func(), right?
> >
> > And after this patch, they would use aer_print_error() like native
> > AER, native DPC, and the ACPI EDR DPC path?
>
> That is correct.
> > I think this consolidation is a good thing, because I don't think we
> > should log errors differently just because we learned about them via a
> > different path.
> >
> > But I think this also changes the text we put in dmesg, which is
> > potentially disruptive to users and scripts that consume it, so I
> > think we should include a comparison of the previous and new text in
> > the commit log.
>
> Like I said in a comment to the patch, I tested CXL error reporting in QEMU
> with and without my patch, and the output is the same:
>
> pcieport 0000:0c:00.0: aer_inject: Injecting errors 00004000/00000000 into device 0000:0c:00.0
> pcieport 0000:0c:00.0: AER: Correctable error message received from 0000:0c:00.0
> pcieport 0000:0c:00.0: CXL Bus Error: severity=Correctable, type=Transaction Layer, (Receiver ID)
> pcieport 0000:0c:00.0: device [8086:7075] error status/mask=00004000/0000a000
> pcieport 0000:0c:00.0: [14] CorrIntErr
Maybe there's CXL magic that I missed. It looks like Terry's series
changes some of this path. And GHES also currently uses
pci_print_aer(). Some sample logs at [1,2].
Looking at v6.14-rc1, only aer_print_error() logs the "error status"
string, and only pci_print_aer() logs "aer_status", "aer_layer", etc.
The previous path is:
pci_print_aer
pci_err("aer_status: 0x%08x, aer_mask: 0x%08x\n") <--
__aer_print_error
pci_err("aer_layer=%s, aer_agent=%s\n") <--
pcie_print_tlp_log
New path is:
aer_print_error
pci_printk("PCIe Bus Error: severity=%s, type=%s, (%s)\n")
pci_printk(" device [%04x:%04x] error status/mask=%08x/%08x\n)
__aer_print_error
pcie_print_tlp_log
So I expected that the lines I marked in pci_print_aer() would be
different.
Bjorn
[1] https://lore.kernel.org/lkml/2149597.8uJZFlvqrj@xrated/T/
[2] https://lore.kernel.org/all/e8a58616-aeae-ad78-d496-6dfcef4ddcaa@arm.com/T/
next prev parent reply other threads:[~2025-03-20 18:17 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-17 10:14 [PATCH] PCI/AER: Consolidate CXL and native AER reporting paths Karolina Stolarek
2025-03-19 22:21 ` Bjorn Helgaas
2025-03-20 15:14 ` Karolina Stolarek
2025-03-20 18:17 ` Bjorn Helgaas [this message]
2025-03-21 13:56 ` Karolina Stolarek
2025-03-21 15:06 ` Bjorn Helgaas
2025-03-24 19:31 ` Karolina Stolarek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250320181736.GA1091349@bhelgaas \
--to=helgaas@kernel.org \
--cc=Benjamin.Cheatham@amd.com \
--cc=bhelgaas@google.com \
--cc=bp@alien8.de \
--cc=dan.j.williams@intel.com \
--cc=darren@os.amperecomputing.com \
--cc=ira.weiny@intel.com \
--cc=james.morse@arm.com \
--cc=karolina.stolarek@oracle.com \
--cc=lenb@kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=liuxp11@chinatelecom.cn \
--cc=pandoh@google.com \
--cc=terry.bowman@amd.com \
--cc=tony.luck@intel.com \
--cc=xueshuai@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox