From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: Dave Jiang <dave.jiang@intel.com>
Cc: <linux-cxl@vger.kernel.org>, <alison.schofield@intel.com>,
<vishal.l.verma@intel.com>, <bwidawsk@kernel.org>,
<dan.j.williams@intel.com>, <shiju.jose@huawei.com>,
<rrichter@amd.com>
Subject: Re: [PATCH RFC v2 0/9] cxl/pci: Add fundamental error handling
Date: Mon, 24 Oct 2022 17:01:02 +0100 [thread overview]
Message-ID: <20221024170102.00000c4b@huawei.com> (raw)
In-Reply-To: <ae8330db-ab77-7952-e846-de7dc527890c@intel.com>
On Wed, 19 Oct 2022 10:38:13 -0700
Dave Jiang <dave.jiang@intel.com> wrote:
> On 10/19/2022 10:30 AM, Jonathan Cameron wrote:
> > On Tue, 11 Oct 2022 18:19:15 +0100
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> >
> >> On Tue, 11 Oct 2022 08:18:34 -0700
> >> Dave Jiang <dave.jiang@intel.com> wrote:
> >>
> >>> On 10/11/2022 7:17 AM, Jonathan Cameron wrote:
> >>>> On Fri, 16 Sep 2022 16:10:53 -0700
> >>>> Dave Jiang <dave.jiang@intel.com> wrote:
> >>>>
> >>>>> Series set to RFC since there's no means to test. Would like to get opinion
> >>>>> on whether going with using trace events as reporting mechanism is ok.
> >>>>>
> >>>>> Jonathan,
> >>>>> We currently don't have any ways to test AER events. Do you have any plans
> >>>>> to support AER events via QEMU emulation?
> >>>> Sorry - missed this entirely as gotten a bit behind reading CXL emails.
> > Hi Dave,
> >
> > Quick update.
> >
> > Working QEMU emulation - but needs some/lots of cleanup. Particularly fun was
> > figuring out why I wasn't getting messages past the upstream switch port.
> > Turned out the serial number ECAP was on top of the AER ECAP. Oops - thankfully
> > that patch isn't upstream yet.
> > Also QEMU AER rooting seems to be based on some older PCIE spec
> > so needed some tweaks to get the device to actually issue ERR_FATAL etc.
> >
> > Anyhow, should have something you can play with in a day or two.
>
> Awesome! Thanks! :)
Took a little longer than expected..
Anyhow, now at
https://gitlab.com/jic23/qemu/-/commits/cxl-2022-10-24
That tree is carrying far too many things right now for it make much sense
to me to email this to qemu-devel - though I may pull
hw/pci/aer: Add missing routing for AER errors
out in advance as that's closing a spec different between QEMU emulation of AER
and what the PCI spec says.
Hopefully set of out of tree patches will start to shrink soon - v9 of the DOE
patches have been on list for a week or so.
Top patch includes a very short 'how to' in patch description. Basically fire
up QMP: Add something like -qmp tcp:localhost:444,server=on,wait=off to your
qemu commandline and use commands like:
{ "execute": "qmp_capabilities" }
...
{ "execute": "cxl-inject-uncorrectable-error",
"arguments": {
"path": "/machine/peripheral/cxl-pmem0",
"type": "cache-address-parity",
"header": [ 3, 4]
} }
...
{ "execute": "cxl-inject-correctable-error",
"arguments": {
"path": "/machine/peripheral/cxl-pmem0",
"type": "physical",
"header": [ 3, 4]
} }
>
>
> > In meantime an example dump (not writing the header log yet!)
> >
> > pcieport 0000:0c:00.0: AER: Uncorrected (Non-Fatal) error received: 0000:0f:00.0
> > cxl_pci 0000:0f:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
> > cxl_pci 0000:0f:00.0: device [8086:0d93] error status/mask=00004000/00000000
> > cxl_pci 0000:0f:00.0: [14] CmpltTO (First)
> > cxl_ras_uc: mem3: status: 'Cache Data Parity Error' first_error: 'Cache Data Parity Error' header log: {0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0}
> > cxl_pci 0000:0f:00.0: mem3: restart CXL.mem after slot reset
> > cxl_port endpoint6: No CMA mailbox
> > cxl_pci 0000:0f:00.0: mem3: error resume successful
> > pcieport 0000:0e:00.0: AER: device recovery successful
> >
> > Jonathan
>
next prev parent reply other threads:[~2022-10-24 20:26 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-16 23:10 [PATCH RFC v2 0/9] cxl/pci: Add fundamental error handling Dave Jiang
2022-09-16 23:11 ` [PATCH RFC v2 1/9] cxl/pci: Cleanup repeated code in cxl_probe_regs() helpers Dave Jiang
2022-09-16 23:11 ` [PATCH RFC v2 2/9] cxl/pci: Cleanup cxl_map_device_regs() Dave Jiang
2022-09-16 23:11 ` [PATCH RFC v2 3/9] cxl/pci: Kill cxl_map_regs() Dave Jiang
2022-10-18 13:43 ` Jonathan Cameron
2022-09-16 23:11 ` [PATCH RFC v2 4/9] cxl/core/regs: Make cxl_map_{component, device}_regs() device generic Dave Jiang
2022-09-16 23:11 ` [PATCH RFC v2 5/9] cxl/port: Limit the port driver to just the HDM Decoder Capability Dave Jiang
2022-10-20 16:54 ` Jonathan Cameron
2022-09-16 23:11 ` [PATCH RFC v2 6/9] cxl/pci: Prepare for mapping RAS Capability Structure Dave Jiang
2022-09-16 23:11 ` [PATCH RFC v2 7/9] cxl/pci: Find and map the " Dave Jiang
2022-09-16 23:11 ` [PATCH RFC v2 8/9] cxl/pci: add tracepoint events for CXL RAS Dave Jiang
2022-10-20 17:02 ` Jonathan Cameron
2022-10-20 17:07 ` Dave Jiang
2022-10-20 17:52 ` Steven Rostedt
2022-09-16 23:11 ` [PATCH RFC v2 9/9] cxl/pci: Add (hopeful) error handling support Dave Jiang
2022-10-20 13:45 ` Jonathan Cameron
2022-10-20 14:50 ` Dave Jiang
2022-10-20 14:03 ` Jonathan Cameron
2022-10-20 14:57 ` Dave Jiang
2022-10-20 15:52 ` Jonathan Cameron
2022-10-20 16:06 ` Dave Jiang
2022-10-20 16:11 ` Jonathan Cameron
2022-10-11 14:17 ` [PATCH RFC v2 0/9] cxl/pci: Add fundamental error handling Jonathan Cameron
2022-10-11 15:18 ` Dave Jiang
2022-10-11 17:19 ` Jonathan Cameron
2022-10-19 17:30 ` Jonathan Cameron
2022-10-19 17:38 ` Dave Jiang
2022-10-24 16:01 ` Jonathan Cameron [this message]
2022-10-25 15:22 ` Dave Jiang
2022-11-03 12:58 ` Jonathan Cameron
2022-11-03 13:27 ` Jonathan Cameron
2022-11-16 23:20 ` Dave Jiang
2022-11-17 13:50 ` Jonathan Cameron
2022-11-18 17:15 ` Dave Jiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221024170102.00000c4b@huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=bwidawsk@kernel.org \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=rrichter@amd.com \
--cc=shiju.jose@huawei.com \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox