From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: Dave Jiang <dave.jiang@intel.com>
Cc: <linux-cxl@vger.kernel.org>, <alison.schofield@intel.com>,
<vishal.l.verma@intel.com>, <bwidawsk@kernel.org>,
<dan.j.williams@intel.com>, <shiju.jose@huawei.com>,
<rrichter@amd.com>
Subject: Re: [PATCH RFC v2 0/9] cxl/pci: Add fundamental error handling
Date: Thu, 17 Nov 2022 13:50:26 +0000 [thread overview]
Message-ID: <20221117135026.00006422@Huawei.com> (raw)
In-Reply-To: <eb64ef4f-6680-8983-81cc-06c470422c50@intel.com>
On Wed, 16 Nov 2022 16:20:20 -0700
Dave Jiang <dave.jiang@intel.com> wrote:
> On 11/3/2022 6:27 AM, Jonathan Cameron wrote:
> > On Thu, 3 Nov 2022 12:58:51 +0000
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> >
> >> On Mon, 24 Oct 2022 17:01:02 +0100
> >> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> >>
> >>> On Wed, 19 Oct 2022 10:38:13 -0700
> >>> Dave Jiang <dave.jiang@intel.com> wrote:
> >>>
> >>>> On 10/19/2022 10:30 AM, Jonathan Cameron wrote:
> >>>>> On Tue, 11 Oct 2022 18:19:15 +0100
> >>>>> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> >>>>>
> >>>>>> On Tue, 11 Oct 2022 08:18:34 -0700
> >>>>>> Dave Jiang <dave.jiang@intel.com> wrote:
> >>>>>>
> >>>>>>> On 10/11/2022 7:17 AM, Jonathan Cameron wrote:
> >>>>>>>> On Fri, 16 Sep 2022 16:10:53 -0700
> >>>>>>>> Dave Jiang <dave.jiang@intel.com> wrote:
> >>>>>>>>
> >>>>>>>>> Series set to RFC since there's no means to test. Would like to get opinion
> >>>>>>>>> on whether going with using trace events as reporting mechanism is ok.
> >>>>>>>>>
> >>>>>>>>> Jonathan,
> >>>>>>>>> We currently don't have any ways to test AER events. Do you have any plans
> >>>>>>>>> to support AER events via QEMU emulation?
> >>>>>>>> Sorry - missed this entirely as gotten a bit behind reading CXL emails.
> >>>>> Hi Dave,
> >>>>>
> >>>>> Quick update.
> >>>>>
> >>>>> Working QEMU emulation - but needs some/lots of cleanup. Particularly fun was
> >>>>> figuring out why I wasn't getting messages past the upstream switch port.
> >>>>> Turned out the serial number ECAP was on top of the AER ECAP. Oops - thankfully
> >>>>> that patch isn't upstream yet.
> >>>>> Also QEMU AER rooting seems to be based on some older PCIE spec
> >>>>> so needed some tweaks to get the device to actually issue ERR_FATAL etc.
> >>>>>
> >>>>> Anyhow, should have something you can play with in a day or two.
> >>>>
> >>>> Awesome! Thanks! :)
> >>>
> >>> Took a little longer than expected..
> >>>
> >>> Anyhow, now at
> >>> https://gitlab.com/jic23/qemu/-/commits/cxl-2022-10-24
> >>>
> >>> That tree is carrying far too many things right now for it make much sense
> >>> to me to email this to qemu-devel - though I may pull
> >>> hw/pci/aer: Add missing routing for AER errors
> >>> out in advance as that's closing a spec different between QEMU emulation of AER
> >>> and what the PCI spec says.
> >>>
> >>> Hopefully set of out of tree patches will start to shrink soon - v9 of the DOE
> >>> patches have been on list for a week or so.
> >>>
> >>> Top patch includes a very short 'how to' in patch description. Basically fire
> >>> up QMP: Add something like -qmp tcp:localhost:444,server=on,wait=off to your
> >>> qemu commandline and use commands like:
> >>>
> >>> { "execute": "qmp_capabilities" }
> >>> ...
> >>> { "execute": "cxl-inject-uncorrectable-error",
> >>> "arguments": {
> >>> "path": "/machine/peripheral/cxl-pmem0",
> >>> "type": "cache-address-parity",
> >>> "header": [ 3, 4]
> >>> } }
> >>> ...
> >>> { "execute": "cxl-inject-correctable-error",
> >>> "arguments": {
> >>> "path": "/machine/peripheral/cxl-pmem0",
> >>> "type": "physical",
> >>> "header": [ 3, 4]
> >>> } }
> >>>
> >>
> >> So Dave reported that this wasn't working on x86 qemu machines.
> >>
> >> A fun bit of debugging later (I hate AML) and I think I have find the issue +
> >> have a hack to workaround it for now.
> >>
> >> So need some background.
> >> 1) CXL code is based on QEMU's pci expander bridge root bridge - there is a complex
> >> bit of handling to create appropriate ACPI DSDT magic.
> >> 2) The CXL root port is based on pcie_root_port.c
> >> 3) Both CXL root port and pcie root port use traditional PCI interrupts, not MSI/MSIX
> >> for their signaling.
> >> 4) Q35 machine uses an IOAPIC and the resulting PCI bus interrupt routing lands the
> >> actual interrupt on line 23 for my particular configuration
> >> 5) The ACPI table says it's on line 11.
> >> 6) x86 code for creating the PRT has an informative comment...
> >> https://elixir.bootlin.com/qemu/latest/source/hw/i386/acpi-build.c#L697
> >> * The main goal is to equaly distribute the interrupts
> >> * over the 4 existing ACPI links (works only for i440fx).
> >>
> >> So the hack I'm running is below (note the UID thing is a separate bug that stops
> >> iasl from disassembling the DSDT due to a duplicate entry - I'll send out a fix
> >> for that shortly).
> >>
> >> There are a bunch of possible approaches to fix this if my identification of
> >> the issue is correct.
> >>
> >> 1) Clean equivalent of this hack that runs on appropriate machines only.
> >> 2) Use MSI instead. (ioh3420 root port takes this approach I think).
> >
> > This turned out to be trivial case of cut and pate. So MSI support it is (mostly because it doesn't
> > involve fixing AML generation :)
> >
> > Very lightly tested on one config of x86 machine and one of arm64.
> > I've not thought about this at all yet, so it's a direct copy of the ioh3420 with
> > appropriate renames for it being in the cxl_rp code.
>
> Hi Jonathan, do you have a qemu branch with the latest code? Maybe I'll
> give it a try. Thanks.
>
https://gitlab.com/jic23/qemu/-/tree/cxl-2022-11-17
should work... (famous last words).
Jonathan
> >
> >
> > From a5e85d90cb734fc1de81e0d99e443747348e65e7 Mon Sep 17 00:00:00 2001
> > From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Date: Thu, 3 Nov 2022 13:10:24 +0000
> > Subject: [PATCH] hw: pci-bridge: cxl_root_port: Write up MSI
> >
> > Done to avoid fixing ACPI rout description of traditional PCI interrupts on q35.
> >
> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > ---
> > hw/pci-bridge/cxl_root_port.c | 63 +++++++++++++++++++++++++++++++++++
> > 1 file changed, 63 insertions(+)
> >
> > diff --git a/hw/pci-bridge/cxl_root_port.c b/hw/pci-bridge/cxl_root_port.c
> > index 1a9363a6de..a7475c427e 100644
> > --- a/hw/pci-bridge/cxl_root_port.c
> > +++ b/hw/pci-bridge/cxl_root_port.c
> > @@ -22,6 +22,7 @@
> > #include "qemu/range.h"
> > #include "hw/pci/pci_bridge.h"
> > #include "hw/pci/pcie_port.h"
> > +#include "hw/pci/msi.h"
> > #include "hw/qdev-properties.h"
> > #include "hw/sysbus.h"
> > #include "qapi/error.h"
> > @@ -29,6 +30,10 @@
> >
> > #define CXL_ROOT_PORT_DID 0x7075
> >
> > +#define CXL_RP_MSI_OFFSET 0x60
> > +#define CXL_RP_MSI_SUPPORTED_FLAGS PCI_MSI_FLAGS_MASKBIT
> > +#define CXL_RP_MSI_NR_VECTOR 2
> > +
> > /* Copied from the gen root port which we derive */
> > #define GEN_PCIE_ROOT_PORT_AER_OFFSET 0x100
> > #define GEN_PCIE_ROOT_PORT_ACS_OFFSET \
> > @@ -36,6 +41,7 @@
> > #define CXL_ROOT_PORT_DVSEC_OFFSET \
> > (GEN_PCIE_ROOT_PORT_ACS_OFFSET + PCI_ACS_SIZEOF)
> >
> > +
> > typedef struct CXLRootPort {
> > /*< private >*/
> > PCIESlot parent_obj;
> > @@ -47,6 +53,50 @@ typedef struct CXLRootPort {
> > #define TYPE_CXL_ROOT_PORT "cxl-rp"
> > DECLARE_INSTANCE_CHECKER(CXLRootPort, CXL_ROOT_PORT, TYPE_CXL_ROOT_PORT)
> >
> > +/*
> > + * If two MSI vector are allocated, Advanced Error Interrupt Message Number
> > + * is 1. otherwise 0.
> > + * 17.12.5.10 RPERRSTS, 32:27 bit Advanced Error Interrupt Message Number.
> > + */
> > +static uint8_t cxl_rp_aer_vector(const PCIDevice *d)
> > +{
> > + switch (msi_nr_vectors_allocated(d)) {
> > + case 1:
> > + return 0;
> > + case 2:
> > + return 1;
> > + case 4:
> > + case 8:
> > + case 16:
> > + case 32:
> > + default:
> > + break;
> > + }
> > + abort();
> > + return 0;
> > +}
> > +
> > +static int cxl_rp_interrupts_init(PCIDevice *d, Error **errp)
> > +{
> > + int rc;
> > +
> > + rc = msi_init(d, CXL_RP_MSI_OFFSET, CXL_RP_MSI_NR_VECTOR,
> > + CXL_RP_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT,
> > + CXL_RP_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT,
> > + errp);
> > + if (rc < 0) {
> > + assert(rc == -ENOTSUP);
> > + }
> > +
> > + return rc;
> > +}
> > +
> > +static void cxl_rp_interrupts_uninit(PCIDevice *d)
> > +{
> > + msi_uninit(d);
> > +}
> > +
> > +
> > static void latch_registers(CXLRootPort *crp)
> > {
> > uint32_t *reg_state = crp->cxl_cstate.crb.cache_mem_registers;
> > @@ -177,6 +227,15 @@ static void cxl_rp_dvsec_write_config(PCIDevice *dev, uint32_t addr,
> > }
> > }
> >
> > +static void cxl_rp_aer_vector_update(PCIDevice *d)
> > +{
> > + PCIERootPortClass *rpc = PCIE_ROOT_PORT_GET_CLASS(d);
> > +
> > + if (rpc->aer_vector) {
> > + pcie_aer_root_set_vector(d, rpc->aer_vector(d));
> > + }
> > +}
> > +
> > static void cxl_rp_write_config(PCIDevice *d, uint32_t address, uint32_t val,
> > int len)
> > {
> > @@ -203,6 +262,7 @@ static void cxl_rp_write_config(PCIDevice *d, uint32_t address, uint32_t val,
> > }
> > }
> > pci_bridge_write_config(d, address, val, len);
> > + cxl_rp_aer_vector_update(d);
> > pcie_cap_flr_write_config(d, address, val, len);
> > pcie_cap_slot_write_config(d, slt_ctl, slt_sta, address, val, len);
> > pcie_aer_write_config(d, address, val, len);
> > @@ -229,6 +289,9 @@ static void cxl_root_port_class_init(ObjectClass *oc, void *data)
> >
> > rpc->aer_offset = GEN_PCIE_ROOT_PORT_AER_OFFSET;
> > rpc->acs_offset = GEN_PCIE_ROOT_PORT_ACS_OFFSET;
> > + rpc->aer_vector = cxl_rp_aer_vector;
> > + rpc->interrupts_init = cxl_rp_interrupts_init;
> > + rpc->interrupts_uninit = cxl_rp_interrupts_uninit;
> >
> > dc->hotpluggable = false;
> > }
next prev parent reply other threads:[~2022-11-17 13:50 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-16 23:10 [PATCH RFC v2 0/9] cxl/pci: Add fundamental error handling Dave Jiang
2022-09-16 23:11 ` [PATCH RFC v2 1/9] cxl/pci: Cleanup repeated code in cxl_probe_regs() helpers Dave Jiang
2022-09-16 23:11 ` [PATCH RFC v2 2/9] cxl/pci: Cleanup cxl_map_device_regs() Dave Jiang
2022-09-16 23:11 ` [PATCH RFC v2 3/9] cxl/pci: Kill cxl_map_regs() Dave Jiang
2022-10-18 13:43 ` Jonathan Cameron
2022-09-16 23:11 ` [PATCH RFC v2 4/9] cxl/core/regs: Make cxl_map_{component, device}_regs() device generic Dave Jiang
2022-09-16 23:11 ` [PATCH RFC v2 5/9] cxl/port: Limit the port driver to just the HDM Decoder Capability Dave Jiang
2022-10-20 16:54 ` Jonathan Cameron
2022-09-16 23:11 ` [PATCH RFC v2 6/9] cxl/pci: Prepare for mapping RAS Capability Structure Dave Jiang
2022-09-16 23:11 ` [PATCH RFC v2 7/9] cxl/pci: Find and map the " Dave Jiang
2022-09-16 23:11 ` [PATCH RFC v2 8/9] cxl/pci: add tracepoint events for CXL RAS Dave Jiang
2022-10-20 17:02 ` Jonathan Cameron
2022-10-20 17:07 ` Dave Jiang
2022-10-20 17:52 ` Steven Rostedt
2022-09-16 23:11 ` [PATCH RFC v2 9/9] cxl/pci: Add (hopeful) error handling support Dave Jiang
2022-10-20 13:45 ` Jonathan Cameron
2022-10-20 14:50 ` Dave Jiang
2022-10-20 14:03 ` Jonathan Cameron
2022-10-20 14:57 ` Dave Jiang
2022-10-20 15:52 ` Jonathan Cameron
2022-10-20 16:06 ` Dave Jiang
2022-10-20 16:11 ` Jonathan Cameron
2022-10-11 14:17 ` [PATCH RFC v2 0/9] cxl/pci: Add fundamental error handling Jonathan Cameron
2022-10-11 15:18 ` Dave Jiang
2022-10-11 17:19 ` Jonathan Cameron
2022-10-19 17:30 ` Jonathan Cameron
2022-10-19 17:38 ` Dave Jiang
2022-10-24 16:01 ` Jonathan Cameron
2022-10-25 15:22 ` Dave Jiang
2022-11-03 12:58 ` Jonathan Cameron
2022-11-03 13:27 ` Jonathan Cameron
2022-11-16 23:20 ` Dave Jiang
2022-11-17 13:50 ` Jonathan Cameron [this message]
2022-11-18 17:15 ` Dave Jiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221117135026.00006422@Huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=bwidawsk@kernel.org \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=rrichter@amd.com \
--cc=shiju.jose@huawei.com \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.