From: Ira Weiny <ira.weiny@intel.com>
To: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
Alison Schofield <alison.schofield@intel.com>,
Vishal Verma <vishal.l.verma@intel.com>,
"Ben Widawsky" <bwidawsk@kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
Davidlohr Bueso <dave@stgolabs.net>,
<linux-kernel@vger.kernel.org>, <linux-cxl@vger.kernel.org>
Subject: Re: [PATCH 08/11] cxl/mem: Wire up event interrupts
Date: Wed, 30 Nov 2022 01:11:56 -0800 [thread overview]
Message-ID: <Y4ceXGYg8MXzZCwP@iweiny-desk3> (raw)
In-Reply-To: <20221116144021.00007a7c@Huawei.com>
On Wed, Nov 16, 2022 at 02:40:21PM +0000, Jonathan Cameron wrote:
> On Thu, 10 Nov 2022 10:57:55 -0800
> ira.weiny@intel.com wrote:
>
> > From: Ira Weiny <ira.weiny@intel.com>
> >
> > CXL device events are signaled via interrupts. Each event log may have
> > a different interrupt message number. These message numbers are
> > reported in the Get Event Interrupt Policy mailbox command.
> >
> > Add interrupt support for event logs. Interrupts are allocated as
> > shared interrupts. Therefore, all or some event logs can share the same
> > message number.
> >
> > The driver must deal with the possibility that dynamic capacity is not
> > yet supported by a device it sees. Fallback and retry without dynamic
> > capacity if the first attempt fails.
> >
> > Device capacity event logs interrupt as part of the informational event
> > log. Check the event status to see which log has data.
> >
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> >
> Hi Ira,
>
> A few comments inline.
Thanks for the review!
>
> Thanks,
>
> Jonathan
>
> > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> > index 879b228a98a0..1e6762af2a00 100644
> > --- a/drivers/cxl/core/mbox.c
> > +++ b/drivers/cxl/core/mbox.c
>
> > /**
> > * cxl_mem_get_event_records - Get Event Records from the device
> > @@ -867,6 +870,52 @@ void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
> > }
> > EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
> >
> > +int cxl_event_config_msgnums(struct cxl_dev_state *cxlds)
> > +{
> > + struct cxl_event_interrupt_policy *policy = &cxlds->evt_int_policy;
> > + size_t policy_size = sizeof(*policy);
> > + bool retry = true;
> > + int rc;
> > +
> > + policy->info_settings = CXL_INT_MSI_MSIX;
> > + policy->warn_settings = CXL_INT_MSI_MSIX;
> > + policy->failure_settings = CXL_INT_MSI_MSIX;
> > + policy->fatal_settings = CXL_INT_MSI_MSIX;
> > + policy->dyn_cap_settings = CXL_INT_MSI_MSIX;
> > +
> > +again:
> > + rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_SET_EVT_INT_POLICY,
> > + policy, policy_size, NULL, 0);
> > + if (rc < 0) {
> > + /*
> > + * If the device does not support dynamic capacity it may fail
> > + * the command due to an invalid payload. Retry without
> > + * dynamic capacity.
> > + */
>
> There are a number of ways to discover if DCD is supported that aren't based
> on try and retry like this. 9.13.3 has "basic sequence to utilize Dynamic Capacity"
> That calls out:
> Verify the necessary Dynamic Capacity commands are returned in the CEL.
>
> First I'm not sure we should set the interrupt on for DCD until we have a lot
> more of the flow handled, secondly even then we should figure out if it is supported
> at a higher level than this command and pass that info down here.
I'm not sure I really agree. The events are just traced. I think this
functionality is really orthogonal to if any other support for DCD is there.
Regardless like I said in the call I think deferring this is the right way to
go for now.
>
>
> > + if (retry) {
> > + retry = false;
> > + policy->dyn_cap_settings = 0;
> > + policy_size = sizeof(*policy) - sizeof(policy->dyn_cap_settings);
> > + goto again;
> > + }
> > + dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
> > + rc);
> > + memset(policy, CXL_INT_NONE, sizeof(*policy));
>
> Relying on all the fields being 1 byte is a bit error prone. I'd just set them all
> individually in the interests of more readable code.
Done.
>
> > + return rc;
> > + }
> > +
> > + rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_GET_EVT_INT_POLICY, NULL, 0,
> > + policy, policy_size);
>
> Add a comment on why you are reading this back (to get the msgnums in the upper
> bits) as it's not obvious to a casual reader.
Done.
>
> > + if (rc < 0) {
> > + dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
> > + rc);
> > + return rc;
> > + }
> > +
> > + return 0;
> > +}
> > +EXPORT_SYMBOL_NS_GPL(cxl_event_config_msgnums, CXL);
> > +
>
> ...
>
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index e0d511575b45..64b2e2671043 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -458,6 +458,138 @@ static void cxl_pci_alloc_irq_vectors(struct cxl_dev_state *cxlds)
> > cxlds->nr_irq_vecs = nvecs;
> > }
> >
> > +struct cxl_event_irq_id {
> > + struct cxl_dev_state *cxlds;
> > + u32 status;
> > + unsigned int msgnum;
> msgnum is only here for freeing the interrupt - I'd rather we fixed
> that by using standard infrastructure (or adding some - see below).
>
> status is an indirect way of allowing us to share an interrupt handler.
> You could do that by registering a trivial wrapper for each instead.
> Then all you have left is the cxl_dev_state which could be passed
> in directly as the callback parameter removing need to have this
> structure at all. I think that might be neater.
It does prevent the alloc of this structure which I like.
I've made the change.
>
> > +};
> > +
> > +static irqreturn_t cxl_event_int_thread(int irq, void *id)
> > +{
> > + struct cxl_event_irq_id *cxlid = id;
> > + struct cxl_dev_state *cxlds = cxlid->cxlds;
> > +
> > + if (cxlid->status & CXLDEV_EVENT_STATUS_INFO)
> > + cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_INFO);
> > + if (cxlid->status & CXLDEV_EVENT_STATUS_WARN)
> > + cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_WARN);
> > + if (cxlid->status & CXLDEV_EVENT_STATUS_FAIL)
> > + cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FAIL);
> > + if (cxlid->status & CXLDEV_EVENT_STATUS_FATAL)
> > + cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FATAL);
> > + if (cxlid->status & CXLDEV_EVENT_STATUS_DYNAMIC_CAP)
> > + cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_DYNAMIC_CAP);
> > +
> > + return IRQ_HANDLED;
> > +}
> > +
> > +static irqreturn_t cxl_event_int_handler(int irq, void *id)
> > +{
> > + struct cxl_event_irq_id *cxlid = id;
> > + struct cxl_dev_state *cxlds = cxlid->cxlds;
> > + u32 status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> > +
> > + if (cxlid->status & status)
> > + return IRQ_WAKE_THREAD;
> > + return IRQ_HANDLED;
>
> If status not set IRQ_NONE.
> Ah. I see Dave raised this as well.
Yep done.
>
> > +}
>
> ...
>
> > +static int cxl_request_event_irq(struct cxl_dev_state *cxlds,
> > + enum cxl_event_log_type log_type,
> > + u8 setting)
> > +{
> > + struct device *dev = cxlds->dev;
> > + struct pci_dev *pdev = to_pci_dev(dev);
> > + struct cxl_event_irq_id *id;
> > + unsigned int msgnum = CXL_EVENT_INT_MSGNUM(setting);
> > + int irq;
> > +
> > + /* Disabled irq is not an error */
> > + if (!cxl_evt_int_is_msi(setting) || msgnum > cxlds->nr_irq_vecs) {
>
> I don't think that second condition can occur. The language under table 8-52
> (I think) means that it will move around if there aren't enough vectors
> (for MSI - MSI-X is more complex, but result the same).
Based on the other review this is just a bool msi_enabled which is used to
determine if this should be set up at all.
>
> > + dev_dbg(dev, "Event interrupt not enabled; %s %u %d\n",
> > + cxl_event_log_type_str(CXL_EVENT_TYPE_INFO),
> > + msgnum, cxlds->nr_irq_vecs);
> > + return 0;
> > + }
> > +
> > + id = devm_kzalloc(dev, sizeof(*id), GFP_KERNEL);
> > + if (!id)
> > + return -ENOMEM;
> > +
> > + id->cxlds = cxlds;
> > + id->msgnum = msgnum;
> > + id->status = log_type_to_status(log_type);
> > +
> > + irq = pci_request_irq(pdev, id->msgnum, cxl_event_int_handler,
> > + cxl_event_int_thread, id,
> > + "%s:event-log-%s", dev_name(dev),
> > + cxl_event_log_type_str(log_type));
> > + if (irq)
> > + return irq;
> > +
> > + devm_add_action_or_reset(dev, cxl_free_event_irq, id);
>
> Hmm. no pcim_request_irq() maybe this is the time to propose one
> (separate from this patch so we don't get delayed by that!)
Perhaps. But not tonight... ;-)
>
> We discussed this way back in DOE series (I'd forgotten but lore found
> it for me). There I suggested just calling
> devm_request_threaded_irq() directly as a work around.
Yea that works fine. One issue is we lose the format printing of the irq name:
...
29: ... PCI-MSI 100663300-edge 0000:c0:00.0:event-log-Fatal
30: ... PCI-MSI 100663301-edge 0000:c0:00.0:event-log-Failure
31: ... PCI-MSI 100663302-edge 0000:c0:00.0:event-log-Warning
32: ... PCI-MSI 100663303-edge 0000:c0:00.0:event-log-Informational
...
Thanks,
Ira
>
> > + return 0;
> > +}
> > +
> > +static void cxl_event_irqsetup(struct cxl_dev_state *cxlds)
> > +{
> > + struct device *dev = cxlds->dev;
> > + u8 setting;
> > +
> > + if (cxl_event_config_msgnums(cxlds))
> > + return;
> > +
> > + /*
> > + * Dynamic Capacity shares the info message number
> > + * Nothing to be done except check the status bit in the
> > + * irq thread.
> > + */
> > + setting = cxlds->evt_int_policy.info_settings;
> > + if (cxl_request_event_irq(cxlds, CXL_EVENT_TYPE_INFO, setting))
> > + dev_err(dev, "Failed to get interrupt for %s event log\n",
> > + cxl_event_log_type_str(CXL_EVENT_TYPE_INFO));
> > +
> > + setting = cxlds->evt_int_policy.warn_settings;
> > + if (cxl_request_event_irq(cxlds, CXL_EVENT_TYPE_WARN, setting))
> > + dev_err(dev, "Failed to get interrupt for %s event log\n",
> > + cxl_event_log_type_str(CXL_EVENT_TYPE_WARN));
> > +
> > + setting = cxlds->evt_int_policy.failure_settings;
> > + if (cxl_request_event_irq(cxlds, CXL_EVENT_TYPE_FAIL, setting))
> > + dev_err(dev, "Failed to get interrupt for %s event log\n",
> > + cxl_event_log_type_str(CXL_EVENT_TYPE_FAIL));
> > +
> > + setting = cxlds->evt_int_policy.fatal_settings;
> > + if (cxl_request_event_irq(cxlds, CXL_EVENT_TYPE_FATAL, setting))
> > + dev_err(dev, "Failed to get interrupt for %s event log\n",
> > + cxl_event_log_type_str(CXL_EVENT_TYPE_FATAL));
> > +}
>
next prev parent reply other threads:[~2022-11-30 9:12 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-10 18:57 [PATCH 00/11] CXL: Process event logs ira.weiny
2022-11-10 18:57 ` [PATCH 01/11] cxl/pci: Add generic MSI-X/MSI irq support ira.weiny
2022-11-15 21:41 ` Dave Jiang
2022-11-16 14:53 ` Jonathan Cameron
2022-11-16 23:48 ` Ira Weiny
2022-11-17 11:20 ` Jonathan Cameron
2022-11-10 18:57 ` [PATCH 02/11] cxl/mem: Implement Get Event Records command ira.weiny
2022-11-15 21:54 ` Dave Jiang
2022-11-16 15:19 ` Jonathan Cameron
2022-11-17 0:47 ` Ira Weiny
2022-11-17 10:43 ` Jonathan Cameron
2022-11-18 23:26 ` Ira Weiny
2022-11-21 10:47 ` Jonathan Cameron
2022-11-28 23:30 ` Ira Weiny
2022-11-29 12:26 ` Jonathan Cameron
2022-11-30 5:09 ` Ira Weiny
2022-11-30 14:05 ` Jonathan Cameron
2022-11-10 18:57 ` [PATCH 03/11] cxl/mem: Implement Clear " ira.weiny
2022-11-15 22:09 ` Dave Jiang
2022-11-16 15:24 ` Jonathan Cameron
2022-11-16 15:45 ` Jonathan Cameron
2022-11-17 1:12 ` Ira Weiny
2022-11-17 1:07 ` Ira Weiny
2022-11-10 18:57 ` [PATCH 04/11] cxl/mem: Clear events on driver load ira.weiny
2022-11-15 22:10 ` Dave Jiang
2022-11-10 18:57 ` [PATCH 05/11] cxl/mem: Trace General Media Event Record ira.weiny
2022-11-15 22:25 ` Dave Jiang
2022-11-16 15:31 ` Jonathan Cameron
2022-11-17 1:18 ` Ira Weiny
2022-11-10 18:57 ` [PATCH 06/11] cxl/mem: Trace DRAM " ira.weiny
2022-11-15 22:26 ` Dave Jiang
2022-11-10 18:57 ` [PATCH 07/11] cxl/mem: Trace Memory Module " ira.weiny
2022-11-15 22:39 ` Dave Jiang
2022-11-16 15:35 ` Jonathan Cameron
2022-11-17 1:23 ` Ira Weiny
2022-11-17 11:22 ` Jonathan Cameron
2022-11-30 9:30 ` Ira Weiny
2022-11-22 22:36 ` Steven Rostedt
2022-11-10 18:57 ` [PATCH 08/11] cxl/mem: Wire up event interrupts ira.weiny
2022-11-15 23:13 ` Dave Jiang
2022-11-17 1:38 ` Ira Weiny
2022-11-16 14:40 ` Jonathan Cameron
2022-11-30 9:11 ` Ira Weiny [this message]
2022-11-10 18:57 ` [PATCH 09/11] cxl/test: Add generic mock events ira.weiny
2022-11-16 16:00 ` Jonathan Cameron
2022-11-29 18:29 ` Ira Weiny
2022-11-10 18:57 ` [PATCH 10/11] cxl/test: Add specific events ira.weiny
2022-11-16 16:08 ` Jonathan Cameron
2022-11-10 18:57 ` [PATCH 11/11] cxl/test: Simulate event log overflow ira.weiny
2022-11-16 16:10 ` Jonathan Cameron
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y4ceXGYg8MXzZCwP@iweiny-desk3 \
--to=ira.weiny@intel.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=bwidawsk@kernel.org \
--cc=dan.j.williams@intel.com \
--cc=dave@stgolabs.net \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox