Linux CXL
 help / color / mirror / Atom feed
From: Ira Weiny <ira.weiny@intel.com>
To: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
	Alison Schofield <alison.schofield@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	"Ben Widawsky" <bwidawsk@kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Davidlohr Bueso <dave@stgolabs.net>,
	<linux-kernel@vger.kernel.org>, <linux-cxl@vger.kernel.org>
Subject: Re: [PATCH 08/11] cxl/mem: Wire up event interrupts
Date: Wed, 30 Nov 2022 01:11:56 -0800	[thread overview]
Message-ID: <Y4ceXGYg8MXzZCwP@iweiny-desk3> (raw)
In-Reply-To: <20221116144021.00007a7c@Huawei.com>

On Wed, Nov 16, 2022 at 02:40:21PM +0000, Jonathan Cameron wrote:
> On Thu, 10 Nov 2022 10:57:55 -0800
> ira.weiny@intel.com wrote:
> 
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > CXL device events are signaled via interrupts.  Each event log may have
> > a different interrupt message number.  These message numbers are
> > reported in the Get Event Interrupt Policy mailbox command.
> > 
> > Add interrupt support for event logs.  Interrupts are allocated as
> > shared interrupts.  Therefore, all or some event logs can share the same
> > message number.
> > 
> > The driver must deal with the possibility that dynamic capacity is not
> > yet supported by a device it sees.  Fallback and retry without dynamic
> > capacity if the first attempt fails.
> > 
> > Device capacity event logs interrupt as part of the informational event
> > log.  Check the event status to see which log has data.
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > 
> Hi Ira,
> 
> A few comments inline.

Thanks for the review!

> 
> Thanks,
> 
> Jonathan
> 
> > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> > index 879b228a98a0..1e6762af2a00 100644
> > --- a/drivers/cxl/core/mbox.c
> > +++ b/drivers/cxl/core/mbox.c
> 
> >  /**
> >   * cxl_mem_get_event_records - Get Event Records from the device
> > @@ -867,6 +870,52 @@ void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
> >  }
> >  EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
> >  
> > +int cxl_event_config_msgnums(struct cxl_dev_state *cxlds)
> > +{
> > +	struct cxl_event_interrupt_policy *policy = &cxlds->evt_int_policy;
> > +	size_t policy_size = sizeof(*policy);
> > +	bool retry = true;
> > +	int rc;
> > +
> > +	policy->info_settings = CXL_INT_MSI_MSIX;
> > +	policy->warn_settings = CXL_INT_MSI_MSIX;
> > +	policy->failure_settings = CXL_INT_MSI_MSIX;
> > +	policy->fatal_settings = CXL_INT_MSI_MSIX;
> > +	policy->dyn_cap_settings = CXL_INT_MSI_MSIX;
> > +
> > +again:
> > +	rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_SET_EVT_INT_POLICY,
> > +			       policy, policy_size, NULL, 0);
> > +	if (rc < 0) {
> > +		/*
> > +		 * If the device does not support dynamic capacity it may fail
> > +		 * the command due to an invalid payload.  Retry without
> > +		 * dynamic capacity.
> > +		 */
> 
> There are a number of ways to discover if DCD is supported that aren't based
> on try and retry like this. 9.13.3 has "basic sequence to utilize Dynamic Capacity"
> That calls out:
> Verify the necessary Dynamic Capacity commands are returned in the CEL.
> 
> First I'm not sure we should set the interrupt on for DCD until we have a lot
> more of the flow handled, secondly even then we should figure out if it is supported
> at a higher level than this command and pass that info down here.

I'm not sure I really agree.  The events are just traced.  I think this
functionality is really orthogonal to if any other support for DCD is there.

Regardless like I said in the call I think deferring this is the right way to
go for now.

> 
> 
> > +		if (retry) {
> > +			retry = false;
> > +			policy->dyn_cap_settings = 0;
> > +			policy_size = sizeof(*policy) - sizeof(policy->dyn_cap_settings);
> > +			goto again;
> > +		}
> > +		dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
> > +			rc);
> > +		memset(policy, CXL_INT_NONE, sizeof(*policy));
> 
> Relying on all the fields being 1 byte is a bit error prone. I'd just set them all
> individually in the interests of more readable code.

Done.

> 
> > +		return rc;
> > +	}
> > +
> > +	rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_GET_EVT_INT_POLICY, NULL, 0,
> > +			       policy, policy_size);
> 
> Add a comment on why you are reading this back (to get the msgnums in the upper
> bits) as it's not obvious to a casual reader.

Done.

> 
> > +	if (rc < 0) {
> > +		dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
> > +			rc);
> > +		return rc;
> > +	}
> > +
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_NS_GPL(cxl_event_config_msgnums, CXL);
> > +
> 
> ...
> 
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index e0d511575b45..64b2e2671043 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -458,6 +458,138 @@ static void cxl_pci_alloc_irq_vectors(struct cxl_dev_state *cxlds)
> >  	cxlds->nr_irq_vecs = nvecs;
> >  }
> >  
> > +struct cxl_event_irq_id {
> > +	struct cxl_dev_state *cxlds;
> > +	u32 status;
> > +	unsigned int msgnum;
> msgnum is only here for freeing the interrupt - I'd rather we fixed
> that by using standard infrastructure (or adding some - see below).
> 
> status is an indirect way of allowing us to share an interrupt handler.
> You could do that by registering a trivial wrapper for each instead.
> Then all you have left is the cxl_dev_state which could be passed
> in directly as the callback parameter removing need to have this
> structure at all.  I think that might be neater.

It does prevent the alloc of this structure which I like.

I've made the change.

> 
> > +};
> > +
> > +static irqreturn_t cxl_event_int_thread(int irq, void *id)
> > +{
> > +	struct cxl_event_irq_id *cxlid = id;
> > +	struct cxl_dev_state *cxlds = cxlid->cxlds;
> > +
> > +	if (cxlid->status & CXLDEV_EVENT_STATUS_INFO)
> > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_INFO);
> > +	if (cxlid->status & CXLDEV_EVENT_STATUS_WARN)
> > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_WARN);
> > +	if (cxlid->status & CXLDEV_EVENT_STATUS_FAIL)
> > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FAIL);
> > +	if (cxlid->status & CXLDEV_EVENT_STATUS_FATAL)
> > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FATAL);
> > +	if (cxlid->status & CXLDEV_EVENT_STATUS_DYNAMIC_CAP)
> > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_DYNAMIC_CAP);
> > +
> > +	return IRQ_HANDLED;
> > +}
> > +
> > +static irqreturn_t cxl_event_int_handler(int irq, void *id)
> > +{
> > +	struct cxl_event_irq_id *cxlid = id;
> > +	struct cxl_dev_state *cxlds = cxlid->cxlds;
> > +	u32 status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> > +
> > +	if (cxlid->status & status)
> > +		return IRQ_WAKE_THREAD;
> > +	return IRQ_HANDLED;
> 
> If status not set IRQ_NONE.
> Ah. I see Dave raised this as well.

Yep done.

> 
> > +}
> 
> ...
> 
> > +static int cxl_request_event_irq(struct cxl_dev_state *cxlds,
> > +				 enum cxl_event_log_type log_type,
> > +				 u8 setting)
> > +{
> > +	struct device *dev = cxlds->dev;
> > +	struct pci_dev *pdev = to_pci_dev(dev);
> > +	struct cxl_event_irq_id *id;
> > +	unsigned int msgnum = CXL_EVENT_INT_MSGNUM(setting);
> > +	int irq;
> > +
> > +	/* Disabled irq is not an error */
> > +	if (!cxl_evt_int_is_msi(setting) || msgnum > cxlds->nr_irq_vecs) {
> 
> I don't think that second condition can occur.  The language under table 8-52
> (I think) means that it will move around if there aren't enough vectors
> (for MSI - MSI-X is more complex, but result the same).

Based on the other review this is just a bool msi_enabled which is used to
determine if this should be set up at all.

> 
> > +		dev_dbg(dev, "Event interrupt not enabled; %s %u %d\n",
> > +			cxl_event_log_type_str(CXL_EVENT_TYPE_INFO),
> > +			msgnum, cxlds->nr_irq_vecs);
> > +		return 0;
> > +	}
> > +
> > +	id = devm_kzalloc(dev, sizeof(*id), GFP_KERNEL);
> > +	if (!id)
> > +		return -ENOMEM;
> > +
> > +	id->cxlds = cxlds;
> > +	id->msgnum = msgnum;
> > +	id->status = log_type_to_status(log_type);
> > +
> > +	irq = pci_request_irq(pdev, id->msgnum, cxl_event_int_handler,
> > +			      cxl_event_int_thread, id,
> > +			      "%s:event-log-%s", dev_name(dev),
> > +			      cxl_event_log_type_str(log_type));
> > +	if (irq)
> > +		return irq;
> > +
> > +	devm_add_action_or_reset(dev, cxl_free_event_irq, id);
> 
> Hmm. no pcim_request_irq()  maybe this is the time to propose one
> (separate from this patch so we don't get delayed by that!)

Perhaps.  But not tonight...  ;-)

> 
> We discussed this way back in DOE series (I'd forgotten but lore found
> it for me).  There I suggested just calling
> devm_request_threaded_irq() directly as a work around.

Yea that works fine.  One issue is we lose the format printing of the irq name:

...
 29:  ...  PCI-MSI 100663300-edge      0000:c0:00.0:event-log-Fatal
 30:  ...  PCI-MSI 100663301-edge      0000:c0:00.0:event-log-Failure
 31:  ...  PCI-MSI 100663302-edge      0000:c0:00.0:event-log-Warning
 32:  ...  PCI-MSI 100663303-edge      0000:c0:00.0:event-log-Informational
...

Thanks,
Ira

> 
> > +	return 0;
> > +}
> > +
> > +static void cxl_event_irqsetup(struct cxl_dev_state *cxlds)
> > +{
> > +	struct device *dev = cxlds->dev;
> > +	u8 setting;
> > +
> > +	if (cxl_event_config_msgnums(cxlds))
> > +		return;
> > +
> > +	/*
> > +	 * Dynamic Capacity shares the info message number
> > +	 * Nothing to be done except check the status bit in the
> > +	 * irq thread.
> > +	 */
> > +	setting = cxlds->evt_int_policy.info_settings;
> > +	if (cxl_request_event_irq(cxlds, CXL_EVENT_TYPE_INFO, setting))
> > +		dev_err(dev, "Failed to get interrupt for %s event log\n",
> > +			cxl_event_log_type_str(CXL_EVENT_TYPE_INFO));
> > +
> > +	setting = cxlds->evt_int_policy.warn_settings;
> > +	if (cxl_request_event_irq(cxlds, CXL_EVENT_TYPE_WARN, setting))
> > +		dev_err(dev, "Failed to get interrupt for %s event log\n",
> > +			cxl_event_log_type_str(CXL_EVENT_TYPE_WARN));
> > +
> > +	setting = cxlds->evt_int_policy.failure_settings;
> > +	if (cxl_request_event_irq(cxlds, CXL_EVENT_TYPE_FAIL, setting))
> > +		dev_err(dev, "Failed to get interrupt for %s event log\n",
> > +			cxl_event_log_type_str(CXL_EVENT_TYPE_FAIL));
> > +
> > +	setting = cxlds->evt_int_policy.fatal_settings;
> > +	if (cxl_request_event_irq(cxlds, CXL_EVENT_TYPE_FATAL, setting))
> > +		dev_err(dev, "Failed to get interrupt for %s event log\n",
> > +			cxl_event_log_type_str(CXL_EVENT_TYPE_FATAL));
> > +}
> 

  reply	other threads:[~2022-11-30  9:12 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-10 18:57 [PATCH 00/11] CXL: Process event logs ira.weiny
2022-11-10 18:57 ` [PATCH 01/11] cxl/pci: Add generic MSI-X/MSI irq support ira.weiny
2022-11-15 21:41   ` Dave Jiang
2022-11-16 14:53   ` Jonathan Cameron
2022-11-16 23:48     ` Ira Weiny
2022-11-17 11:20       ` Jonathan Cameron
2022-11-10 18:57 ` [PATCH 02/11] cxl/mem: Implement Get Event Records command ira.weiny
2022-11-15 21:54   ` Dave Jiang
2022-11-16 15:19   ` Jonathan Cameron
2022-11-17  0:47     ` Ira Weiny
2022-11-17 10:43       ` Jonathan Cameron
2022-11-18 23:26         ` Ira Weiny
2022-11-21 10:47           ` Jonathan Cameron
2022-11-28 23:30             ` Ira Weiny
2022-11-29 12:26               ` Jonathan Cameron
2022-11-30  5:09                 ` Ira Weiny
2022-11-30 14:05                   ` Jonathan Cameron
2022-11-10 18:57 ` [PATCH 03/11] cxl/mem: Implement Clear " ira.weiny
2022-11-15 22:09   ` Dave Jiang
2022-11-16 15:24   ` Jonathan Cameron
2022-11-16 15:45     ` Jonathan Cameron
2022-11-17  1:12       ` Ira Weiny
2022-11-17  1:07     ` Ira Weiny
2022-11-10 18:57 ` [PATCH 04/11] cxl/mem: Clear events on driver load ira.weiny
2022-11-15 22:10   ` Dave Jiang
2022-11-10 18:57 ` [PATCH 05/11] cxl/mem: Trace General Media Event Record ira.weiny
2022-11-15 22:25   ` Dave Jiang
2022-11-16 15:31   ` Jonathan Cameron
2022-11-17  1:18     ` Ira Weiny
2022-11-10 18:57 ` [PATCH 06/11] cxl/mem: Trace DRAM " ira.weiny
2022-11-15 22:26   ` Dave Jiang
2022-11-10 18:57 ` [PATCH 07/11] cxl/mem: Trace Memory Module " ira.weiny
2022-11-15 22:39   ` Dave Jiang
2022-11-16 15:35   ` Jonathan Cameron
2022-11-17  1:23     ` Ira Weiny
2022-11-17 11:22       ` Jonathan Cameron
2022-11-30  9:30         ` Ira Weiny
2022-11-22 22:36   ` Steven Rostedt
2022-11-10 18:57 ` [PATCH 08/11] cxl/mem: Wire up event interrupts ira.weiny
2022-11-15 23:13   ` Dave Jiang
2022-11-17  1:38     ` Ira Weiny
2022-11-16 14:40   ` Jonathan Cameron
2022-11-30  9:11     ` Ira Weiny [this message]
2022-11-10 18:57 ` [PATCH 09/11] cxl/test: Add generic mock events ira.weiny
2022-11-16 16:00   ` Jonathan Cameron
2022-11-29 18:29     ` Ira Weiny
2022-11-10 18:57 ` [PATCH 10/11] cxl/test: Add specific events ira.weiny
2022-11-16 16:08   ` Jonathan Cameron
2022-11-10 18:57 ` [PATCH 11/11] cxl/test: Simulate event log overflow ira.weiny
2022-11-16 16:10   ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y4ceXGYg8MXzZCwP@iweiny-desk3 \
    --to=ira.weiny@intel.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=alison.schofield@intel.com \
    --cc=bwidawsk@kernel.org \
    --cc=dan.j.williams@intel.com \
    --cc=dave@stgolabs.net \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox