public inbox for linux-acpi@vger.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: Ira Weiny <ira.weiny@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Alison Schofield <alison.schofield@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Dave Jiang <dave.jiang@intel.com>, <linux-kernel@vger.kernel.org>,
	<linux-pci@vger.kernel.org>, <linux-acpi@vger.kernel.org>,
	<linux-cxl@vger.kernel.org>
Subject: Re: [PATCH V3 1/8] cxl/mem: Read, trace, and clear events on driver load
Date: Mon, 12 Dec 2022 17:58:47 +0000	[thread overview]
Message-ID: <20221212175847.00006f85@Huawei.com> (raw)
In-Reply-To: <Y5PGEr4Ngrp07CMU@iweiny-desk3>

On Fri, 9 Dec 2022 15:34:42 -0800
Ira Weiny <ira.weiny@intel.com> wrote:

> On Fri, Dec 09, 2022 at 02:33:20PM -0800, Dan Williams wrote:
> > Ira Weiny wrote:
> > [..]  
> > > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > > > index 3a66aadb4df0..86c84611a168 100644
> > > > > --- a/drivers/cxl/pci.c
> > > > > +++ b/drivers/cxl/pci.c
> > > > > @@ -417,8 +417,44 @@ static void disable_aer(void *pdev)
> > > > >  	pci_disable_pcie_error_reporting(pdev);
> > > > >  }
> > > > >  
> > > > > +static void cxl_mem_free_event_buffer(void *buf)
> > > > > +{
> > > > > +	kvfree(buf);
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * There is a single buffer for reading event logs from the mailbox.  All logs
> > > > > + * share this buffer protected by the cxlds->event_log_lock.
> > > > > + */
> > > > > +static void cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
> > > > > +{
> > > > > +	struct cxl_get_event_payload *buf;
> > > > > +
> > > > > +	dev_dbg(cxlds->dev, "Allocating event buffer size %zu\n",
> > > > > +		cxlds->payload_size);
> > > > > +
> > > > > +	buf = kvmalloc(cxlds->payload_size, GFP_KERNEL);
> > > > > +	if (WARN_ON_ONCE(!buf))  
> > > > 
> > > > No, why is event init so special that it behaves differently than all
> > > > the other init-time allocations this driver does?  
> > > 
> > > Previous review agreed that a warn on once would be printed if this universal
> > > buffer was not allocated.
> > >   
> > > >   
> > > > > +		return;  
> > > > 
> > > > return -ENOMEM;
> > > >   
> > > > > +
> > > > > +	if (WARN_ON_ONCE(devm_add_action_or_reset(cxlds->dev,
> > > > > +			 cxl_mem_free_event_buffer, buf)))
> > > > > +		return;  
> > > > 
> > > > ditto.  
> > > 
> > > I'll change both of these with a dev_err() and bail during init.  
> > 
> > No real need to dev_err() for a simple memory allocation faliure, but
> > at least it is better than a WARN  
> 
> Ok no error then.
> 
> >   
> > >   
> > > >   
> > > > > +
> > > > > +	cxlds->event.buf = buf;
> > > > > +}
> > > > > +
> > > > > +static void cxl_clear_event_logs(struct cxl_dev_state *cxlds)
> > > > > +{
> > > > > +	/* Force read and clear of all logs */
> > > > > +	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> > > > > +	/* Ensure prior partial reads are handled, by starting over again */  
> > > > 
> > > > What partial reads? cxl_mem_get_event_records() reads every log until
> > > > each returns an empty result. Any remaining events after this returns
> > > > are events that fired during the retrieval.  
> > > 
> > > Jonathan was concerned that something could read part of the log and because of
> > > the statefullness of the log processing this reading of the log could start in
> > > the beginning.  Perhaps from a previous driver unload while reading?  
> > 
> > The driver will not unload without completing any current executions of
> > the event retrieval thread otherwise that's an irq shutdown bug.
> >   
> > > I guess I was also thinking the BIOS could leave things this way?  But I think
> > > we should not be here if the BIOS was ever involved right?  
> > 
> > If the OS has CXL Error control and all Event irqs are steered to the OS
> > then the driver must be allowed to assume that it has exclusive control
> > over event retrieval and clearing.

The OS has control once it's asked for it ;)  We have no idea what the firmware
did before that.

> >   
> > > > So I do not think cxl_clear_event_logs() needs to exist, just call
> > > > cxl_mem_get_event_records(CXLDEV_EVENT_STATUS_ALL) once and that's it.  
> > > 
> > > That was my inclination but Jonathan's comments got me thinking I was wrong.  
> > 
> > Perhaps that was before we realized the recent CXL _OSC entanglement.  
> 
> Yea that could have been.  I'm not clear on the order of the comments.

I'm just paranoid - particularly when my excellent firmware writing colleagues
are involved (or our test teams who like to simulate weird sequences of events).

I'm fine crossing fingers until we know there is someone doing this sort of
crazy in the wild.

Jonathan


> 
> Ok this should be good to go.  Reworking the rest of the series.
> 
> Thanks for the review!
> Ira


  reply	other threads:[~2022-12-12 17:58 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-08  5:21 [PATCH V3 0/8] CXL: Process event logs ira.weiny
2022-12-08  5:21 ` [PATCH V3 1/8] cxl/mem: Read, trace, and clear events on driver load ira.weiny
2022-12-09 17:56   ` Dan Williams
2022-12-09 21:00     ` Ira Weiny
2022-12-09 22:33       ` Dan Williams
2022-12-09 23:34         ` Ira Weiny
2022-12-12 17:58           ` Jonathan Cameron [this message]
2022-12-08  5:21 ` [PATCH V3 2/8] cxl/mem: Wire up event interrupts ira.weiny
2022-12-09 21:49   ` Dan Williams
2022-12-10  1:44     ` Ira Weiny
2022-12-08  5:21 ` [PATCH V3 3/8] cxl/mem: Trace General Media Event Record ira.weiny
2022-12-09 22:04   ` Dan Williams
2022-12-11 16:08     ` Ira Weiny
2022-12-08  5:21 ` [PATCH V3 4/8] cxl/mem: Trace DRAM " ira.weiny
2022-12-09 22:14   ` Dan Williams
2022-12-11 16:21     ` Ira Weiny
2022-12-08  5:21 ` [PATCH V3 5/8] cxl/mem: Trace Memory Module " ira.weiny
2022-12-09 22:18   ` Dan Williams
2022-12-08  5:21 ` [PATCH V3 6/8] cxl/test: Add generic mock events ira.weiny
2022-12-09 22:48   ` Dan Williams
2022-12-11 17:26     ` Ira Weiny
2022-12-08  5:21 ` [PATCH V3 7/8] cxl/test: Add specific events ira.weiny
2022-12-08  5:21 ` [PATCH V3 8/8] cxl/test: Simulate event log overflow ira.weiny
2022-12-09 22:52   ` Dan Williams
2022-12-12  4:21     ` Ira Weiny

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221212175847.00006f85@Huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=alison.schofield@intel.com \
    --cc=bhelgaas@google.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=ira.weiny@intel.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox