Linux PCI subsystem development
 help / color / mirror / Atom feed
From: Ira Weiny <ira.weiny@intel.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	Jonathan Cameron <Jonathan.Cameron@huawei.com>,
	"Li, Ming" <ming4.li@intel.com>,
	Bjorn Helgaas <helgaas@kernel.org>,
	Lukas Wunner <lukas@wunner.de>,
	Alison Schofield <alison.schofield@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	"Dave Jiang" <dave.jiang@intel.com>,
	Ben Widawsky <bwidawsk@kernel.org>,
	<linux-kernel@vger.kernel.org>, <linux-cxl@vger.kernel.org>,
	<linux-pci@vger.kernel.org>
Subject: Re: [PATCH V13 3/9] PCI: Create PCIe library functions in support of DOE mailboxes.
Date: Thu, 14 Jul 2022 07:19:33 -0700	[thread overview]
Message-ID: <YtAl9WlqbiCFslcD@iweiny-desk3> (raw)
In-Reply-To: <62cf8aa560ecf_1643dc2945c@dwillia2-xfh.jf.intel.com.notmuch>

On Wed, Jul 13, 2022 at 08:16:53PM -0700, Dan Williams wrote:
> ira.weiny@ wrote:
> > From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > 

[snip]

> > +
> > +static int pci_doe_abort(struct pci_doe_mb *doe_mb)
> > +{
> > +	struct pci_dev *pdev = doe_mb->pdev;
> > +	int offset = doe_mb->cap_offset;
> > +	unsigned long timeout_jiffies;
> > +
> > +	pci_dbg(pdev, "[%x] Issuing Abort\n", offset);
> > +
> > +	timeout_jiffies = jiffies + PCI_DOE_TIMEOUT;
> > +	pci_doe_write_ctrl(doe_mb, PCI_DOE_CTRL_ABORT);
> > +
> > +	do {
> > +		u32 val;
> > +
> > +		if (pci_doe_wait(doe_mb, PCI_DOE_POLL_INTERVAL))
> > +			return -EIO;
> 
> nit, why translate the pci_doe_wait() return value? Not worth respinning
> just for this though.

Ok.

[snip]

> > +static int pci_doe_recv_resp(struct pci_doe_mb *doe_mb, struct pci_doe_task *task)
> > +{
> > +	struct pci_dev *pdev = doe_mb->pdev;
> > +	int offset = doe_mb->cap_offset;
> > +	size_t length, payload_length;
> > +	u32 val;
> > +	int i;
> > +
> > +	/* Read the first dword to get the protocol */
> > +	pci_read_config_dword(pdev, offset + PCI_DOE_READ, &val);
> > +	if ((FIELD_GET(PCI_DOE_DATA_OBJECT_HEADER_1_VID, val) != task->prot.vid) ||
> > +	    (FIELD_GET(PCI_DOE_DATA_OBJECT_HEADER_1_TYPE, val) != task->prot.type)) {
> > +		pci_err(pdev,
> > +			"[%x] expected [VID, Protocol] = [%04x, %02x], got [%04x, %02x]\n",
> > +			doe_mb->cap_offset,
> > +			task->prot.vid, task->prot.type,
> > +			FIELD_GET(PCI_DOE_DATA_OBJECT_HEADER_1_VID, val),
> > +			FIELD_GET(PCI_DOE_DATA_OBJECT_HEADER_1_TYPE, val));
> 
> Is this an error that happens when userspace collides DOE accesses with
> the kernel? Or is this a "*should* never happen" error?
> 
> If this has the potential to spam the log it probably deserves to be
> pci_err_ratelimited() (dev_err_ratelimited()).
> 
> ...but just a question not a request to change.

I was originally viewing other actors as BIOS or something like an FM.  But
thinking about it yea userspace could be poking at this.

Of course we still need to close on restricting user space access.

Yes, ratelimited would be best.

[snip]

> > +static void signal_task_abort(struct pci_doe_task *task, int rv)
> > +{
> > +	struct pci_doe_mb *doe_mb = task->doe_mb;
> > +	struct pci_dev *pdev = doe_mb->pdev;
> > +
> > +	if (pci_doe_abort(doe_mb)) {
> > +		/*
> > +		 * If the device can't process an abort; set the mailbox dead
> > +		 *	- no more submissions
> > +		 */
> > +		pci_err(pdev, "[%x] Abort failed marking mailbox dead\n",
> > +			doe_mb->cap_offset);
> 
> Feels like a dev_dbg(), no?

No.  If this happens we are stopping all processing on the mailbox.  So I think
it needs to be an error such that the admin knows what happened.

> 
> > +		set_bit(PCI_DOE_FLAG_DEAD, &doe_mb->flags);
> > +	}
> > +	signal_task_complete(task, rv);
> > +}
> > +
> > +static void doe_statemachine_work(struct work_struct *work)
> > +{
> > +	struct pci_doe_task *task = container_of(work, struct pci_doe_task,
> > +						 work);
> > +	struct pci_doe_mb *doe_mb = task->doe_mb;
> > +	struct pci_dev *pdev = doe_mb->pdev;
> > +	int offset = doe_mb->cap_offset;
> > +	unsigned long timeout_jiffies;
> > +	u32 val;
> > +	int rc;
> > +
> > +	if (test_bit(PCI_DOE_FLAG_DEAD, &doe_mb->flags)) {
> > +		signal_task_complete(task, -EIO);
> > +		return;
> > +	}
> > +
> > +	/* Send request */
> > +	rc = pci_doe_send_req(doe_mb, task);
> > +
> > +	if (rc) {
> > +		/*
> > +		 * The specification does not provide any guidance on how to
> > +		 * resolve conflicting requests from other entities.
> > +		 * Furthermore, it is likely that busy will not be detected
> > +		 * most of the time.  Flag any detection of status busy with an
> > +		 * error.
> > +		 */
> > +		if (rc == -EBUSY) {
> > +			pci_err(pdev,
> > +				"[%x] busy detected; another entity is sending conflicting requests\n",
> > +				offset);
> 
> This definitely feels like something that needs to be ratelimited.

I think you are right with the potential for user space to be poking at the
registers.

[snip]

> > +
> > +static int pci_doe_cache_protocols(struct pci_doe_mb *doe_mb)
> > +{
> > +	u8 index = 0;
> > +	u8 xa_idx = 0;
> > +
> > +	do {
> > +		int rc;
> > +		u16 vid;
> > +		u8 prot;
> > +
> > +		rc = pci_doe_discovery(doe_mb, &index, &vid, &prot);
> > +		if (rc)
> > +			return rc;
> > +
> > +		pci_dbg(doe_mb->pdev,
> > +			"[%x] Found protocol %d vid: %x prot: %x\n",
> > +			doe_mb->cap_offset, xa_idx, vid, prot);
> > +
> > +		rc = xa_insert(&doe_mb->prots, xa_idx++,
> > +			       pci_doe_xa_prot_entry(vid, prot), GFP_KERNEL);
> > +		if (rc)
> > +			return -ENOMEM;
> 
> Why translate the xa_insert() return value when xa_insert() can also
> report -EBUSY?

By definition there will never be a entry present because xa_idx is being
incremented.  But I think you are correct on principle that it is best to let
the error flow up.  Especially if this call is ever used for other purposes.

[snip]

> > +
> > +/**
> > + * pcim_doe_create_mb() - Create a DOE mailbox object
> > + *
> > + * @pdev: PCI device to create the DOE mailbox for
> > + * @cap_offset: Offset of the DOE mailbox
> > + *
> > + * Create a single mailbox object to manage the mailbox protocol at the
> > + * cap_offset specified.
> > + *
> > + * RETURNS: created mailbox object on success
> > + *	    ERR_PTR(-errno) on failure
> > + */
> > +struct pci_doe_mb *pcim_doe_create_mb(struct pci_dev *pdev, u16 cap_offset)
> > +{
> > +	struct pci_doe_mb *doe_mb;
> > +	struct device *dev = &pdev->dev;
> > +	int rc;
> > +
> > +	doe_mb = devm_kzalloc(dev, sizeof(*doe_mb), GFP_KERNEL);
> > +	if (!doe_mb)
> > +		return ERR_PTR(-ENOMEM);
> > +
> > +	doe_mb->pdev = pdev;
> > +	doe_mb->cap_offset = cap_offset;
> > +	init_waitqueue_head(&doe_mb->wq);
> > +
> > +	xa_init(&doe_mb->prots);
> > +	rc = devm_add_action(dev, pci_doe_xa_destroy, doe_mb);
> > +	if (rc)
> > +		return ERR_PTR(rc);
> > +
> > +	doe_mb->work_queue = alloc_ordered_workqueue("DOE: [%x]", 0,
> > +						     doe_mb->cap_offset);
> 
> I don't see much in the way of end user useful information in that name.
> Not that the thread names matter all that much, but for debug it might
> be nice to be able to tie queue threads back to the corresponding
> device. So maybe put the PCI device name and dev_driver_string() in the
> name?

Good point.  I got too used to the pci_* calls including that.

[snip]

> > +
> > +/**
> > + * pci_doe_submit_task() - Submit a task to be processed by the state machine
> > + *
> > + * @doe_mb: DOE mailbox capability to submit to
> > + * @task: task to be queued
> > + *
> > + * Submit a DOE task (request/response) to the DOE mailbox to be processed.
> > + * Returns upon queueing the task object.  If the queue is full this function
> > + * will sleep until there is room in the queue.
> > + *
> > + * task->complete will be called when the state machine is done processing this
> > + * task.
> > + *
> > + * Excess data will be discarded.
> > + *
> > + * RETURNS: 0 when task has been successful queued, -ERRNO on error
> > + */
> > +int pci_doe_submit_task(struct pci_doe_mb *doe_mb, struct pci_doe_task *task)
> > +{
> > +	if (!pci_doe_supports_prot(doe_mb, task->prot.vid, task->prot.type))
> > +		return -EINVAL;
> > +
> > +	/*
> > +	 * DOE requests must be a whole number of DW
> > +	 * and the response needs to be big enough for at least 1 DW
> > +	 */
> > +	if (task->request_pl_sz % sizeof(u32) ||
> > +	    task->response_pl_sz < sizeof(u32))
> > +		return -EINVAL;
> > +
> > +	if (test_bit(PCI_DOE_FLAG_DEAD, &doe_mb->flags))
> > +		return -EIO;
> > +
> > +	task->doe_mb = doe_mb;
> > +	INIT_WORK(&task->work, doe_statemachine_work);
> 
> Would be nice to move all init to the callers and, similar to
> submit_bio(), rely on the task already knowing all its submit
> parameters.

That could work.  But I think that is awkward.  doe_statemachine_work() is not
exported and is an internal implementation detail of the way the MB processes
the task.  The doe_mb parameter could be set up.  But again seems safer to deal
with it here.

I thought about wrapping struct pci_doe_task with an internal structure but it
seemed overly complex.

> Can be a follow-on cleanup.

Maybe I'll wrap struct pci_doe_task with an internal structure as a follow on.

[snip]

> > 
> 
> Hey, this looks pretty good, thanks for all the hard work on this.

I'll spin soon thanks,
Ira


  reply	other threads:[~2022-07-14 14:19 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-05 15:49 [PATCH V13 0/9] CXL: Read CDAT and DSMAS data ira.weiny
2022-07-05 15:49 ` [PATCH V13 1/9] PCI: Add vendor ID for the PCI SIG ira.weiny
2022-07-05 15:49 ` [PATCH V13 2/9] PCI: Replace magic constant for PCI Sig Vendor ID ira.weiny
2022-07-05 15:49 ` [PATCH V13 3/9] PCI: Create PCIe library functions in support of DOE mailboxes ira.weiny
2022-07-07 16:22   ` Bjorn Helgaas
2022-07-11 19:13     ` Ira Weiny
2022-07-11 22:27   ` [PATCH V13.1] PCI/DOE: Add DOE mailbox support functions ira.weiny
2022-07-11 22:47     ` Matthew Wilcox
2022-07-14 14:54       ` Ira Weiny
2022-07-14  3:16   ` [PATCH V13 3/9] PCI: Create PCIe library functions in support of DOE mailboxes Dan Williams
2022-07-14 14:19     ` Ira Weiny [this message]
2022-07-05 15:49 ` [PATCH V13 4/9] cxl/pci: Create PCI DOE mailbox's for memory devices ira.weiny
2022-07-14  3:30   ` Dan Williams
2022-07-14 18:50     ` Ira Weiny
2022-07-05 15:49 ` [PATCH V13 5/9] driver-core: Introduce BIN_ATTR_ADMIN_{RO,RW} ira.weiny
2022-07-05 15:49 ` [PATCH V13 6/9] cxl/port: Read CDAT table ira.weiny
2022-07-14 15:38   ` Dan Williams
2022-07-14 19:55     ` Ira Weiny
2022-07-14 20:19       ` Dan Williams
2022-07-15  1:09         ` Ira Weiny
2022-07-05 15:49 ` [PATCH V13 7/9] cxl/port: Introduce cxl_cdat_valid() ira.weiny
2022-07-14 16:05   ` Dan Williams
2022-07-05 15:49 ` [PATCH V13 8/9] cxl/port: Retry reading CDAT on failure ira.weiny
2022-07-14 16:27   ` Dan Williams
2022-07-14 20:05     ` Ira Weiny
2022-07-14 20:13       ` Ira Weiny
2022-07-14 20:44         ` Dan Williams
2022-07-19 15:07     ` Jonathan Cameron
2022-07-05 15:49 ` [PATCH V13 9/9] cxl/port: Parse out DSMAS data from CDAT table ira.weiny
2022-07-14 16:32   ` Dan Williams
2022-07-14 19:58     ` Ira Weiny

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YtAl9WlqbiCFslcD@iweiny-desk3 \
    --to=ira.weiny@intel.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=alison.schofield@intel.com \
    --cc=bhelgaas@google.com \
    --cc=bwidawsk@kernel.org \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=helgaas@kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=ming4.li@intel.com \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox