Linux PCI subsystem development
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Aksh Garg" <a-garg7@ti.com>
Cc: linux-pci@vger.kernel.org
Subject: Re: [PATCH v5 3/4] PCI: endpoint: Add support for DOE initialization and setup in EPC core
Date: Wed, 10 Jun 2026 10:17:51 +0000	[thread overview]
Message-ID: <20260610101752.5A8E21F00893@smtp.kernel.org> (raw)
In-Reply-To: <20260610100256.1889111-4-a-garg7@ti.com>

Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
- [High] Lockless check of `epc->init_complete` in `pci_epc_init_notify()` and `pci_epc_deinit_notify()` creates a race condition leading to double initialization or double destruction of DOE mailboxes.
- [High] Queueing a task on a mailbox's workqueue races with workqueue destruction.
- [Medium] DOE capability discovery completely skips Virtual Functions (VFs).
- [Medium] Inconsistent capability check assumes Physical Function 0 (PF0) dictates controller-wide DOE features.
- [Medium] DOE mailboxes and workqueues are leaked if the endpoint controller is destroyed without a prior `pci_epc_deinit_notify()` call.
--

commit f40c46636c83315eee171f04597bcd6e1bbbedab
Author: Aksh Garg <a-garg7@ti.com>

PCI: endpoint: Add support for DOE initialization and setup in EPC core

This commit adds pci_epc_init_capabilities() and pci_epc_deinit_capabilities()
to the EPC core driver to initialize and tear down DOE capabilities. It
discovers and registers DOE mailboxes for the functions in the endpoint
controller.

> diff --git a/drivers/pci/endpoint/pci-epc-core.c b/drivers/pci/endpoint/pci-epc-core.c
> index 6c3c58185fc5d..e48f40eeed290 100644
> --- a/drivers/pci/endpoint/pci-epc-core.c
> +++ b/drivers/pci/endpoint/pci-epc-core.c

[ ... ]

> @@ -842,6 +844,81 @@ void pci_epc_linkdown(struct pci_epc *epc)
>  }
>  EXPORT_SYMBOL_GPL(pci_epc_linkdown);
>  
> +/**
> + * pci_epc_doe_setup() - Discover and setup DOE mailboxes for all functions

[ ... ]

> +static int pci_epc_doe_setup(struct pci_epc *epc)
> +{
> +	u8 func_no, vfunc_no = 0;
> +	u16 cap_offset;
> +	int ret;
> +
> +	if (!epc->ops || !epc->ops->find_ext_capability)
> +		return -EINVAL;
> +
> +	/* Discover DOE capabilities for all functions */
> +	for (func_no = 0; func_no < epc->max_functions; func_no++) {
> +		mutex_lock(&epc->lock);
> +		cap_offset = epc->ops->find_ext_capability(epc, func_no,
> +							   vfunc_no, 0,
> +							   PCI_EXT_CAP_ID_DOE);

[Severity: Medium]
Will this skip Virtual Functions during DOE capability discovery?

The loop iterates over physical functions, but vfunc_no is permanently
hardcoded to 0 here. VFs that advertise DOE capabilities won't be processed.

> +		mutex_unlock(&epc->lock);

[ ... ]

> +static void pci_epc_init_capabilities(struct pci_epc *epc)
> +{
> +	const struct pci_epc_features *epc_features;
> +	int ret;
> +
> +	epc_features = pci_epc_get_features(epc, 0, 0);

[Severity: Medium]
Is it intentional to check capability features exclusively against Physical
Function 0 (PF0)?

If the controller implements DOE on a non-zero function but not PF0, it
seems initialization and cleanup will be skipped entirely. Capability
checks might need to be dynamically evaluated per-function.

> +	if (!epc_features)
> +		return;

[ ... ]

> @@ -857,6 +934,9 @@ void pci_epc_init_notify(struct pci_epc *epc)
>  	if (IS_ERR_OR_NULL(epc))
>  		return;
>  
> +	if (!epc->init_complete)
> +		pci_epc_init_capabilities(epc);
> +

[Severity: High]
Could checking epc->init_complete without acquiring epc->list_lock lead to
a data race?

If pci_epc_init_notify() is invoked concurrently by different threads,
both might see !epc->init_complete and execute pci_epc_init_capabilities(),
which calls pci_ep_doe_init() twice. This overwrites the xarray internal
lock and leaks previously registered mailboxes.

>  	mutex_lock(&epc->list_lock);
>  	list_for_each_entry(epf, &epc->pci_epf, list) {

[ ... ]

> @@ -890,6 +970,27 @@ void pci_epc_notify_pending_init(struct pci_epc *epc, struct pci_epf *epf)
>  }
>  EXPORT_SYMBOL_GPL(pci_epc_notify_pending_init);
>  
> +/**
> + * pci_epc_deinit_capabilities() - Cleanup EPC capabilities

[ ... ]

> +static void pci_epc_deinit_capabilities(struct pci_epc *epc)
> +{
> +	const struct pci_epc_features *epc_features;
> +
> +	epc_features = pci_epc_get_features(epc, 0, 0);
> +	if (!epc_features)
> +		return;
> +
> +	if (IS_ENABLED(CONFIG_PCI_ENDPOINT_DOE) && epc_features->doe_capable) {
> +		pci_ep_doe_destroy(epc);

[Severity: High]
Does destroying the workqueue here race with task queueing?

When pci_ep_doe_destroy() is called, it eventually destroys the workqueue.
However, in pci_ep_doe_submit_task(), there's a lockless test_bit() check:

    if (test_bit(PCI_DOE_FLAG_DEAD, &doe_mb->flags))
        return -EIO;

    ... queue_work(doe_mb->work_queue, &task->work);

If the workqueue is destroyed right after test_bit() passes, the submission
thread might attempt to queue work on the freed workqueue, leading to a
use-after-free.

[Severity: Medium]
What happens to the DOE mailboxes and workqueues if the endpoint
controller is destroyed without a prior pci_epc_deinit_notify() call?

If the controller driver unbinds and calls pci_epc_destroy() directly,
these resources might be leaked since pci_epc_destroy() does not have a
fallback to clean up dynamically allocated DOE capabilities.

> +		dev_dbg(&epc->dev, "DOE mailboxes destroyed\n");
> +	}
> +}

[ ... ]

> @@ -903,6 +1004,9 @@ void pci_epc_deinit_notify(struct pci_epc *epc)
>  	if (IS_ERR_OR_NULL(epc))
>  		return;
>  
> +	if (epc->init_complete)
> +		pci_epc_deinit_capabilities(epc);
> +

[Severity: High]
Can this lockless check of epc->init_complete race similarly to the one in
pci_epc_init_notify()?

A concurrent call could cause double destruction of DOE mailboxes and
corrupt the xarray.

>  	mutex_lock(&epc->list_lock);
>  	list_for_each_entry(epf, &epc->pci_epf, list) {

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260610100256.1889111-1-a-garg7@ti.com?part=3

  reply	other threads:[~2026-06-10 10:17 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-10 10:02 [PATCH v5 0/4] PCI: Add DOE support for endpoint Aksh Garg
2026-06-10 10:02 ` [PATCH v5 1/4] PCI/DOE: Move common definitions to the header file Aksh Garg
2026-06-10 10:02 ` [PATCH v5 2/4] PCI: endpoint: Add DOE mailbox support for endpoint functions Aksh Garg
2026-06-10 10:17   ` sashiko-bot
2026-06-10 11:19     ` Aksh Garg
2026-06-10 10:02 ` [PATCH v5 3/4] PCI: endpoint: Add support for DOE initialization and setup in EPC core Aksh Garg
2026-06-10 10:17   ` sashiko-bot [this message]
2026-06-10 10:02 ` [PATCH v5 4/4] Documentation: PCI: Add documentation for DOE endpoint support Aksh Garg
2026-06-10 23:21   ` Randy Dunlap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260610101752.5A8E21F00893@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=a-garg7@ti.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox