Linux PCI subsystem development
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Manivannan Sadhasivam via B4 Relay"
	<devnull+manivannan.sadhasivam.oss.qualcomm.com@kernel.org>
Cc: vkoul@kernel.org, linux-pci@vger.kernel.org,
	dmaengine@vger.kernel.org, Frank.Li@kernel.org
Subject: Re: [PATCH 2/3] bus: mhi: ep: Add mhi_cntrl->flush_async() callback to flush the async read/write
Date: Mon, 29 Jun 2026 08:59:16 +0000	[thread overview]
Message-ID: <20260629085917.350211F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260629-mhi-ep-flush-v1-2-714e0d56e87c@oss.qualcomm.com>

Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:

New issues:
- [High] Incorrect ordering of channel teardown, disconnect notification, and DMA flushing in mhi_ep_remove() causes data delivery after disconnect.
- [High] The newly introduced flush_async() callback lacks channel context, forcing a controller-wide flush that aborts DMA operations for all active channels.
- [High] Dropping the channel lock during flush_async exposes a window for worker threads to process disconnected channels, defeating the fix and re-introducing the UAF.

Pre-existing issues:
- [Critical] In-flight asynchronous DMA operations are not flushed before mhi_ep_ring_reset() frees ring->ring_cache.
- [High] Lockless access to mhi_chan->xfer_cb in completion handlers causes a TOCTOU data race with mhi_ep_remove().
--

commit 191c1d6e4eab8c4a6ee949c2b5b9499a56060800
Author: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>

bus: mhi: ep: Add mhi_cntrl->flush_async() callback to flush the async read/write

This commit introduces a flush_async() callback in the MHI endpoint
controller. It is called during device removal to ensure any in-flight
asynchronous DMA operations are completed before continuing teardown.
This aims to prevent use-after-free issues caused by delayed DMA callbacks.

> diff --git a/drivers/bus/mhi/ep/main.c b/drivers/bus/mhi/ep/main.c
> --- a/drivers/bus/mhi/ep/main.c
> +++ b/drivers/bus/mhi/ep/main.c
[ ... ]
> @@ -1636,6 +1637,12 @@ static void mhi_ep_remove(struct device *dev)
>  		}
>  
>  		mhi_chan->state = MHI_CH_STATE_DISABLED;
> +		mutex_unlock(&mhi_chan->lock);

[Severity: High]
Does dropping the channel lock here leave a window for worker threads to
process disconnected channels?

If mhi_ep_ch_ring_worker wakes up while the lock is dropped, it sees
ring->started is still true. For Downlink channels, mhi_ep_process_ch_ring()
skips the state check and fires xfer_cb() directly:

drivers/bus/mhi/ep/main.c:mhi_ep_process_ch_ring() {
    if (ring->ch_id % 2) {
        /* DL channel */
        result.dir = mhi_chan->dir;
        mhi_chan->xfer_cb(mhi_chan->mhi_dev, &result);
    } else {
    ...
}

Could this execute xfer_cb() mid-teardown and cause a use-after-free?

> +
> +		if (mhi_cntrl->flush_async)
> +			mhi_cntrl->flush_async(mhi_cntrl);

[Severity: High]
Does this order of operations allow successful data delivery callbacks after
disconnecting the client?

Since the channel lock is dropped, completing in-flight DMA operations during
flush_async() will invoke their completion handlers. Those handlers find
xfer_cb still populated and call it with a success status.

Should flush_async() be called before sending the -ENOTCONN disconnect
status to avoid notifying the client after it may have freed its contexts?

> +
> +		mutex_lock(&mhi_chan->lock);
>  		mhi_chan->xfer_cb = NULL;

[Severity: High]
This is a pre-existing issue, but is there a data race when clearing xfer_cb
concurrently with completion handlers?

In mhi_ep_read_completion(), the callback is checked locklessly:

drivers/bus/mhi/ep/main.c:mhi_ep_read_completion() {
    if (mhi_chan->xfer_cb) {
        ...
        mhi_chan->xfer_cb(mhi_dev, &result);
    }
}

If mhi_ep_remove() clears mhi_chan->xfer_cb to NULL after the lockless check
but before execution, could this result in a NULL pointer dereference or
execution of freed memory?

>  		mutex_unlock(&mhi_chan->lock);
>  	}
> diff --git a/include/linux/mhi_ep.h b/include/linux/mhi_ep.h
> --- a/include/linux/mhi_ep.h
> +++ b/include/linux/mhi_ep.h
[ ... ]
> @@ -164,6 +165,7 @@ struct mhi_ep_cntrl {
>  	int (*write_sync)(struct mhi_ep_cntrl *mhi_cntrl, struct mhi_ep_buf_info *buf_info);
>  	int (*read_async)(struct mhi_ep_cntrl *mhi_cntrl, struct mhi_ep_buf_info *buf_info);
>  	int (*write_async)(struct mhi_ep_cntrl *mhi_cntrl, struct mhi_ep_buf_info *buf_info);
> +	void (*flush_async)(struct mhi_ep_cntrl *mhi_cntrl);

[Severity: High]
Does the flush_async callback need a channel argument to avoid flushing
all channels on the controller?

When mhi_ep_remove() calls this without specifying a channel, an
implementation like pci_epf_mhi_edma_flush_async() terminates the shared
DMA channels indiscriminately:

drivers/pci/endpoint/functions/pci-epf-mhi.c:pci_epf_mhi_edma_flush_async() {
    dmaengine_terminate_sync(epf_mhi->dma_chan_rx);
    dmaengine_terminate_sync(epf_mhi->dma_chan_tx);
}

Will this abruptly discard in-flight operations and cause data loss for all
other active MHI clients sharing the controller?

[Severity: Critical]
This is a pre-existing issue, but is a similar flush required in
mhi_ep_ring_reset() before freeing the ring cache?

When the host issues a reset command, mhi_ep_ring_reset() executes:

drivers/bus/mhi/ep/ring.c:mhi_ep_ring_reset() {
    ring->started = false;
    kfree(ring->ring_cache);
    ring->ring_cache = NULL;
}

If in-flight asynchronous DMA operations complete after this,
mhi_ep_read_completion() unconditionally accesses the freed cache:

drivers/bus/mhi/ep/main.c:mhi_ep_read_completion() {
    struct mhi_ring_element *el = &ring->ring_cache[ring->rd_offset];
    ...
}

Could this cause a NULL pointer dereference when a remote host triggers a
channel reset?

>  
>  	enum mhi_state mhi_state;

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260629-mhi-ep-flush-v1-0-714e0d56e87c@oss.qualcomm.com?part=2

  reply	other threads:[~2026-06-29  8:59 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-29  8:45 [PATCH 0/3] bus: mhi: ep: Implement flush_async() callback to flush async read/write Manivannan Sadhasivam via B4 Relay
2026-06-29  8:45 ` [PATCH 1/3] dmaengine: dw-edma: Implement device_synchronize() callback Manivannan Sadhasivam via B4 Relay
2026-06-29  8:57   ` sashiko-bot
2026-06-29 14:20   ` Frank Li
2026-06-29  8:45 ` [PATCH 2/3] bus: mhi: ep: Add mhi_cntrl->flush_async() callback to flush the async read/write Manivannan Sadhasivam via B4 Relay
2026-06-29  8:59   ` sashiko-bot [this message]
2026-06-29  8:45 ` [PATCH 3/3] PCI: epf-mhi: Implement mhi_cntrl->flush_async() to flush DMA read/write Manivannan Sadhasivam via B4 Relay
2026-06-29  8:59   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260629085917.350211F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=Frank.Li@kernel.org \
    --cc=devnull+manivannan.sadhasivam.oss.qualcomm.com@kernel.org \
    --cc=dmaengine@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=vkoul@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox