All of lore.kernel.org
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Chandrashekar Devegowda" <chandrashekar.devegowda@intel.com>
Cc: linux-pci@vger.kernel.org
Subject: Re: [PATCH v1 2/2] Bluetooth: btintel_pcie: Fix TOCTOU race in reset path
Date: Fri, 12 Jun 2026 01:46:56 +0000	[thread overview]
Message-ID: <20260612014656.9D5751F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260612012832.2395034-2-chandrashekar.devegowda@intel.com>

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:

Pre-existing issues:
- [High] Non-atomic assignment `data->flags = 0;` in `btintel_pcie_reset_work` wipes out flags set concurrently via atomic bitops.
- [High] PCI device reference leak when `schedule_work()` fails due to reset work being disabled during device removal.
- [High] Null pointer dereference of `data->hdev` in suspend/resume handlers if device recovery fails.
- [High] Missing `dma_rmb()` after reading hardware-updated completion ring indices in MSI-X receive/transmit handlers.
--

commit 9ad44c0c51fefc58aa3ecc2842ab5a316618a24a
Author: Chandrashekar Devegowda <chandrashekar.devegowda@intel.com>

Bluetooth: btintel_pcie: Fix TOCTOU race in reset path

Move the test_and_set_bit(BTINTEL_PCIE_RECOVERY_IN_PROGRESS) check
before the SETUP_DONE check to fix a Time-Of-Check to Time-Of-Use
race. Previously, multiple callers could pass the SETUP_DONE check
concurrently and then race on the RECOVERY_IN_PROGRESS flag,
potentially scheduling conflicting removal work.

> diff --git a/drivers/bluetooth/btintel_pcie.c b/drivers/bluetooth/btintel_pcie.c
> index d3a03cf964216..f4f400421833b 100644
> --- a/drivers/bluetooth/btintel_pcie.c
> +++ b/drivers/bluetooth/btintel_pcie.c
> @@ -2686,11 +2686,15 @@ static void btintel_pcie_reset(struct hci_dev *hdev, u8 reset_type)
>  
>  	data = hci_get_drvdata(hdev);

[Severity: High]
This isn't a bug introduced by this patch, but is it possible for
btintel_pcie_reset() to dereference a null hdev pointer here if device
recovery failed previously?

During an FLR reset triggered via btintel_pcie_reset_work(), the code calls
btintel_pcie_release_hdev(data), explicitly unregistering and setting
data->hdev = NULL. If the subsequent hardware reset operations fail (e.g.,
pci_try_reset_function() errors out), the worker exits early and abandons
recovery:

drivers/bluetooth/btintel_pcie.c:btintel_pcie_reset_work() {
        ...
        btintel_pcie_release_hdev(data);
        ...
        err = pci_try_reset_function(pdev);
        if (err) {
                BT_ERR("Failed resetting the pcie device (%d)", err);
                goto out_enable;
        }
        ...
}

This leaves data->hdev permanently as NULL. Later, if the system undergoes a
sleep/wake cycle, the PCI PM callback unconditionally checks for a halted
controller state and invokes btintel_pcie_reset():

drivers/bluetooth/btintel_pcie.c:btintel_pcie_resume() {
        ...
        btintel_pcie_reset(data->hdev, 0);
        ...
}

Because data->hdev is null, won't btintel_pcie_reset() immediately
dereference the null pointer via hci_get_drvdata(hdev)?

> -	if (!test_bit(BTINTEL_PCIE_SETUP_DONE, &data->flags))
> +	if (test_and_set_bit(BTINTEL_PCIE_RECOVERY_IN_PROGRESS, &data->flags)) {
> +		bt_dev_warn(hdev, "Reset rejected: recovery already in progress");
>  		return;
> +	}
>  
> -	if (test_and_set_bit(BTINTEL_PCIE_RECOVERY_IN_PROGRESS, &data->flags))
> +	if (!test_bit(BTINTEL_PCIE_SETUP_DONE, &data->flags)) {
> +		clear_bit(BTINTEL_PCIE_RECOVERY_IN_PROGRESS, &data->flags);
>  		return;
> +	}
>  
>  	data->reset_type = (reset_type == 1) ?
>  			    BTINTEL_PCIE_IOSF_PRR_PLDR : BTINTEL_PCIE_IOSF_PRR_FLR;

[Severity: High]
This isn't a bug introduced by this patch, but does this function leak a
PCI device reference if schedule_work() fails during driver teardown?

In btintel_pcie_reset(), the driver unconditionally increments the PCI
device reference before calling schedule_work():

drivers/bluetooth/btintel_pcie.c:btintel_pcie_reset() {
        ...
        pci_dev_get(data->pdev);
        schedule_work(&data->reset_work);
}

Normally, this reference is balanced by a pci_dev_put() inside the worker
thread. However, during driver teardown, disable_work_sync() is
intentionally called to block further reset executions:

drivers/bluetooth/btintel_pcie.c:btintel_pcie_remove() {
        ...
        disable_work_sync(&data->reset_work);

        btintel_pcie_disable_interrupts(data);
        ...
}

Because disable_work_sync() is called before disabling interrupts, if a
hardware exception interrupt fires within this deterministic window, it will
invoke btintel_pcie_reset(). The schedule_work() call will return false
(since the work is disabled), but the driver fails to check this return
value. Could this permanently leak the pci_dev reference and prevent the
PCI subsystem from fully releasing the device?


[Severity: High]
This isn't a bug introduced by this patch, but can the non-atomic assignment
in btintel_pcie_reset_work() wipe out flags set concurrently via atomic
bitops?

In btintel_pcie_reset_work(), the driver attempts to clear all flags by
executing a direct assignment:

drivers/bluetooth/btintel_pcie.c:btintel_pcie_reset_work() {
        ...
        data->flags = 0;
        ...
}

However, this occurs shortly after interrupts are unmasked via
btintel_pcie_enable_interrupts(data). If an MSI-X interrupt fires between
the unmasking and the assignment, the IRQ handler might use atomic bitops
(e.g., test_and_set_bit(BTINTEL_PCIE_CORE_HALTED, &data->flags) in
btintel_pcie_msix_hw_exp_handler) to record a critical hardware event.

Will the non-atomic store silently overwrite and lose these concurrently set
flags, potentially dropping hardware exceptions or blocking core dumps from
executing?


[Severity: High]
This isn't a bug introduced by this patch, but are memory barriers required
after reading the hardware-updated completion ring indices in the MSI-X
interrupt handlers?

In the MSI-X interrupt handlers, the driver reads the hardware-updated
completion ring head indices to determine how many descriptors are ready:

drivers/bluetooth/btintel_pcie.c:btintel_pcie_msix_rx_handle() {
        ...
        cr_hia = data->ia.cr_hia[BTINTEL_PCIE_RXQ_NUM];
        ...
}

And similarly in the TX handler:

drivers/bluetooth/btintel_pcie.c:btintel_pcie_msix_tx_handle() {
        ...
        cr_hia = data->ia.cr_hia[BTINTEL_PCIE_TXQ_NUM];
        ...
}

It then immediately iterates into the ring to process the descriptors from
coherent DMA memory. There is no read memory barrier (dma_rmb()) between
fetching the index and reading the descriptor contents.

On weakly ordered CPU architectures, can these reads be reordered by the CPU
or compiler, leading to the host fetching and processing stale descriptor
data before the hardware's updates to the memory have become visible?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260612012832.2395034-1-chandrashekar.devegowda@intel.com?part=2

  reply	other threads:[~2026-06-12  1:46 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-12  1:28 [PATCH v1 1/2] Bluetooth: hci_core: Add reset_type parameter to hdev->reset() callback Chandrashekar Devegowda
2026-06-12  1:28 ` [PATCH v1 2/2] Bluetooth: btintel_pcie: Fix TOCTOU race in reset path Chandrashekar Devegowda
2026-06-12  1:46   ` sashiko-bot [this message]
2026-06-12  1:41 ` [PATCH v1 1/2] Bluetooth: hci_core: Add reset_type parameter to hdev->reset() callback sashiko-bot
2026-06-12  5:36 ` [v1,1/2] " bluez.test.bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260612014656.9D5751F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=chandrashekar.devegowda@intel.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.