From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 29A1E28E0 for ; Fri, 12 Jun 2026 01:46:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781228818; cv=none; b=FWxOuD6tZjfS9V1iwegW0V0k158ygOBu/QFuVn+a0Wat49bjbjpJ4XhwByBCKSzc7a+md3eX+q87QnyWnkff1Mghwj3yCN8Xo/v3+3BFClAiCMN/Ft1j14C1Sco/dmEtZklucrHDhsmIoPHvfEA+I8/fI/BPdCTSXvy3besgICE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781228818; c=relaxed/simple; bh=5zNvtFDpjjP+qdIgTw+zqjjDKL9gTTW8PWMZZa+iFYA=; h=From:Subject:To:Cc:In-Reply-To:References:Content-Type:Date: Message-Id; b=e3C92LvIgantrOMim0Q7+3O9HZtqp45dXcAu88nR5HSektHsxojhTKKTou2nPr0H+EiZhHP7zcBErEHdPtR/ENjEYi2rrTEpDAlwaYOMWtL35yeyNbs0OYyVJ4rXf793OF+XrUDqq+QNSvanth9CYbZl0EAliTitX/zSjazP42M= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=aY61Axk7; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="aY61Axk7" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9D5751F000E9; Fri, 12 Jun 2026 01:46:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1781228816; bh=21zhGL68opO892nQljyuzcH3M3dHl3yP8lM3pX5pU2Y=; h=From:Subject:Reply-To:To:Cc:In-Reply-To:References:Date; b=aY61Axk7XJ4lWaxfTIHcZmdCeFuUOq2pD/KNfuQbCMgkp5GPNeuXJJEQBKSaksasV WqkD70EpugnK4AefhPCyTtueu+E03mDr6LSn5tTT8GrgINmmPrnr4dkKH6kHnt3ciF LDh80m9mVgoNepRecPVQb4rRH6/rOuI24eOvEB/90+U+5AbpJ7XHaqXZKCBc3V3R/w i6vO4iParbZHZ+DyH7Pnr+0pmVFNFRH88LOuzuRGVIqphr0Jo+87JiHWXtJ3q9QGC0 kC/+pkJ0Rq7VcisprynTUETiO5x2GA88dDmyWeJqxjdT+J7QkpmcPkCVXFhLsdRkV+ oCNgdeOASD2Hg== From: sashiko-bot@kernel.org Subject: Re: [PATCH v1 2/2] Bluetooth: btintel_pcie: Fix TOCTOU race in reset path Reply-To: sashiko-reviews@lists.linux.dev To: "Chandrashekar Devegowda" Cc: linux-pci@vger.kernel.org In-Reply-To: <20260612012832.2395034-2-chandrashekar.devegowda@intel.com> References: <20260612012832.2395034-2-chandrashekar.devegowda@intel.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Date: Fri, 12 Jun 2026 01:46:56 +0000 Message-Id: <20260612014656.9D5751F000E9@smtp.kernel.org> Precedence: bulk X-Mailing-List: linux-pci@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Thank you for your contribution! Sashiko AI review found 4 potential issue(= s) to consider: Pre-existing issues: - [High] Non-atomic assignment `data->flags =3D 0;` in `btintel_pcie_reset_= work` wipes out flags set concurrently via atomic bitops. - [High] PCI device reference leak when `schedule_work()` fails due to rese= t work being disabled during device removal. - [High] Null pointer dereference of `data->hdev` in suspend/resume handler= s if device recovery fails. - [High] Missing `dma_rmb()` after reading hardware-updated completion ring= indices in MSI-X receive/transmit handlers. -- commit 9ad44c0c51fefc58aa3ecc2842ab5a316618a24a Author: Chandrashekar Devegowda Bluetooth: btintel_pcie: Fix TOCTOU race in reset path Move the test_and_set_bit(BTINTEL_PCIE_RECOVERY_IN_PROGRESS) check before the SETUP_DONE check to fix a Time-Of-Check to Time-Of-Use race. Previously, multiple callers could pass the SETUP_DONE check concurrently and then race on the RECOVERY_IN_PROGRESS flag, potentially scheduling conflicting removal work. > diff --git a/drivers/bluetooth/btintel_pcie.c b/drivers/bluetooth/btintel= _pcie.c > index d3a03cf964216..f4f400421833b 100644 > --- a/drivers/bluetooth/btintel_pcie.c > +++ b/drivers/bluetooth/btintel_pcie.c > @@ -2686,11 +2686,15 @@ static void btintel_pcie_reset(struct hci_dev *hd= ev, u8 reset_type) > =20 > data =3D hci_get_drvdata(hdev); [Severity: High] This isn't a bug introduced by this patch, but is it possible for btintel_pcie_reset() to dereference a null hdev pointer here if device recovery failed previously? During an FLR reset triggered via btintel_pcie_reset_work(), the code calls btintel_pcie_release_hdev(data), explicitly unregistering and setting data->hdev =3D NULL. If the subsequent hardware reset operations fail (e.g., pci_try_reset_function() errors out), the worker exits early and abandons recovery: drivers/bluetooth/btintel_pcie.c:btintel_pcie_reset_work() { ... btintel_pcie_release_hdev(data); ... err =3D pci_try_reset_function(pdev); if (err) { BT_ERR("Failed resetting the pcie device (%d)", err); goto out_enable; } ... } This leaves data->hdev permanently as NULL. Later, if the system undergoes a sleep/wake cycle, the PCI PM callback unconditionally checks for a halted controller state and invokes btintel_pcie_reset(): drivers/bluetooth/btintel_pcie.c:btintel_pcie_resume() { ... btintel_pcie_reset(data->hdev, 0); ... } Because data->hdev is null, won't btintel_pcie_reset() immediately dereference the null pointer via hci_get_drvdata(hdev)? > - if (!test_bit(BTINTEL_PCIE_SETUP_DONE, &data->flags)) > + if (test_and_set_bit(BTINTEL_PCIE_RECOVERY_IN_PROGRESS, &data->flags)) { > + bt_dev_warn(hdev, "Reset rejected: recovery already in progress"); > return; > + } > =20 > - if (test_and_set_bit(BTINTEL_PCIE_RECOVERY_IN_PROGRESS, &data->flags)) > + if (!test_bit(BTINTEL_PCIE_SETUP_DONE, &data->flags)) { > + clear_bit(BTINTEL_PCIE_RECOVERY_IN_PROGRESS, &data->flags); > return; > + } > =20 > data->reset_type =3D (reset_type =3D=3D 1) ? > BTINTEL_PCIE_IOSF_PRR_PLDR : BTINTEL_PCIE_IOSF_PRR_FLR; [Severity: High] This isn't a bug introduced by this patch, but does this function leak a PCI device reference if schedule_work() fails during driver teardown? In btintel_pcie_reset(), the driver unconditionally increments the PCI device reference before calling schedule_work(): drivers/bluetooth/btintel_pcie.c:btintel_pcie_reset() { ... pci_dev_get(data->pdev); schedule_work(&data->reset_work); } Normally, this reference is balanced by a pci_dev_put() inside the worker thread. However, during driver teardown, disable_work_sync() is intentionally called to block further reset executions: drivers/bluetooth/btintel_pcie.c:btintel_pcie_remove() { ... disable_work_sync(&data->reset_work); btintel_pcie_disable_interrupts(data); ... } Because disable_work_sync() is called before disabling interrupts, if a hardware exception interrupt fires within this deterministic window, it will invoke btintel_pcie_reset(). The schedule_work() call will return false (since the work is disabled), but the driver fails to check this return value. Could this permanently leak the pci_dev reference and prevent the PCI subsystem from fully releasing the device? [Severity: High] This isn't a bug introduced by this patch, but can the non-atomic assignment in btintel_pcie_reset_work() wipe out flags set concurrently via atomic bitops? In btintel_pcie_reset_work(), the driver attempts to clear all flags by executing a direct assignment: drivers/bluetooth/btintel_pcie.c:btintel_pcie_reset_work() { ... data->flags =3D 0; ... } However, this occurs shortly after interrupts are unmasked via btintel_pcie_enable_interrupts(data). If an MSI-X interrupt fires between the unmasking and the assignment, the IRQ handler might use atomic bitops (e.g., test_and_set_bit(BTINTEL_PCIE_CORE_HALTED, &data->flags) in btintel_pcie_msix_hw_exp_handler) to record a critical hardware event. Will the non-atomic store silently overwrite and lose these concurrently set flags, potentially dropping hardware exceptions or blocking core dumps from executing? [Severity: High] This isn't a bug introduced by this patch, but are memory barriers required after reading the hardware-updated completion ring indices in the MSI-X interrupt handlers? In the MSI-X interrupt handlers, the driver reads the hardware-updated completion ring head indices to determine how many descriptors are ready: drivers/bluetooth/btintel_pcie.c:btintel_pcie_msix_rx_handle() { ... cr_hia =3D data->ia.cr_hia[BTINTEL_PCIE_RXQ_NUM]; ... } And similarly in the TX handler: drivers/bluetooth/btintel_pcie.c:btintel_pcie_msix_tx_handle() { ... cr_hia =3D data->ia.cr_hia[BTINTEL_PCIE_TXQ_NUM]; ... } It then immediately iterates into the ring to process the descriptors from coherent DMA memory. There is no read memory barrier (dma_rmb()) between fetching the index and reading the descriptor contents. On weakly ordered CPU architectures, can these reads be reordered by the CPU or compiler, leading to the host fetching and processing stale descriptor data before the hardware's updates to the memory have become visible? --=20 Sashiko AI review =C2=B7 https://sashiko.dev/#/patchset/20260612012832.2395= 034-1-chandrashekar.devegowda@intel.com?part=3D2