* [PATCH v1] Bluetooth: btintel_pcie: split coredump worker into per-trigger works
@ 2026-07-02 17:03 Kiran K
2026-07-02 18:48 ` [v1] " bluez.test.bot
0 siblings, 1 reply; 2+ messages in thread
From: Kiran K @ 2026-07-02 17:03 UTC (permalink / raw)
To: linux-bluetooth
Cc: ravishankar.srivatsa, chethan.tumkur.narayan,
chandrashekar.devegowda, Kiran K
btintel_pcie_coredump_worker() handled three unrelated jobs in one
work item: collect a DRAM trace coredump, read the hardware exception
event, and read the firmware-trigger event. The worker walked three
flag bits at runtime and each interrupt path mutated multiple bits
to communicate which sub-jobs the worker should run, which made the
ownership rules for those bits hard to reason about and entangled
the trigger reason with the in-progress accounting.
Replace the single combined worker with three single-purpose ones,
each owning exactly one flag:
coredump_work -> btintel_pcie_dump_traces()
guarded by COREDUMP_INPROGRESS
hwexp_work -> btintel_pcie_read_hwexp()
guarded by CORE_HALTED (already permanent until
re-probe; HWEXP_INPROGRESS is now redundant
and removed)
fwtrigger_work -> btintel_pcie_dump_fwtrigger_event()
guarded by FWTRIGGER_DUMP_INPROGRESS
All three workers are queued on a shared ordered workqueue (renamed
coredump_workqueue -> dump_workqueue) so a companion event reader
(hwexp/fwtrigger) and the coredump always run FIFO. Companion work
is queued before coredump_work so dmp_hdr.event_type/event_id are
populated by the time dump_traces() consumes them, preserving the
original ordering.
Introduce btintel_pcie_queue_coredump() to centralize the coredump
trigger contract: it is the single writer of COREDUMP_INPROGRESS and
of dmp_hdr.trigger_reason, sets both atomically against concurrent
triggers, and rolls back the bit if the workqueue is disabled
(reset/remove in progress) so a later trigger after re-probe can
succeed. All four trigger sites (HWEXP IRQ, FW-trigger IRQ,
devcoredump user trigger, resume() D0 error path) go through the
helper.
Per-work guard bits are now cleared at the tail of each worker
rather than in the middle of the combined worker, which closes a
subtle race where a duplicate IRQ could observe a cleared bit and
requeue while the previous pass was still finalizing
dev_coredumpv().
reset_work() and remove() now disable_work_sync() all three workers
and, on the FLR-failure path, enable_work() all three to keep their
disable counters balanced. The PLDR/FLR-success contract (re-probe
re-INIT_WORKs everything with counter 0) is preserved.
No functional change to the dump payloads; this is a pure
restructuring of the worker dispatch and its synchronization.
Signed-off-by: Kiran K <kiran.k@intel.com>
Assisted-by: GitHub-Copilot:claude-4.7-opus
---
drivers/bluetooth/btintel_pcie.c | 185 ++++++++++++++++++++-----------
drivers/bluetooth/btintel_pcie.h | 12 +-
2 files changed, 130 insertions(+), 67 deletions(-)
diff --git a/drivers/bluetooth/btintel_pcie.c b/drivers/bluetooth/btintel_pcie.c
index 2b7231be5973..013568197a39 100644
--- a/drivers/bluetooth/btintel_pcie.c
+++ b/drivers/bluetooth/btintel_pcie.c
@@ -1446,72 +1446,134 @@ static int btintel_pcie_dump_fwtrigger_event(struct btintel_pcie_data *data)
return err;
}
+/* Queue a coredump dump_traces() pass.
+ *
+ * Returns true if a new coredump was queued, false if one was already
+ * in-flight (the BTINTEL_PCIE_COREDUMP_INPROGRESS bit serves as the
+ * single-writer guard for the @coredump_work item) or the workqueue is
+ * disabled (reset / remove in progress).
+ *
+ * Always queue this AFTER any companion event-reader work (hwexp /
+ * fwtrigger) so that, on the ordered @dump_workqueue, the event reader
+ * runs first and populates dmp_hdr.event_type / event_id before
+ * dump_traces consumes them.
+ */
+static bool btintel_pcie_queue_coredump(struct btintel_pcie_data *data,
+ u16 trigger_reason)
+{
+ if (test_and_set_bit(BTINTEL_PCIE_COREDUMP_INPROGRESS, &data->flags))
+ return false;
+
+ data->dmp_hdr.trigger_reason = trigger_reason;
+
+ if (queue_work(data->dump_workqueue, &data->coredump_work))
+ return true;
+
+ /* Workqueue is disabled (reset/remove drained it). Release the
+ * guard so a later trigger, after re-probe, can succeed.
+ */
+ clear_bit(BTINTEL_PCIE_COREDUMP_INPROGRESS, &data->flags);
+ return false;
+}
+
static void btintel_pcie_msix_fw_trigger_handler(struct btintel_pcie_data *data)
{
bt_dev_dbg(data->hdev, "Received firmware smart trigger cause");
- if (test_and_set_bit(BTINTEL_PCIE_FWTRIGGER_DUMP_INPROGRESS, &data->flags))
+ /* Per-work guard: deduplicate concurrent FW-trigger interrupts.
+ * Cleared at the tail of btintel_pcie_fwtrigger_worker().
+ */
+ if (test_and_set_bit(BTINTEL_PCIE_FWTRIGGER_DUMP_INPROGRESS,
+ &data->flags))
return;
- /* Trigger device core dump when there is FW assert */
- if (!test_and_set_bit(BTINTEL_PCIE_COREDUMP_INPROGRESS, &data->flags))
- data->dmp_hdr.trigger_reason = BTINTEL_PCIE_TRIGGER_REASON_FW_ASSERT;
+ if (!queue_work(data->dump_workqueue, &data->fwtrigger_work)) {
+ clear_bit(BTINTEL_PCIE_FWTRIGGER_DUMP_INPROGRESS, &data->flags);
+ return;
+ }
- queue_work(data->coredump_workqueue, &data->coredump_work);
+ /* Queue coredump after the fwtrigger event reader so dmp_hdr.event_*
+ * is populated before dump_traces consumes it.
+ */
+ btintel_pcie_queue_coredump(data, BTINTEL_PCIE_TRIGGER_REASON_FW_ASSERT);
}
static void btintel_pcie_msix_hw_exp_handler(struct btintel_pcie_data *data)
{
bt_dev_err(data->hdev, "Received hw exception interrupt");
+ /* CORE_HALTED is the single-writer guard for this handler. It is
+ * set once on first HW exception and cleared only by re-probe
+ * (data is reallocated), so it also serializes hwexp_work
+ * scheduling without needing a separate bit.
+ */
if (test_and_set_bit(BTINTEL_PCIE_CORE_HALTED, &data->flags))
return;
- if (test_and_set_bit(BTINTEL_PCIE_HWEXP_INPROGRESS, &data->flags))
- return;
-
- /* Trigger device core dump when there is HW exception */
- if (!test_and_set_bit(BTINTEL_PCIE_COREDUMP_INPROGRESS, &data->flags))
- data->dmp_hdr.trigger_reason = BTINTEL_PCIE_TRIGGER_REASON_FW_ASSERT;
+ /* Queue companion coredump first so it is appended after hwexp_work
+ * on the ordered @dump_workqueue (preserves the original
+ * coredump-then-hwexp ordering).
+ */
+ btintel_pcie_queue_coredump(data, BTINTEL_PCIE_TRIGGER_REASON_FW_ASSERT);
- queue_work(data->coredump_workqueue, &data->coredump_work);
+ queue_work(data->dump_workqueue, &data->hwexp_work);
}
static void btintel_pcie_coredump_worker(struct work_struct *work)
{
struct btintel_pcie_data *data = container_of(work,
struct btintel_pcie_data, coredump_work);
- int err;
/* hdev is NULL until setup_hdev() succeeds, and is cleared on
* teardown after disable_work_sync() drains us; bail in that case.
*/
+ if (!data->hdev)
+ goto out;
+
+ btintel_pcie_dump_traces(data->hdev);
+out:
+ /* Release guard last so a new trigger can run only after this
+ * pass has fully completed (including dev_coredumpv()).
+ */
+ clear_bit(BTINTEL_PCIE_COREDUMP_INPROGRESS, &data->flags);
+}
+
+static void btintel_pcie_hwexp_worker(struct work_struct *work)
+{
+ struct btintel_pcie_data *data = container_of(work,
+ struct btintel_pcie_data, hwexp_work);
+
if (!data->hdev)
return;
- if (test_bit(BTINTEL_PCIE_FWTRIGGER_DUMP_INPROGRESS, &data->flags)) {
- err = btintel_pcie_dump_fwtrigger_event(data);
- if (err)
- bt_dev_warn(data->hdev, "failed to log fwtrigger event");
- clear_bit(BTINTEL_PCIE_FWTRIGGER_DUMP_INPROGRESS, &data->flags);
- }
+ /* Unlike usb products, controller will not send hardware exception
+ * event on exception. Instead controller writes the hardware event
+ * to device memory along with optional debug events, raises MSIX
+ * and halts. Driver shall read the exception event from device
+ * memory and passes it to the stack for further processing.
+ *
+ * Re-entry is gated by BTINTEL_PCIE_CORE_HALTED in the IRQ
+ * handler, which is only cleared by re-probe; no per-work bit
+ * is needed here.
+ */
+ btintel_pcie_read_hwexp(data);
+}
- if (test_bit(BTINTEL_PCIE_COREDUMP_INPROGRESS, &data->flags)) {
- btintel_pcie_dump_traces(data->hdev);
- clear_bit(BTINTEL_PCIE_COREDUMP_INPROGRESS, &data->flags);
- }
+static void btintel_pcie_fwtrigger_worker(struct work_struct *work)
+{
+ struct btintel_pcie_data *data = container_of(work,
+ struct btintel_pcie_data, fwtrigger_work);
+ int err;
- if (test_bit(BTINTEL_PCIE_HWEXP_INPROGRESS, &data->flags)) {
- /* Unlike usb products, controller will not send hardware
- * exception event on exception. Instead controller writes the
- * hardware event to device memory along with optional debug
- * events, raises MSIX and halts. Driver shall read the
- * exception event from device memory and passes it stack for
- * further processing.
- */
- btintel_pcie_read_hwexp(data);
- clear_bit(BTINTEL_PCIE_HWEXP_INPROGRESS, &data->flags);
- }
+ if (!data->hdev)
+ goto out;
+
+ err = btintel_pcie_dump_fwtrigger_event(data);
+ if (err)
+ bt_dev_warn(data->hdev, "failed to log fwtrigger event");
+out:
+ /* Release guard last; matches set in fw_trigger handler. */
+ clear_bit(BTINTEL_PCIE_FWTRIGGER_DUMP_INPROGRESS, &data->flags);
}
static void btintel_pcie_rx_work(struct work_struct *work)
@@ -2650,20 +2712,22 @@ static void btintel_pcie_reset_work(struct work_struct *wk)
btintel_pcie_synchronize_irqs(data);
flush_work(&data->rx_work);
- /* Drain any in-flight coredump and block new ones across reset.
- * Safe from self-deadlock: coredump_work runs on a separate wq.
+ /* Drain any in-flight dump workers and block new ones across reset.
+ * Safe from self-deadlock: they all run on a separate wq.
*/
disable_work_sync(&data->coredump_work);
+ disable_work_sync(&data->hwexp_work);
+ disable_work_sync(&data->fwtrigger_work);
bt_dev_dbg(data->hdev, "Release bluetooth interface");
/* Both reset paths follow the same contract: on success they
* destroy 'data' via device_reprobe() (a fresh probe re-INIT_WORKs
- * the coredump_work with disable count 0), so enable_work() must
+ * the dump workers with disable count 0), so enable_work() must
* NOT be called on the success path. Only the FLR path can fail
* with 'data' still alive, in which case we balance the
- * disable_work_sync() above so a later successful reset is not
- * permanently blocked.
+ * disable_work_sync() calls above so a later successful reset is
+ * not permanently blocked.
*
* pci_lock_rescan_remove() (held above) serializes against PCI
* device addition/removal (hotplug), so no device can be added to
@@ -2674,8 +2738,11 @@ static void btintel_pcie_reset_work(struct work_struct *wk)
goto out;
}
- if (btintel_pcie_perform_flr(data))
+ if (btintel_pcie_perform_flr(data)) {
enable_work(&data->coredump_work);
+ enable_work(&data->hwexp_work);
+ enable_work(&data->fwtrigger_work);
+ }
out:
pci_dev_put(pdev);
@@ -2869,8 +2936,8 @@ static int btintel_pcie_probe(struct pci_dev *pdev,
if (!data->workqueue)
return -ENOMEM;
- data->coredump_workqueue = alloc_ordered_workqueue(KBUILD_MODNAME "_cd", 0);
- if (!data->coredump_workqueue) {
+ data->dump_workqueue = alloc_ordered_workqueue(KBUILD_MODNAME "_cd", 0);
+ if (!data->dump_workqueue) {
destroy_workqueue(data->workqueue);
return -ENOMEM;
}
@@ -2879,6 +2946,8 @@ static int btintel_pcie_probe(struct pci_dev *pdev,
INIT_WORK(&data->rx_work, btintel_pcie_rx_work);
INIT_WORK(&data->reset_work, btintel_pcie_reset_work);
INIT_WORK(&data->coredump_work, btintel_pcie_coredump_worker);
+ INIT_WORK(&data->hwexp_work, btintel_pcie_hwexp_worker);
+ INIT_WORK(&data->fwtrigger_work, btintel_pcie_fwtrigger_worker);
data->boot_stage_cache = 0x00;
data->img_resp_cache = 0x00;
@@ -2921,7 +2990,7 @@ static int btintel_pcie_probe(struct pci_dev *pdev,
/* reset device before exit */
btintel_pcie_reset_bt(data);
- destroy_workqueue(data->coredump_workqueue);
+ destroy_workqueue(data->dump_workqueue);
pci_clear_master(pdev);
@@ -2940,12 +3009,14 @@ static void btintel_pcie_remove(struct pci_dev *pdev)
return;
}
- /* Permanently block coredump triggers and drain the worker before
- * tearing down. Must run before cancel_work_sync(&reset_work) so
- * the disable counter stays >= 1 even after reset_work()'s
+ /* Permanently block all dump triggers and drain the workers before
+ * tearing down. Must run before disable_work_sync(&reset_work) so
+ * the disable counters stay >= 1 even after reset_work()'s
* balanced enable_work() (counter 2 -> 1, never reaching 0).
*/
disable_work_sync(&data->coredump_work);
+ disable_work_sync(&data->hwexp_work);
+ disable_work_sync(&data->fwtrigger_work);
/* Cancel pending reset work. Skip only when remove() is called from
* within the reset work itself (PLDR device_reprobe path) to avoid
@@ -2973,7 +3044,7 @@ static void btintel_pcie_remove(struct pci_dev *pdev)
btintel_pcie_release_hdev(data);
- destroy_workqueue(data->coredump_workqueue);
+ destroy_workqueue(data->dump_workqueue);
destroy_workqueue(data->workqueue);
btintel_pcie_free(data);
@@ -2992,16 +3063,8 @@ static void btintel_pcie_coredump(struct device *dev)
if (!data)
return;
- if (test_and_set_bit(BTINTEL_PCIE_COREDUMP_INPROGRESS, &data->flags))
- return;
-
- data->dmp_hdr.trigger_reason = BTINTEL_PCIE_TRIGGER_REASON_USER_TRIGGER;
- /* queue_work() returns false if the work is disabled (reset or
- * remove in progress); clear the in-progress bit so a later
- * trigger can succeed once the work is re-enabled.
- */
- if (!queue_work(data->coredump_workqueue, &data->coredump_work))
- clear_bit(BTINTEL_PCIE_COREDUMP_INPROGRESS, &data->flags);
+ btintel_pcie_queue_coredump(data,
+ BTINTEL_PCIE_TRIGGER_REASON_USER_TRIGGER);
}
#endif
@@ -3138,12 +3201,8 @@ static int btintel_pcie_resume(struct device *dev)
if (btintel_pcie_in_error(data) ||
btintel_pcie_in_device_halt(data)) {
bt_dev_err(data->hdev, "Controller in error state for D0 entry");
- if (!test_and_set_bit(BTINTEL_PCIE_COREDUMP_INPROGRESS,
- &data->flags)) {
- data->dmp_hdr.trigger_reason =
- BTINTEL_PCIE_TRIGGER_REASON_FW_ASSERT;
- queue_work(data->coredump_workqueue, &data->coredump_work);
- }
+ btintel_pcie_queue_coredump(data,
+ BTINTEL_PCIE_TRIGGER_REASON_FW_ASSERT);
set_bit(BTINTEL_PCIE_CORE_HALTED, &data->flags);
btintel_pcie_reset(data->hdev);
}
diff --git a/drivers/bluetooth/btintel_pcie.h b/drivers/bluetooth/btintel_pcie.h
index 7caee093e316..749369b24031 100644
--- a/drivers/bluetooth/btintel_pcie.h
+++ b/drivers/bluetooth/btintel_pcie.h
@@ -118,7 +118,6 @@ enum {
enum {
BTINTEL_PCIE_CORE_HALTED,
- BTINTEL_PCIE_HWEXP_INPROGRESS,
BTINTEL_PCIE_COREDUMP_INPROGRESS,
BTINTEL_PCIE_FWTRIGGER_DUMP_INPROGRESS,
BTINTEL_PCIE_RECOVERY_IN_PROGRESS,
@@ -466,8 +465,11 @@ struct btintel_pcie_dump_header {
* @workqueue: workqueue for RX work
* @rx_skb_q: SKB queue for RX packet
* @rx_work: RX work struct to process the RX packet in @rx_skb_q
- * @coredump_workqueue: dedicated workqueue for coredump collection
- * @coredump_work: work struct for coredump trace collection
+ * @dump_workqueue: dedicated ordered workqueue serializing the coredump,
+ * hardware exception, and firmware-trigger dump workers
+ * @coredump_work: work struct for DRAM trace coredump collection
+ * @hwexp_work: work struct for hardware exception event read
+ * @fwtrigger_work: work struct for firmware-triggered diagnostic event read
* @dma_pool: DMA pool for descriptors, index array and ci
* @dma_p_addr: DMA address for pool
* @dma_v_addr: address of pool
@@ -516,8 +518,10 @@ struct btintel_pcie_data {
struct work_struct rx_work;
struct work_struct reset_work;
- struct workqueue_struct *coredump_workqueue;
+ struct workqueue_struct *dump_workqueue;
struct work_struct coredump_work;
+ struct work_struct hwexp_work;
+ struct work_struct fwtrigger_work;
struct dma_pool *dma_pool;
dma_addr_t dma_p_addr;
--
2.54.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
* RE: [v1] Bluetooth: btintel_pcie: split coredump worker into per-trigger works
2026-07-02 17:03 [PATCH v1] Bluetooth: btintel_pcie: split coredump worker into per-trigger works Kiran K
@ 2026-07-02 18:48 ` bluez.test.bot
0 siblings, 0 replies; 2+ messages in thread
From: bluez.test.bot @ 2026-07-02 18:48 UTC (permalink / raw)
To: linux-bluetooth, kiran.k
[-- Attachment #1: Type: text/plain, Size: 1181 bytes --]
This is automated email and please do not reply to this email!
Dear submitter,
Thank you for submitting the patches to the linux bluetooth mailing list.
This is a CI test results with your patch series:
PW Link:https://patchwork.kernel.org/project/bluetooth/list/?series=1120618
---Test result---
Test Summary:
CheckPatch PASS 0.77 seconds
VerifyFixes PASS 0.09 seconds
VerifySignedoff PASS 0.09 seconds
GitLint PASS 0.22 seconds
SubjectPrefix PASS 0.08 seconds
BuildKernel PASS 20.77 seconds
CheckAllWarning PASS 22.84 seconds
CheckSparse PASS 22.37 seconds
BuildKernel32 PASS 20.13 seconds
CheckKernelLLVM SKIP 0.00 seconds
TestRunnerSetup PASS 362.20 seconds
IncrementalBuild PASS 19.90 seconds
Details
##############################
Test: CheckKernelLLVM - SKIP
Desc: Build kernel with LLVM + context analysis
Output:
Clang not found
https://github.com/bluez/bluetooth-next/pull/386
---
Regards,
Linux Bluetooth
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-07-02 18:48 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-07-02 17:03 [PATCH v1] Bluetooth: btintel_pcie: split coredump worker into per-trigger works Kiran K
2026-07-02 18:48 ` [v1] " bluez.test.bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox