public inbox for linux-arm-msm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6] Bluetooth: qca: Fix delayed hw_error handling due to missing wakeup during SSR
@ 2026-04-10  8:52 Shuai Zhang
  2026-04-10  9:38 ` Paul Menzel
  2026-04-10 14:20 ` patchwork-bot+bluetooth
  0 siblings, 2 replies; 4+ messages in thread
From: Shuai Zhang @ 2026-04-10  8:52 UTC (permalink / raw)
  To: Bartosz Golaszewski, Marcel Holtmann, Luiz Augusto von Dentz
  Cc: linux-arm-msm, linux-bluetooth, linux-kernel, cheng.jiang,
	quic_chezhou, wei.deng, jinwang.li, mengshi.wu, shuai.zhang,
	Bartosz Golaszewski

When Bluetooth controller encounters a coredump, it triggers
the Subsystem Restart (SSR) mechanism. The controller first
reports the coredump data, and once the data upload is complete,
it sends a hw_error event. The host relies on this event to
proceed with subsequent recovery actions.

If the host has not finished processing the coredump data
when the hw_error event is received,
it sets a timer to wait until either the data processing is complete
or the timeout expires before handling the event.

The current implementation lacks a wakeup trigger. As a result,
even if the coredump data has already been processed, the host
continues to wait until the timer expires, causing unnecessary
delays in handling the hw_error event.

To fix this issue, adds a `wake_up_bit()` call after the host finishes
processing the coredump data. This ensures that the waiting thread is
promptly notified and can proceed to handle the hw_error event without
waiting for the timeout.

Test case:
- Trigger controller coredump using the command: `hcitool cmd 0x3f 0c 26`.
- Use `btmon` to capture HCI logs.
- Observe the time interval between receiving the hw_error event
and the execution of the power-off sequence in the HCI log.

Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Signed-off-by: Shuai Zhang <shuai.zhang@oss.qualcomm.com>
---
Changes v6:
- Replace wake_up_bit with clear_and_wake_up_bit 
- Link to v5
  https://lore.kernel.org/all/20260409112233.3326467-1-shuai.zhang@oss.qualcomm.com/

Changes v5:
- Replace clear_and_wake_up_bit with wake_up_bit
- Link to v4
  https://lore.kernel.org/all/20260327083258.1398450-1-shuai.zhang@oss.qualcomm.com/

Changes v4:
- add Acked-by signoff
- Link to v3
  https://lore.kernel.org/all/20251107033924.3707495-1-quic_shuaz@quicinc.com/

Changes v3:
- add Fixes tag
- Link to v2
  https://lore.kernel.org/all/20251106140103.1406081-1-quic_shuaz@quicinc.com/

Changes v2:
- Split timeout conversion into a separate patch.
- Clarified commit messages and added test case description.
- Link to v1
  https://lore.kernel.org/all/20251104112601.2670019-1-quic_shuaz@quicinc.com/
---
 drivers/bluetooth/hci_qca.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
index c17a462ae..228a754a9 100644
--- a/drivers/bluetooth/hci_qca.c
+++ b/drivers/bluetooth/hci_qca.c
@@ -1108,7 +1108,7 @@ static void qca_controller_memdump(struct work_struct *work)
 				qca->qca_memdump = NULL;
 				qca->memdump_state = QCA_MEMDUMP_COLLECTED;
 				cancel_delayed_work(&qca->ctrl_memdump_timeout);
-				clear_bit(QCA_MEMDUMP_COLLECTION, &qca->flags);
+				clear_and_wake_up_bit(QCA_MEMDUMP_COLLECTION, &qca->flags);
 				clear_bit(QCA_IBS_DISABLED, &qca->flags);
 				mutex_unlock(&qca->hci_memdump_lock);
 				return;
@@ -1186,7 +1186,7 @@ static void qca_controller_memdump(struct work_struct *work)
 			kfree(qca->qca_memdump);
 			qca->qca_memdump = NULL;
 			qca->memdump_state = QCA_MEMDUMP_COLLECTED;
-			clear_bit(QCA_MEMDUMP_COLLECTION, &qca->flags);
+			clear_and_wake_up_bit(QCA_MEMDUMP_COLLECTION, &qca->flags);
 		}
 
 		mutex_unlock(&qca->hci_memdump_lock);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v6] Bluetooth: qca: Fix delayed hw_error handling due to missing wakeup during SSR
  2026-04-10  8:52 [PATCH v6] Bluetooth: qca: Fix delayed hw_error handling due to missing wakeup during SSR Shuai Zhang
@ 2026-04-10  9:38 ` Paul Menzel
  2026-04-10  9:55   ` Shuai Zhang
  2026-04-10 14:20 ` patchwork-bot+bluetooth
  1 sibling, 1 reply; 4+ messages in thread
From: Paul Menzel @ 2026-04-10  9:38 UTC (permalink / raw)
  To: Shuai Zhang
  Cc: Bartosz Golaszewski, Marcel Holtmann, Luiz Augusto von Dentz,
	linux-arm-msm, linux-bluetooth, linux-kernel, cheng.jiang,
	quic_chezhou, wei.deng, jinwang.li, mengshi.wu,
	Bartosz Golaszewski

Dear Shuai,


Thank you for your patch. Just some last style things. It’d be great if 
you could re-flow the commit message to 75 characters per line. This 
would save some lines.

Am 10.04.26 um 10:52 schrieb Shuai Zhang:
> When Bluetooth controller encounters a coredump, it triggers
> the Subsystem Restart (SSR) mechanism. The controller first
> reports the coredump data, and once the data upload is complete,
> it sends a hw_error event. The host relies on this event to
> proceed with subsequent recovery actions.
> 
> If the host has not finished processing the coredump data
> when the hw_error event is received,
> it sets a timer to wait until either the data processing is complete
> or the timeout expires before handling the event.

Maybe mention the timer value?

> The current implementation lacks a wakeup trigger. As a result,
> even if the coredump data has already been processed, the host
> continues to wait until the timer expires, causing unnecessary
> delays in handling the hw_error event.
> 
> To fix this issue, adds a `wake_up_bit()` call after the host finishes

s/adds/add/

Now that you use `clear_and_wake_up_bit()`, this might confuse readers. 
Maybe:

To fix this issue, also wake up the other thread by using 
`clear_and_wake_up_bit()`.

Feel free to ignore though.

> processing the coredump data. This ensures that the waiting thread is
> promptly notified and can proceed to handle the hw_error event without
> waiting for the timeout.
> 
> Test case:
> - Trigger controller coredump using the command: `hcitool cmd 0x3f 0c 26`.

It’d be great if you mentioned one affected controller.

> - Use `btmon` to capture HCI logs.
> - Observe the time interval between receiving the hw_error event
> and the execution of the power-off sequence in the HCI log.

Please paste the logs.

> Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
> Signed-off-by: Shuai Zhang <shuai.zhang@oss.qualcomm.com>
> ---
> Changes v6:
> - Replace wake_up_bit with clear_and_wake_up_bit
> - Link to v5
>    https://lore.kernel.org/all/20260409112233.3326467-1-shuai.zhang@oss.qualcomm.com/
> 
> Changes v5:
> - Replace clear_and_wake_up_bit with wake_up_bit
> - Link to v4
>    https://lore.kernel.org/all/20260327083258.1398450-1-shuai.zhang@oss.qualcomm.com/
> 
> Changes v4:
> - add Acked-by signoff
> - Link to v3
>    https://lore.kernel.org/all/20251107033924.3707495-1-quic_shuaz@quicinc.com/
> 
> Changes v3:
> - add Fixes tag
> - Link to v2
>    https://lore.kernel.org/all/20251106140103.1406081-1-quic_shuaz@quicinc.com/
> 
> Changes v2:
> - Split timeout conversion into a separate patch.
> - Clarified commit messages and added test case description.
> - Link to v1
>    https://lore.kernel.org/all/20251104112601.2670019-1-quic_shuaz@quicinc.com/
> ---
>   drivers/bluetooth/hci_qca.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
> index c17a462ae..228a754a9 100644
> --- a/drivers/bluetooth/hci_qca.c
> +++ b/drivers/bluetooth/hci_qca.c
> @@ -1108,7 +1108,7 @@ static void qca_controller_memdump(struct work_struct *work)
>   				qca->qca_memdump = NULL;
>   				qca->memdump_state = QCA_MEMDUMP_COLLECTED;
>   				cancel_delayed_work(&qca->ctrl_memdump_timeout);
> -				clear_bit(QCA_MEMDUMP_COLLECTION, &qca->flags);
> +				clear_and_wake_up_bit(QCA_MEMDUMP_COLLECTION, &qca->flags);
>   				clear_bit(QCA_IBS_DISABLED, &qca->flags);
>   				mutex_unlock(&qca->hci_memdump_lock);
>   				return;
> @@ -1186,7 +1186,7 @@ static void qca_controller_memdump(struct work_struct *work)
>   			kfree(qca->qca_memdump);
>   			qca->qca_memdump = NULL;
>   			qca->memdump_state = QCA_MEMDUMP_COLLECTED;
> -			clear_bit(QCA_MEMDUMP_COLLECTION, &qca->flags);
> +			clear_and_wake_up_bit(QCA_MEMDUMP_COLLECTION, &qca->flags);
>   		}
>   
>   		mutex_unlock(&qca->hci_memdump_lock);

With the comments above addressed:

Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>


Kind regards,

Paul


PS: gemini/gemini-3.1-pro-preview has some unrelated(?) comments [1]. 
It’d be great if Qualcomm could look into this.


[1]: 
https://sashiko.dev/#/patchset/20260410085202.4128000-1-shuai.zhang%40oss.qualcomm.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v6] Bluetooth: qca: Fix delayed hw_error handling due to missing wakeup during SSR
  2026-04-10  9:38 ` Paul Menzel
@ 2026-04-10  9:55   ` Shuai Zhang
  0 siblings, 0 replies; 4+ messages in thread
From: Shuai Zhang @ 2026-04-10  9:55 UTC (permalink / raw)
  To: Paul Menzel
  Cc: Bartosz Golaszewski, Marcel Holtmann, Luiz Augusto von Dentz,
	linux-arm-msm, linux-bluetooth, linux-kernel, cheng.jiang,
	quic_chezhou, wei.deng, jinwang.li, mengshi.wu,
	Bartosz Golaszewski

Hi Paul

Thank you for the suggestion. I have updated it in v7.

On 4/10/2026 5:38 PM, Paul Menzel wrote:
> Dear Shuai,
>
>
> Thank you for your patch. Just some last style things. It’d be great 
> if you could re-flow the commit message to 75 characters per line. 
> This would save some lines.
>
> Am 10.04.26 um 10:52 schrieb Shuai Zhang:
>> When Bluetooth controller encounters a coredump, it triggers
>> the Subsystem Restart (SSR) mechanism. The controller first
>> reports the coredump data, and once the data upload is complete,
>> it sends a hw_error event. The host relies on this event to
>> proceed with subsequent recovery actions.
>>
>> If the host has not finished processing the coredump data
>> when the hw_error event is received,
>> it sets a timer to wait until either the data processing is complete
>> or the timeout expires before handling the event.
>
> Maybe mention the timer value?
>
>> The current implementation lacks a wakeup trigger. As a result,
>> even if the coredump data has already been processed, the host
>> continues to wait until the timer expires, causing unnecessary
>> delays in handling the hw_error event.
>>
>> To fix this issue, adds a `wake_up_bit()` call after the host finishes
>
> s/adds/add/
>
> Now that you use `clear_and_wake_up_bit()`, this might confuse 
> readers. Maybe:
>
> To fix this issue, also wake up the other thread by using 
> `clear_and_wake_up_bit()`.
>
> Feel free to ignore though.
>
>> processing the coredump data. This ensures that the waiting thread is
>> promptly notified and can proceed to handle the hw_error event without
>> waiting for the timeout.
>>
>> Test case:
>> - Trigger controller coredump using the command: `hcitool cmd 0x3f 0c 
>> 26`.
>
> It’d be great if you mentioned one affected controller.
>
>> - Use `btmon` to capture HCI logs.
>> - Observe the time interval between receiving the hw_error event
>> and the execution of the power-off sequence in the HCI log.
>
> Please paste the logs.
>
>> Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
>> Signed-off-by: Shuai Zhang <shuai.zhang@oss.qualcomm.com>
>> ---
>> Changes v6:
>> - Replace wake_up_bit with clear_and_wake_up_bit
>> - Link to v5
>> https://lore.kernel.org/all/20260409112233.3326467-1-shuai.zhang@oss.qualcomm.com/
>>
>> Changes v5:
>> - Replace clear_and_wake_up_bit with wake_up_bit
>> - Link to v4
>> https://lore.kernel.org/all/20260327083258.1398450-1-shuai.zhang@oss.qualcomm.com/
>>
>> Changes v4:
>> - add Acked-by signoff
>> - Link to v3
>> https://lore.kernel.org/all/20251107033924.3707495-1-quic_shuaz@quicinc.com/
>>
>> Changes v3:
>> - add Fixes tag
>> - Link to v2
>> https://lore.kernel.org/all/20251106140103.1406081-1-quic_shuaz@quicinc.com/
>>
>> Changes v2:
>> - Split timeout conversion into a separate patch.
>> - Clarified commit messages and added test case description.
>> - Link to v1
>> https://lore.kernel.org/all/20251104112601.2670019-1-quic_shuaz@quicinc.com/
>> ---
>>   drivers/bluetooth/hci_qca.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
>> index c17a462ae..228a754a9 100644
>> --- a/drivers/bluetooth/hci_qca.c
>> +++ b/drivers/bluetooth/hci_qca.c
>> @@ -1108,7 +1108,7 @@ static void qca_controller_memdump(struct 
>> work_struct *work)
>>                   qca->qca_memdump = NULL;
>>                   qca->memdump_state = QCA_MEMDUMP_COLLECTED;
>> cancel_delayed_work(&qca->ctrl_memdump_timeout);
>> -                clear_bit(QCA_MEMDUMP_COLLECTION, &qca->flags);
>> +                clear_and_wake_up_bit(QCA_MEMDUMP_COLLECTION, 
>> &qca->flags);
>>                   clear_bit(QCA_IBS_DISABLED, &qca->flags);
>>                   mutex_unlock(&qca->hci_memdump_lock);
>>                   return;
>> @@ -1186,7 +1186,7 @@ static void qca_controller_memdump(struct 
>> work_struct *work)
>>               kfree(qca->qca_memdump);
>>               qca->qca_memdump = NULL;
>>               qca->memdump_state = QCA_MEMDUMP_COLLECTED;
>> -            clear_bit(QCA_MEMDUMP_COLLECTION, &qca->flags);
>> +            clear_and_wake_up_bit(QCA_MEMDUMP_COLLECTION, &qca->flags);
>>           }
>>             mutex_unlock(&qca->hci_memdump_lock);
>
> With the comments above addressed:
>
> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
>
>
> Kind regards,
>
> Paul
>
>
> PS: gemini/gemini-3.1-pro-preview has some unrelated(?) comments [1]. 
> It’d be great if Qualcomm could look into this.
>
>
> [1]: 
> https://sashiko.dev/#/patchset/20260410085202.4128000-1-shuai.zhang%40oss.qualcomm.com 
>


Thanks,
shuai


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v6] Bluetooth: qca: Fix delayed hw_error handling due to missing wakeup during SSR
  2026-04-10  8:52 [PATCH v6] Bluetooth: qca: Fix delayed hw_error handling due to missing wakeup during SSR Shuai Zhang
  2026-04-10  9:38 ` Paul Menzel
@ 2026-04-10 14:20 ` patchwork-bot+bluetooth
  1 sibling, 0 replies; 4+ messages in thread
From: patchwork-bot+bluetooth @ 2026-04-10 14:20 UTC (permalink / raw)
  To: Shuai Zhang
  Cc: brgl, marcel, luiz.dentz, linux-arm-msm, linux-bluetooth,
	linux-kernel, cheng.jiang, quic_chezhou, wei.deng, jinwang.li,
	mengshi.wu, bartosz.golaszewski

Hello:

This patch was applied to bluetooth/bluetooth-next.git (master)
by Luiz Augusto von Dentz <luiz.von.dentz@intel.com>:

On Fri, 10 Apr 2026 16:52:02 +0800 you wrote:
> When Bluetooth controller encounters a coredump, it triggers
> the Subsystem Restart (SSR) mechanism. The controller first
> reports the coredump data, and once the data upload is complete,
> it sends a hw_error event. The host relies on this event to
> proceed with subsequent recovery actions.
> 
> If the host has not finished processing the coredump data
> when the hw_error event is received,
> it sets a timer to wait until either the data processing is complete
> or the timeout expires before handling the event.
> 
> [...]

Here is the summary with links:
  - [v6] Bluetooth: qca: Fix delayed hw_error handling due to missing wakeup during SSR
    https://git.kernel.org/bluetooth/bluetooth-next/c/9f07d5d04826

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-04-10 14:20 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-10  8:52 [PATCH v6] Bluetooth: qca: Fix delayed hw_error handling due to missing wakeup during SSR Shuai Zhang
2026-04-10  9:38 ` Paul Menzel
2026-04-10  9:55   ` Shuai Zhang
2026-04-10 14:20 ` patchwork-bot+bluetooth

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox