linux-arm-msm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5] Fix SSR(SubSystem Restart) fail when BT_EN is pulled up by hw
@ 2025-08-20 12:06 Shuai Zhang
  2025-08-20 13:32 ` Paul Menzel
  2025-08-23 10:20 ` kernel test robot
  0 siblings, 2 replies; 4+ messages in thread
From: Shuai Zhang @ 2025-08-20 12:06 UTC (permalink / raw)
  To: linux-bluetooth, linux-arm-msm; +Cc: quic_bt, Shuai Zhang

When the host actively triggers SSR and collects coredump data,
the Bluetooth stack sends a reset command to the controller. However, due
to the inability to clear the QCA_SSR_TRIGGERED and QCA_IBS_DISABLED bits,
the reset command times out.

To address this, this patch clears the QCA_SSR_TRIGGERED and
QCA_IBS_DISABLED flags and adds a 50ms delay after SSR, but only when
HCI_QUIRK_NON_PERSISTENT_SETUP is not set. This ensures the controller
completes the SSR process when BT_EN is always high due to hardware.

For the purpose of HCI_QUIRK_NON_PERSISTENT_SETUP, please refer to
commit 740011cfe948 ("Bluetooth: Add new quirk for non-persistent setup
settings")

The HCI_QUIRK_NON_PERSISTENT_SETUP quirk is associated with BT_EN,
and its presence can be used to determine whether BT_EN is defined in DTS.

After SSR, host will not download the firmware, causing
controller to remain in the IBS_WAKE state. Host needs
to synchronize with the controller to maintain proper operation.

Multiple triggers of SSR only first generate coredump file,
duo to memcoredump_flag no clear.

add clear coredump flag when ssr completed.

When the SSR duration exceeds 2 seconds, it triggers
host tx_idle_timeout, which sets host TX state to sleep. due to the
hardware pulling up bt_en, the firmware is not downloaded after the SSR.
As a result, the controller does not enter sleep mode. Consequently,
when the host sends a command afterward, it sends 0xFD to the controller,
but the controller does not respond, leading to a command timeout.

So reset tx_idle_timer after SSR to prevent host enter TX IBS_Sloeep mode.

Signed-off-by: Shuai Zhang <quic_shuaz@quicinc.com>
---
 drivers/bluetooth/hci_qca.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
index 4e56782b0..403d65952 100644
--- a/drivers/bluetooth/hci_qca.c
+++ b/drivers/bluetooth/hci_qca.c
@@ -1653,6 +1653,37 @@ static void qca_hw_error(struct hci_dev *hdev, u8 code)
 		skb_queue_purge(&qca->rx_memdump_q);
 	}
 
+	/*
+	 * If the BT chip's bt_en pin is connected to a 3.3V power supply via
+	 * hardware and always stays high, driver cannot control the bt_en pin.
+	 * As a result, during SSR(SubSystem Restart), QCA_SSR_TRIGGERED and
+	 * QCA_IBS_DISABLED flags cannot be cleared, which leads to a reset
+	 * command timeout.
+	 * Add an msleep delay to ensure controller completes the SSR process.
+	 *
+	 * Host will not download the firmware after SSR, controller to remain
+	 * in the IBS_WAKE state, and the host needs to synchronize with it
+	 *
+	 * Since the bluetooth chip has been reset, clear the memdump state.
+	 */
+	if (!test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)) {
+		/*
+		 * When the SSR (Sub-System Restart) duration exceeds 2 seconds,
+		 * it triggers host tx_idle_delay, which sets host TX state
+		 * to sleep. Reset tx_idle_timer after SSR to prevent
+		 * host enter TX IBS_Sloeep mode.
+		 */
+		mod_timer(&qca->tx_idle_timer, jiffies +
+				  msecs_to_jiffies(qca->tx_idle_delay));
+		msleep(50);
+
+		clear_bit(QCA_SSR_TRIGGERED, &qca->flags);
+		clear_bit(QCA_IBS_DISABLED, &qca->flags);
+
+		qca->tx_ibs_state = HCI_IBS_TX_AWAKE;
+		qca->memdump_state = QCA_MEMDUMP_IDLE;
+	}
+
 	clear_bit(QCA_HW_ERROR_EVENT, &qca->flags);
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v5] Fix SSR(SubSystem Restart) fail when BT_EN is pulled up by hw
  2025-08-20 12:06 [PATCH v5] Fix SSR(SubSystem Restart) fail when BT_EN is pulled up by hw Shuai Zhang
@ 2025-08-20 13:32 ` Paul Menzel
  2025-08-21 11:34   ` Shuai Zhang
  2025-08-23 10:20 ` kernel test robot
  1 sibling, 1 reply; 4+ messages in thread
From: Paul Menzel @ 2025-08-20 13:32 UTC (permalink / raw)
  To: Shuai Zhang; +Cc: linux-bluetooth, linux-arm-msm, quic_bt

Dear Shuai,


Thank you for the improved version. The commit message summary/title 
still has the space missing before the ( and should be prefixed with 
`Bluetooth:` to pass the linters.

Am 20.08.25 um 14:06 schrieb Shuai Zhang:
> When the host actively triggers SSR and collects coredump data,
> the Bluetooth stack sends a reset command to the controller. However, due
> to the inability to clear the QCA_SSR_TRIGGERED and QCA_IBS_DISABLED bits,
> the reset command times out.
> 
> To address this, this patch clears the QCA_SSR_TRIGGERED and
> QCA_IBS_DISABLED flags and adds a 50ms delay after SSR, but only when
> HCI_QUIRK_NON_PERSISTENT_SETUP is not set. This ensures the controller
> completes the SSR process when BT_EN is always high due to hardware.
> 
> For the purpose of HCI_QUIRK_NON_PERSISTENT_SETUP, please refer to
> commit 740011cfe948 ("Bluetooth: Add new quirk for non-persistent setup
> settings")

Missing dot/period at the end.

Also, the comment in `include/net/bluetooth/hci.h` is more helpful to me 
than the commit.

> The HCI_QUIRK_NON_PERSISTENT_SETUP quirk is associated with BT_EN,
> and its presence can be used to determine whether BT_EN is defined in DTS.
> 
> After SSR, host will not download the firmware, causing
> controller to remain in the IBS_WAKE state. Host needs
> to synchronize with the controller to maintain proper operation.
> 
> Multiple triggers of SSR only first generate coredump file,
> duo to memcoredump_flag no clear.

due to

> add clear coredump flag when ssr completed.
> 
> When the SSR duration exceeds 2 seconds, it triggers
> host tx_idle_timeout, which sets host TX state to sleep. due to the
> hardware pulling up bt_en, the firmware is not downloaded after the SSR.
> As a result, the controller does not enter sleep mode. Consequently,
> when the host sends a command afterward, it sends 0xFD to the controller,
> but the controller does not respond, leading to a command timeout.
> 
> So reset tx_idle_timer after SSR to prevent host enter TX IBS_Sloeep mode.

Sleep

> Signed-off-by: Shuai Zhang <quic_shuaz@quicinc.com>
> ---
>   drivers/bluetooth/hci_qca.c | 31 +++++++++++++++++++++++++++++++
>   1 file changed, 31 insertions(+)
> 
> diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
> index 4e56782b0..403d65952 100644
> --- a/drivers/bluetooth/hci_qca.c
> +++ b/drivers/bluetooth/hci_qca.c
> @@ -1653,6 +1653,37 @@ static void qca_hw_error(struct hci_dev *hdev, u8 code)
>   		skb_queue_purge(&qca->rx_memdump_q);
>   	}
>   
> +	/*
> +	 * If the BT chip's bt_en pin is connected to a 3.3V power supply via
> +	 * hardware and always stays high, driver cannot control the bt_en pin.
> +	 * As a result, during SSR(SubSystem Restart), QCA_SSR_TRIGGERED and

Missing space before (.

> +	 * QCA_IBS_DISABLED flags cannot be cleared, which leads to a reset
> +	 * command timeout.
> +	 * Add an msleep delay to ensure controller completes the SSR process.
> +	 *
> +	 * Host will not download the firmware after SSR, controller to remain
> +	 * in the IBS_WAKE state, and the host needs to synchronize with it
> +	 *
> +	 * Since the bluetooth chip has been reset, clear the memdump state.
> +	 */
> +	if (!test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)) {
> +		/*
> +		 * When the SSR (Sub-System Restart) duration exceeds 2 seconds,
> +		 * it triggers host tx_idle_delay, which sets host TX state
> +		 * to sleep. Reset tx_idle_timer after SSR to prevent
> +		 * host enter TX IBS_Sloeep mode.

Sleep?

> +		 */
> +		mod_timer(&qca->tx_idle_timer, jiffies +
> +				  msecs_to_jiffies(qca->tx_idle_delay));
> +		msleep(50);

Add a comment, why 50 ms and not 20 ms or 100 ms?

> +
> +		clear_bit(QCA_SSR_TRIGGERED, &qca->flags);
> +		clear_bit(QCA_IBS_DISABLED, &qca->flags);
> +
> +		qca->tx_ibs_state = HCI_IBS_TX_AWAKE;
> +		qca->memdump_state = QCA_MEMDUMP_IDLE;
> +	}
> +
>   	clear_bit(QCA_HW_ERROR_EVENT, &qca->flags);
>   }
>   


Kind regards,

Paul

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v5] Fix SSR(SubSystem Restart) fail when BT_EN is pulled up by hw
  2025-08-20 13:32 ` Paul Menzel
@ 2025-08-21 11:34   ` Shuai Zhang
  0 siblings, 0 replies; 4+ messages in thread
From: Shuai Zhang @ 2025-08-21 11:34 UTC (permalink / raw)
  To: Paul Menzel; +Cc: linux-bluetooth, linux-arm-msm, quic_bt

Dear,Paul

On 8/20/2025 9:32 PM, Paul Menzel wrote:
> Dear Shuai,
> 
> 
> Thank you for the improved version. The commit message summary/title still has the space missing before the ( and should be prefixed with `Bluetooth:` to pass the linters.
> 
> Am 20.08.25 um 14:06 schrieb Shuai Zhang:
>> When the host actively triggers SSR and collects coredump data,
>> the Bluetooth stack sends a reset command to the controller. However, due
>> to the inability to clear the QCA_SSR_TRIGGERED and QCA_IBS_DISABLED bits,
>> the reset command times out.
>>
>> To address this, this patch clears the QCA_SSR_TRIGGERED and
>> QCA_IBS_DISABLED flags and adds a 50ms delay after SSR, but only when
>> HCI_QUIRK_NON_PERSISTENT_SETUP is not set. This ensures the controller
>> completes the SSR process when BT_EN is always high due to hardware.
>>
>> For the purpose of HCI_QUIRK_NON_PERSISTENT_SETUP, please refer to
>> commit 740011cfe948 ("Bluetooth: Add new quirk for non-persistent setup
>> settings")
> 
> Missing dot/period at the end.
> 
> Also, the comment in `include/net/bluetooth/hci.h` is more helpful to me than the commit.
> 
>> The HCI_QUIRK_NON_PERSISTENT_SETUP quirk is associated with BT_EN,
>> and its presence can be used to determine whether BT_EN is defined in DTS.
>>
>> After SSR, host will not download the firmware, causing
>> controller to remain in the IBS_WAKE state. Host needs
>> to synchronize with the controller to maintain proper operation.
>>
>> Multiple triggers of SSR only first generate coredump file,
>> duo to memcoredump_flag no clear.
> 
> due to
> 
>> add clear coredump flag when ssr completed.
>>
>> When the SSR duration exceeds 2 seconds, it triggers
>> host tx_idle_timeout, which sets host TX state to sleep. due to the
>> hardware pulling up bt_en, the firmware is not downloaded after the SSR.
>> As a result, the controller does not enter sleep mode. Consequently,
>> when the host sends a command afterward, it sends 0xFD to the controller,
>> but the controller does not respond, leading to a command timeout.
>>
>> So reset tx_idle_timer after SSR to prevent host enter TX IBS_Sloeep mode.
> 
> Sleep
> 
>> Signed-off-by: Shuai Zhang <quic_shuaz@quicinc.com>
>> ---
>>   drivers/bluetooth/hci_qca.c | 31 +++++++++++++++++++++++++++++++
>>   1 file changed, 31 insertions(+)
>>
>> diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
>> index 4e56782b0..403d65952 100644
>> --- a/drivers/bluetooth/hci_qca.c
>> +++ b/drivers/bluetooth/hci_qca.c
>> @@ -1653,6 +1653,37 @@ static void qca_hw_error(struct hci_dev *hdev, u8 code)
>>           skb_queue_purge(&qca->rx_memdump_q);
>>       }
>>   +    /*
>> +     * If the BT chip's bt_en pin is connected to a 3.3V power supply via
>> +     * hardware and always stays high, driver cannot control the bt_en pin.
>> +     * As a result, during SSR(SubSystem Restart), QCA_SSR_TRIGGERED and
> 
> Missing space before (.
> 
>> +     * QCA_IBS_DISABLED flags cannot be cleared, which leads to a reset
>> +     * command timeout.
>> +     * Add an msleep delay to ensure controller completes the SSR process.
>> +     *
>> +     * Host will not download the firmware after SSR, controller to remain
>> +     * in the IBS_WAKE state, and the host needs to synchronize with it
>> +     *
>> +     * Since the bluetooth chip has been reset, clear the memdump state.
>> +     */
>> +    if (!test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)) {
>> +        /*
>> +         * When the SSR (Sub-System Restart) duration exceeds 2 seconds,
>> +         * it triggers host tx_idle_delay, which sets host TX state
>> +         * to sleep. Reset tx_idle_timer after SSR to prevent
>> +         * host enter TX IBS_Sloeep mode.
> 
> Sleep?
> 
>> +         */
>> +        mod_timer(&qca->tx_idle_timer, jiffies +
>> +                  msecs_to_jiffies(qca->tx_idle_delay));
>> +        msleep(50);
> 
> Add a comment, why 50 ms and not 20 ms or 100 ms?
> 
>> +
>> +        clear_bit(QCA_SSR_TRIGGERED, &qca->flags);
>> +        clear_bit(QCA_IBS_DISABLED, &qca->flags);
>> +
>> +        qca->tx_ibs_state = HCI_IBS_TX_AWAKE;
>> +        qca->memdump_state = QCA_MEMDUMP_IDLE;
>> +    }
>> +
>>       clear_bit(QCA_HW_ERROR_EVENT, &qca->flags);
>>   }
>>   
> 
> 
> Kind regards,
> 
> Paul
Thanks again for your thorough check. 
I’ll revise the content and share the updated version soon


BR,
Shuai




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v5] Fix SSR(SubSystem Restart) fail when BT_EN is pulled up by hw
  2025-08-20 12:06 [PATCH v5] Fix SSR(SubSystem Restart) fail when BT_EN is pulled up by hw Shuai Zhang
  2025-08-20 13:32 ` Paul Menzel
@ 2025-08-23 10:20 ` kernel test robot
  1 sibling, 0 replies; 4+ messages in thread
From: kernel test robot @ 2025-08-23 10:20 UTC (permalink / raw)
  To: Shuai Zhang, linux-bluetooth, linux-arm-msm
  Cc: oe-kbuild-all, quic_bt, Shuai Zhang

Hi Shuai,

kernel test robot noticed the following build errors:

[auto build test ERROR on bluetooth-next/master]
[also build test ERROR on bluetooth/master linus/master v6.17-rc2 next-20250822]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Shuai-Zhang/Fix-SSR-SubSystem-Restart-fail-when-BT_EN-is-pulled-up-by-hw/20250820-200925
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git master
patch link:    https://lore.kernel.org/r/20250820120641.1622351-1-quic_shuaz%40quicinc.com
patch subject: [PATCH v5] Fix SSR(SubSystem Restart) fail when BT_EN is pulled up by hw
config: loongarch-allyesconfig (https://download.01.org/0day-ci/archive/20250823/202508231806.zApKGtbH-lkp@intel.com/config)
compiler: clang version 22.0.0git (https://github.com/llvm/llvm-project d26ea02060b1c9db751d188b2edb0059a9eb273d)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250823/202508231806.zApKGtbH-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202508231806.zApKGtbH-lkp@intel.com/

All errors (new ones prefixed by >>):

>> drivers/bluetooth/hci_qca.c:1669:55: error: no member named 'quirks' in 'struct hci_dev'
    1669 |         if (!test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)) {
         |                                                        ~~~~  ^
>> drivers/bluetooth/hci_qca.c:1669:55: error: no member named 'quirks' in 'struct hci_dev'
    1669 |         if (!test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)) {
         |                                                        ~~~~  ^
>> drivers/bluetooth/hci_qca.c:1669:55: error: no member named 'quirks' in 'struct hci_dev'
    1669 |         if (!test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)) {
         |                                                        ~~~~  ^
>> drivers/bluetooth/hci_qca.c:1669:55: error: no member named 'quirks' in 'struct hci_dev'
    1669 |         if (!test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)) {
         |                                                        ~~~~  ^
>> drivers/bluetooth/hci_qca.c:1669:55: error: no member named 'quirks' in 'struct hci_dev'
    1669 |         if (!test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)) {
         |                                                        ~~~~  ^
   5 errors generated.


vim +1669 drivers/bluetooth/hci_qca.c

  1609	
  1610	static void qca_hw_error(struct hci_dev *hdev, u8 code)
  1611	{
  1612		struct hci_uart *hu = hci_get_drvdata(hdev);
  1613		struct qca_data *qca = hu->priv;
  1614	
  1615		set_bit(QCA_SSR_TRIGGERED, &qca->flags);
  1616		set_bit(QCA_HW_ERROR_EVENT, &qca->flags);
  1617		bt_dev_info(hdev, "mem_dump_status: %d", qca->memdump_state);
  1618	
  1619		if (qca->memdump_state == QCA_MEMDUMP_IDLE) {
  1620			/* If hardware error event received for other than QCA
  1621			 * soc memory dump event, then we need to crash the SOC
  1622			 * and wait here for 8 seconds to get the dump packets.
  1623			 * This will block main thread to be on hold until we
  1624			 * collect dump.
  1625			 */
  1626			set_bit(QCA_MEMDUMP_COLLECTION, &qca->flags);
  1627			qca_send_crashbuffer(hu);
  1628			qca_wait_for_dump_collection(hdev);
  1629		} else if (qca->memdump_state == QCA_MEMDUMP_COLLECTING) {
  1630			/* Let us wait here until memory dump collected or
  1631			 * memory dump timer expired.
  1632			 */
  1633			bt_dev_info(hdev, "waiting for dump to complete");
  1634			qca_wait_for_dump_collection(hdev);
  1635		}
  1636	
  1637		mutex_lock(&qca->hci_memdump_lock);
  1638		if (qca->memdump_state != QCA_MEMDUMP_COLLECTED) {
  1639			bt_dev_err(hu->hdev, "clearing allocated memory due to memdump timeout");
  1640			hci_devcd_abort(hu->hdev);
  1641			if (qca->qca_memdump) {
  1642				kfree(qca->qca_memdump);
  1643				qca->qca_memdump = NULL;
  1644			}
  1645			qca->memdump_state = QCA_MEMDUMP_TIMEOUT;
  1646			cancel_delayed_work(&qca->ctrl_memdump_timeout);
  1647		}
  1648		mutex_unlock(&qca->hci_memdump_lock);
  1649	
  1650		if (qca->memdump_state == QCA_MEMDUMP_TIMEOUT ||
  1651		    qca->memdump_state == QCA_MEMDUMP_COLLECTED) {
  1652			cancel_work_sync(&qca->ctrl_memdump_evt);
  1653			skb_queue_purge(&qca->rx_memdump_q);
  1654		}
  1655	
  1656		/*
  1657		 * If the BT chip's bt_en pin is connected to a 3.3V power supply via
  1658		 * hardware and always stays high, driver cannot control the bt_en pin.
  1659		 * As a result, during SSR(SubSystem Restart), QCA_SSR_TRIGGERED and
  1660		 * QCA_IBS_DISABLED flags cannot be cleared, which leads to a reset
  1661		 * command timeout.
  1662		 * Add an msleep delay to ensure controller completes the SSR process.
  1663		 *
  1664		 * Host will not download the firmware after SSR, controller to remain
  1665		 * in the IBS_WAKE state, and the host needs to synchronize with it
  1666		 *
  1667		 * Since the bluetooth chip has been reset, clear the memdump state.
  1668		 */
> 1669		if (!test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)) {
  1670			/*
  1671			 * When the SSR (Sub-System Restart) duration exceeds 2 seconds,
  1672			 * it triggers host tx_idle_delay, which sets host TX state
  1673			 * to sleep. Reset tx_idle_timer after SSR to prevent
  1674			 * host enter TX IBS_Sloeep mode.
  1675			 */
  1676			mod_timer(&qca->tx_idle_timer, jiffies +
  1677					  msecs_to_jiffies(qca->tx_idle_delay));
  1678			msleep(50);
  1679	
  1680			clear_bit(QCA_SSR_TRIGGERED, &qca->flags);
  1681			clear_bit(QCA_IBS_DISABLED, &qca->flags);
  1682	
  1683			qca->tx_ibs_state = HCI_IBS_TX_AWAKE;
  1684			qca->memdump_state = QCA_MEMDUMP_IDLE;
  1685		}
  1686	
  1687		clear_bit(QCA_HW_ERROR_EVENT, &qca->flags);
  1688	}
  1689	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-08-23 10:20 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-20 12:06 [PATCH v5] Fix SSR(SubSystem Restart) fail when BT_EN is pulled up by hw Shuai Zhang
2025-08-20 13:32 ` Paul Menzel
2025-08-21 11:34   ` Shuai Zhang
2025-08-23 10:20 ` kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).