[PATCH v2] scsi: ufs: core: Fix data race in CPU latency PM QoS request handling

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2] scsi: ufs: core: Fix data race in CPU latency PM QoS request handling
@ 2025-09-02  7:48 Zhongqiu Han
  2025-09-02 12:39 ` Peter Wang (王信友)
  0 siblings, 1 reply; 3+ messages in thread
From: Zhongqiu Han @ 2025-09-02  7:48 UTC (permalink / raw)
  To: alim.akhtar, avri.altman, bvanassche, James.Bottomley,
	martin.petersen
  Cc: peter.wang, tanghuan, liu.song13, quic_nguyenb, viro, huobean,
	adrian.hunter, can.guo, ebiggers, neil.armstrong,
	angelogioacchino.delregno, quic_narepall, quic_mnaresh,
	linux-scsi, linux-kernel, nitin.rawat, ziqi.chen, zhongqiu.han

The cpu_latency_qos_add/remove/update_request interfaces lack internal
synchronization by design, requiring the caller to ensure thread safety.
The current implementation relies on the `pm_qos_enabled` flag, which is
insufficient to prevent concurrent access and cannot serve as a proper
synchronization mechanism. This has led to data races and list corruption
issues.

A typical race condition call trace is:

[Thread A]
ufshcd_pm_qos_exit()
  --> cpu_latency_qos_remove_request()
    --> cpu_latency_qos_apply();
      --> pm_qos_update_target()
        --> plist_del              <--(1) delete plist node
    --> memset(req, 0, sizeof(*req));
  --> hba->pm_qos_enabled = false;

[Thread B]
ufshcd_devfreq_target
  --> ufshcd_devfreq_scale
    --> ufshcd_scale_clks
      --> ufshcd_pm_qos_update     <--(2) pm_qos_enabled is true
        --> cpu_latency_qos_update_request
          --> pm_qos_update_target
            --> plist_del          <--(3) plist node use-after-free

This patch introduces a dedicated mutex to serialize PM QoS operations,
preventing data races and ensuring safe access to PM QoS resources.
Additionally, READ_ONCE is used in the sysfs interface to ensure atomic
read access to pm_qos_enabled flag.

Fixes: 2777e73fc154 ("scsi: ufs: core: Add CPU latency QoS support for UFS driver")
Signed-off-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
---
v1 -> v2:
- Fix misleading indentation by adding braces to if statements in pm_qos logic.
- Resolve checkpatch strict mode warning by adding an inline comment for pm_qos_mutex.
- Link to v1: https://lore.kernel.org/all/20250901085117.86160-1-zhongqiu.han@oss.qualcomm.com/

 drivers/ufs/core/ufs-sysfs.c |  2 +-
 drivers/ufs/core/ufshcd.c    | 25 ++++++++++++++++++++++---
 include/ufs/ufshcd.h         |  3 +++
 3 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/drivers/ufs/core/ufs-sysfs.c b/drivers/ufs/core/ufs-sysfs.c
index 4bd7d491e3c5..8f7975010513 100644
--- a/drivers/ufs/core/ufs-sysfs.c
+++ b/drivers/ufs/core/ufs-sysfs.c
@@ -512,7 +512,7 @@ static ssize_t pm_qos_enable_show(struct device *dev,
 {
 	struct ufs_hba *hba = dev_get_drvdata(dev);
 
-	return sysfs_emit(buf, "%d\n", hba->pm_qos_enabled);
+	return sysfs_emit(buf, "%d\n", READ_ONCE(hba->pm_qos_enabled));
 }
 
 /**
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 926650412eaa..98b9ce583386 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -1047,14 +1047,19 @@ EXPORT_SYMBOL_GPL(ufshcd_is_hba_active);
  */
 void ufshcd_pm_qos_init(struct ufs_hba *hba)
 {
+	mutex_lock(&hba->pm_qos_mutex);
 
-	if (hba->pm_qos_enabled)
+	if (hba->pm_qos_enabled) {
+		mutex_unlock(&hba->pm_qos_mutex);
 		return;
+	}
 
 	cpu_latency_qos_add_request(&hba->pm_qos_req, PM_QOS_DEFAULT_VALUE);
 
 	if (cpu_latency_qos_request_active(&hba->pm_qos_req))
 		hba->pm_qos_enabled = true;
+
+	mutex_unlock(&hba->pm_qos_mutex);
 }
 
 /**
@@ -1063,11 +1068,16 @@ void ufshcd_pm_qos_init(struct ufs_hba *hba)
  */
 void ufshcd_pm_qos_exit(struct ufs_hba *hba)
 {
-	if (!hba->pm_qos_enabled)
+	mutex_lock(&hba->pm_qos_mutex);
+
+	if (!hba->pm_qos_enabled) {
+		mutex_unlock(&hba->pm_qos_mutex);
 		return;
+	}
 
 	cpu_latency_qos_remove_request(&hba->pm_qos_req);
 	hba->pm_qos_enabled = false;
+	mutex_unlock(&hba->pm_qos_mutex);
 }
 
 /**
@@ -1077,10 +1087,15 @@ void ufshcd_pm_qos_exit(struct ufs_hba *hba)
  */
 static void ufshcd_pm_qos_update(struct ufs_hba *hba, bool on)
 {
-	if (!hba->pm_qos_enabled)
+	mutex_lock(&hba->pm_qos_mutex);
+
+	if (!hba->pm_qos_enabled) {
+		mutex_unlock(&hba->pm_qos_mutex);
 		return;
+	}
 
 	cpu_latency_qos_update_request(&hba->pm_qos_req, on ? 0 : PM_QOS_DEFAULT_VALUE);
+	mutex_unlock(&hba->pm_qos_mutex);
 }
 
 /**
@@ -10764,6 +10779,10 @@ int ufshcd_init(struct ufs_hba *hba, void __iomem *mmio_base, unsigned int irq)
 	mutex_init(&hba->ee_ctrl_mutex);
 
 	mutex_init(&hba->wb_mutex);
+
+	/* Initialize mutex for PM QoS request synchronization */
+	mutex_init(&hba->pm_qos_mutex);
+
 	init_rwsem(&hba->clk_scaling_lock);
 
 	ufshcd_init_clk_gating(hba);
diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h
index 30ff169878dc..a16f857a052f 100644
--- a/include/ufs/ufshcd.h
+++ b/include/ufs/ufshcd.h
@@ -962,6 +962,7 @@ enum ufshcd_mcq_opr {
  * @ufs_rtc_update_work: A work for UFS RTC periodic update
  * @pm_qos_req: PM QoS request handle
  * @pm_qos_enabled: flag to check if pm qos is enabled
+ * @pm_qos_mutex: synchronizes PM QoS request and status updates
  * @critical_health_count: count of critical health exceptions
  * @dev_lvl_exception_count: count of device level exceptions since last reset
  * @dev_lvl_exception_id: vendor specific information about the
@@ -1135,6 +1136,8 @@ struct ufs_hba {
 	struct delayed_work ufs_rtc_update_work;
 	struct pm_qos_request pm_qos_req;
 	bool pm_qos_enabled;
+	/* synchronizes PM QoS request and status updates */
+	struct mutex pm_qos_mutex;
 
 	int critical_health_count;
 	atomic_t dev_lvl_exception_count;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] scsi: ufs: core: Fix data race in CPU latency PM QoS request handling
  2025-09-02  7:48 [PATCH v2] scsi: ufs: core: Fix data race in CPU latency PM QoS request handling Zhongqiu Han
@ 2025-09-02 12:39 ` Peter Wang (王信友)
  2025-09-03  7:10   ` Zhongqiu Han
  0 siblings, 1 reply; 3+ messages in thread
From: Peter Wang (王信友) @ 2025-09-02 12:39 UTC (permalink / raw)
  To: martin.petersen@oracle.com, James.Bottomley@HansenPartnership.com,
	alim.akhtar@samsung.com, avri.altman@wdc.com,
	zhongqiu.han@oss.qualcomm.com, bvanassche@acm.org
  Cc: tanghuan@vivo.com, AngeloGioacchino Del Regno,
	linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org,
	adrian.hunter@intel.com, ebiggers@kernel.org,
	quic_mnaresh@quicinc.com, ziqi.chen@oss.qualcomm.com,
	viro@zeniv.linux.org.uk, quic_narepall@quicinc.com,
	nitin.rawat@oss.qualcomm.com, quic_nguyenb@quicinc.com,
	huobean@gmail.com, neil.armstrong@linaro.org,
	liu.song13@zte.com.cn, can.guo@oss.qualcomm.com

On Tue, 2025-09-02 at 15:48 +0800, Zhongqiu Han wrote:
> 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> 
> 
> The cpu_latency_qos_add/remove/update_request interfaces lack
> internal
> synchronization by design, requiring the caller to ensure thread
> safety.
> The current implementation relies on the `pm_qos_enabled` flag, which
> is
> insufficient to prevent concurrent access and cannot serve as a
> proper
> synchronization mechanism. This has led to data races and list
> corruption
> issues.
> 
> A typical race condition call trace is:
> 
> [Thread A]
> ufshcd_pm_qos_exit()
>   --> cpu_latency_qos_remove_request()
>     --> cpu_latency_qos_apply();
>       --> pm_qos_update_target()
>         --> plist_del              <--(1) delete plist node
>     --> memset(req, 0, sizeof(*req));
>   --> hba->pm_qos_enabled = false;
> 
> [Thread B]
> ufshcd_devfreq_target
>   --> ufshcd_devfreq_scale
>     --> ufshcd_scale_clks
>       --> ufshcd_pm_qos_update     <--(2) pm_qos_enabled is true
>         --> cpu_latency_qos_update_request
>           --> pm_qos_update_target
>             --> plist_del          <--(3) plist node use-after-free
> 
> This patch introduces a dedicated mutex to serialize PM QoS
> operations,
> preventing data races and ensuring safe access to PM QoS resources.
> Additionally, READ_ONCE is used in the sysfs interface to ensure
> atomic
> read access to pm_qos_enabled flag.


Hi Zhongqiu,

Introducing an additional mutex lock would impact the efficiency of
devfreq.
Wouldn’t it be better to simply adjust the sequence to avoid race
conditions?
For instance,
ufshcd_pm_qos_exit(hba);
ufshcd_exit_clk_scaling(hba);
could be changed to
ufshcd_exit_clk_scaling(hba);
ufshcd_pm_qos_exit(hba);
This ensures that clock scaling is stopped before pm_qos is removed.

Thanks.
Peter



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] scsi: ufs: core: Fix data race in CPU latency PM QoS request handling
  2025-09-02 12:39 ` Peter Wang (王信友)
@ 2025-09-03  7:10   ` Zhongqiu Han
  0 siblings, 0 replies; 3+ messages in thread
From: Zhongqiu Han @ 2025-09-03  7:10 UTC (permalink / raw)
  To: Peter Wang (王信友), martin.petersen@oracle.com,
	James.Bottomley@HansenPartnership.com, alim.akhtar@samsung.com,
	avri.altman@wdc.com, bvanassche@acm.org
  Cc: tanghuan@vivo.com, AngeloGioacchino Del Regno,
	linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org,
	adrian.hunter@intel.com, ebiggers@kernel.org,
	quic_mnaresh@quicinc.com, ziqi.chen@oss.qualcomm.com,
	viro@zeniv.linux.org.uk, quic_narepall@quicinc.com,
	nitin.rawat@oss.qualcomm.com, quic_nguyenb@quicinc.com,
	huobean@gmail.com, neil.armstrong@linaro.org,
	liu.song13@zte.com.cn, can.guo@oss.qualcomm.com, zhongqiu.han

On 9/2/2025 8:39 PM, Peter Wang (王信友) wrote:
> On Tue, 2025-09-02 at 15:48 +0800, Zhongqiu Han wrote:
>> 
>> External email : Please do not click links or open attachments until
>> you have verified the sender or the content.
>> 
>> 
>> The cpu_latency_qos_add/remove/update_request interfaces lack
>> internal
>> synchronization by design, requiring the caller to ensure thread
>> safety.
>> The current implementation relies on the `pm_qos_enabled` flag, which
>> is
>> insufficient to prevent concurrent access and cannot serve as a
>> proper
>> synchronization mechanism. This has led to data races and list
>> corruption
>> issues.
>> 
>> A typical race condition call trace is:
>> 
>> [Thread A]
>> ufshcd_pm_qos_exit()
>>   --> cpu_latency_qos_remove_request()
>>     --> cpu_latency_qos_apply();
>>       --> pm_qos_update_target()
>>         --> plist_del              <--(1) delete plist node
>>     --> memset(req, 0, sizeof(*req));
>>   --> hba->pm_qos_enabled = false;
>> 
>> [Thread B]
>> ufshcd_devfreq_target
>>   --> ufshcd_devfreq_scale
>>     --> ufshcd_scale_clks
>>       --> ufshcd_pm_qos_update     <--(2) pm_qos_enabled is true
>>         --> cpu_latency_qos_update_request
>>           --> pm_qos_update_target
>>             --> plist_del          <--(3) plist node use-after-free
>> 
>> This patch introduces a dedicated mutex to serialize PM QoS
>> operations,
>> preventing data races and ensuring safe access to PM QoS resources.
>> Additionally, READ_ONCE is used in the sysfs interface to ensure
>> atomic
>> read access to pm_qos_enabled flag.
> 
> 
> Hi Zhongqiu,
> 
> Introducing an additional mutex lock would impact the efficiency of
> devfreq.
> Wouldn’t it be better to simply adjust the sequence to avoid race
> conditions?
> For instance,
> ufshcd_pm_qos_exit(hba);
> ufshcd_exit_clk_scaling(hba);
> could be changed to
> ufshcd_exit_clk_scaling(hba);
> ufshcd_pm_qos_exit(hba);
> This ensures that clock scaling is stopped before pm_qos is removed.
> 
> Thanks.
> Peter
> 
> 
> 
> ************* MEDIATEK Confidentiality Notice ********************
> The information contained in this e-mail message (including any
> attachments) may be confidential, proprietary, privileged, or otherwise
> exempt from disclosure under applicable laws. It is intended to be
> conveyed only to the designated recipient(s). Any use, dissemination,
> distribution, printing, retaining or copying of this e-mail (including its
> attachments) by unintended recipient(s) is strictly prohibited and may
> be unlawful. If you are not an intended recipient of this e-mail, or believe
> that you have received this e-mail in error, please notify the sender
> immediately (by replying to this e-mail), delete any and all copies of
> this e-mail (including any attachments) from your system, and do not
> disclose the content of this e-mail to any other person. Thank you!
> 

Hi Peter,
Thank you for your review and suggestion~

1.While adjusting the sequence of operations may help avoid race
conditions, it is not a reliable solution and increases the risk of
future maintenance issues.

2.There are additional race paths beyond the one discussed. For example,
user-triggered sysfs interactions pm_qos_enable_store can occur at
unpredictable times, potentially leading to concurrent access to PM QoS
resources.

3.The critical section protected by the mutex is relatively small, so
the performance impact is expected to be minimal.

Thanks again for your feedback!


-- 
Thx and BRs,
Zhongqiu Han

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-09-03  7:10 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-02  7:48 [PATCH v2] scsi: ufs: core: Fix data race in CPU latency PM QoS request handling Zhongqiu Han
2025-09-02 12:39 ` Peter Wang (王信友)
2025-09-03  7:10   ` Zhongqiu Han

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).