* [PATCH 0/1] scsi: ufs: core: Fix task management completion timeout race
@ 2021-10-13 15:01 Adrian Hunter
2021-10-13 15:01 ` [PATCH 1/1] " Adrian Hunter
0 siblings, 1 reply; 8+ messages in thread
From: Adrian Hunter @ 2021-10-13 15:01 UTC (permalink / raw)
To: Martin K . Petersen
Cc: James E . J . Bottomley, Bean Huo, Avri Altman, Alim Akhtar,
Can Guo, Asutosh Das, Bart Van Assche, linux-scsi
Hi
Please consider this for v5.15 fixes.
Adrian Hunter (1):
scsi: ufs: core: Fix task management completion timeout race
drivers/scsi/ufs/ufshcd.c | 1 -
1 file changed, 1 deletion(-)
Regards
Adrian
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/1] scsi: ufs: core: Fix task management completion timeout race
2021-10-13 15:01 [PATCH 0/1] scsi: ufs: core: Fix task management completion timeout race Adrian Hunter
@ 2021-10-13 15:01 ` Adrian Hunter
2021-10-14 4:14 ` Bart Van Assche
2021-10-14 19:18 ` Bart Van Assche
0 siblings, 2 replies; 8+ messages in thread
From: Adrian Hunter @ 2021-10-13 15:01 UTC (permalink / raw)
To: Martin K . Petersen
Cc: James E . J . Bottomley, Bean Huo, Avri Altman, Alim Akhtar,
Can Guo, Asutosh Das, Bart Van Assche, linux-scsi
__ufshcd_issue_tm_cmd() clears req->end_io_data after timing out,
which races with the completion function ufshcd_tmc_handler() which
expects req->end_io_data to have a value.
Note __ufshcd_issue_tm_cmd() and ufshcd_tmc_handler() are already
synchronized using hba->tmf_rqs and hba->outstanding_tasks under the
host_lock spinlock.
It is also not necessary (nor typical) to clear req->end_io_data because
the block layer does it before allocating out requests e.g. via
blk_get_request().
So fix by not clearing it.
Fixes: f5ef336fd2e4c3 ("scsi: ufs: core: Fix task management completion")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
drivers/scsi/ufs/ufshcd.c | 5 -----
1 file changed, 5 deletions(-)
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 95be7ecdfe10..f34b3994d1aa 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6550,11 +6550,6 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
err = wait_for_completion_io_timeout(&wait,
msecs_to_jiffies(TM_CMD_TIMEOUT));
if (!err) {
- /*
- * Make sure that ufshcd_compl_tm() does not trigger a
- * use-after-free.
- */
- req->end_io_data = NULL;
ufshcd_add_tm_upiu_trace(hba, task_tag, UFS_TM_ERR);
dev_err(hba->dev, "%s: task management cmd 0x%.2x timed-out\n",
__func__, tm_function);
--
2.25.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] scsi: ufs: core: Fix task management completion timeout race
2021-10-13 15:01 ` [PATCH 1/1] " Adrian Hunter
@ 2021-10-14 4:14 ` Bart Van Assche
2021-10-14 6:02 ` Adrian Hunter
2021-10-14 19:18 ` Bart Van Assche
1 sibling, 1 reply; 8+ messages in thread
From: Bart Van Assche @ 2021-10-14 4:14 UTC (permalink / raw)
To: Adrian Hunter, Martin K . Petersen
Cc: James E . J . Bottomley, Bean Huo, Avri Altman, Alim Akhtar,
Can Guo, Asutosh Das, linux-scsi
On 10/13/21 08:01, Adrian Hunter wrote:
> __ufshcd_issue_tm_cmd() clears req->end_io_data after timing out,
> which races with the completion function ufshcd_tmc_handler() which
> expects req->end_io_data to have a value.
>
> Note __ufshcd_issue_tm_cmd() and ufshcd_tmc_handler() are already
> synchronized using hba->tmf_rqs and hba->outstanding_tasks under the
> host_lock spinlock.
>
> It is also not necessary (nor typical) to clear req->end_io_data because
> the block layer does it before allocating out requests e.g. via
> blk_get_request().
>
> So fix by not clearing it.
>
> Fixes: f5ef336fd2e4c3 ("scsi: ufs: core: Fix task management completion")
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
> drivers/scsi/ufs/ufshcd.c | 5 -----
> 1 file changed, 5 deletions(-)
>
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index 95be7ecdfe10..f34b3994d1aa 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -6550,11 +6550,6 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
> err = wait_for_completion_io_timeout(&wait,
> msecs_to_jiffies(TM_CMD_TIMEOUT));
> if (!err) {
> - /*
> - * Make sure that ufshcd_compl_tm() does not trigger a
> - * use-after-free.
> - */
> - req->end_io_data = NULL;
> ufshcd_add_tm_upiu_trace(hba, task_tag, UFS_TM_ERR);
> dev_err(hba->dev, "%s: task management cmd 0x%.2x timed-out\n",
> __func__, tm_function);
With this patch applied ufshcd_tmc_handler() can trigger a
use-after-free of the stack memory used for the 'wait' completion.
Wouldn't it be better to keep the code that clears req->end_io_data and
to change complete(c) into if(c) complete(c) in ufshcd_tmc_handler()?
Thanks,
Bart.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] scsi: ufs: core: Fix task management completion timeout race
2021-10-14 4:14 ` Bart Van Assche
@ 2021-10-14 6:02 ` Adrian Hunter
2021-10-14 16:47 ` Bart Van Assche
0 siblings, 1 reply; 8+ messages in thread
From: Adrian Hunter @ 2021-10-14 6:02 UTC (permalink / raw)
To: Bart Van Assche, Martin K . Petersen
Cc: James E . J . Bottomley, Bean Huo, Avri Altman, Alim Akhtar,
Can Guo, Asutosh Das, linux-scsi
On 14/10/2021 07:14, Bart Van Assche wrote:
> On 10/13/21 08:01, Adrian Hunter wrote:
>> __ufshcd_issue_tm_cmd() clears req->end_io_data after timing out,
>> which races with the completion function ufshcd_tmc_handler() which
>> expects req->end_io_data to have a value.
>>
>> Note __ufshcd_issue_tm_cmd() and ufshcd_tmc_handler() are already
>> synchronized using hba->tmf_rqs and hba->outstanding_tasks under the
>> host_lock spinlock.
>>
>> It is also not necessary (nor typical) to clear req->end_io_data because
>> the block layer does it before allocating out requests e.g. via
>> blk_get_request().
>>
>> So fix by not clearing it.
>>
>> Fixes: f5ef336fd2e4c3 ("scsi: ufs: core: Fix task management completion")
>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
>> ---
>> drivers/scsi/ufs/ufshcd.c | 5 -----
>> 1 file changed, 5 deletions(-)
>>
>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
>> index 95be7ecdfe10..f34b3994d1aa 100644
>> --- a/drivers/scsi/ufs/ufshcd.c
>> +++ b/drivers/scsi/ufs/ufshcd.c
>> @@ -6550,11 +6550,6 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
>> err = wait_for_completion_io_timeout(&wait,
>> msecs_to_jiffies(TM_CMD_TIMEOUT));
>> if (!err) {
>> - /*
>> - * Make sure that ufshcd_compl_tm() does not trigger a
>> - * use-after-free.
>> - */
>> - req->end_io_data = NULL;
>> ufshcd_add_tm_upiu_trace(hba, task_tag, UFS_TM_ERR);
>> dev_err(hba->dev, "%s: task management cmd 0x%.2x timed-out\n",
>> __func__, tm_function);
>
> With this patch applied ufshcd_tmc_handler() can trigger a use-after-free of the stack memory used for the 'wait' completion.
AFAICT that would only happen because blk_mq_tagset_busy_iter() is not synchronized with respect to blk_put_request().
But ufshcd_tmc_handler() does not use blk_mq_tagset_busy_iter() anymore, so that can't happen.
Wouldn't it be better to keep the code that clears req->end_io_data and to change complete(c) into if(c) complete(c) in ufshcd_tmc_handler()?
If that were needed, it would imply the synchronization was broken i.e. why are we referencing a request that has already been through blk_put_request()?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] scsi: ufs: core: Fix task management completion timeout race
2021-10-14 6:02 ` Adrian Hunter
@ 2021-10-14 16:47 ` Bart Van Assche
2021-10-14 18:50 ` Adrian Hunter
0 siblings, 1 reply; 8+ messages in thread
From: Bart Van Assche @ 2021-10-14 16:47 UTC (permalink / raw)
To: Adrian Hunter, Martin K . Petersen
Cc: James E . J . Bottomley, Bean Huo, Avri Altman, Alim Akhtar,
Can Guo, Asutosh Das, linux-scsi
On 10/13/21 11:02 PM, Adrian Hunter wrote:
> On 14/10/2021 07:14, Bart Van Assche wrote:
>> Wouldn't it be better to keep the code that clears req->end_io_data
>> and to change complete(c) into if(c) complete(c) in
>> ufshcd_tmc_handler()?
>
> If that were needed, it would imply the synchronization was broken
> i.e. why are we referencing a request that has already been through
> blk_put_request()?
The scenario I'm worried about is as follows:
* __ufshcd_issue_tm_cmd() issues a task management function.
* No completion is received before TM_CMD_TIMEOUT has expired (100 ms).
* ufshcd_clear_tm_cmd() fails.
* The TMF completes, ufshcd_tmc_handler() is called and that function
calls complete(req->end_io_data).
Can this happen?
I agree that this scenario involves completion of a request that has
already been through blk_put_request().
Thanks,
Bart.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] scsi: ufs: core: Fix task management completion timeout race
2021-10-14 16:47 ` Bart Van Assche
@ 2021-10-14 18:50 ` Adrian Hunter
2021-10-15 5:41 ` Adrian Hunter
0 siblings, 1 reply; 8+ messages in thread
From: Adrian Hunter @ 2021-10-14 18:50 UTC (permalink / raw)
To: Bart Van Assche, Martin K . Petersen
Cc: James E . J . Bottomley, Bean Huo, Avri Altman, Alim Akhtar,
Can Guo, Asutosh Das, linux-scsi
On 14/10/2021 19:47, Bart Van Assche wrote:
> On 10/13/21 11:02 PM, Adrian Hunter wrote:
>> On 14/10/2021 07:14, Bart Van Assche wrote:
>>> Wouldn't it be better to keep the code that clears req->end_io_data
>>> and to change complete(c) into if(c) complete(c) in
>>> ufshcd_tmc_handler()?
>>
>> If that were needed, it would imply the synchronization was broken
>> i.e. why are we referencing a request that has already been through
>> blk_put_request()?
>
> The scenario I'm worried about is as follows:
> * __ufshcd_issue_tm_cmd() issues a task management function.
> * No completion is received before TM_CMD_TIMEOUT has expired (100 ms).
> * ufshcd_clear_tm_cmd() fails.
> * The TMF completes, ufshcd_tmc_handler() is called and that function calls complete(req->end_io_data).
>
> Can this happen?
No because the tag's bit is cleared from outstanding_tasks before blk_put_request() and
access to outstanding_tasks is protected by host_lock in both __ufshcd_issue_tm_cmd()
and ufshcd_clear_tm_cmd().
>
> I agree that this scenario involves completion of a request that has already been through blk_put_request().
>
> Thanks,
>
> Bart.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] scsi: ufs: core: Fix task management completion timeout race
2021-10-13 15:01 ` [PATCH 1/1] " Adrian Hunter
2021-10-14 4:14 ` Bart Van Assche
@ 2021-10-14 19:18 ` Bart Van Assche
1 sibling, 0 replies; 8+ messages in thread
From: Bart Van Assche @ 2021-10-14 19:18 UTC (permalink / raw)
To: Adrian Hunter, Martin K . Petersen
Cc: James E . J . Bottomley, Bean Huo, Avri Altman, Alim Akhtar,
Can Guo, Asutosh Das, linux-scsi
On 10/13/21 8:01 AM, Adrian Hunter wrote:
> __ufshcd_issue_tm_cmd() clears req->end_io_data after timing out,
> which races with the completion function ufshcd_tmc_handler() which
> expects req->end_io_data to have a value.
>
> Note __ufshcd_issue_tm_cmd() and ufshcd_tmc_handler() are already
> synchronized using hba->tmf_rqs and hba->outstanding_tasks under the
> host_lock spinlock.
>
> It is also not necessary (nor typical) to clear req->end_io_data because
> the block layer does it before allocating out requests e.g. via
> blk_get_request().
>
> So fix by not clearing it.
>
> Fixes: f5ef336fd2e4c3 ("scsi: ufs: core: Fix task management completion")
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
> drivers/scsi/ufs/ufshcd.c | 5 -----
> 1 file changed, 5 deletions(-)
>
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index 95be7ecdfe10..f34b3994d1aa 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -6550,11 +6550,6 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
> err = wait_for_completion_io_timeout(&wait,
> msecs_to_jiffies(TM_CMD_TIMEOUT));
> if (!err) {
> - /*
> - * Make sure that ufshcd_compl_tm() does not trigger a
> - * use-after-free.
> - */
> - req->end_io_data = NULL;
> ufshcd_add_tm_upiu_trace(hba, task_tag, UFS_TM_ERR);
> dev_err(hba->dev, "%s: task management cmd 0x%.2x timed-out\n",
> __func__, tm_function);
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] scsi: ufs: core: Fix task management completion timeout race
2021-10-14 18:50 ` Adrian Hunter
@ 2021-10-15 5:41 ` Adrian Hunter
0 siblings, 0 replies; 8+ messages in thread
From: Adrian Hunter @ 2021-10-15 5:41 UTC (permalink / raw)
To: Bart Van Assche, Martin K . Petersen
Cc: James E . J . Bottomley, Bean Huo, Avri Altman, Alim Akhtar,
Can Guo, Asutosh Das, linux-scsi
On 14/10/2021 21:50, Adrian Hunter wrote:
> On 14/10/2021 19:47, Bart Van Assche wrote:
>> On 10/13/21 11:02 PM, Adrian Hunter wrote:
>>> On 14/10/2021 07:14, Bart Van Assche wrote:
>>>> Wouldn't it be better to keep the code that clears req->end_io_data
>>>> and to change complete(c) into if(c) complete(c) in
>>>> ufshcd_tmc_handler()?
>>>
>>> If that were needed, it would imply the synchronization was broken
>>> i.e. why are we referencing a request that has already been through
>>> blk_put_request()?
>>
>> The scenario I'm worried about is as follows:
>> * __ufshcd_issue_tm_cmd() issues a task management function.
>> * No completion is received before TM_CMD_TIMEOUT has expired (100 ms).
>> * ufshcd_clear_tm_cmd() fails.
>> * The TMF completes, ufshcd_tmc_handler() is called and that function calls complete(req->end_io_data).
>>
>> Can this happen?
>
> No because the tag's bit is cleared from outstanding_tasks before blk_put_request() and
> access to outstanding_tasks is protected by host_lock in both __ufshcd_issue_tm_cmd()
> and ufshcd_clear_tm_cmd().
Although I just noticed a different issue.
In ufshcd_tmc_handler() the task doorbell register needs to be read
in conjunction with outstanding_tasks i.e. under the spinlock
I will send a V2.
>
>>
>> I agree that this scenario involves completion of a request that has already been through blk_put_request().
>>
>> Thanks,
>>
>> Bart.
>
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-10-15 5:41 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-10-13 15:01 [PATCH 0/1] scsi: ufs: core: Fix task management completion timeout race Adrian Hunter
2021-10-13 15:01 ` [PATCH 1/1] " Adrian Hunter
2021-10-14 4:14 ` Bart Van Assche
2021-10-14 6:02 ` Adrian Hunter
2021-10-14 16:47 ` Bart Van Assche
2021-10-14 18:50 ` Adrian Hunter
2021-10-15 5:41 ` Adrian Hunter
2021-10-14 19:18 ` Bart Van Assche
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox