public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed
* [PATCH] nvme: avoid double free special payload
@ 2024-06-11 10:02 brookxu.cn
  2024-06-11 10:53 ` Sagi Grimberg
  0 siblings, 1 reply; 4+ messages in thread
From: brookxu.cn @ 2024-06-11 10:02 UTC (permalink / raw)
  To: kbusch, axboe, hch, sagi, maxg; +Cc: linux-nvme

From: Chunguang Xu <chunguang.xu@shopee.com>

Now we may double free spacial payload for some requests, such as
discard. This will corrupt the memory and lead to kernel crash. Now we
will free special payload before retry it. If we disconnect device
before reconnect success, then we will fail request by
nvme_fail_nonready_command(), as a result we will double free
special payload. Here try to fix it, we may can clear RQF_SPECIAL_LOAD
bit after we cleanup command. This will not broken following clean
logic of blkmq, as nvme request will not be partial complete.

Fixes: 16686f3a6c3c ("nvme: move common call to nvme_cleanup_cmd to core layer")
Signed-off-by: Chunguang Xu <chunguang.xu@shopee.com>
---
 drivers/nvme/host/core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index f5d150c62955..c40930d10bd3 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -998,6 +998,7 @@ void nvme_cleanup_cmd(struct request *req)
 			clear_bit_unlock(0, &ctrl->discard_page_busy);
 		else
 			kfree(bvec_virt(&req->special_vec));
+		req->rq_flags &= ~RQF_SPECIAL_PAYLOAD;
 	}
 }
 EXPORT_SYMBOL_GPL(nvme_cleanup_cmd);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] nvme: avoid double free special payload
  2024-06-11 10:02 [PATCH] nvme: avoid double free special payload brookxu.cn
@ 2024-06-11 10:53 ` Sagi Grimberg
  2024-06-11 11:47   ` Max Gurtovoy
  0 siblings, 1 reply; 4+ messages in thread
From: Sagi Grimberg @ 2024-06-11 10:53 UTC (permalink / raw)
  To: brookxu.cn, kbusch, axboe, hch, maxg; +Cc: linux-nvme

Looks reasonable.

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>


On 11/06/2024 13:02, brookxu.cn wrote:
> From: Chunguang Xu <chunguang.xu@shopee.com>
>
> Now we may double free spacial payload for some requests, such as
> discard. This will corrupt the memory and lead to kernel crash. Now we
> will free special payload before retry it. If we disconnect device
> before reconnect success, then we will fail request by
> nvme_fail_nonready_command(), as a result we will double free
> special payload. Here try to fix it, we may can clear RQF_SPECIAL_LOAD
> bit after we cleanup command. This will not broken following clean
> logic of blkmq, as nvme request will not be partial complete.
>
> Fixes: 16686f3a6c3c ("nvme: move common call to nvme_cleanup_cmd to core layer")
> Signed-off-by: Chunguang Xu <chunguang.xu@shopee.com>
> ---
>   drivers/nvme/host/core.c | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index f5d150c62955..c40930d10bd3 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -998,6 +998,7 @@ void nvme_cleanup_cmd(struct request *req)
>   			clear_bit_unlock(0, &ctrl->discard_page_busy);
>   		else
>   			kfree(bvec_virt(&req->special_vec));
> +		req->rq_flags &= ~RQF_SPECIAL_PAYLOAD;
>   	}
>   }
>   EXPORT_SYMBOL_GPL(nvme_cleanup_cmd);



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] nvme: avoid double free special payload
  2024-06-11 10:53 ` Sagi Grimberg
@ 2024-06-11 11:47   ` Max Gurtovoy
  2024-06-12 18:01     ` Keith Busch
  0 siblings, 1 reply; 4+ messages in thread
From: Max Gurtovoy @ 2024-06-11 11:47 UTC (permalink / raw)
  To: Sagi Grimberg, brookxu.cn, kbusch, axboe, hch, maxg; +Cc: linux-nvme

hi,

On 11/06/2024 13:53, Sagi Grimberg wrote:
> Looks reasonable.
>
> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
>
>
> On 11/06/2024 13:02, brookxu.cn wrote:
>> From: Chunguang Xu <chunguang.xu@shopee.com>
>>
>> Now we may double free spacial payload for some requests, such as
>> discard. This will corrupt the memory and lead to kernel crash. Now we
>> will free special payload before retry it. If we disconnect device
>> before reconnect success, then we will fail request by
>> nvme_fail_nonready_command(), as a result we will double free
>> special payload. Here try to fix it, we may can clear RQF_SPECIAL_LOAD
>> bit after we cleanup command. This will not broken following clean
>> logic of blkmq, as nvme request will not be partial complete.
>>
>> Fixes: 16686f3a6c3c ("nvme: move common call to nvme_cleanup_cmd to 
>> core layer")
I'm not sure that this commit caused the bug. The nvme_cleanup_cmd() was 
called in this path also before this commit.
>> Signed-off-by: Chunguang Xu <chunguang.xu@shopee.com>

The fix looks fine to me, but the commit message can be improved a bit 
to be more clear about the scenario.

Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>

>> ---
>>   drivers/nvme/host/core.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index f5d150c62955..c40930d10bd3 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -998,6 +998,7 @@ void nvme_cleanup_cmd(struct request *req)
>>               clear_bit_unlock(0, &ctrl->discard_page_busy);
>>           else
>>               kfree(bvec_virt(&req->special_vec));
>> +        req->rq_flags &= ~RQF_SPECIAL_PAYLOAD;
>>       }
>>   }
>>   EXPORT_SYMBOL_GPL(nvme_cleanup_cmd);
>


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] nvme: avoid double free special payload
  2024-06-11 11:47   ` Max Gurtovoy
@ 2024-06-12 18:01     ` Keith Busch
  0 siblings, 0 replies; 4+ messages in thread
From: Keith Busch @ 2024-06-12 18:01 UTC (permalink / raw)
  To: Max Gurtovoy; +Cc: Sagi Grimberg, brookxu.cn, axboe, hch, maxg, linux-nvme

On Tue, Jun 11, 2024 at 02:47:24PM +0300, Max Gurtovoy wrote:
> On 11/06/2024 13:53, Sagi Grimberg wrote:
> > On 11/06/2024 13:02, brookxu.cn wrote:
> > > From: Chunguang Xu <chunguang.xu@shopee.com>
> > > 
> > > Now we may double free spacial payload for some requests, such as
> > > discard. This will corrupt the memory and lead to kernel crash. Now we
> > > will free special payload before retry it. If we disconnect device
> > > before reconnect success, then we will fail request by
> > > nvme_fail_nonready_command(), as a result we will double free
> > > special payload. Here try to fix it, we may can clear RQF_SPECIAL_LOAD
> > > bit after we cleanup command. This will not broken following clean
> > > logic of blkmq, as nvme request will not be partial complete.
> > > 
> > > Fixes: 16686f3a6c3c ("nvme: move common call to nvme_cleanup_cmd to
> > > core layer")
> I'm not sure that this commit caused the bug. The nvme_cleanup_cmd() was
> called in this path also before this commit.
> > > Signed-off-by: Chunguang Xu <chunguang.xu@shopee.com>
> 
> The fix looks fine to me, but the commit message can be improved a bit to be
> more clear about the scenario.

Yeah, that's a difficult read. I modified the commit message and applied
to nvme-6.10. Thanks!


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-06-12 18:01 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-11 10:02 [PATCH] nvme: avoid double free special payload brookxu.cn
2024-06-11 10:53 ` Sagi Grimberg
2024-06-11 11:47   ` Max Gurtovoy
2024-06-12 18:01     ` Keith Busch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox