* [PATCH] nvme: avoid double free special payload
@ 2024-06-11 10:02 brookxu.cn
2024-06-11 10:53 ` Sagi Grimberg
0 siblings, 1 reply; 4+ messages in thread
From: brookxu.cn @ 2024-06-11 10:02 UTC (permalink / raw)
To: kbusch, axboe, hch, sagi, maxg; +Cc: linux-nvme
From: Chunguang Xu <chunguang.xu@shopee.com>
Now we may double free spacial payload for some requests, such as
discard. This will corrupt the memory and lead to kernel crash. Now we
will free special payload before retry it. If we disconnect device
before reconnect success, then we will fail request by
nvme_fail_nonready_command(), as a result we will double free
special payload. Here try to fix it, we may can clear RQF_SPECIAL_LOAD
bit after we cleanup command. This will not broken following clean
logic of blkmq, as nvme request will not be partial complete.
Fixes: 16686f3a6c3c ("nvme: move common call to nvme_cleanup_cmd to core layer")
Signed-off-by: Chunguang Xu <chunguang.xu@shopee.com>
---
drivers/nvme/host/core.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index f5d150c62955..c40930d10bd3 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -998,6 +998,7 @@ void nvme_cleanup_cmd(struct request *req)
clear_bit_unlock(0, &ctrl->discard_page_busy);
else
kfree(bvec_virt(&req->special_vec));
+ req->rq_flags &= ~RQF_SPECIAL_PAYLOAD;
}
}
EXPORT_SYMBOL_GPL(nvme_cleanup_cmd);
--
2.25.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] nvme: avoid double free special payload
2024-06-11 10:02 [PATCH] nvme: avoid double free special payload brookxu.cn
@ 2024-06-11 10:53 ` Sagi Grimberg
2024-06-11 11:47 ` Max Gurtovoy
0 siblings, 1 reply; 4+ messages in thread
From: Sagi Grimberg @ 2024-06-11 10:53 UTC (permalink / raw)
To: brookxu.cn, kbusch, axboe, hch, maxg; +Cc: linux-nvme
Looks reasonable.
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
On 11/06/2024 13:02, brookxu.cn wrote:
> From: Chunguang Xu <chunguang.xu@shopee.com>
>
> Now we may double free spacial payload for some requests, such as
> discard. This will corrupt the memory and lead to kernel crash. Now we
> will free special payload before retry it. If we disconnect device
> before reconnect success, then we will fail request by
> nvme_fail_nonready_command(), as a result we will double free
> special payload. Here try to fix it, we may can clear RQF_SPECIAL_LOAD
> bit after we cleanup command. This will not broken following clean
> logic of blkmq, as nvme request will not be partial complete.
>
> Fixes: 16686f3a6c3c ("nvme: move common call to nvme_cleanup_cmd to core layer")
> Signed-off-by: Chunguang Xu <chunguang.xu@shopee.com>
> ---
> drivers/nvme/host/core.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index f5d150c62955..c40930d10bd3 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -998,6 +998,7 @@ void nvme_cleanup_cmd(struct request *req)
> clear_bit_unlock(0, &ctrl->discard_page_busy);
> else
> kfree(bvec_virt(&req->special_vec));
> + req->rq_flags &= ~RQF_SPECIAL_PAYLOAD;
> }
> }
> EXPORT_SYMBOL_GPL(nvme_cleanup_cmd);
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] nvme: avoid double free special payload
2024-06-11 10:53 ` Sagi Grimberg
@ 2024-06-11 11:47 ` Max Gurtovoy
2024-06-12 18:01 ` Keith Busch
0 siblings, 1 reply; 4+ messages in thread
From: Max Gurtovoy @ 2024-06-11 11:47 UTC (permalink / raw)
To: Sagi Grimberg, brookxu.cn, kbusch, axboe, hch, maxg; +Cc: linux-nvme
hi,
On 11/06/2024 13:53, Sagi Grimberg wrote:
> Looks reasonable.
>
> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
>
>
> On 11/06/2024 13:02, brookxu.cn wrote:
>> From: Chunguang Xu <chunguang.xu@shopee.com>
>>
>> Now we may double free spacial payload for some requests, such as
>> discard. This will corrupt the memory and lead to kernel crash. Now we
>> will free special payload before retry it. If we disconnect device
>> before reconnect success, then we will fail request by
>> nvme_fail_nonready_command(), as a result we will double free
>> special payload. Here try to fix it, we may can clear RQF_SPECIAL_LOAD
>> bit after we cleanup command. This will not broken following clean
>> logic of blkmq, as nvme request will not be partial complete.
>>
>> Fixes: 16686f3a6c3c ("nvme: move common call to nvme_cleanup_cmd to
>> core layer")
I'm not sure that this commit caused the bug. The nvme_cleanup_cmd() was
called in this path also before this commit.
>> Signed-off-by: Chunguang Xu <chunguang.xu@shopee.com>
The fix looks fine to me, but the commit message can be improved a bit
to be more clear about the scenario.
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>> ---
>> drivers/nvme/host/core.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index f5d150c62955..c40930d10bd3 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -998,6 +998,7 @@ void nvme_cleanup_cmd(struct request *req)
>> clear_bit_unlock(0, &ctrl->discard_page_busy);
>> else
>> kfree(bvec_virt(&req->special_vec));
>> + req->rq_flags &= ~RQF_SPECIAL_PAYLOAD;
>> }
>> }
>> EXPORT_SYMBOL_GPL(nvme_cleanup_cmd);
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] nvme: avoid double free special payload
2024-06-11 11:47 ` Max Gurtovoy
@ 2024-06-12 18:01 ` Keith Busch
0 siblings, 0 replies; 4+ messages in thread
From: Keith Busch @ 2024-06-12 18:01 UTC (permalink / raw)
To: Max Gurtovoy; +Cc: Sagi Grimberg, brookxu.cn, axboe, hch, maxg, linux-nvme
On Tue, Jun 11, 2024 at 02:47:24PM +0300, Max Gurtovoy wrote:
> On 11/06/2024 13:53, Sagi Grimberg wrote:
> > On 11/06/2024 13:02, brookxu.cn wrote:
> > > From: Chunguang Xu <chunguang.xu@shopee.com>
> > >
> > > Now we may double free spacial payload for some requests, such as
> > > discard. This will corrupt the memory and lead to kernel crash. Now we
> > > will free special payload before retry it. If we disconnect device
> > > before reconnect success, then we will fail request by
> > > nvme_fail_nonready_command(), as a result we will double free
> > > special payload. Here try to fix it, we may can clear RQF_SPECIAL_LOAD
> > > bit after we cleanup command. This will not broken following clean
> > > logic of blkmq, as nvme request will not be partial complete.
> > >
> > > Fixes: 16686f3a6c3c ("nvme: move common call to nvme_cleanup_cmd to
> > > core layer")
> I'm not sure that this commit caused the bug. The nvme_cleanup_cmd() was
> called in this path also before this commit.
> > > Signed-off-by: Chunguang Xu <chunguang.xu@shopee.com>
>
> The fix looks fine to me, but the commit message can be improved a bit to be
> more clear about the scenario.
Yeah, that's a difficult read. I modified the commit message and applied
to nvme-6.10. Thanks!
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-06-12 18:01 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-11 10:02 [PATCH] nvme: avoid double free special payload brookxu.cn
2024-06-11 10:53 ` Sagi Grimberg
2024-06-11 11:47 ` Max Gurtovoy
2024-06-12 18:01 ` Keith Busch
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox