* [PATCH] blk-mq: always clear rq->bio in blk_complete_request()
@ 2025-12-17 17:18 Michael Liang
2025-12-18 8:51 ` Christoph Hellwig
0 siblings, 1 reply; 3+ messages in thread
From: Michael Liang @ 2025-12-17 17:18 UTC (permalink / raw)
To: axboe; +Cc: linux-block, linux-kernel, mliang
Commit ab3e1d3bbab9 ("block: allow end_io based requests in the
completion batch handling") changed blk_complete_request() so that
rq->bio and rq->__data_len are only cleared when ->end_io is NULL.
This conditional clearing is incorrect. The block layer guarantees that
all bios attached to the request are fully completed and released before
blk_complete_request() is called. Leaving rq->bio pointing to already
completed bios results in stale pointers that may be reused immediately
by a bioset allocator.
Stale rq->bio values have been observed to cause double-initialization
of cloned bios in request-based device-mapper targets, leading to
use-after-free and double-free scenarios. One such case occurs when
using dm-multipath on top of a PCIe NVMe namespace, where cloned request
bios are freed during blk_complete_request(), but rq->bio is left
intact. Subsequent clone teardown then attempts to free the same bios
again via blk_rq_unprep_clone(). Below is the codepath of such double-free:
nvme_pci_complete_batch()
nvme_complete_batch()
blk_mq_end_request_batch()
blk_complete_request() // called on a DM-target clone req
bio_endio() // 1st free of all bios of the clone req
...
rq->end_io() // calls end_clone_request() since @rq is a clone req
dm_compelte_request(tio->orig)
dm_softirq_done() // Note this actually defers to softirq context
dm_done()
dm_end_request() // end the clone request
blk_rq_unprep_clone() // 2nd free of BIOs on the clone req
There is no valid case where rq->bio may still reference live bios at
this point. Clear rq->bio and rq->__data_len unconditionally to avoid
leaking stale pointer state across completions.
Fixes: ab3e1d3bbab9 ("block: allow end_io based requests in the
completion batch handling")
Signed-off-by: Michael Liang <mliang@purestorage.com>
---
block/blk-mq.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index d626d32f6e57..b8b9ca2200e4 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -905,10 +905,8 @@ static void blk_complete_request(struct request *req)
* can find how many bytes remain in the request
* later.
*/
- if (!req->end_io) {
- req->bio = NULL;
- req->__data_len = 0;
- }
+ req->bio = NULL;
+ req->__data_len = 0;
}
/**
--
2.34.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] blk-mq: always clear rq->bio in blk_complete_request()
2025-12-17 17:18 [PATCH] blk-mq: always clear rq->bio in blk_complete_request() Michael Liang
@ 2025-12-18 8:51 ` Christoph Hellwig
2025-12-18 16:37 ` Michael Liang
0 siblings, 1 reply; 3+ messages in thread
From: Christoph Hellwig @ 2025-12-18 8:51 UTC (permalink / raw)
To: Michael Liang; +Cc: axboe, linux-block, linux-kernel
On Wed, Dec 17, 2025 at 10:18:53AM -0700, Michael Liang wrote:
> Commit ab3e1d3bbab9 ("block: allow end_io based requests in the
> completion batch handling") changed blk_complete_request() so that
> rq->bio and rq->__data_len are only cleared when ->end_io is NULL.
>
> This conditional clearing is incorrect. The block layer guarantees that
> all bios attached to the request are fully completed and released before
> blk_complete_request() is called. Leaving rq->bio pointing to already
> completed bios results in stale pointers that may be reused immediately
> by a bioset allocator.
Passthrough commands keep an extra reference on the bio and need the
pointer to call blk_rq_unmap_user from the completion handler.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] blk-mq: always clear rq->bio in blk_complete_request()
2025-12-18 8:51 ` Christoph Hellwig
@ 2025-12-18 16:37 ` Michael Liang
0 siblings, 0 replies; 3+ messages in thread
From: Michael Liang @ 2025-12-18 16:37 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: axboe, linux-block, linux-kernel
On Thu, Dec 18, 2025 at 12:51:14AM -0800, Christoph Hellwig wrote:
> On Wed, Dec 17, 2025 at 10:18:53AM -0700, Michael Liang wrote:
> > Commit ab3e1d3bbab9 ("block: allow end_io based requests in the
> > completion batch handling") changed blk_complete_request() so that
> > rq->bio and rq->__data_len are only cleared when ->end_io is NULL.
> >
> > This conditional clearing is incorrect. The block layer guarantees that
> > all bios attached to the request are fully completed and released before
> > blk_complete_request() is called. Leaving rq->bio pointing to already
> > completed bios results in stale pointers that may be reused immediately
> > by a bioset allocator.
>
> Passthrough commands keep an extra reference on the bio and need the
> pointer to call blk_rq_unmap_user from the completion handler.
>
Are you referring to nvme_uring_cmd_io() and nvme_submit_user_cmd()?
From what I see req->bio is cached in both cases and from the comment in
nvmme_uring_cmd_io() it actually expects req->bio is NULL after I/O
completion. Anyway my point is to me blk_complete_request() is functionally
similar to blk_update_request(), and in blk_update_request() req->bio is
updated and if all I/Os are completed it's cleared to NULL. So I think
it makes sense to keep the logic consistent here. But anyway let me know
if I miss something here.
Thanks,
Michael
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-12-18 16:37 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-17 17:18 [PATCH] blk-mq: always clear rq->bio in blk_complete_request() Michael Liang
2025-12-18 8:51 ` Christoph Hellwig
2025-12-18 16:37 ` Michael Liang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).