linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] blk-mq: always clear rq->bio in blk_complete_request()
@ 2025-12-17 17:18 Michael Liang
  2025-12-18  8:51 ` Christoph Hellwig
  0 siblings, 1 reply; 3+ messages in thread
From: Michael Liang @ 2025-12-17 17:18 UTC (permalink / raw)
  To: axboe; +Cc: linux-block, linux-kernel, mliang

Commit ab3e1d3bbab9 ("block: allow end_io based requests in the
completion batch handling") changed blk_complete_request() so that
rq->bio and rq->__data_len are only cleared when ->end_io is NULL.

This conditional clearing is incorrect. The block layer guarantees that
all bios attached to the request are fully completed and released before
blk_complete_request() is called. Leaving rq->bio pointing to already
completed bios results in stale pointers that may be reused immediately
by a bioset allocator.

Stale rq->bio values have been observed to cause double-initialization
of cloned bios in request-based device-mapper targets, leading to
use-after-free and double-free scenarios. One such case occurs when
using dm-multipath on top of a PCIe NVMe namespace, where cloned request
bios are freed during blk_complete_request(), but rq->bio is left
intact. Subsequent clone teardown then attempts to free the same bios
again via blk_rq_unprep_clone(). Below is the codepath of such double-free:
nvme_pci_complete_batch()
    nvme_complete_batch()
        blk_mq_end_request_batch()
            blk_complete_request() // called on a DM-target clone req
                bio_endio() // 1st free of all bios of the clone req
                ...
            rq->end_io() // calls end_clone_request() since @rq is a clone req
                dm_compelte_request(tio->orig)
                    dm_softirq_done() // Note this actually defers to softirq context
                        dm_done()
                            dm_end_request() // end the clone request
                                blk_rq_unprep_clone() // 2nd free of BIOs on the clone req

There is no valid case where rq->bio may still reference live bios at
this point. Clear rq->bio and rq->__data_len unconditionally to avoid
leaking stale pointer state across completions.

Fixes: ab3e1d3bbab9 ("block: allow end_io based requests in the
completion batch handling")

Signed-off-by: Michael Liang <mliang@purestorage.com>
---
 block/blk-mq.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index d626d32f6e57..b8b9ca2200e4 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -905,10 +905,8 @@ static void blk_complete_request(struct request *req)
 	 * can find how many bytes remain in the request
 	 * later.
 	 */
-	if (!req->end_io) {
-		req->bio = NULL;
-		req->__data_len = 0;
-	}
+	req->bio = NULL;
+	req->__data_len = 0;
 }
 
 /**
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] blk-mq: always clear rq->bio in blk_complete_request()
  2025-12-17 17:18 [PATCH] blk-mq: always clear rq->bio in blk_complete_request() Michael Liang
@ 2025-12-18  8:51 ` Christoph Hellwig
  2025-12-18 16:37   ` Michael Liang
  0 siblings, 1 reply; 3+ messages in thread
From: Christoph Hellwig @ 2025-12-18  8:51 UTC (permalink / raw)
  To: Michael Liang; +Cc: axboe, linux-block, linux-kernel

On Wed, Dec 17, 2025 at 10:18:53AM -0700, Michael Liang wrote:
> Commit ab3e1d3bbab9 ("block: allow end_io based requests in the
> completion batch handling") changed blk_complete_request() so that
> rq->bio and rq->__data_len are only cleared when ->end_io is NULL.
> 
> This conditional clearing is incorrect. The block layer guarantees that
> all bios attached to the request are fully completed and released before
> blk_complete_request() is called. Leaving rq->bio pointing to already
> completed bios results in stale pointers that may be reused immediately
> by a bioset allocator.

Passthrough commands keep an extra reference on the bio and need the
pointer to call blk_rq_unmap_user from the completion handler.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] blk-mq: always clear rq->bio in blk_complete_request()
  2025-12-18  8:51 ` Christoph Hellwig
@ 2025-12-18 16:37   ` Michael Liang
  0 siblings, 0 replies; 3+ messages in thread
From: Michael Liang @ 2025-12-18 16:37 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: axboe, linux-block, linux-kernel

On Thu, Dec 18, 2025 at 12:51:14AM -0800, Christoph Hellwig wrote:
> On Wed, Dec 17, 2025 at 10:18:53AM -0700, Michael Liang wrote:
> > Commit ab3e1d3bbab9 ("block: allow end_io based requests in the
> > completion batch handling") changed blk_complete_request() so that
> > rq->bio and rq->__data_len are only cleared when ->end_io is NULL.
> > 
> > This conditional clearing is incorrect. The block layer guarantees that
> > all bios attached to the request are fully completed and released before
> > blk_complete_request() is called. Leaving rq->bio pointing to already
> > completed bios results in stale pointers that may be reused immediately
> > by a bioset allocator.
> 
> Passthrough commands keep an extra reference on the bio and need the
> pointer to call blk_rq_unmap_user from the completion handler.
> 
Are you referring to nvme_uring_cmd_io() and nvme_submit_user_cmd()?
From what I see req->bio is cached in both cases and from the comment in
nvmme_uring_cmd_io() it actually expects req->bio is NULL after I/O
completion. Anyway my point is to me blk_complete_request() is functionally
similar to blk_update_request(), and in blk_update_request() req->bio is
updated and if all I/Os are completed it's cleared to NULL. So I think
it makes sense to keep the logic consistent here. But anyway let me know
if I miss something here.

Thanks,
Michael

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-12-18 16:37 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-17 17:18 [PATCH] blk-mq: always clear rq->bio in blk_complete_request() Michael Liang
2025-12-18  8:51 ` Christoph Hellwig
2025-12-18 16:37   ` Michael Liang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).