From: Ming Lei <ming.lei@redhat.com>
To: Caleb Sander Mateos <csander@purestorage.com>
Cc: Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org,
Uday Shankar <ushankar@purestorage.com>
Subject: Re: [PATCH 1/2] ublk: fix handling recovery & reissue in ublk_abort_queue()
Date: Wed, 9 Apr 2025 11:17:44 +0800 [thread overview]
Message-ID: <Z_Xm2DBIaVrpjwNO@fedora> (raw)
In-Reply-To: <CADUfDZrHu=8Muss4zSvdLqq-EVoOE9t9PtqYEm343bTWaA-80g@mail.gmail.com>
On Tue, Apr 08, 2025 at 06:47:20PM -0700, Caleb Sander Mateos wrote:
> On Tue, Apr 8, 2025 at 12:25 AM Ming Lei <ming.lei@redhat.com> wrote:
> >
> > Commit 8284066946e6 ("ublk: grab request reference when the request is handled
> > by userspace") doesn't grab request reference in case of recovery reissue.
> > Then the request can be requeued & re-dispatch & failed when canceling
> > uring command.
> >
> > If it is one zc request, the request can be freed before io_uring
> > returns the zc buffer back, then cause kernel panic:
> >
> > [ 126.773061] BUG: kernel NULL pointer dereference, address: 00000000000000c8
> > [ 126.773657] #PF: supervisor read access in kernel mode
> > [ 126.774052] #PF: error_code(0x0000) - not-present page
> > [ 126.774455] PGD 0 P4D 0
> > [ 126.774698] Oops: Oops: 0000 [#1] SMP NOPTI
> > [ 126.775034] CPU: 13 UID: 0 PID: 1612 Comm: kworker/u64:55 Not tainted 6.14.0_blk+ #182 PREEMPT(full)
> > [ 126.775676] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-1.fc39 04/01/2014
> > [ 126.776275] Workqueue: iou_exit io_ring_exit_work
> > [ 126.776651] RIP: 0010:ublk_io_release+0x14/0x130 [ublk_drv]
> >
> > Fixes it by always grabbing request reference for aborting the request.
> >
> > Reported-by: Caleb Sander Mateos <csander@purestorage.com>
> > Closes: https://lore.kernel.org/linux-block/CADUfDZodKfOGUeWrnAxcZiLT+puaZX8jDHoj_sfHZCOZwhzz6A@mail.gmail.com/
> > Fixes: 8284066946e6 ("ublk: grab request reference when the request is handled by userspace")
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> > drivers/block/ublk_drv.c | 30 ++++++++++++++++++++++++++----
> > 1 file changed, 26 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> > index 2fd05c1bd30b..41bed67508f2 100644
> > --- a/drivers/block/ublk_drv.c
> > +++ b/drivers/block/ublk_drv.c
> > @@ -1140,6 +1140,25 @@ static void ublk_complete_rq(struct kref *ref)
> > __ublk_complete_rq(req);
> > }
> >
> > +static void ublk_do_fail_rq(struct request *req)
> > +{
> > + struct ublk_queue *ubq = req->mq_hctx->driver_data;
> > +
> > + if (ublk_nosrv_should_reissue_outstanding(ubq->dev))
> > + blk_mq_requeue_request(req, false);
> > + else
> > + __ublk_complete_rq(req);
> > +}
> > +
> > +static void ublk_fail_rq_fn(struct kref *ref)
> > +{
> > + struct ublk_rq_data *data = container_of(ref, struct ublk_rq_data,
> > + ref);
> > + struct request *req = blk_mq_rq_from_pdu(data);
> > +
> > + ublk_do_fail_rq(req);
> > +}
> > +
> > /*
> > * Since ublk_rq_task_work_cb always fails requests immediately during
> > * exiting, __ublk_fail_req() is only called from abort context during
> > @@ -1153,10 +1172,13 @@ static void __ublk_fail_req(struct ublk_queue *ubq, struct ublk_io *io,
> > {
> > WARN_ON_ONCE(io->flags & UBLK_IO_FLAG_ACTIVE);
> >
> > - if (ublk_nosrv_should_reissue_outstanding(ubq->dev))
> > - blk_mq_requeue_request(req, false);
> > - else
> > - ublk_put_req_ref(ubq, req);
> > + if (ublk_need_req_ref(ubq)) {
> > + struct ublk_rq_data *data = blk_mq_rq_to_pdu(req);
> > +
> > + kref_put(&data->ref, ublk_fail_rq_fn);
>
> I think this looks good, just a small question. If __ublk_fail_req()
> is called but there is at least 1 other reference, ublk_do_fail_rq()
> won't get called here. When the last reference is dropped, it will
> call __ublk_complete_rq() instead. That checks for io->flags &
> UBLK_IO_FLAG_ABORTED and will terminate the I/O with BLK_STS_IOERR.
> But is that the desired behavior in the
> ublk_nosrv_should_reissue_outstanding() case? I would think it should
> call blk_mq_requeue_request() instead.
Good catch, I'd suggest to fix the kernel panic first, and this behavior
for ublk_nosrv_should_reissue_outstanding() is less serious and can be
fixed as one followup.
Especially, Uday's approach "[PATCH v3] ublk: improve detection and handling of ublk server exit"[1]
may simplify this area lot, with which requests aborting is moved to
ublk_ch_release() after all uring_cmd are completed. Then we can get
correct behavior for ublk_nosrv_should_reissue_outstanding() without any
change basically.
[1] https://lore.kernel.org/linux-block/20250403-ublk_timeout-v3-1-aa09f76c7451@purestorage.com/
thanks,
Ming
next prev parent reply other threads:[~2025-04-09 3:17 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-08 7:24 [PATCH 0/2] ublk: two fixes Ming Lei
2025-04-08 7:24 ` [PATCH 1/2] ublk: fix handling recovery & reissue in ublk_abort_queue() Ming Lei
2025-04-09 1:47 ` Caleb Sander Mateos
2025-04-09 3:17 ` Ming Lei [this message]
2025-04-09 15:47 ` Caleb Sander Mateos
2025-04-08 7:24 ` [PATCH 2/2] ublk: don't fail request for recovery & reissue in case of ubq->canceling Ming Lei
2025-04-08 19:11 ` Uday Shankar
2025-04-09 1:09 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z_Xm2DBIaVrpjwNO@fedora \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=csander@purestorage.com \
--cc=linux-block@vger.kernel.org \
--cc=ushankar@purestorage.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.