public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: alex+zkern@zazolabs.com
Cc: Yi Zhang <yi.zhang@redhat.com>, Jens Axboe <axboe@kernel.dk>,
	fengnanchang@gmail.com, linux-block <linux-block@vger.kernel.org>,
	Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Subject: Re: [bug report][bisected] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049
Date: Fri, 16 Jan 2026 20:41:29 +0800	[thread overview]
Message-ID: <aWox-MEzRBKG5UjX@fedora> (raw)
In-Reply-To: <8c523b07-f868-41c9-88f1-753c77ef85fb@zazolabs.com>

On Fri, Jan 16, 2026 at 01:54:15PM +0200, Alexander Atanasov wrote:
> Hello Ming,
> 
> On 14.01.26 16:11, Ming Lei wrote:
> > On Wed, Jan 14, 2026 at 01:58:03PM +0800, Yi Zhang wrote:
> > > On Thu, Jan 8, 2026 at 2:39 PM Yi Zhang <yi.zhang@redhat.com> wrote:
> > > > 
> > > > On Thu, Jan 8, 2026 at 12:48 AM Jens Axboe <axboe@kernel.dk> wrote:
> > > > > 
> > > > > On 1/7/26 9:39 AM, Yi Zhang wrote:
> > > > > > Hi
> > > > > > The following issue[2] was triggered by blktests nvme/059 and it's
> > > > > 
> > > > > nvme/049 presumably?
> > > > > 
> > > > Yes.
> > > > 
> > > > > > 100% reproduced with commit[1]. Please help check it and let me know
> > > > > > if you need any info/test for it.
> > > > > > Seems it's one regression, I will try to test with the latest
> > > > > > linux-block/for-next and also bisect it tomorrow.
> > > > > 
> > > > > Doesn't reproduce for me on the current tree, but nothing since:
> > > > > 
> > > > > > commit 5ee81d4ae52ec4e9206efb4c1b06e269407aba11
> > > > > > Merge: 29cefd61e0c6 fcf463b92a08
> > > > > > Author: Jens Axboe <axboe@kernel.dk>
> > > > > > Date:   Tue Jan 6 05:48:07 2026 -0700
> > > > > > 
> > > > > >      Merge branch 'for-7.0/blk-pvec' into for-next
> > > > > 
> > > > > should have impacted that. So please do bisect.
> > > > 
> > > > Hi Jens
> > > > The issue seems was introduced from below commit.
> > > > and the issue cannot be reproduced after reverting this commit.
> > > 
> > > The issue still can be reproduced on the latest linux-block/for-next
> > 
> > Hi Yi,
> > 
> > Can you try the following patch?
> > 
> > 
> > diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
> > index a9c097dacad6..7b0e62b8322b 100644
> > --- a/drivers/nvme/host/ioctl.c
> > +++ b/drivers/nvme/host/ioctl.c
> > @@ -425,14 +425,23 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req,
> >   	pdu->result = le64_to_cpu(nvme_req(req)->result.u64);
> >   	/*
> > -	 * IOPOLL could potentially complete this request directly, but
> > -	 * if multiple rings are polling on the same queue, then it's possible
> > -	 * for one ring to find completions for another ring. Punting the
> > -	 * completion via task_work will always direct it to the right
> > -	 * location, rather than potentially complete requests for ringA
> > -	 * under iopoll invocations from ringB.
> > +	 * For IOPOLL, complete the request inline. The request's io_kiocb
> > +	 * uses a union for io_task_work and iopoll_node, so scheduling
> > +	 * task_work would corrupt the iopoll_list while the request is
> > +	 * still on it. io_uring_cmd_done() handles IOPOLL by setting
> > +	 * iopoll_completed rather than scheduling task_work.
> > +	 *
> > +	 * For non-IOPOLL, complete via task_work to ensure we run in the
> > +	 * submitter's context and handling multiple rings is safe.
> >   	 */
> > -	io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb);
> > +	if (blk_rq_is_poll(req)) {
> > +		if (pdu->bio)
> > +			blk_rq_unmap_user(pdu->bio);
> > +		io_uring_cmd_done32(ioucmd, pdu->status, pdu->result, 0);
> > +	} else {
> > +		io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb);
> > +	}
> > +
> >   	return RQ_END_IO_FREE;
> >   }
> 
> 
> While this is a good optimisation and it will fix the list issue for a
> single user - it may crash with multiple users of the context. I am still
> learning this code, so excuse my ignorance here and there.

Jens has sent the following fix already:

https://lore.kernel.org/io-uring/aWhGEMsaOf752f5z@fedora/T/#t

> 
> The bisected patch 3c7d76d6128a changed io_wq_work_list which looks like
> safe to be used  without locks (it is a derivate of llist) , list_head
> require proper locking to be safe.
> 
> ctx can be used to poll multiple files, iopoll_list is a list for that
> reason.
> sqpoll is calling io_iopoll_req_issued without lock -> it does list_add_tail
> if that races with other list addition or deletion it will corrupt the list.
> 
> is there any mechanism to prevent that? or i am missing something?

io_iopoll_req_issued() will grab ctx->uring_lock if it isn't held.


Thanks,
Ming


      reply	other threads:[~2026-01-16 12:41 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-07 16:39 [bug report] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049 Yi Zhang
2026-01-07 16:48 ` Jens Axboe
2026-01-08  6:39   ` Yi Zhang
2026-01-14  5:58     ` [bug report][bisected] " Yi Zhang
2026-01-14  9:40       ` Alexander Atanasov
2026-01-14 12:43         ` Christoph Hellwig
2026-01-14 14:11       ` Ming Lei
2026-01-14 14:43         ` Jens Axboe
2026-01-14 14:58           ` Jens Axboe
2026-01-14 15:20             ` Ming Lei
2026-01-14 15:26               ` Jens Axboe
2026-01-16 11:54         ` Alexander Atanasov
2026-01-16 12:41           ` Ming Lei [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aWox-MEzRBKG5UjX@fedora \
    --to=ming.lei@redhat.com \
    --cc=alex+zkern@zazolabs.com \
    --cc=axboe@kernel.dk \
    --cc=fengnanchang@gmail.com \
    --cc=linux-block@vger.kernel.org \
    --cc=shinichiro.kawasaki@wdc.com \
    --cc=yi.zhang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox