Re: [PATCH 1/2] nvme-tcp: avoid race between nvme scan and reset

Linux block layer
 help / color / mirror / Atom feed

From: Ming Lei <ming.lei@redhat.com>
To: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Cc: Sagi Grimberg <sagi@grimberg.me>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	linux-block <linux-block@vger.kernel.org>,
	Hannes Reinecke <hare@kernel.org>,
	Keith Busch <kbusch@kernel.org>, Jens Axboe <axboe@kernel.dk>,
	"hch@infradead.org" <hch@infradead.org>
Subject: Re: [PATCH 1/2] nvme-tcp: avoid race between nvme scan and reset
Date: Fri, 13 Jun 2025 12:56:40 +0800	[thread overview]
Message-ID: <aEuviL5O2kD70Sij@fedora> (raw)
In-Reply-To: <6pt5u3fg3qts4jekun5ory5lr2jtfbibd76phqviheulpjqjtq@m3arkh44nrs2>

On Wed, Jun 04, 2025 at 11:17:05AM +0000, Shinichiro Kawasaki wrote:
> Cc+: Ming,
> 
> Hi Sagi, thanks for the background explanation and the suggestion.
> 
> On Jun 04, 2025 / 10:10, Sagi Grimberg wrote:
> ...
> > > This is a problem. We are flushing a work that is IO bound, which can
> > > take a long time to complete.
> > > Up until now, we have deliberately avoided introducing dependency
> > > between reset forward progress
> > > and scan work IO to completely finish.
> > > 
> > > I would like to keep it this way.
> > > 
> > > BTW, this is not TCP specific.
> 
> I see. The blktests test case nvme/063 is dedicated to tcp transport, so that's
> why it was reported for the TCP transport.
> 
> > 
> > blk_mq_unquiesce_queue is still very much safe to call as many times as we
> > want.
> > The only thing that comes in the way is this pesky WARN. How about we make
> > it go away and have
> > a debug print instead?
> > 
> > My preference would be to allow nvme to unquiesce queues that were not
> > previously quiesced (just
> > like it historically was) instead of having to block a controller reset
> > until the scan_work is completed (which
> > is admin I/O dependent, and may get stuck until admin timeout, which can be
> > changed by the user for 60
> > minutes or something arbitrarily long btw).
> > 
> > How about something like this patch instead:
> > --
> > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > index c2697db59109..74f3ad16e812 100644
> > --- a/block/blk-mq.c
> > +++ b/block/blk-mq.c
> > @@ -327,8 +327,10 @@ void blk_mq_unquiesce_queue(struct request_queue *q)
> >         bool run_queue = false;
> > 
> >         spin_lock_irqsave(&q->queue_lock, flags);
> > -       if (WARN_ON_ONCE(q->quiesce_depth <= 0)) {
> > -               ;
> > +       if (q->quiesce_depth <= 0) {
> > +               printk(KERN_DEBUG
> > +                       "dev %s: unquiescing a non-quiesced queue,
> > expected?\n",
> > +                       q->disk ? q->disk->disk_name : "?", );
> >         } else if (!--q->quiesce_depth) {
> >                 blk_queue_flag_clear(QUEUE_FLAG_QUIESCED, q);
> >                 run_queue = true;
> > --
> 
> The WARN was introduced with the commit e70feb8b3e68 ("blk-mq: support
> concurrent queue quiesce/unquiesce") that Ming authored. Ming, may I
> ask your comment on the suggestion by Sagi?

I think it is bad to use one standard block layer API in this unbalanced way,
that is why WARN_ON_ONCE() is added. We shouldn't encourage driver to use
APIs in this way.

Question is why nvme have to unquiesce one non-quiesced queue?



Thanks,
Ming

next prev parent reply	other threads:[~2025-06-13  4:56 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20250602043522.55787-1-shinichiro.kawasaki@wdc.com>
     [not found] ` <20250602043522.55787-2-shinichiro.kawasaki@wdc.com>
     [not found]   ` <910b31ba-1982-4365-961e-435f5e7611b2@grimberg.me>
2025-06-04  7:10     ` [PATCH 1/2] nvme-tcp: avoid race between nvme scan and reset Sagi Grimberg
2025-06-04 11:17       ` Shinichiro Kawasaki
2025-06-13  4:10         ` Shinichiro Kawasaki
2025-06-13  4:56         ` Ming Lei [this message]
2025-06-26  4:45           ` Shinichiro Kawasaki
2025-06-28 10:24           ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aEuviL5O2kD70Sij@fedora \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=hare@kernel.org \
    --cc=hch@infradead.org \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    --cc=shinichiro.kawasaki@wdc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox