From: Ming Lei <ming.lei@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>,
djeffery@redhat.com, Bart Van Assche <bvanassche@acm.org>,
linux-scsi@vger.kernel.org,
virtualization@lists.linux-foundation.org,
linux-block@vger.kernel.org, Christoph Hellwig <hch@lst.de>
Subject: Re: [Bug] double ->queue_rq() because of timeout in ->queue_rq()
Date: Mon, 24 Oct 2022 23:41:02 +0800 [thread overview]
Message-ID: <Y1ayDrL5Z1JTT5OA@T590> (raw)
In-Reply-To: <Y1avnzv01gevnmXz@fedora>
On Mon, Oct 24, 2022 at 11:30:39AM -0400, Stefan Hajnoczi wrote:
> On Fri, Oct 21, 2022 at 10:23:57AM +0800, Ming Lei wrote:
> > On Thu, Oct 20, 2022 at 04:01:11PM -0400, Stefan Hajnoczi wrote:
> > > On Thu, Oct 20, 2022 at 05:10:13PM +0800, Ming Lei wrote:
> > > > Hi,
> > > >
> > > > David Jeffery found one double ->queue_rq() issue, so far it can
> > > > be triggered in the following two cases:
> > > >
> > > > 1) scsi driver in guest kernel
> > > >
> > > > - the story could be long vmexit latency or long preempt latency of
> > > > vCPU pthread, then IO req is timed out before queuing the request
> > > > to hardware but after calling blk_mq_start_request() during ->queue_rq(),
> > > > then timeout handler handles it by requeue, then double ->queue_rq() is
> > > > caused, and kernel panic
> > > >
> > > > 2) burst of kernel messages from irq handler
> > > >
> > > > For 1), I think it is one reasonable case, given latency from host side
> > > > can come anytime in theory because vCPU is emulated by one normal host
> > > > pthread which can be preempted anywhere. For 2), I guess kernel message is
> > > > supposed to be rate limited.
> > > >
> > > > Firstly, is this kind of so long(30sec) random latency when running kernel
> > > > code something normal? Or do we need to take care of it? IMO, it looks
> > > > reasonable in case of VM, but our VM experts may have better idea about this
> > > > situation. Also the default 30sec timeout could be reduced via sysfs or
> > > > drivers.
> > >
> > > 30 seconds is a long latency that does not occur during normal
> > > operation, but unfortunately does happen on occasion.
> >
> > Thanks for the confirmation!
> >
> > >
> > > I think there's an interest in understanding the root cause and solving
> > > long latencies (if possible) in the QEMU/KVM communities. We can
> > > investigate specific cases on kvm@vger.kernel.org and/or
> > > qemu-devel@nongnu.org.
> >
> > The issue was original reported on VMware VM, but maybe David can figure
> > out how to trigger it on QEMU/KVM.
>
> A very basic question:
>
> The virtio_blk driver has no q->mq_ops->timeout() callback. Why does the
> block layer still enable the timeout mechanism when the driver doesn't
> implement ->timeout()?
No matter if ->timeout() is implemented or not, request still may
be timed out, and it is better for block layer to find such issue
and simply reset timer in case of no ->timeout().
>
> I saw there was some "idle" hctx logic and I guess the requests are
timeout timer is reused for idle hctx detection.
> resubmitted (although it wasn't obvious to me how that happens in the
> code)? Maybe that's why the timer is still used if the driver doesn't
> care about timeouts...
Timeout handling is totally decided by driver's ->timeout() callback.
If driver doesn't implement ->timeout(), the request's timer is
reset.
Thanks
Ming
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
next prev parent reply other threads:[~2022-10-24 15:41 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-20 9:10 [Bug] double ->queue_rq() because of timeout in ->queue_rq() Ming Lei
2022-10-20 20:01 ` Stefan Hajnoczi
2022-10-21 2:23 ` Ming Lei
2022-10-24 15:30 ` Stefan Hajnoczi
2022-10-24 15:41 ` Ming Lei [this message]
2022-10-20 20:26 ` Bart Van Assche
2022-10-21 0:57 ` Ming Lei
[not found] ` <Y1Ktf2jRTlPMQwJR@kbusch-mbp.dhcp.thefacebook.com>
2022-10-21 15:22 ` Ming Lei
[not found] ` <CA+-xHTFp+gFVy6aKW2nj47+WY2+1vOLAE-X067C-hm4_8ngA6g@mail.gmail.com>
2022-10-22 4:27 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y1ayDrL5Z1JTT5OA@T590 \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=bvanassche@acm.org \
--cc=djeffery@redhat.com \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=stefanha@redhat.com \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).