From: Ming Lei <ming.lei@redhat.com>
To: Paolo Valente <paolo.valente@linaro.org>
Cc: Alban Browaeys <alban.browaeys@gmail.com>,
Jens Axboe <axboe@fb.com>,
linux-block <linux-block@vger.kernel.org>,
SERENA ZIVIANI <169364@studenti.unimore.it>,
Ulf Hansson <ulf.hansson@linaro.org>,
Linus Walleij <linus.walleij@linaro.org>
Subject: Re: blk-mq + bfq: udevd hang on usb2 storages
Date: Fri, 8 Dec 2017 09:28:04 +0800 [thread overview]
Message-ID: <20171208012803.GC21488@ming.t460p> (raw)
In-Reply-To: <D03704C9-BF8E-49E1-B552-90393BC9DE4A@linaro.org>
Hi Paolo,
On Thu, Dec 07, 2017 at 07:04:54PM +0100, Paolo Valente wrote:
>
> > Il giorno 04 dic 2017, alle ore 11:57, Ming Lei <ming.lei@redhat.com> ha scritto:
> >
> > On Fri, Dec 01, 2017 at 06:04:29PM +0100, Alban Browaeys wrote:
> >> I initially reported as https://bugzilla.kernel.org/show_bug.cgi?id=198
> >> 023 .
> >>
> >> I have now bisected this issue to commit a6a252e6491443c1c1 "blk-mq-
> >> sched: decide how to handle flush rq via RQF_FLUSH_SEQ".
> >>
> >> This is with an USB stick Sandisk Cruzer (USB Version: 2.10) I
> >> regressed with.
> >> systemctl restart systemd-udevd restores sanity.
> >>
> >> PS: With an USB3 Lexar (USB Version: 3.00) I get more severe an issue
> >> (not bisected) where I find no way out of reboot. My report to bugzilla
> >> has logs when I was swapping between the these keys. The logs attached
> >> there mixes what looks like two different behaviors.
> >
> > Hi Paolo,
> >
> > From both Alban's trace and my trace, looks this issue is in BFQ,
> > since request can't be retrieved via e->type->ops.mq.dispatch_request()
> > in blk_mq_do_dispatch_sched() after it is inserted into BFQ's queue.
> >
> > https://bugzilla.kernel.org/show_bug.cgi?id=198023#c4
> > https://marc.info/?l=linux-block&m=151214241518562&w=2
> >
> > BTW, I have tried to reproduce the issue with scsi_debug, but not succeed,
> > and it can't be reproduced with other schedulers(mq-deadline, none) too.
> >
> > So could you take a look?
> >
>
> Hi Ming, all,
> sorry for the delay, but we preferred to reply directly after finding
> the cause of the problem. And the cause is that gdisk makes an I/O
Not a problem, :-)
In the previous mail, I just want to share you our findings.
> request that is dispatched to the drive, but apparently never
> completed (as Serena, in CC discovered). Or, at least, the execution
> of completed_request in bfq is never triggered.
I can understand the case a bit, and the following info may be helpful
for you:
1) USB's queue depth is one
2) the only pending request is completed, and scsi_finish_command() is called
3) inside scsi_finish_command(), scsi_device_unbusy() is called at the
beginning, once it is done, blk_mq_get_dispatch_budget() in blk_mq_do_dispatch_sched()
returns true, then we can start to try to dispatch request
4) e->type->ops.mq.dispatch_request() is called, but the request in 2)
isn't completed yet, completed_request in bfq isn't be run yet because
it is called later from scsi_end_request()(<-scsi_io_completion()<-scsi_finish_command())
Then no request can be dispatched any more, and hang happens, but
finally completed_request should be run later.
>
> In more detail: disk is a process for which bfq performs device idling
> (for good reasons), and, for one such process, bfq does not switch to
> serving another process until the last pending request of the process
> is completed, after which device idling is started, to wait for the
> next request of the process. So, if such a last request is never
> completed, bfq remains forever waiting for such an event, and then
> refuses forever to deliver requests of other queues.
>
> As for why bfq_completed_request is not executed for the above,
It should be run.
> dispatched request, the reason is either that the bfq_finish_request
> hook is not invoked at all, or that it is invoked, but the request
> does not have the RQF_STARTED flag set. Discovering which event
The flag of RQF_STARTED is set only if there is one request found by
__bfq_dispatch_request(), which can never happen in this case, since
we observed no request is found by __bfq_dispatch_request() even though
it has been inserted to BFQ queue already.
> occurs is our next step.
>
> We'll let you know.
Thanks,
Ming
prev parent reply other threads:[~2017-12-08 1:28 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-01 17:04 blk-mq + bfq: udevd hang on usb2 storages Alban Browaeys
2017-12-01 17:29 ` Ming Lei
2017-12-04 10:57 ` Ming Lei
2017-12-07 18:04 ` Paolo Valente
2017-12-08 1:28 ` Ming Lei [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171208012803.GC21488@ming.t460p \
--to=ming.lei@redhat.com \
--cc=169364@studenti.unimore.it \
--cc=alban.browaeys@gmail.com \
--cc=axboe@fb.com \
--cc=linus.walleij@linaro.org \
--cc=linux-block@vger.kernel.org \
--cc=paolo.valente@linaro.org \
--cc=ulf.hansson@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox