From: krisman@linux.vnet.ibm.com (Gabriel Krisman Bertazi)
Subject: Oops when completing request on the wrong queue
Date: Tue, 23 Aug 2016 17:54:03 -0300 [thread overview]
Message-ID: <87a8g39pg4.fsf@linux.vnet.ibm.com> (raw)
In-Reply-To: <87shu0hfye.fsf@linux.vnet.ibm.com> (Gabriel Krisman Bertazi's message of "Fri, 19 Aug 2016 13:38:17 -0300")
Gabriel Krisman Bertazi <krisman at linux.vnet.ibm.com> writes:
>> Can you share what you ran to online/offline CPUs? I can't reproduce
>> this here.
>
> I was using the ppc64_cpu tool, which shouldn't do nothing more than
> write to sysfs. but I just reproduced it with the script below.
>
> Note that this is ppc64le. I don't have a x86 in hand to attempt to
> reproduce right now, but I'll look for one and see how it goes.
Hi,
Any luck on reproducing it? We were initially reproducing with a
proprietary stress test, but I gave a try to a generated fio jobfile
associated with the SMT script I shared earlier and I could reproduce
the crash consistently in less than 10 minutes of execution. this was
still ppc64le, though. I couldn't get my hands on nvme on x86 yet.
The job file I used, as well as the smt.sh script, in case you want to
give it a try:
jobfile: http://krisman.be/k/nvmejob.fio
smt.sh: http://krisman.be/k/smt.sh
Still, the trigger seems to be consistently a heavy load of IO
associated with CPU addition/removal.
Let me share my progress from the last couple days in the hope that it
rings a bell for you.
Firstly, I verified that when we hit the BUG_ON in nvme_queue_rq, the
request_queue's freeze_depth is 0, which points away from a fault in the
freeze/unfreeze mechanism. If a request was escaping and going through
the block layer during a freeze, we'd see freeze_depth >= 1. Before
that, I had also tried to keep the q_usage_counter in atomic mode, in
case of a bug in the percpu refcount. No luck, the BUG_ON was still
hit.
Also, I don't see anything special about the request that reaches the
BUG_ON. It's a REQ_TYPE_FS request and, at least in the last time I
reproduced, it was a READ that came from the stress test task through
submit_bio. So nothing remarkable about it too, as far as I can see.
I'm still thinking about a case in which the mapping get's screwed up,
where a ctx would appear into two hctxs bitmaps after a remap, or if the
ctx got remaped to another hctx. I'm still learning my way through the
cpumap code, so I'm not sure it's a real possibility, but I'm not
convinced it isn't. Some preliminary tests don't suggest it's the case
at play, but I wanna spend a little more time on this theory (maybe
for my lack of better ideas :)
On a side note, probably unrelated to this crash, it also got me
thinking about the current usefulness of blk_mq_hctx_notify. Since CPU
is dead, no more requests would be coming through its ctx. I think we
could force a queue run in blk_mq_queue_reinit_notify, before remapping,
which would cause the hctx to fetch the remaining requests from that
dead ctx (since it's not unmapped yet). This way, we could maintain a
single hotplug notification hook and simplify the hotplug path. I
haven't written code for it yet, but I'll see if I can come up with
something and send to the list.
--
Gabriel Krisman Bertazi
next prev parent reply other threads:[~2016-08-23 20:54 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-10 4:04 Oops when completing request on the wrong queue Gabriel Krisman Bertazi
2016-08-11 17:16 ` Keith Busch
2016-08-11 18:10 ` Gabriel Krisman Bertazi
2016-08-19 13:28 ` Gabriel Krisman Bertazi
2016-08-19 14:13 ` Jens Axboe
2016-08-19 15:51 ` Jens Axboe
2016-08-19 16:38 ` Gabriel Krisman Bertazi
2016-08-23 20:54 ` Gabriel Krisman Bertazi [this message]
2016-08-23 21:11 ` Jens Axboe
2016-08-23 21:14 ` Jens Axboe
2016-08-23 22:49 ` Keith Busch
2016-08-24 18:34 ` Jens Axboe
2016-08-24 20:36 ` Jens Axboe
2016-08-29 18:06 ` Gabriel Krisman Bertazi
2016-08-29 18:40 ` Jens Axboe
2016-09-05 12:02 ` Gabriel Krisman Bertazi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87a8g39pg4.fsf@linux.vnet.ibm.com \
--to=krisman@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).