From: axboe@kernel.dk (Jens Axboe)
Subject: Oops when completing request on the wrong queue
Date: Tue, 23 Aug 2016 15:14:23 -0600 [thread overview]
Message-ID: <164a4c63-065b-b766-36f3-bcef4aa46a38@kernel.dk> (raw)
In-Reply-To: <43693064-dd37-92ce-7753-2a8edb43eab5@kernel.dk>
On 08/23/2016 03:11 PM, Jens Axboe wrote:
> On 08/23/2016 02:54 PM, Gabriel Krisman Bertazi wrote:
>> Gabriel Krisman Bertazi <krisman at linux.vnet.ibm.com> writes:
>>
>>>> Can you share what you ran to online/offline CPUs? I can't reproduce
>>>> this here.
>>>
>>> I was using the ppc64_cpu tool, which shouldn't do nothing more than
>>> write to sysfs. but I just reproduced it with the script below.
>>>
>>> Note that this is ppc64le. I don't have a x86 in hand to attempt to
>>> reproduce right now, but I'll look for one and see how it goes.
>>
>> Hi,
>>
>> Any luck on reproducing it? We were initially reproducing with a
>> proprietary stress test, but I gave a try to a generated fio jobfile
>> associated with the SMT script I shared earlier and I could reproduce
>> the crash consistently in less than 10 minutes of execution. this was
>> still ppc64le, though. I couldn't get my hands on nvme on x86 yet.
>
> Nope, I have not been able to reproduce it. How long does the CPU
> offline/online actions take on ppc64? It's pretty slow on x86, which may
> hide the issue. I took out the various printk's associated with bringing
> a CPU off/online, as well as IRQ breaking parts, but didn't help in
> reproducing it.
>
>> The job file I used, as well as the smt.sh script, in case you want to
>> give it a try:
>>
>> jobfile: http://krisman.be/k/nvmejob.fio
>> smt.sh: http://krisman.be/k/smt.sh
>>
>> Still, the trigger seems to be consistently a heavy load of IO
>> associated with CPU addition/removal.
>
> My workload looks similar to yours, in that it's high depth and with a
> lot of jobs to keep most CPUs loaded. My bash script is different than
> yours, I'll try that and see if it helps here.
Actually, I take that back. You're not using O_DIRECT, hence all your
jobs are running at QD=1, not the 256 specified. That looks odd, but
I'll try, maybe it'll hit something different.
--
Jens Axboe
next prev parent reply other threads:[~2016-08-23 21:14 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-10 4:04 Oops when completing request on the wrong queue Gabriel Krisman Bertazi
2016-08-11 17:16 ` Keith Busch
2016-08-11 18:10 ` Gabriel Krisman Bertazi
2016-08-19 13:28 ` Gabriel Krisman Bertazi
2016-08-19 14:13 ` Jens Axboe
2016-08-19 15:51 ` Jens Axboe
2016-08-19 16:38 ` Gabriel Krisman Bertazi
2016-08-23 20:54 ` Gabriel Krisman Bertazi
2016-08-23 21:11 ` Jens Axboe
2016-08-23 21:14 ` Jens Axboe [this message]
2016-08-23 22:49 ` Keith Busch
2016-08-24 18:34 ` Jens Axboe
2016-08-24 20:36 ` Jens Axboe
2016-08-29 18:06 ` Gabriel Krisman Bertazi
2016-08-29 18:40 ` Jens Axboe
2016-09-05 12:02 ` Gabriel Krisman Bertazi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=164a4c63-065b-b766-36f3-bcef4aa46a38@kernel.dk \
--to=axboe@kernel.dk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).