From: axboe@kernel.dk (Jens Axboe)
Subject: [GIT PULL] nvme fix for 4.16-rc6
Date: Thu, 22 Mar 2018 15:32:45 -0600 [thread overview]
Message-ID: <71eda309-fa4a-7641-9292-fcc6c77d44af@kernel.dk> (raw)
In-Reply-To: <20180322210917.GK12909@localhost.localdomain>
On 3/22/18 3:09 PM, Keith Busch wrote:
> On Wed, Mar 21, 2018@03:44:32PM -0600, Jens Axboe wrote:
>>
>> [ 30.241598] nvme nvme2: pci function 0000:0b:00.0
>> [ 30.247205] nvme nvme3: pci function 0000:81:00.0
>> [ 30.252684] nvme nvme4: pci function 0000:82:00.0
>> [ 30.258144] nvme nvme5: pci function 0000:83:00.0
>> [ 30.263606] nvme nvme6: pci function 0000:84:00.0
>> [ 30.360555] nvme nvme3: could not set timestamp (8194)
>> [ 30.481649] nvme nvme6: Shutdown timeout set to 8 seconds
>> [ 38.790949] nvme nvme4: Device not ready; aborting initialisation
>> [ 38.797857] nvme nvme4: Removing after probe failure status: -19
>> [ 60.708816] nvme nvme3: I/O 363 QID 8 timeout, completion polled
>> [ 60.708820] nvme nvme6: I/O 781 QID 7 timeout, completion polled
>> [ 68.068772] nvme nvme2: I/O 769 QID 28 timeout, completion polled
>> [ 91.108626] nvme nvme6: I/O 781 QID 7 timeout, completion polled
>> [ 98.660581] nvme nvme2: I/O 769 QID 28 timeout, completion polled
>> [ 121.702691] nvme nvme6: I/O 100 QID 7 timeout, completion polled
>> [ 128.998648] nvme nvme3: I/O 387 QID 4 timeout, completion polled
>> [ 152.038523] nvme nvme6: I/O 781 QID 7 timeout, completion polled
>>
>> This is just doing an fdisk -l after load. No interrupts triggering,
>> looking at /proc/interrupts for the queues that timeout.
>
> So no interrupts triggered for the queues that timeout, but you are
> getting interrupts for other queues. Are there by chance many spurious
> interrupts for those other queues?
I picked one device, and I did:
for i in $(seq 0 47); do echo cpu $i; taskset -c $i dd if=/dev/nvme6n1 of=/dev/null bs=4k iflag=direct count=1; done
[...]
cpu 17
1+0 records in
1+0 records out
4096 bytes (4.1 kB, 4.0 KiB) copied, 30.608 s, 0.1 kB/s
same slowness for CPU 19, 21, 23, 41, 43, 45, 47. Looking at dmesg:
[ 701.725191] nvme nvme6: I/O 912 QID 7 timeout, completion polled
[ 732.124646] nvme nvme6: I/O 896 QID 7 timeout, completion polled
[ 762.652479] nvme nvme6: I/O 840 QID 7 timeout, completion polled
[ 793.052259] nvme nvme6: I/O 754 QID 7 timeout, completion polled
[ 823.644082] nvme nvme6: I/O 339 QID 7 timeout, completion polled
[ 853.979878] nvme nvme6: I/O 86 QID 7 timeout, completion polled
[ 884.571686] nvme nvme6: I/O 997 QID 7 timeout, completion polled
[ 914.907506] nvme nvme6: I/O 694 QID 7 timeout, completion polled
we see them all timing out after 30s, and all being on QID 7, which
is hctx6. There are 8 hw queues (and interrupts), only some of
them seem to be actually triggering looking at /proc/interrupts.
There seems to be some mismatch. nvme6q7 is 244:
# cat /proc/irq/244/smp_affinity_list
49,51,53,55,57,59,61,63
and 243 is nvme6q6:
# cat /proc/irq/243/smp_affinity_list
17,19,21,23,41,43,45,47
244 has never triggered, if I do:
# taskset -c 17 dd if=/dev/nvme6n1 of=/dev/null bs=4k iflag=direct count=1
then look at interrupts, none of the nvme6 associated interrupts have
triggered.
--
Jens Axboe
next prev parent reply other threads:[~2018-03-22 21:32 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-16 16:01 [GIT PULL] nvme fix for 4.16-rc6 Keith Busch
2018-03-16 16:14 ` Jens Axboe
2018-03-16 16:16 ` Christoph Hellwig
2018-03-16 16:26 ` Jens Axboe
2018-03-16 16:53 ` Keith Busch
2018-03-21 20:59 ` Keith Busch
2018-03-21 21:02 ` Jens Axboe
2018-03-21 21:44 ` Jens Axboe
2018-03-21 22:08 ` Keith Busch
2018-03-22 21:09 ` Keith Busch
2018-03-22 21:32 ` Jens Axboe [this message]
2018-03-22 22:02 ` Keith Busch
2018-03-22 22:09 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=71eda309-fa4a-7641-9292-fcc6c77d44af@kernel.dk \
--to=axboe@kernel.dk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox