From mboxrd@z Thu Jan 1 00:00:00 1970 From: axboe@kernel.dk (Jens Axboe) Date: Thu, 22 Mar 2018 15:32:45 -0600 Subject: [GIT PULL] nvme fix for 4.16-rc6 In-Reply-To: <20180322210917.GK12909@localhost.localdomain> References: <20180316160110.GC720@localhost.localdomain> <20180316161655.GA13293@lst.de> <8734416c-cd14-8bfb-8e2c-6eab84e0ce5d@kernel.dk> <20180321205958.GH12909@localhost.localdomain> <20180322210917.GK12909@localhost.localdomain> Message-ID: <71eda309-fa4a-7641-9292-fcc6c77d44af@kernel.dk> On 3/22/18 3:09 PM, Keith Busch wrote: > On Wed, Mar 21, 2018@03:44:32PM -0600, Jens Axboe wrote: >> >> [ 30.241598] nvme nvme2: pci function 0000:0b:00.0 >> [ 30.247205] nvme nvme3: pci function 0000:81:00.0 >> [ 30.252684] nvme nvme4: pci function 0000:82:00.0 >> [ 30.258144] nvme nvme5: pci function 0000:83:00.0 >> [ 30.263606] nvme nvme6: pci function 0000:84:00.0 >> [ 30.360555] nvme nvme3: could not set timestamp (8194) >> [ 30.481649] nvme nvme6: Shutdown timeout set to 8 seconds >> [ 38.790949] nvme nvme4: Device not ready; aborting initialisation >> [ 38.797857] nvme nvme4: Removing after probe failure status: -19 >> [ 60.708816] nvme nvme3: I/O 363 QID 8 timeout, completion polled >> [ 60.708820] nvme nvme6: I/O 781 QID 7 timeout, completion polled >> [ 68.068772] nvme nvme2: I/O 769 QID 28 timeout, completion polled >> [ 91.108626] nvme nvme6: I/O 781 QID 7 timeout, completion polled >> [ 98.660581] nvme nvme2: I/O 769 QID 28 timeout, completion polled >> [ 121.702691] nvme nvme6: I/O 100 QID 7 timeout, completion polled >> [ 128.998648] nvme nvme3: I/O 387 QID 4 timeout, completion polled >> [ 152.038523] nvme nvme6: I/O 781 QID 7 timeout, completion polled >> >> This is just doing an fdisk -l after load. No interrupts triggering, >> looking at /proc/interrupts for the queues that timeout. > > So no interrupts triggered for the queues that timeout, but you are > getting interrupts for other queues. Are there by chance many spurious > interrupts for those other queues? I picked one device, and I did: for i in $(seq 0 47); do echo cpu $i; taskset -c $i dd if=/dev/nvme6n1 of=/dev/null bs=4k iflag=direct count=1; done [...] cpu 17 1+0 records in 1+0 records out 4096 bytes (4.1 kB, 4.0 KiB) copied, 30.608 s, 0.1 kB/s same slowness for CPU 19, 21, 23, 41, 43, 45, 47. Looking at dmesg: [ 701.725191] nvme nvme6: I/O 912 QID 7 timeout, completion polled [ 732.124646] nvme nvme6: I/O 896 QID 7 timeout, completion polled [ 762.652479] nvme nvme6: I/O 840 QID 7 timeout, completion polled [ 793.052259] nvme nvme6: I/O 754 QID 7 timeout, completion polled [ 823.644082] nvme nvme6: I/O 339 QID 7 timeout, completion polled [ 853.979878] nvme nvme6: I/O 86 QID 7 timeout, completion polled [ 884.571686] nvme nvme6: I/O 997 QID 7 timeout, completion polled [ 914.907506] nvme nvme6: I/O 694 QID 7 timeout, completion polled we see them all timing out after 30s, and all being on QID 7, which is hctx6. There are 8 hw queues (and interrupts), only some of them seem to be actually triggering looking at /proc/interrupts. There seems to be some mismatch. nvme6q7 is 244: # cat /proc/irq/244/smp_affinity_list 49,51,53,55,57,59,61,63 and 243 is nvme6q6: # cat /proc/irq/243/smp_affinity_list 17,19,21,23,41,43,45,47 244 has never triggered, if I do: # taskset -c 17 dd if=/dev/nvme6n1 of=/dev/null bs=4k iflag=direct count=1 then look at interrupts, none of the nvme6 associated interrupts have triggered. -- Jens Axboe