* [PATCH] nvme: utilize two queue maps, one for reads and one for writes [not found] <20181114004148.GA29545@roeck-us.net> @ 2018-11-14 0:51 ` Jens Axboe 2018-11-14 1:28 ` Mike Snitzer 2018-11-14 4:52 ` [PATCH] " Guenter Roeck 0 siblings, 2 replies; 15+ messages in thread From: Jens Axboe @ 2018-11-14 0:51 UTC (permalink / raw) On 11/13/18 5:41 PM, Guenter Roeck wrote: > Hi, > > On Wed, Oct 31, 2018@08:36:31AM -0600, Jens Axboe wrote: >> NVMe does round-robin between queues by default, which means that >> sharing a queue map for both reads and writes can be problematic >> in terms of read servicing. It's much easier to flood the queue >> with writes and reduce the read servicing. >> >> Implement two queue maps, one for reads and one for writes. The >> write queue count is configurable through the 'write_queues' >> parameter. >> >> By default, we retain the previous behavior of having a single >> queue set, shared between reads and writes. Setting 'write_queues' >> to a non-zero value will create two queue sets, one for reads and >> one for writes, the latter using the configurable number of >> queues (hardware queue counts permitting). >> >> Reviewed-by: Hannes Reinecke <hare at suse.com> >> Reviewed-by: Keith Busch <keith.busch at intel.com> >> Signed-off-by: Jens Axboe <axboe at kernel.dk> > > This patch causes hangs when running recent versions of > -next with several architectures; see the -next column at > kerneltests.org/builders for details. Bisect log below; this > was run with qemu on alpha. Reverting this patch as well as > "nvme: add separate poll queue map" fixes the problem. I don't see anything related to what hung, the trace, and so on. Can you clue me in? Where are the test results with dmesg? How to reproduce? -- Jens Axboe ^ permalink raw reply [flat|nested] 15+ messages in thread
* nvme: utilize two queue maps, one for reads and one for writes 2018-11-14 0:51 ` [PATCH] nvme: utilize two queue maps, one for reads and one for writes Jens Axboe @ 2018-11-14 1:28 ` Mike Snitzer 2018-11-14 1:36 ` Mike Snitzer 2018-11-14 4:52 ` [PATCH] " Guenter Roeck 1 sibling, 1 reply; 15+ messages in thread From: Mike Snitzer @ 2018-11-14 1:28 UTC (permalink / raw) On Tue, Nov 13 2018 at 7:51pm -0500, Jens Axboe <axboe@kernel.dk> wrote: > On 11/13/18 5:41 PM, Guenter Roeck wrote: > > Hi, > > > > On Wed, Oct 31, 2018@08:36:31AM -0600, Jens Axboe wrote: > >> NVMe does round-robin between queues by default, which means that > >> sharing a queue map for both reads and writes can be problematic > >> in terms of read servicing. It's much easier to flood the queue > >> with writes and reduce the read servicing. > >> > >> Implement two queue maps, one for reads and one for writes. The > >> write queue count is configurable through the 'write_queues' > >> parameter. > >> > >> By default, we retain the previous behavior of having a single > >> queue set, shared between reads and writes. Setting 'write_queues' > >> to a non-zero value will create two queue sets, one for reads and > >> one for writes, the latter using the configurable number of > >> queues (hardware queue counts permitting). > >> > >> Reviewed-by: Hannes Reinecke <hare at suse.com> > >> Reviewed-by: Keith Busch <keith.busch at intel.com> > >> Signed-off-by: Jens Axboe <axboe at kernel.dk> > > > > This patch causes hangs when running recent versions of > > -next with several architectures; see the -next column at > > kerneltests.org/builders for details. Bisect log below; this > > was run with qemu on alpha. Reverting this patch as well as > > "nvme: add separate poll queue map" fixes the problem. > > I don't see anything related to what hung, the trace, and so on. > Can you clue me in? Where are the test results with dmesg? > > How to reproduce? Think Guenter should've provided a full kerneltests.org url, but I had a look and found this for powerpc with -next: https://kerneltests.org/builders/next-powerpc-next/builds/998/steps/buildcommand/logs/stdio Has useful logs of the build failure due to block. (not seeing any -next failure for alpha but Guenter said he was using qemu so the build failure could've been any arch qemu supports) Mike ^ permalink raw reply [flat|nested] 15+ messages in thread
* nvme: utilize two queue maps, one for reads and one for writes 2018-11-14 1:28 ` Mike Snitzer @ 2018-11-14 1:36 ` Mike Snitzer 0 siblings, 0 replies; 15+ messages in thread From: Mike Snitzer @ 2018-11-14 1:36 UTC (permalink / raw) On Tue, Nov 13 2018 at 8:28pm -0500, Mike Snitzer <snitzer@redhat.com> wrote: > On Tue, Nov 13 2018 at 7:51pm -0500, > Jens Axboe <axboe@kernel.dk> wrote: > > > On 11/13/18 5:41 PM, Guenter Roeck wrote: > > > Hi, > > > > > > On Wed, Oct 31, 2018@08:36:31AM -0600, Jens Axboe wrote: > > >> NVMe does round-robin between queues by default, which means that > > >> sharing a queue map for both reads and writes can be problematic > > >> in terms of read servicing. It's much easier to flood the queue > > >> with writes and reduce the read servicing. > > >> > > >> Implement two queue maps, one for reads and one for writes. The > > >> write queue count is configurable through the 'write_queues' > > >> parameter. > > >> > > >> By default, we retain the previous behavior of having a single > > >> queue set, shared between reads and writes. Setting 'write_queues' > > >> to a non-zero value will create two queue sets, one for reads and > > >> one for writes, the latter using the configurable number of > > >> queues (hardware queue counts permitting). > > >> > > >> Reviewed-by: Hannes Reinecke <hare at suse.com> > > >> Reviewed-by: Keith Busch <keith.busch at intel.com> > > >> Signed-off-by: Jens Axboe <axboe at kernel.dk> > > > > > > This patch causes hangs when running recent versions of > > > -next with several architectures; see the -next column at > > > kerneltests.org/builders for details. Bisect log below; this > > > was run with qemu on alpha. Reverting this patch as well as > > > "nvme: add separate poll queue map" fixes the problem. > > > > I don't see anything related to what hung, the trace, and so on. > > Can you clue me in? Where are the test results with dmesg? > > > > How to reproduce? > > Think Guenter should've provided a full kerneltests.org url, but I had a > look and found this for powerpc with -next: > https://kerneltests.org/builders/next-powerpc-next/builds/998/steps/buildcommand/logs/stdio > > Has useful logs of the build failure due to block. Take that back, of course I only had a quick look and first scrolled to this fragment and thought "yeap shows block build failure" (not _really_): opt/buildbot/slave/next-next/build/kernel/sched/psi.c: In function 'cgroup_move_task': /opt/buildbot/slave/next-next/build/include/linux/spinlock.h:273:32: warning: 'rq' may be used uninitialized in this function [-Wmaybe-uninitialized] #define raw_spin_unlock(lock) _raw_spin_unlock(lock) ^~~~~~~~~~~~~~~~ /opt/buildbot/slave/next-next/build/kernel/sched/psi.c:639:13: note: 'rq' was declared here struct rq *rq; ^~ ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] nvme: utilize two queue maps, one for reads and one for writes 2018-11-14 0:51 ` [PATCH] nvme: utilize two queue maps, one for reads and one for writes Jens Axboe 2018-11-14 1:28 ` Mike Snitzer @ 2018-11-14 4:52 ` Guenter Roeck 2018-11-14 17:12 ` Jens Axboe 1 sibling, 1 reply; 15+ messages in thread From: Guenter Roeck @ 2018-11-14 4:52 UTC (permalink / raw) On Tue, Nov 13, 2018@05:51:08PM -0700, Jens Axboe wrote: > On 11/13/18 5:41 PM, Guenter Roeck wrote: > > Hi, > > > > On Wed, Oct 31, 2018@08:36:31AM -0600, Jens Axboe wrote: > >> NVMe does round-robin between queues by default, which means that > >> sharing a queue map for both reads and writes can be problematic > >> in terms of read servicing. It's much easier to flood the queue > >> with writes and reduce the read servicing. > >> > >> Implement two queue maps, one for reads and one for writes. The > >> write queue count is configurable through the 'write_queues' > >> parameter. > >> > >> By default, we retain the previous behavior of having a single > >> queue set, shared between reads and writes. Setting 'write_queues' > >> to a non-zero value will create two queue sets, one for reads and > >> one for writes, the latter using the configurable number of > >> queues (hardware queue counts permitting). > >> > >> Reviewed-by: Hannes Reinecke <hare at suse.com> > >> Reviewed-by: Keith Busch <keith.busch at intel.com> > >> Signed-off-by: Jens Axboe <axboe at kernel.dk> > > > > This patch causes hangs when running recent versions of > > -next with several architectures; see the -next column at > > kerneltests.org/builders for details. Bisect log below; this > > was run with qemu on alpha. Reverting this patch as well as > > "nvme: add separate poll queue map" fixes the problem. > > I don't see anything related to what hung, the trace, and so on. > Can you clue me in? Where are the test results with dmesg? > alpha just stalls during boot. parisc reports a hung task in nvme_reset_work. sparc64 reports EIO when instantiating the nvme driver, called from nvme_reset_work, and then stalls. In all three cases, reverting the two mentioned patches fixes the problem. https://kerneltests.org/builders/qemu-parisc-next/builds/173/steps/qemubuildcommand_1/logs/stdio is an example log for parisc. I didn't check if the other boot failures (ppc looks bad) have the same root cause. > How to reproduce? > parisc: qemu-system-hppa -kernel vmlinux -no-reboot \ -snapshot -device nvme,serial=foo,drive=d0 \ -drive file=rootfs.ext2,if=none,format=raw,id=d0 \ -append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0,115200 ' \ -nographic -monitor null alpha: qemu-system-alpha -M clipper -kernel arch/alpha/boot/vmlinux -no-reboot \ -snapshot -device nvme,serial=foo,drive=d0 \ -drive file=rootfs.ext2,if=none,format=raw,id=d0 \ -append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0' \ -m 128M -nographic -monitor null -serial stdio sparc64: qemu-system-sparc64 -M sun4u -cpu 'TI UltraSparc IIi' -m 512 \ -snapshot -device nvme,serial=foo,drive=d0,bus=pciB \ -drive file=rootfs.ext2,if=none,format=raw,id=d0 \ -kernel arch/sparc/boot/image -no-reboot \ -append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0' \ -nographic -monitor none The root file systems are available from the respective subdirectories of: https://github.com/groeck/linux-build-test/tree/master/rootfs Guenter ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] nvme: utilize two queue maps, one for reads and one for writes 2018-11-14 4:52 ` [PATCH] " Guenter Roeck @ 2018-11-14 17:12 ` Jens Axboe 0 siblings, 0 replies; 15+ messages in thread From: Jens Axboe @ 2018-11-14 17:12 UTC (permalink / raw) On 11/13/18 9:52 PM, Guenter Roeck wrote: > On Tue, Nov 13, 2018@05:51:08PM -0700, Jens Axboe wrote: >> On 11/13/18 5:41 PM, Guenter Roeck wrote: >>> Hi, >>> >>> On Wed, Oct 31, 2018@08:36:31AM -0600, Jens Axboe wrote: >>>> NVMe does round-robin between queues by default, which means that >>>> sharing a queue map for both reads and writes can be problematic >>>> in terms of read servicing. It's much easier to flood the queue >>>> with writes and reduce the read servicing. >>>> >>>> Implement two queue maps, one for reads and one for writes. The >>>> write queue count is configurable through the 'write_queues' >>>> parameter. >>>> >>>> By default, we retain the previous behavior of having a single >>>> queue set, shared between reads and writes. Setting 'write_queues' >>>> to a non-zero value will create two queue sets, one for reads and >>>> one for writes, the latter using the configurable number of >>>> queues (hardware queue counts permitting). >>>> >>>> Reviewed-by: Hannes Reinecke <hare at suse.com> >>>> Reviewed-by: Keith Busch <keith.busch at intel.com> >>>> Signed-off-by: Jens Axboe <axboe at kernel.dk> >>> >>> This patch causes hangs when running recent versions of >>> -next with several architectures; see the -next column at >>> kerneltests.org/builders for details. Bisect log below; this >>> was run with qemu on alpha. Reverting this patch as well as >>> "nvme: add separate poll queue map" fixes the problem. >> >> I don't see anything related to what hung, the trace, and so on. >> Can you clue me in? Where are the test results with dmesg? >> > alpha just stalls during boot. parisc reports a hung task > in nvme_reset_work. sparc64 reports EIO when instantiating > the nvme driver, called from nvme_reset_work, and then stalls. > In all three cases, reverting the two mentioned patches fixes > the problem. I think the below patch should fix it. > https://kerneltests.org/builders/qemu-parisc-next/builds/173/steps/qemubuildcommand_1/logs/stdio > > is an example log for parisc. > > I didn't check if the other boot failures (ppc looks bad) > have the same root cause. > >> How to reproduce? >> > parisc: > > qemu-system-hppa -kernel vmlinux -no-reboot \ > -snapshot -device nvme,serial=foo,drive=d0 \ > -drive file=rootfs.ext2,if=none,format=raw,id=d0 \ > -append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0,115200 ' \ > -nographic -monitor null > > alpha: > > qemu-system-alpha -M clipper -kernel arch/alpha/boot/vmlinux -no-reboot \ > -snapshot -device nvme,serial=foo,drive=d0 \ > -drive file=rootfs.ext2,if=none,format=raw,id=d0 \ > -append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0' \ > -m 128M -nographic -monitor null -serial stdio > > sparc64: > > qemu-system-sparc64 -M sun4u -cpu 'TI UltraSparc IIi' -m 512 \ > -snapshot -device nvme,serial=foo,drive=d0,bus=pciB \ > -drive file=rootfs.ext2,if=none,format=raw,id=d0 \ > -kernel arch/sparc/boot/image -no-reboot \ > -append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0' \ > -nographic -monitor none > > The root file systems are available from the respective subdirectories > of: > > https://github.com/groeck/linux-build-test/tree/master/rootfs This is useful, thanks! I haven't tried it yet, but I was able to reproduce on x86 with MSI turned off. diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 8df868afa363..6c03461ad988 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2098,7 +2098,7 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int nr_io_queues) .nr_sets = ARRAY_SIZE(irq_sets), .sets = irq_sets, }; - int result; + int result = 0; /* * For irq sets, we have to ask for minvec == maxvec. This passes @@ -2113,9 +2113,16 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int nr_io_queues) affd.nr_sets = 1; /* - * Need IRQs for read+write queues, and one for the admin queue + * Need IRQs for read+write queues, and one for the admin queue. + * If we can't get more than one vector, we have to share the + * admin queue and IO queue vector. For that case, don't add + * an extra vector for the admin queue, or we'll continue + * asking for 2 and get -ENOSPC in return. */ - nr_io_queues = irq_sets[0] + irq_sets[1] + 1; + if (result == -ENOSPC && nr_io_queues == 1) + nr_io_queues = 1; + else + nr_io_queues = irq_sets[0] + irq_sets[1] + 1; result = pci_alloc_irq_vectors_affinity(pdev, nr_io_queues, nr_io_queues, -- Jens Axboe ^ permalink raw reply related [flat|nested] 15+ messages in thread
[parent not found: <20181115182833.GA15729@roeck-us.net>]
* [PATCH] nvme: utilize two queue maps, one for reads and one for writes [not found] <20181115182833.GA15729@roeck-us.net> @ 2018-11-15 18:38 ` Jens Axboe 0 siblings, 0 replies; 15+ messages in thread From: Jens Axboe @ 2018-11-15 18:38 UTC (permalink / raw) On 11/15/18 11:28 AM, Guenter Roeck wrote: > Hi Jens, > >> I think the below patch should fix it. >> > Sorry I wasn't able to test this earlier. Looks like it does > fix the problem; the problem is no longer seen in next-20181115. > Minor comment below. That's fine, thanks for testing! >> /* >> - * Need IRQs for read+write queues, and one for the admin queue >> + * Need IRQs for read+write queues, and one for the admin queue. >> + * If we can't get more than one vector, we have to share the >> + * admin queue and IO queue vector. For that case, don't add >> + * an extra vector for the admin queue, or we'll continue >> + * asking for 2 and get -ENOSPC in return. >> */ >> - nr_io_queues = irq_sets[0] + irq_sets[1] + 1; >> + if (result == -ENOSPC && nr_io_queues == 1) >> + nr_io_queues = 1; > > Setting nr_io_queues to 1 when it already is set to 1 doesn't really do > anything. Is this for clarification ? Guess that does look a bit odd, alternative would be to flip the condition, but I think this one is easier to read. -- Jens Axboe ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <20181115191126.GA16973@roeck-us.net>]
* [PATCH] nvme: utilize two queue maps, one for reads and one for writes [not found] <20181115191126.GA16973@roeck-us.net> @ 2018-11-15 19:29 ` Jens Axboe 2018-11-15 19:38 ` Guenter Roeck 2018-11-15 19:36 ` Guenter Roeck 1 sibling, 1 reply; 15+ messages in thread From: Jens Axboe @ 2018-11-15 19:29 UTC (permalink / raw) On 11/15/18 12:11 PM, Guenter Roeck wrote: > On Wed, Nov 14, 2018@10:12:44AM -0700, Jens Axboe wrote: >> >> I think the below patch should fix it. >> > > I spoke too early. sparc64, next-20181115: > > [ 14.204370] nvme nvme0: pci function 0000:02:00.0 > [ 14.249956] nvme nvme0: Removing after probe failure status: -5 > [ 14.263496] ------------[ cut here ]------------ > [ 14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 __free_irq+0xa4/0x320 > [ 14.264265] Trying to free already-free IRQ 9 > [ 14.264519] Modules linked in: > [ 14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 4.20.0-rc2-next-20181115 #1 > [ 14.265555] Workqueue: nvme-reset-wq nvme_reset_work > [ 14.265899] Call Trace: > [ 14.266118] [000000000046944c] __warn+0xcc/0x100 > [ 14.266375] [00000000004694b0] warn_slowpath_fmt+0x30/0x40 > [ 14.266635] [00000000004d4ce4] __free_irq+0xa4/0x320 > [ 14.266867] [00000000004d4ff8] free_irq+0x38/0x80 > [ 14.267092] [00000000007b1874] pci_free_irq+0x14/0x40 > [ 14.267327] [00000000008a5444] nvme_dev_disable+0xe4/0x520 > [ 14.267576] [00000000008a69b8] nvme_reset_work+0x138/0x1c60 > [ 14.267827] [0000000000488dd0] process_one_work+0x230/0x6e0 > [ 14.268079] [00000000004894f4] worker_thread+0x274/0x520 > [ 14.268321] [0000000000490624] kthread+0xe4/0x120 > [ 14.268544] [00000000004060c4] ret_from_fork+0x1c/0x2c > [ 14.268825] [0000000000000000] (null) > [ 14.269089] irq event stamp: 32796 > [ 14.269350] hardirqs last enabled at (32795): [<0000000000b624a4>] _raw_spin_unlock_irqrestore+0x24/0x80 > [ 14.269757] hardirqs last disabled at (32796): [<0000000000b622f4>] _raw_spin_lock_irqsave+0x14/0x60 > [ 14.270566] softirqs last enabled at (32780): [<0000000000b64c18>] __do_softirq+0x238/0x520 > [ 14.271206] softirqs last disabled at (32729): [<000000000042ceec>] do_softirq_own_stack+0x2c/0x40 > [ 14.272288] ---[ end trace cb79ccd2a0a03f3c ]--- > > Looks like an error during probe followed by an error cleanup problem. Did it previous probe fine? Or is the new thing just the fact that we spew a warning on trying to free a non-existing vector? -- Jens Axboe ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] nvme: utilize two queue maps, one for reads and one for writes 2018-11-15 19:29 ` Jens Axboe @ 2018-11-15 19:38 ` Guenter Roeck 2018-11-15 19:40 ` Jens Axboe 0 siblings, 1 reply; 15+ messages in thread From: Guenter Roeck @ 2018-11-15 19:38 UTC (permalink / raw) On Thu, Nov 15, 2018@12:29:04PM -0700, Jens Axboe wrote: > On 11/15/18 12:11 PM, Guenter Roeck wrote: > > On Wed, Nov 14, 2018@10:12:44AM -0700, Jens Axboe wrote: > >> > >> I think the below patch should fix it. > >> > > > > I spoke too early. sparc64, next-20181115: > > > > [ 14.204370] nvme nvme0: pci function 0000:02:00.0 > > [ 14.249956] nvme nvme0: Removing after probe failure status: -5 > > [ 14.263496] ------------[ cut here ]------------ > > [ 14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 __free_irq+0xa4/0x320 > > [ 14.264265] Trying to free already-free IRQ 9 > > [ 14.264519] Modules linked in: > > [ 14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 4.20.0-rc2-next-20181115 #1 > > [ 14.265555] Workqueue: nvme-reset-wq nvme_reset_work > > [ 14.265899] Call Trace: > > [ 14.266118] [000000000046944c] __warn+0xcc/0x100 > > [ 14.266375] [00000000004694b0] warn_slowpath_fmt+0x30/0x40 > > [ 14.266635] [00000000004d4ce4] __free_irq+0xa4/0x320 > > [ 14.266867] [00000000004d4ff8] free_irq+0x38/0x80 > > [ 14.267092] [00000000007b1874] pci_free_irq+0x14/0x40 > > [ 14.267327] [00000000008a5444] nvme_dev_disable+0xe4/0x520 > > [ 14.267576] [00000000008a69b8] nvme_reset_work+0x138/0x1c60 > > [ 14.267827] [0000000000488dd0] process_one_work+0x230/0x6e0 > > [ 14.268079] [00000000004894f4] worker_thread+0x274/0x520 > > [ 14.268321] [0000000000490624] kthread+0xe4/0x120 > > [ 14.268544] [00000000004060c4] ret_from_fork+0x1c/0x2c > > [ 14.268825] [0000000000000000] (null) > > [ 14.269089] irq event stamp: 32796 > > [ 14.269350] hardirqs last enabled at (32795): [<0000000000b624a4>] _raw_spin_unlock_irqrestore+0x24/0x80 > > [ 14.269757] hardirqs last disabled at (32796): [<0000000000b622f4>] _raw_spin_lock_irqsave+0x14/0x60 > > [ 14.270566] softirqs last enabled at (32780): [<0000000000b64c18>] __do_softirq+0x238/0x520 > > [ 14.271206] softirqs last disabled at (32729): [<000000000042ceec>] do_softirq_own_stack+0x2c/0x40 > > [ 14.272288] ---[ end trace cb79ccd2a0a03f3c ]--- > > > > Looks like an error during probe followed by an error cleanup problem. > > Did it previous probe fine? Or is the new thing just the fact that > we spew a warning on trying to free a non-existing vector? > This works fine in mainline, if that is your question. Guenter ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] nvme: utilize two queue maps, one for reads and one for writes 2018-11-15 19:38 ` Guenter Roeck @ 2018-11-15 19:40 ` Jens Axboe 2018-11-15 19:43 ` Jens Axboe 0 siblings, 1 reply; 15+ messages in thread From: Jens Axboe @ 2018-11-15 19:40 UTC (permalink / raw) On 11/15/18 12:38 PM, Guenter Roeck wrote: > On Thu, Nov 15, 2018@12:29:04PM -0700, Jens Axboe wrote: >> On 11/15/18 12:11 PM, Guenter Roeck wrote: >>> On Wed, Nov 14, 2018@10:12:44AM -0700, Jens Axboe wrote: >>>> >>>> I think the below patch should fix it. >>>> >>> >>> I spoke too early. sparc64, next-20181115: >>> >>> [ 14.204370] nvme nvme0: pci function 0000:02:00.0 >>> [ 14.249956] nvme nvme0: Removing after probe failure status: -5 >>> [ 14.263496] ------------[ cut here ]------------ >>> [ 14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 __free_irq+0xa4/0x320 >>> [ 14.264265] Trying to free already-free IRQ 9 >>> [ 14.264519] Modules linked in: >>> [ 14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 4.20.0-rc2-next-20181115 #1 >>> [ 14.265555] Workqueue: nvme-reset-wq nvme_reset_work >>> [ 14.265899] Call Trace: >>> [ 14.266118] [000000000046944c] __warn+0xcc/0x100 >>> [ 14.266375] [00000000004694b0] warn_slowpath_fmt+0x30/0x40 >>> [ 14.266635] [00000000004d4ce4] __free_irq+0xa4/0x320 >>> [ 14.266867] [00000000004d4ff8] free_irq+0x38/0x80 >>> [ 14.267092] [00000000007b1874] pci_free_irq+0x14/0x40 >>> [ 14.267327] [00000000008a5444] nvme_dev_disable+0xe4/0x520 >>> [ 14.267576] [00000000008a69b8] nvme_reset_work+0x138/0x1c60 >>> [ 14.267827] [0000000000488dd0] process_one_work+0x230/0x6e0 >>> [ 14.268079] [00000000004894f4] worker_thread+0x274/0x520 >>> [ 14.268321] [0000000000490624] kthread+0xe4/0x120 >>> [ 14.268544] [00000000004060c4] ret_from_fork+0x1c/0x2c >>> [ 14.268825] [0000000000000000] (null) >>> [ 14.269089] irq event stamp: 32796 >>> [ 14.269350] hardirqs last enabled at (32795): [<0000000000b624a4>] _raw_spin_unlock_irqrestore+0x24/0x80 >>> [ 14.269757] hardirqs last disabled at (32796): [<0000000000b622f4>] _raw_spin_lock_irqsave+0x14/0x60 >>> [ 14.270566] softirqs last enabled at (32780): [<0000000000b64c18>] __do_softirq+0x238/0x520 >>> [ 14.271206] softirqs last disabled at (32729): [<000000000042ceec>] do_softirq_own_stack+0x2c/0x40 >>> [ 14.272288] ---[ end trace cb79ccd2a0a03f3c ]--- >>> >>> Looks like an error during probe followed by an error cleanup problem. >> >> Did it previous probe fine? Or is the new thing just the fact that >> we spew a warning on trying to free a non-existing vector? >> > This works fine in mainline, if that is your question. Yeah, as soon as I sent the other email I realized that. Let me send you a quick patch. -- Jens Axboe ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] nvme: utilize two queue maps, one for reads and one for writes 2018-11-15 19:40 ` Jens Axboe @ 2018-11-15 19:43 ` Jens Axboe 2018-11-15 22:06 ` Guenter Roeck 0 siblings, 1 reply; 15+ messages in thread From: Jens Axboe @ 2018-11-15 19:43 UTC (permalink / raw) On 11/15/18 12:40 PM, Jens Axboe wrote: > On 11/15/18 12:38 PM, Guenter Roeck wrote: >> On Thu, Nov 15, 2018@12:29:04PM -0700, Jens Axboe wrote: >>> On 11/15/18 12:11 PM, Guenter Roeck wrote: >>>> On Wed, Nov 14, 2018@10:12:44AM -0700, Jens Axboe wrote: >>>>> >>>>> I think the below patch should fix it. >>>>> >>>> >>>> I spoke too early. sparc64, next-20181115: >>>> >>>> [ 14.204370] nvme nvme0: pci function 0000:02:00.0 >>>> [ 14.249956] nvme nvme0: Removing after probe failure status: -5 >>>> [ 14.263496] ------------[ cut here ]------------ >>>> [ 14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 __free_irq+0xa4/0x320 >>>> [ 14.264265] Trying to free already-free IRQ 9 >>>> [ 14.264519] Modules linked in: >>>> [ 14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 4.20.0-rc2-next-20181115 #1 >>>> [ 14.265555] Workqueue: nvme-reset-wq nvme_reset_work >>>> [ 14.265899] Call Trace: >>>> [ 14.266118] [000000000046944c] __warn+0xcc/0x100 >>>> [ 14.266375] [00000000004694b0] warn_slowpath_fmt+0x30/0x40 >>>> [ 14.266635] [00000000004d4ce4] __free_irq+0xa4/0x320 >>>> [ 14.266867] [00000000004d4ff8] free_irq+0x38/0x80 >>>> [ 14.267092] [00000000007b1874] pci_free_irq+0x14/0x40 >>>> [ 14.267327] [00000000008a5444] nvme_dev_disable+0xe4/0x520 >>>> [ 14.267576] [00000000008a69b8] nvme_reset_work+0x138/0x1c60 >>>> [ 14.267827] [0000000000488dd0] process_one_work+0x230/0x6e0 >>>> [ 14.268079] [00000000004894f4] worker_thread+0x274/0x520 >>>> [ 14.268321] [0000000000490624] kthread+0xe4/0x120 >>>> [ 14.268544] [00000000004060c4] ret_from_fork+0x1c/0x2c >>>> [ 14.268825] [0000000000000000] (null) >>>> [ 14.269089] irq event stamp: 32796 >>>> [ 14.269350] hardirqs last enabled at (32795): [<0000000000b624a4>] _raw_spin_unlock_irqrestore+0x24/0x80 >>>> [ 14.269757] hardirqs last disabled at (32796): [<0000000000b622f4>] _raw_spin_lock_irqsave+0x14/0x60 >>>> [ 14.270566] softirqs last enabled at (32780): [<0000000000b64c18>] __do_softirq+0x238/0x520 >>>> [ 14.271206] softirqs last disabled at (32729): [<000000000042ceec>] do_softirq_own_stack+0x2c/0x40 >>>> [ 14.272288] ---[ end trace cb79ccd2a0a03f3c ]--- >>>> >>>> Looks like an error during probe followed by an error cleanup problem. >>> >>> Did it previous probe fine? Or is the new thing just the fact that >>> we spew a warning on trying to free a non-existing vector? >>> >> This works fine in mainline, if that is your question. > > Yeah, as soon as I sent the other email I realized that. Let me send > you a quick patch. How's this? diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index ffbab5b01df4..fd73bfd2d1be 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2088,15 +2088,11 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int nr_io_queues) affd.nr_sets = 1; /* - * Need IRQs for read+write queues, and one for the admin queue. - * If we can't get more than one vector, we have to share the - * admin queue and IO queue vector. For that case, don't add - * an extra vector for the admin queue, or we'll continue - * asking for 2 and get -ENOSPC in return. + * If we got a failure and we're down to asking for just + * 1 + 1 queues, just ask for a single vector. We'll share + * that between the single IO queue and the admin queue. */ - if (result == -ENOSPC && nr_io_queues == 1) - nr_io_queues = 1; - else + if (!(result < 0 && nr_io_queues == 1)) nr_io_queues = irq_sets[0] + irq_sets[1] + 1; result = pci_alloc_irq_vectors_affinity(pdev, nr_io_queues, -- Jens Axboe ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH] nvme: utilize two queue maps, one for reads and one for writes 2018-11-15 19:43 ` Jens Axboe @ 2018-11-15 22:06 ` Guenter Roeck 2018-11-15 22:12 ` Jens Axboe 0 siblings, 1 reply; 15+ messages in thread From: Guenter Roeck @ 2018-11-15 22:06 UTC (permalink / raw) On Thu, Nov 15, 2018@12:43:40PM -0700, Jens Axboe wrote: > On 11/15/18 12:40 PM, Jens Axboe wrote: > > On 11/15/18 12:38 PM, Guenter Roeck wrote: > >> On Thu, Nov 15, 2018@12:29:04PM -0700, Jens Axboe wrote: > >>> On 11/15/18 12:11 PM, Guenter Roeck wrote: > >>>> On Wed, Nov 14, 2018@10:12:44AM -0700, Jens Axboe wrote: > >>>>> > >>>>> I think the below patch should fix it. > >>>>> > >>>> > >>>> I spoke too early. sparc64, next-20181115: > >>>> > >>>> [ 14.204370] nvme nvme0: pci function 0000:02:00.0 > >>>> [ 14.249956] nvme nvme0: Removing after probe failure status: -5 > >>>> [ 14.263496] ------------[ cut here ]------------ > >>>> [ 14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 __free_irq+0xa4/0x320 > >>>> [ 14.264265] Trying to free already-free IRQ 9 > >>>> [ 14.264519] Modules linked in: > >>>> [ 14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 4.20.0-rc2-next-20181115 #1 > >>>> [ 14.265555] Workqueue: nvme-reset-wq nvme_reset_work > >>>> [ 14.265899] Call Trace: > >>>> [ 14.266118] [000000000046944c] __warn+0xcc/0x100 > >>>> [ 14.266375] [00000000004694b0] warn_slowpath_fmt+0x30/0x40 > >>>> [ 14.266635] [00000000004d4ce4] __free_irq+0xa4/0x320 > >>>> [ 14.266867] [00000000004d4ff8] free_irq+0x38/0x80 > >>>> [ 14.267092] [00000000007b1874] pci_free_irq+0x14/0x40 > >>>> [ 14.267327] [00000000008a5444] nvme_dev_disable+0xe4/0x520 > >>>> [ 14.267576] [00000000008a69b8] nvme_reset_work+0x138/0x1c60 > >>>> [ 14.267827] [0000000000488dd0] process_one_work+0x230/0x6e0 > >>>> [ 14.268079] [00000000004894f4] worker_thread+0x274/0x520 > >>>> [ 14.268321] [0000000000490624] kthread+0xe4/0x120 > >>>> [ 14.268544] [00000000004060c4] ret_from_fork+0x1c/0x2c > >>>> [ 14.268825] [0000000000000000] (null) > >>>> [ 14.269089] irq event stamp: 32796 > >>>> [ 14.269350] hardirqs last enabled at (32795): [<0000000000b624a4>] _raw_spin_unlock_irqrestore+0x24/0x80 > >>>> [ 14.269757] hardirqs last disabled at (32796): [<0000000000b622f4>] _raw_spin_lock_irqsave+0x14/0x60 > >>>> [ 14.270566] softirqs last enabled at (32780): [<0000000000b64c18>] __do_softirq+0x238/0x520 > >>>> [ 14.271206] softirqs last disabled at (32729): [<000000000042ceec>] do_softirq_own_stack+0x2c/0x40 > >>>> [ 14.272288] ---[ end trace cb79ccd2a0a03f3c ]--- > >>>> > >>>> Looks like an error during probe followed by an error cleanup problem. > >>> > >>> Did it previous probe fine? Or is the new thing just the fact that > >>> we spew a warning on trying to free a non-existing vector? > >>> > >> This works fine in mainline, if that is your question. > > > > Yeah, as soon as I sent the other email I realized that. Let me send > > you a quick patch. > > How's this? > > > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c > index ffbab5b01df4..fd73bfd2d1be 100644 > --- a/drivers/nvme/host/pci.c > +++ b/drivers/nvme/host/pci.c > @@ -2088,15 +2088,11 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int nr_io_queues) > affd.nr_sets = 1; > > /* > - * Need IRQs for read+write queues, and one for the admin queue. > - * If we can't get more than one vector, we have to share the > - * admin queue and IO queue vector. For that case, don't add > - * an extra vector for the admin queue, or we'll continue > - * asking for 2 and get -ENOSPC in return. > + * If we got a failure and we're down to asking for just > + * 1 + 1 queues, just ask for a single vector. We'll share > + * that between the single IO queue and the admin queue. > */ > - if (result == -ENOSPC && nr_io_queues == 1) > - nr_io_queues = 1; > - else > + if (!(result < 0 && nr_io_queues == 1)) > nr_io_queues = irq_sets[0] + irq_sets[1] + 1; > Unfortunately, the code doesn't even get here because the call of pci_alloc_irq_vectors_affinity in the first iteration fails with -EINVAL, which results in an immediate return with -EIO. Guenter > result = pci_alloc_irq_vectors_affinity(pdev, nr_io_queues, > > -- > Jens Axboe > ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] nvme: utilize two queue maps, one for reads and one for writes 2018-11-15 22:06 ` Guenter Roeck @ 2018-11-15 22:12 ` Jens Axboe 0 siblings, 0 replies; 15+ messages in thread From: Jens Axboe @ 2018-11-15 22:12 UTC (permalink / raw) On 11/15/18 3:06 PM, Guenter Roeck wrote: > On Thu, Nov 15, 2018@12:43:40PM -0700, Jens Axboe wrote: >> On 11/15/18 12:40 PM, Jens Axboe wrote: >>> On 11/15/18 12:38 PM, Guenter Roeck wrote: >>>> On Thu, Nov 15, 2018@12:29:04PM -0700, Jens Axboe wrote: >>>>> On 11/15/18 12:11 PM, Guenter Roeck wrote: >>>>>> On Wed, Nov 14, 2018@10:12:44AM -0700, Jens Axboe wrote: >>>>>>> >>>>>>> I think the below patch should fix it. >>>>>>> >>>>>> >>>>>> I spoke too early. sparc64, next-20181115: >>>>>> >>>>>> [ 14.204370] nvme nvme0: pci function 0000:02:00.0 >>>>>> [ 14.249956] nvme nvme0: Removing after probe failure status: -5 >>>>>> [ 14.263496] ------------[ cut here ]------------ >>>>>> [ 14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 __free_irq+0xa4/0x320 >>>>>> [ 14.264265] Trying to free already-free IRQ 9 >>>>>> [ 14.264519] Modules linked in: >>>>>> [ 14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 4.20.0-rc2-next-20181115 #1 >>>>>> [ 14.265555] Workqueue: nvme-reset-wq nvme_reset_work >>>>>> [ 14.265899] Call Trace: >>>>>> [ 14.266118] [000000000046944c] __warn+0xcc/0x100 >>>>>> [ 14.266375] [00000000004694b0] warn_slowpath_fmt+0x30/0x40 >>>>>> [ 14.266635] [00000000004d4ce4] __free_irq+0xa4/0x320 >>>>>> [ 14.266867] [00000000004d4ff8] free_irq+0x38/0x80 >>>>>> [ 14.267092] [00000000007b1874] pci_free_irq+0x14/0x40 >>>>>> [ 14.267327] [00000000008a5444] nvme_dev_disable+0xe4/0x520 >>>>>> [ 14.267576] [00000000008a69b8] nvme_reset_work+0x138/0x1c60 >>>>>> [ 14.267827] [0000000000488dd0] process_one_work+0x230/0x6e0 >>>>>> [ 14.268079] [00000000004894f4] worker_thread+0x274/0x520 >>>>>> [ 14.268321] [0000000000490624] kthread+0xe4/0x120 >>>>>> [ 14.268544] [00000000004060c4] ret_from_fork+0x1c/0x2c >>>>>> [ 14.268825] [0000000000000000] (null) >>>>>> [ 14.269089] irq event stamp: 32796 >>>>>> [ 14.269350] hardirqs last enabled at (32795): [<0000000000b624a4>] _raw_spin_unlock_irqrestore+0x24/0x80 >>>>>> [ 14.269757] hardirqs last disabled at (32796): [<0000000000b622f4>] _raw_spin_lock_irqsave+0x14/0x60 >>>>>> [ 14.270566] softirqs last enabled at (32780): [<0000000000b64c18>] __do_softirq+0x238/0x520 >>>>>> [ 14.271206] softirqs last disabled at (32729): [<000000000042ceec>] do_softirq_own_stack+0x2c/0x40 >>>>>> [ 14.272288] ---[ end trace cb79ccd2a0a03f3c ]--- >>>>>> >>>>>> Looks like an error during probe followed by an error cleanup problem. >>>>> >>>>> Did it previous probe fine? Or is the new thing just the fact that >>>>> we spew a warning on trying to free a non-existing vector? >>>>> >>>> This works fine in mainline, if that is your question. >>> >>> Yeah, as soon as I sent the other email I realized that. Let me send >>> you a quick patch. >> >> How's this? >> >> >> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c >> index ffbab5b01df4..fd73bfd2d1be 100644 >> --- a/drivers/nvme/host/pci.c >> +++ b/drivers/nvme/host/pci.c >> @@ -2088,15 +2088,11 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int nr_io_queues) >> affd.nr_sets = 1; >> >> /* >> - * Need IRQs for read+write queues, and one for the admin queue. >> - * If we can't get more than one vector, we have to share the >> - * admin queue and IO queue vector. For that case, don't add >> - * an extra vector for the admin queue, or we'll continue >> - * asking for 2 and get -ENOSPC in return. >> + * If we got a failure and we're down to asking for just >> + * 1 + 1 queues, just ask for a single vector. We'll share >> + * that between the single IO queue and the admin queue. >> */ >> - if (result == -ENOSPC && nr_io_queues == 1) >> - nr_io_queues = 1; >> - else >> + if (!(result < 0 && nr_io_queues == 1)) >> nr_io_queues = irq_sets[0] + irq_sets[1] + 1; >> > > Unfortunately, the code doesn't even get here because the call of > pci_alloc_irq_vectors_affinity in the first iteration fails with > -EINVAL, which results in an immediate return with -EIO. Oh yeah... How about this then? diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index ffbab5b01df4..4d161daa9c3a 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2088,15 +2088,11 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int nr_io_queues) affd.nr_sets = 1; /* - * Need IRQs for read+write queues, and one for the admin queue. - * If we can't get more than one vector, we have to share the - * admin queue and IO queue vector. For that case, don't add - * an extra vector for the admin queue, or we'll continue - * asking for 2 and get -ENOSPC in return. + * If we got a failure and we're down to asking for just + * 1 + 1 queues, just ask for a single vector. We'll share + * that between the single IO queue and the admin queue. */ - if (result == -ENOSPC && nr_io_queues == 1) - nr_io_queues = 1; - else + if (!(result < 0 && nr_io_queues == 1)) nr_io_queues = irq_sets[0] + irq_sets[1] + 1; result = pci_alloc_irq_vectors_affinity(pdev, nr_io_queues, @@ -2111,6 +2107,9 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int nr_io_queues) if (!nr_io_queues) return result; continue; + } else if (result == -EINVAL) { + nr_io_queues = 1; + continue; } else if (result <= 0) return -EIO; break; -- Jens Axboe ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH] nvme: utilize two queue maps, one for reads and one for writes [not found] <20181115191126.GA16973@roeck-us.net> 2018-11-15 19:29 ` Jens Axboe @ 2018-11-15 19:36 ` Guenter Roeck 2018-11-15 19:39 ` Jens Axboe 1 sibling, 1 reply; 15+ messages in thread From: Guenter Roeck @ 2018-11-15 19:36 UTC (permalink / raw) On Thu, Nov 15, 2018@11:11:26AM -0800, Guenter Roeck wrote: > On Wed, Nov 14, 2018@10:12:44AM -0700, Jens Axboe wrote: > > > > I think the below patch should fix it. > > > > I spoke too early. sparc64, next-20181115: > > [ 14.204370] nvme nvme0: pci function 0000:02:00.0 > [ 14.249956] nvme nvme0: Removing after probe failure status: -5 > [ 14.263496] ------------[ cut here ]------------ > [ 14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 __free_irq+0xa4/0x320 > [ 14.264265] Trying to free already-free IRQ 9 > [ 14.264519] Modules linked in: > [ 14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 4.20.0-rc2-next-20181115 #1 > [ 14.265555] Workqueue: nvme-reset-wq nvme_reset_work > [ 14.265899] Call Trace: > [ 14.266118] [000000000046944c] __warn+0xcc/0x100 > [ 14.266375] [00000000004694b0] warn_slowpath_fmt+0x30/0x40 > [ 14.266635] [00000000004d4ce4] __free_irq+0xa4/0x320 > [ 14.266867] [00000000004d4ff8] free_irq+0x38/0x80 > [ 14.267092] [00000000007b1874] pci_free_irq+0x14/0x40 > [ 14.267327] [00000000008a5444] nvme_dev_disable+0xe4/0x520 > [ 14.267576] [00000000008a69b8] nvme_reset_work+0x138/0x1c60 > [ 14.267827] [0000000000488dd0] process_one_work+0x230/0x6e0 > [ 14.268079] [00000000004894f4] worker_thread+0x274/0x520 > [ 14.268321] [0000000000490624] kthread+0xe4/0x120 > [ 14.268544] [00000000004060c4] ret_from_fork+0x1c/0x2c > [ 14.268825] [0000000000000000] (null) > [ 14.269089] irq event stamp: 32796 > [ 14.269350] hardirqs last enabled at (32795): [<0000000000b624a4>] _raw_spin_unlock_irqrestore+0x24/0x80 > [ 14.269757] hardirqs last disabled at (32796): [<0000000000b622f4>] _raw_spin_lock_irqsave+0x14/0x60 > [ 14.270566] softirqs last enabled at (32780): [<0000000000b64c18>] __do_softirq+0x238/0x520 > [ 14.271206] softirqs last disabled at (32729): [<000000000042ceec>] do_softirq_own_stack+0x2c/0x40 > [ 14.272288] ---[ end trace cb79ccd2a0a03f3c ]--- > > Looks like an error during probe followed by an error cleanup problem. > On sparc64, pci_alloc_irq_vectors_affinity() returns -EINVAL (possibly because the controller doesn't support MSI). [ 16.554753] nvme nvme0: pci function 0000:02:00.0 [ 16.622894] nvme 0000:02:00.0: pre alloc: nr_io_queues: 2 result: 0 [ 16.623814] nvme 0000:02:00.0: post alloc: nr_io_queues: 2 result: -22 [ 16.625047] nvme nvme0: Removing after probe failure status: -5 ... and, as result, allocating a single (legacy) interrupt isn't even tried. I didn't try to track down the cleanup failure. Guenter ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] nvme: utilize two queue maps, one for reads and one for writes 2018-11-15 19:36 ` Guenter Roeck @ 2018-11-15 19:39 ` Jens Axboe 0 siblings, 0 replies; 15+ messages in thread From: Jens Axboe @ 2018-11-15 19:39 UTC (permalink / raw) On 11/15/18 12:36 PM, Guenter Roeck wrote: > On Thu, Nov 15, 2018@11:11:26AM -0800, Guenter Roeck wrote: >> On Wed, Nov 14, 2018@10:12:44AM -0700, Jens Axboe wrote: >>> >>> I think the below patch should fix it. >>> >> >> I spoke too early. sparc64, next-20181115: >> >> [ 14.204370] nvme nvme0: pci function 0000:02:00.0 >> [ 14.249956] nvme nvme0: Removing after probe failure status: -5 >> [ 14.263496] ------------[ cut here ]------------ >> [ 14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 __free_irq+0xa4/0x320 >> [ 14.264265] Trying to free already-free IRQ 9 >> [ 14.264519] Modules linked in: >> [ 14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 4.20.0-rc2-next-20181115 #1 >> [ 14.265555] Workqueue: nvme-reset-wq nvme_reset_work >> [ 14.265899] Call Trace: >> [ 14.266118] [000000000046944c] __warn+0xcc/0x100 >> [ 14.266375] [00000000004694b0] warn_slowpath_fmt+0x30/0x40 >> [ 14.266635] [00000000004d4ce4] __free_irq+0xa4/0x320 >> [ 14.266867] [00000000004d4ff8] free_irq+0x38/0x80 >> [ 14.267092] [00000000007b1874] pci_free_irq+0x14/0x40 >> [ 14.267327] [00000000008a5444] nvme_dev_disable+0xe4/0x520 >> [ 14.267576] [00000000008a69b8] nvme_reset_work+0x138/0x1c60 >> [ 14.267827] [0000000000488dd0] process_one_work+0x230/0x6e0 >> [ 14.268079] [00000000004894f4] worker_thread+0x274/0x520 >> [ 14.268321] [0000000000490624] kthread+0xe4/0x120 >> [ 14.268544] [00000000004060c4] ret_from_fork+0x1c/0x2c >> [ 14.268825] [0000000000000000] (null) >> [ 14.269089] irq event stamp: 32796 >> [ 14.269350] hardirqs last enabled at (32795): [<0000000000b624a4>] _raw_spin_unlock_irqrestore+0x24/0x80 >> [ 14.269757] hardirqs last disabled at (32796): [<0000000000b622f4>] _raw_spin_lock_irqsave+0x14/0x60 >> [ 14.270566] softirqs last enabled at (32780): [<0000000000b64c18>] __do_softirq+0x238/0x520 >> [ 14.271206] softirqs last disabled at (32729): [<000000000042ceec>] do_softirq_own_stack+0x2c/0x40 >> [ 14.272288] ---[ end trace cb79ccd2a0a03f3c ]--- >> >> Looks like an error during probe followed by an error cleanup problem. >> > On sparc64, pci_alloc_irq_vectors_affinity() returns -EINVAL (possibly > because the controller doesn't support MSI). > > [ 16.554753] nvme nvme0: pci function 0000:02:00.0 > [ 16.622894] nvme 0000:02:00.0: pre alloc: nr_io_queues: 2 result: 0 > [ 16.623814] nvme 0000:02:00.0: post alloc: nr_io_queues: 2 result: -22 > [ 16.625047] nvme nvme0: Removing after probe failure status: -5 > > ... and, as result, allocating a single (legacy) interrupt isn't even tried. > > I didn't try to track down the cleanup failure. OK, then this isn't a new failure in terms of whether the nvme device will work, it's just a cleanup issue. That's less severe than the previous hang :-) -- Jens Axboe ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <20181115224634.GA13101@roeck-us.net>]
* [PATCH] nvme: utilize two queue maps, one for reads and one for writes [not found] <20181115224634.GA13101@roeck-us.net> @ 2018-11-15 23:03 ` Jens Axboe 0 siblings, 0 replies; 15+ messages in thread From: Jens Axboe @ 2018-11-15 23:03 UTC (permalink / raw) On 11/15/18 3:46 PM, Guenter Roeck wrote: > On Thu, Nov 15, 2018@03:12:48PM -0700, Jens Axboe wrote: >> On 11/15/18 3:06 PM, Guenter Roeck wrote: >>> On Thu, Nov 15, 2018@12:43:40PM -0700, Jens Axboe wrote: >>>> On 11/15/18 12:40 PM, Jens Axboe wrote: >>>>> On 11/15/18 12:38 PM, Guenter Roeck wrote: >>>>>> On Thu, Nov 15, 2018@12:29:04PM -0700, Jens Axboe wrote: >>>>>>> On 11/15/18 12:11 PM, Guenter Roeck wrote: >>>>>>>> On Wed, Nov 14, 2018@10:12:44AM -0700, Jens Axboe wrote: >>>>>>>>> >>>>>>>>> I think the below patch should fix it. >>>>>>>>> >>>>>>>> >>>>>>>> I spoke too early. sparc64, next-20181115: >>>>>>>> >>>>>>>> [ 14.204370] nvme nvme0: pci function 0000:02:00.0 >>>>>>>> [ 14.249956] nvme nvme0: Removing after probe failure status: -5 >>>>>>>> [ 14.263496] ------------[ cut here ]------------ >>>>>>>> [ 14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 __free_irq+0xa4/0x320 >>>>>>>> [ 14.264265] Trying to free already-free IRQ 9 >>>>>>>> [ 14.264519] Modules linked in: >>>>>>>> [ 14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 4.20.0-rc2-next-20181115 #1 >>>>>>>> [ 14.265555] Workqueue: nvme-reset-wq nvme_reset_work >>>>>>>> [ 14.265899] Call Trace: >>>>>>>> [ 14.266118] [000000000046944c] __warn+0xcc/0x100 >>>>>>>> [ 14.266375] [00000000004694b0] warn_slowpath_fmt+0x30/0x40 >>>>>>>> [ 14.266635] [00000000004d4ce4] __free_irq+0xa4/0x320 >>>>>>>> [ 14.266867] [00000000004d4ff8] free_irq+0x38/0x80 >>>>>>>> [ 14.267092] [00000000007b1874] pci_free_irq+0x14/0x40 >>>>>>>> [ 14.267327] [00000000008a5444] nvme_dev_disable+0xe4/0x520 >>>>>>>> [ 14.267576] [00000000008a69b8] nvme_reset_work+0x138/0x1c60 >>>>>>>> [ 14.267827] [0000000000488dd0] process_one_work+0x230/0x6e0 >>>>>>>> [ 14.268079] [00000000004894f4] worker_thread+0x274/0x520 >>>>>>>> [ 14.268321] [0000000000490624] kthread+0xe4/0x120 >>>>>>>> [ 14.268544] [00000000004060c4] ret_from_fork+0x1c/0x2c >>>>>>>> [ 14.268825] [0000000000000000] (null) >>>>>>>> [ 14.269089] irq event stamp: 32796 >>>>>>>> [ 14.269350] hardirqs last enabled at (32795): [<0000000000b624a4>] _raw_spin_unlock_irqrestore+0x24/0x80 >>>>>>>> [ 14.269757] hardirqs last disabled at (32796): [<0000000000b622f4>] _raw_spin_lock_irqsave+0x14/0x60 >>>>>>>> [ 14.270566] softirqs last enabled at (32780): [<0000000000b64c18>] __do_softirq+0x238/0x520 >>>>>>>> [ 14.271206] softirqs last disabled at (32729): [<000000000042ceec>] do_softirq_own_stack+0x2c/0x40 >>>>>>>> [ 14.272288] ---[ end trace cb79ccd2a0a03f3c ]--- >>>>>>>> >>>>>>>> Looks like an error during probe followed by an error cleanup problem. >>>>>>> >>>>>>> Did it previous probe fine? Or is the new thing just the fact that >>>>>>> we spew a warning on trying to free a non-existing vector? >>>>>>> >>>>>> This works fine in mainline, if that is your question. >>>>> >>>>> Yeah, as soon as I sent the other email I realized that. Let me send >>>>> you a quick patch. >>>> >>>> How's this? >>>> >>>> >>>> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c >>>> index ffbab5b01df4..fd73bfd2d1be 100644 >>>> --- a/drivers/nvme/host/pci.c >>>> +++ b/drivers/nvme/host/pci.c >>>> @@ -2088,15 +2088,11 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int nr_io_queues) >>>> affd.nr_sets = 1; >>>> >>>> /* >>>> - * Need IRQs for read+write queues, and one for the admin queue. >>>> - * If we can't get more than one vector, we have to share the >>>> - * admin queue and IO queue vector. For that case, don't add >>>> - * an extra vector for the admin queue, or we'll continue >>>> - * asking for 2 and get -ENOSPC in return. >>>> + * If we got a failure and we're down to asking for just >>>> + * 1 + 1 queues, just ask for a single vector. We'll share >>>> + * that between the single IO queue and the admin queue. >>>> */ >>>> - if (result == -ENOSPC && nr_io_queues == 1) >>>> - nr_io_queues = 1; >>>> - else >>>> + if (!(result < 0 && nr_io_queues == 1)) >>>> nr_io_queues = irq_sets[0] + irq_sets[1] + 1; >>>> >>> >>> Unfortunately, the code doesn't even get here because the call of >>> pci_alloc_irq_vectors_affinity in the first iteration fails with >>> -EINVAL, which results in an immediate return with -EIO. >> >> Oh yeah... How about this then? >> > Yes, this one works (at least on sparc64). Do I need to test > on other architectures as well ? Should be fine, hopefully... Thanks for testing! >> @@ -2111,6 +2107,9 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int nr_io_queues) >> if (!nr_io_queues) >> return result; >> continue; >> + } else if (result == -EINVAL) { > > Add an explanation, maybe ? Yeah, I'll add a proper comment, this was just for testing. -- Jens Axboe ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2018-11-15 23:03 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20181114004148.GA29545@roeck-us.net>
2018-11-14 0:51 ` [PATCH] nvme: utilize two queue maps, one for reads and one for writes Jens Axboe
2018-11-14 1:28 ` Mike Snitzer
2018-11-14 1:36 ` Mike Snitzer
2018-11-14 4:52 ` [PATCH] " Guenter Roeck
2018-11-14 17:12 ` Jens Axboe
[not found] <20181115182833.GA15729@roeck-us.net>
2018-11-15 18:38 ` Jens Axboe
[not found] <20181115191126.GA16973@roeck-us.net>
2018-11-15 19:29 ` Jens Axboe
2018-11-15 19:38 ` Guenter Roeck
2018-11-15 19:40 ` Jens Axboe
2018-11-15 19:43 ` Jens Axboe
2018-11-15 22:06 ` Guenter Roeck
2018-11-15 22:12 ` Jens Axboe
2018-11-15 19:36 ` Guenter Roeck
2018-11-15 19:39 ` Jens Axboe
[not found] <20181115224634.GA13101@roeck-us.net>
2018-11-15 23:03 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).