From: Guenter Roeck <linux@roeck-us.net>
To: Jens Axboe <axboe@kernel.dk>
Cc: Keith Busch <keith.busch@intel.com>,
Sagi Grimberg <sagi@grimberg.me>,
linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes
Date: Thu, 15 Nov 2018 10:28:33 -0800 [thread overview]
Message-ID: <20181115182833.GA15729@roeck-us.net> (raw)
Hi Jens,
On Wed, Nov 14, 2018 at 10:12:44AM -0700, Jens Axboe wrote:
> On 11/13/18 9:52 PM, Guenter Roeck wrote:
> > On Tue, Nov 13, 2018 at 05:51:08PM -0700, Jens Axboe wrote:
> >> On 11/13/18 5:41 PM, Guenter Roeck wrote:
> >>> Hi,
> >>>
> >>> On Wed, Oct 31, 2018 at 08:36:31AM -0600, Jens Axboe wrote:
> >>>> NVMe does round-robin between queues by default, which means that
> >>>> sharing a queue map for both reads and writes can be problematic
> >>>> in terms of read servicing. It's much easier to flood the queue
> >>>> with writes and reduce the read servicing.
> >>>>
> >>>> Implement two queue maps, one for reads and one for writes. The
> >>>> write queue count is configurable through the 'write_queues'
> >>>> parameter.
> >>>>
> >>>> By default, we retain the previous behavior of having a single
> >>>> queue set, shared between reads and writes. Setting 'write_queues'
> >>>> to a non-zero value will create two queue sets, one for reads and
> >>>> one for writes, the latter using the configurable number of
> >>>> queues (hardware queue counts permitting).
> >>>>
> >>>> Reviewed-by: Hannes Reinecke <hare@suse.com>
> >>>> Reviewed-by: Keith Busch <keith.busch@intel.com>
> >>>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> >>>
> >>> This patch causes hangs when running recent versions of
> >>> -next with several architectures; see the -next column at
> >>> kerneltests.org/builders for details. Bisect log below; this
> >>> was run with qemu on alpha. Reverting this patch as well as
> >>> "nvme: add separate poll queue map" fixes the problem.
> >>
> >> I don't see anything related to what hung, the trace, and so on.
> >> Can you clue me in? Where are the test results with dmesg?
> >>
> > alpha just stalls during boot. parisc reports a hung task
> > in nvme_reset_work. sparc64 reports EIO when instantiating
> > the nvme driver, called from nvme_reset_work, and then stalls.
> > In all three cases, reverting the two mentioned patches fixes
> > the problem.
>
> I think the below patch should fix it.
>
Sorry I wasn't able to test this earlier. Looks like it does
fix the problem; the problem is no longer seen in next-20181115.
Minor comment below.
Guenter
> > https://kerneltests.org/builders/qemu-parisc-next/builds/173/steps/qemubuildcommand_1/logs/stdio
> >
> > is an example log for parisc.
> >
> > I didn't check if the other boot failures (ppc looks bad)
> > have the same root cause.
> >
> >> How to reproduce?
> >>
> > parisc:
> >
> > qemu-system-hppa -kernel vmlinux -no-reboot \
> > -snapshot -device nvme,serial=foo,drive=d0 \
> > -drive file=rootfs.ext2,if=none,format=raw,id=d0 \
> > -append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0,115200 ' \
> > -nographic -monitor null
> >
> > alpha:
> >
> > qemu-system-alpha -M clipper -kernel arch/alpha/boot/vmlinux -no-reboot \
> > -snapshot -device nvme,serial=foo,drive=d0 \
> > -drive file=rootfs.ext2,if=none,format=raw,id=d0 \
> > -append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0' \
> > -m 128M -nographic -monitor null -serial stdio
> >
> > sparc64:
> >
> > qemu-system-sparc64 -M sun4u -cpu 'TI UltraSparc IIi' -m 512 \
> > -snapshot -device nvme,serial=foo,drive=d0,bus=pciB \
> > -drive file=rootfs.ext2,if=none,format=raw,id=d0 \
> > -kernel arch/sparc/boot/image -no-reboot \
> > -append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0' \
> > -nographic -monitor none
> >
> > The root file systems are available from the respective subdirectories
> > of:
> >
> > https://github.com/groeck/linux-build-test/tree/master/rootfs
>
> This is useful, thanks! I haven't tried it yet, but I was able to
> reproduce on x86 with MSI turned off.
>
>
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 8df868afa363..6c03461ad988 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -2098,7 +2098,7 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int nr_io_queues)
> .nr_sets = ARRAY_SIZE(irq_sets),
> .sets = irq_sets,
> };
> - int result;
> + int result = 0;
>
> /*
> * For irq sets, we have to ask for minvec == maxvec. This passes
> @@ -2113,9 +2113,16 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int nr_io_queues)
> affd.nr_sets = 1;
>
> /*
> - * Need IRQs for read+write queues, and one for the admin queue
> + * Need IRQs for read+write queues, and one for the admin queue.
> + * If we can't get more than one vector, we have to share the
> + * admin queue and IO queue vector. For that case, don't add
> + * an extra vector for the admin queue, or we'll continue
> + * asking for 2 and get -ENOSPC in return.
> */
> - nr_io_queues = irq_sets[0] + irq_sets[1] + 1;
> + if (result == -ENOSPC && nr_io_queues == 1)
> + nr_io_queues = 1;
Setting nr_io_queues to 1 when it already is set to 1 doesn't really do
anything. Is this for clarification ?
> + else
> + nr_io_queues = irq_sets[0] + irq_sets[1] + 1;
>
> result = pci_alloc_irq_vectors_affinity(pdev, nr_io_queues,
> nr_io_queues,
>
> --
> Jens Axboe
>
next reply other threads:[~2018-11-15 18:28 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-15 18:28 Guenter Roeck [this message]
2018-11-15 18:38 ` [PATCH] nvme: utilize two queue maps, one for reads and one for writes Jens Axboe
2018-11-15 18:38 ` Jens Axboe
-- strict thread matches above, loose matches on Subject: below --
2018-11-15 22:46 Guenter Roeck
2018-11-15 23:03 ` Jens Axboe
2018-11-15 23:03 ` Jens Axboe
2018-11-15 19:11 Guenter Roeck
2018-11-15 19:29 ` Jens Axboe
2018-11-15 19:29 ` Jens Axboe
2018-11-15 19:38 ` Guenter Roeck
2018-11-15 19:38 ` Guenter Roeck
2018-11-15 19:40 ` Jens Axboe
2018-11-15 19:40 ` Jens Axboe
2018-11-15 19:43 ` Jens Axboe
2018-11-15 19:43 ` Jens Axboe
2018-11-15 22:06 ` Guenter Roeck
2018-11-15 22:06 ` Guenter Roeck
2018-11-15 22:12 ` Jens Axboe
2018-11-15 22:12 ` Jens Axboe
2018-11-15 19:36 ` Guenter Roeck
2018-11-15 19:36 ` Guenter Roeck
2018-11-15 19:39 ` Jens Axboe
2018-11-15 19:39 ` Jens Axboe
2018-11-14 0:41 Guenter Roeck
2018-11-14 0:51 ` Jens Axboe
2018-11-14 0:51 ` Jens Axboe
2018-11-14 4:52 ` Guenter Roeck
2018-11-14 4:52 ` Guenter Roeck
2018-11-14 17:12 ` Jens Axboe
2018-11-14 17:12 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181115182833.GA15729@roeck-us.net \
--to=linux@roeck-us.net \
--cc=axboe@kernel.dk \
--cc=keith.busch@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.