From: Keith Busch <keith.busch@intel.com>
To: Linux NVMe <linux-nvme@lists.infradead.org>,
Linux Block <linux-block@vger.kernel.org>
Cc: Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>,
Jianchao Wang <jianchao.w.wang@oracle.com>,
Ming Lei <ming.lei@redhat.com>, Jens Axboe <axboe@kernel.dk>,
Keith Busch <keith.busch@intel.com>
Subject: [PATCH 3/3] nvme-pci: Separate IO and admin queue IRQ vectors
Date: Tue, 27 Mar 2018 09:39:08 -0600 [thread overview]
Message-ID: <20180327153908.3732-3-keith.busch@intel.com> (raw)
In-Reply-To: <20180327153908.3732-1-keith.busch@intel.com>
The admin and first IO queues shared the first irq vector, which has an
affinity mask including cpu0. If a system allows cpu0 to be offlined,
the admin queue may not be usable if no other CPUs in the affinity mask
are online. This is a problem since unlike IO queues, there is only
one admin queue that always needs to be usable.
To fix, this patch allocates one pre_vector for the admin queue that
is assigned all CPUs, so will always be accessible. The IO queues are
assigned the remaining managed vectors.
In case a controller has only one interrupt vector available, the admin
and IO queues will share the pre_vector with all CPUs assigned.
Cc: Jianchao Wang <jianchao.w.wang@oracle.com>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
---
v1 -> v2:
Update to use new blk-mq API.
Removed unnecessary braces, inline functions, and temp variables.
Amended author (this has evolved significantly from the original).
drivers/nvme/host/pci.c | 23 +++++++++++++++++------
1 file changed, 17 insertions(+), 6 deletions(-)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index cc47fbe32ea5..50c8eaf51d92 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -84,6 +84,7 @@ struct nvme_dev {
struct dma_pool *prp_small_pool;
unsigned online_queues;
unsigned max_qid;
+ unsigned int num_vecs;
int q_depth;
u32 db_stride;
void __iomem *bar;
@@ -414,7 +415,8 @@ static int nvme_pci_map_queues(struct blk_mq_tag_set *set)
{
struct nvme_dev *dev = set->driver_data;
- return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev), 0);
+ return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev),
+ dev->num_vecs > 1);
}
/**
@@ -1455,7 +1457,11 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid)
nvmeq->sq_cmds_io = dev->cmb + offset;
}
- nvmeq->cq_vector = qid - 1;
+ /*
+ * A queue's vector matches the queue identifier unless the controller
+ * has only one vector available.
+ */
+ nvmeq->cq_vector = dev->num_vecs == 1 ? 0 : qid;
result = adapter_alloc_cq(dev, qid, nvmeq);
if (result < 0)
goto release_vector;
@@ -1909,6 +1915,10 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
int result, nr_io_queues;
unsigned long size;
+ struct irq_affinity affd = {
+ .pre_vectors = 1
+ };
+
nr_io_queues = num_present_cpus();
result = nvme_set_queue_count(&dev->ctrl, &nr_io_queues);
if (result < 0)
@@ -1944,11 +1954,12 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
* setting up the full range we need.
*/
pci_free_irq_vectors(pdev);
- nr_io_queues = pci_alloc_irq_vectors(pdev, 1, nr_io_queues,
- PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY);
- if (nr_io_queues <= 0)
+ result = pci_alloc_irq_vectors_affinity(pdev, 1, nr_io_queues + 1,
+ PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY, &affd);
+ if (result <= 0)
return -EIO;
- dev->max_qid = nr_io_queues;
+ dev->num_vecs = result;
+ dev->max_qid = max(result - 1, 1);
/*
* Should investigate if there's a performance win from allocating
--
2.14.3
WARNING: multiple messages have this Message-ID (diff)
From: keith.busch@intel.com (Keith Busch)
Subject: [PATCH 3/3] nvme-pci: Separate IO and admin queue IRQ vectors
Date: Tue, 27 Mar 2018 09:39:08 -0600 [thread overview]
Message-ID: <20180327153908.3732-3-keith.busch@intel.com> (raw)
In-Reply-To: <20180327153908.3732-1-keith.busch@intel.com>
The admin and first IO queues shared the first irq vector, which has an
affinity mask including cpu0. If a system allows cpu0 to be offlined,
the admin queue may not be usable if no other CPUs in the affinity mask
are online. This is a problem since unlike IO queues, there is only
one admin queue that always needs to be usable.
To fix, this patch allocates one pre_vector for the admin queue that
is assigned all CPUs, so will always be accessible. The IO queues are
assigned the remaining managed vectors.
In case a controller has only one interrupt vector available, the admin
and IO queues will share the pre_vector with all CPUs assigned.
Cc: Jianchao Wang <jianchao.w.wang at oracle.com>
Cc: Ming Lei <ming.lei at redhat.com>
Signed-off-by: Keith Busch <keith.busch at intel.com>
---
v1 -> v2:
Update to use new blk-mq API.
Removed unnecessary braces, inline functions, and temp variables.
Amended author (this has evolved significantly from the original).
drivers/nvme/host/pci.c | 23 +++++++++++++++++------
1 file changed, 17 insertions(+), 6 deletions(-)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index cc47fbe32ea5..50c8eaf51d92 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -84,6 +84,7 @@ struct nvme_dev {
struct dma_pool *prp_small_pool;
unsigned online_queues;
unsigned max_qid;
+ unsigned int num_vecs;
int q_depth;
u32 db_stride;
void __iomem *bar;
@@ -414,7 +415,8 @@ static int nvme_pci_map_queues(struct blk_mq_tag_set *set)
{
struct nvme_dev *dev = set->driver_data;
- return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev), 0);
+ return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev),
+ dev->num_vecs > 1);
}
/**
@@ -1455,7 +1457,11 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid)
nvmeq->sq_cmds_io = dev->cmb + offset;
}
- nvmeq->cq_vector = qid - 1;
+ /*
+ * A queue's vector matches the queue identifier unless the controller
+ * has only one vector available.
+ */
+ nvmeq->cq_vector = dev->num_vecs == 1 ? 0 : qid;
result = adapter_alloc_cq(dev, qid, nvmeq);
if (result < 0)
goto release_vector;
@@ -1909,6 +1915,10 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
int result, nr_io_queues;
unsigned long size;
+ struct irq_affinity affd = {
+ .pre_vectors = 1
+ };
+
nr_io_queues = num_present_cpus();
result = nvme_set_queue_count(&dev->ctrl, &nr_io_queues);
if (result < 0)
@@ -1944,11 +1954,12 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
* setting up the full range we need.
*/
pci_free_irq_vectors(pdev);
- nr_io_queues = pci_alloc_irq_vectors(pdev, 1, nr_io_queues,
- PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY);
- if (nr_io_queues <= 0)
+ result = pci_alloc_irq_vectors_affinity(pdev, 1, nr_io_queues + 1,
+ PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY, &affd);
+ if (result <= 0)
return -EIO;
- dev->max_qid = nr_io_queues;
+ dev->num_vecs = result;
+ dev->max_qid = max(result - 1, 1);
/*
* Should investigate if there's a performance win from allocating
--
2.14.3
next prev parent reply other threads:[~2018-03-27 15:39 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-27 15:39 [PATCH 1/3] blk-mq: Allow PCI vector offset for mapping queues Keith Busch
2018-03-27 15:39 ` Keith Busch
2018-03-27 15:39 ` [PATCH 2/3] nvme-pci: Remove unused queue parameter Keith Busch
2018-03-27 15:39 ` Keith Busch
2018-03-28 1:27 ` Ming Lei
2018-03-28 1:27 ` Ming Lei
2018-03-27 15:39 ` Keith Busch [this message]
2018-03-27 15:39 ` [PATCH 3/3] nvme-pci: Separate IO and admin queue IRQ vectors Keith Busch
2018-03-28 2:08 ` Ming Lei
2018-03-28 2:08 ` Ming Lei
2018-03-28 7:32 ` Christoph Hellwig
2018-03-28 7:32 ` Christoph Hellwig
2018-03-28 20:38 ` Keith Busch
2018-03-28 20:38 ` Keith Busch
2018-03-28 1:26 ` [PATCH 1/3] blk-mq: Allow PCI vector offset for mapping queues Ming Lei
2018-03-28 1:26 ` Ming Lei
2018-03-28 3:24 ` Jens Axboe
2018-03-28 3:24 ` Jens Axboe
2018-03-28 14:48 ` Don Brace
2018-03-28 14:48 ` Don Brace
-- strict thread matches above, loose matches on Subject: below --
2018-03-23 22:19 Keith Busch
2018-03-23 22:19 ` [PATCH 3/3] nvme-pci: Separate IO and admin queue IRQ vectors Keith Busch
2018-03-23 22:19 ` Keith Busch
2018-03-27 14:20 ` Christoph Hellwig
2018-03-27 14:20 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180327153908.3732-3-keith.busch@intel.com \
--to=keith.busch@intel.com \
--cc=axboe@kernel.dk \
--cc=hch@lst.de \
--cc=jianchao.w.wang@oracle.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=ming.lei@redhat.com \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.