All of lore.kernel.org
 help / color / mirror / Atom feed
From: Keith Busch <keith.busch@intel.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Bjorn Helgaas <helgaas@kernel.org>, Jens Axboe <axboe@kernel.dk>,
	Sagi Grimberg <sagi@grimberg.me>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-nvme@lists.infradead.org, Ming Lei <ming.lei@redhat.com>,
	linux-block@vger.kernel.org, Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH V3 1/5] genirq/affinity: don't mark 'affd' as const
Date: Wed, 13 Feb 2019 15:37:11 -0700	[thread overview]
Message-ID: <20190213223711.GC8027@localhost.localdomain> (raw)
In-Reply-To: <alpine.DEB.2.21.1902132232560.1659@nanos.tec.linutronix.de>

On Wed, Feb 13, 2019 at 10:41:55PM +0100, Thomas Gleixner wrote:
> Btw, while I have your attention. There popped up an issue recently related
> to that affinity logic.
> 
> The current implementation fails when:
> 
>         /*
>          * If there aren't any vectors left after applying the pre/post
>          * vectors don't bother with assigning affinity.
> 	 */
> 	if (nvecs == affd->pre_vectors + affd->post_vectors)
>     		return NULL;
> 
> Now the discussion arised, that in that case the affinity sets are not
> allocated and filled in for the pre/post vectors, but somehow the
> underlying device still works and later on triggers the warning in the
> blk-mq code because the MSI entries do not have affinity information
> attached.
>
> Sure, we could make that work, but there are several issues:
> 
>     1) irq_create_affinity_masks() has another reason to return NULL:
>        memory allocation fails.
> 
>     2) Does it make sense at all.
> 
> Right now the PCI allocator ignores the NULL return and proceeds without
> setting any affinities. As a consequence nothing is managed and everything
> happens to work.
> 
> But that happens to work is more by chance than by design and the warning
> is bogus if this is an expected mode of operation.
> 
> We should address these points in some way.

Ah, yes, that's a mistake in the nvme driver. It is assuming IO queues are
always on managed interrupts, but that's not true if when only 1 vector
could be allocated. This should be an appropriate fix to the warning:

---
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 022ea1ee63f8..f2ccebe1c926 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -506,7 +506,7 @@ static int nvme_pci_map_queues(struct blk_mq_tag_set *set)
 		 * affinity), so use the regular blk-mq cpu mapping
 		 */
 		map->queue_offset = qoff;
-		if (i != HCTX_TYPE_POLL)
+		if (i != HCTX_TYPE_POLL && dev->num_vecs > 1)
 			blk_mq_pci_map_queues(map, to_pci_dev(dev->dev), offset);
 		else
 			blk_mq_map_queues(map);
--

WARNING: multiple messages have this Message-ID (diff)
From: keith.busch@intel.com (Keith Busch)
Subject: [PATCH V3 1/5] genirq/affinity: don't mark 'affd' as const
Date: Wed, 13 Feb 2019 15:37:11 -0700	[thread overview]
Message-ID: <20190213223711.GC8027@localhost.localdomain> (raw)
In-Reply-To: <alpine.DEB.2.21.1902132232560.1659@nanos.tec.linutronix.de>

On Wed, Feb 13, 2019@10:41:55PM +0100, Thomas Gleixner wrote:
> Btw, while I have your attention. There popped up an issue recently related
> to that affinity logic.
> 
> The current implementation fails when:
> 
>         /*
>          * If there aren't any vectors left after applying the pre/post
>          * vectors don't bother with assigning affinity.
> 	 */
> 	if (nvecs == affd->pre_vectors + affd->post_vectors)
>     		return NULL;
> 
> Now the discussion arised, that in that case the affinity sets are not
> allocated and filled in for the pre/post vectors, but somehow the
> underlying device still works and later on triggers the warning in the
> blk-mq code because the MSI entries do not have affinity information
> attached.
>
> Sure, we could make that work, but there are several issues:
> 
>     1) irq_create_affinity_masks() has another reason to return NULL:
>        memory allocation fails.
> 
>     2) Does it make sense at all.
> 
> Right now the PCI allocator ignores the NULL return and proceeds without
> setting any affinities. As a consequence nothing is managed and everything
> happens to work.
> 
> But that happens to work is more by chance than by design and the warning
> is bogus if this is an expected mode of operation.
> 
> We should address these points in some way.

Ah, yes, that's a mistake in the nvme driver. It is assuming IO queues are
always on managed interrupts, but that's not true if when only 1 vector
could be allocated. This should be an appropriate fix to the warning:

---
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 022ea1ee63f8..f2ccebe1c926 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -506,7 +506,7 @@ static int nvme_pci_map_queues(struct blk_mq_tag_set *set)
 		 * affinity), so use the regular blk-mq cpu mapping
 		 */
 		map->queue_offset = qoff;
-		if (i != HCTX_TYPE_POLL)
+		if (i != HCTX_TYPE_POLL && dev->num_vecs > 1)
 			blk_mq_pci_map_queues(map, to_pci_dev(dev->dev), offset);
 		else
 			blk_mq_map_queues(map);
--

  reply	other threads:[~2019-02-13 22:37 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-13 10:50 [PATCH V3 0/5] genirq/affinity: add .calc_sets for improving IRQ allocation & spread Ming Lei
2019-02-13 10:50 ` Ming Lei
2019-02-13 10:50 ` [PATCH V3 1/5] genirq/affinity: don't mark 'affd' as const Ming Lei
2019-02-13 10:50   ` Ming Lei
2019-02-13 15:04   ` Bjorn Helgaas
2019-02-13 15:04     ` Bjorn Helgaas
2019-02-13 20:56     ` Thomas Gleixner
2019-02-13 20:56       ` Thomas Gleixner
2019-02-13 21:31       ` Keith Busch
2019-02-13 21:31         ` Keith Busch
2019-02-13 21:41         ` Thomas Gleixner
2019-02-13 21:41           ` Thomas Gleixner
2019-02-13 22:37           ` Keith Busch [this message]
2019-02-13 22:37             ` Keith Busch
2019-02-14  8:50             ` Thomas Gleixner
2019-02-14  8:50               ` Thomas Gleixner
2019-02-14 13:04               ` 陈华才
2019-02-14 13:04                 ` 陈华才
2019-02-14 13:31                 ` Thomas Gleixner
2019-02-14 13:31                   ` Thomas Gleixner
2019-02-19  0:42                   ` 陈华才
2019-02-19  0:42                     ` 陈华才
2019-02-19  6:19                     ` Thomas Gleixner
2019-02-19  6:19                       ` Thomas Gleixner
2019-02-19 16:12                     ` Keith Busch
2019-02-19 16:12                       ` Keith Busch
2019-02-13 10:50 ` [PATCH V3 2/5] genirq/affinity: store irq set vectors in 'struct irq_affinity' Ming Lei
2019-02-13 10:50   ` Ming Lei
2019-02-13 15:07   ` Bjorn Helgaas
2019-02-13 15:07     ` Bjorn Helgaas
2019-02-13 10:50 ` [PATCH V3 3/5] genirq/affinity: add new callback for caculating set vectors Ming Lei
2019-02-13 10:50   ` Ming Lei
2019-02-13 15:11   ` Bjorn Helgaas
2019-02-13 15:11     ` Bjorn Helgaas
2019-02-13 20:58     ` Thomas Gleixner
2019-02-13 20:58       ` Thomas Gleixner
2019-02-13 10:50 ` [PATCH V3 4/5] nvme-pci: avoid irq allocation retrying via .calc_sets Ming Lei
2019-02-13 10:50   ` Ming Lei
2019-02-13 15:13   ` Bjorn Helgaas
2019-02-13 15:13     ` Bjorn Helgaas
2019-02-13 21:26     ` Thomas Gleixner
2019-02-13 21:26       ` Thomas Gleixner
2019-02-13 10:50 ` [PATCH V3 5/5] genirq/affinity: Document .calc_sets as required in case of multiple sets Ming Lei
2019-02-13 10:50   ` Ming Lei
2019-02-13 15:16   ` Bjorn Helgaas
2019-02-13 15:16     ` Bjorn Helgaas
2019-02-13 14:36 ` [PATCH V3 0/5] genirq/affinity: add .calc_sets for improving IRQ allocation & spread Jens Axboe
2019-02-13 14:36   ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190213223711.GC8027@localhost.localdomain \
    --to=keith.busch@intel.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=helgaas@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=sagi@grimberg.me \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.