All of lore.kernel.org
 help / color / mirror / Atom feed
From: keith.busch@intel.com (Keith Busch)
Subject: [PATCH V2 1/3] PCI/MSI: preference to returning -ENOSPC from pci_alloc_irq_vectors_affinity
Date: Wed, 2 Jan 2019 15:46:39 -0700	[thread overview]
Message-ID: <20190102224637.GA9795@localhost.localdomain> (raw)
In-Reply-To: <20190102210202.GC126384@google.com>

On Wed, Jan 02, 2019@03:02:02PM -0600, Bjorn Helgaas wrote:
> Keith said:
> > The min/max vecs doesn't work correctly when using the irq_affinity
> > nr_sets because rebalancing the set counts is driver specific. To
> > get around that, drivers using nr_sets have to set min and max to
> > the same value and handle the "reduce and try again".
> 
> Sorry I saw that, but didn't follow it at first.  After a little
> archaeology, I see that 6da4b3ab9a6e ("genirq/affinity: Add support
> for allocating interrupt sets") added nr_sets and some validation
> tests (if affd.nr_sets, min_vecs == max_vecs) for using it in the API.
> 
> That's sort of a wart on the API, but I don't know if we should live
> with it or try to clean it up somehow.

Yeah, that interface is a bit awkward. I was thinking it would be nice to
thread a driver callback to PCI for the driver to redistribute the sets
as needed and let the PCI handle the retries as before. I am testing
with the following, and seems to work, but I'm getting some unexpected
warnings from blk-mq when I have nvme use it. Still investigating that,
but just throwing this out for early feedback.

---
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 7a1c8a09efa5..e33abb167c19 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -1035,13 +1035,6 @@ static int __pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec,
 	if (maxvec < minvec)
 		return -ERANGE;
 
-	/*
-	 * If the caller is passing in sets, we can't support a range of
-	 * vectors. The caller needs to handle that.
-	 */
-	if (affd && affd->nr_sets && minvec != maxvec)
-		return -EINVAL;
-
 	if (WARN_ON_ONCE(dev->msi_enabled))
 		return -EINVAL;
 
@@ -1093,13 +1086,6 @@ static int __pci_enable_msix_range(struct pci_dev *dev,
 	if (maxvec < minvec)
 		return -ERANGE;
 
-	/*
-	 * If the caller is passing in sets, we can't support a range of
-	 * supported vectors. The caller needs to handle that.
-	 */
-	if (affd && affd->nr_sets && minvec != maxvec)
-		return -EINVAL;
-
 	if (WARN_ON_ONCE(dev->msix_enabled))
 		return -EINVAL;
 
@@ -1110,6 +1096,9 @@ static int __pci_enable_msix_range(struct pci_dev *dev,
 				return -ENOSPC;
 		}
 
+		if (nvec != maxvec && affd && affd->recalc_sets)
+			affd->recalc_sets(affd, nvec);
+
 		rc = __pci_enable_msix(dev, entries, nvec, affd);
 		if (rc == 0)
 			return nvec;
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index c672f34235e7..326c9bd05f62 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -249,12 +249,16 @@ struct irq_affinity_notify {
  *			the MSI(-X) vector space
  * @nr_sets:		Length of passed in *sets array
  * @sets:		Number of affinitized sets
+ * @recalc_sets:	Recalculate sets when requested allocation failed
+ * @priv:		Driver private data
  */
 struct irq_affinity {
 	int	pre_vectors;
 	int	post_vectors;
 	int	nr_sets;
 	int	*sets;
+	void	(*recalc_sets)(struct irq_affinity *, unsigned int);
+	void	*priv;
 };
 
 /**
--

WARNING: multiple messages have this Message-ID (diff)
From: Keith Busch <keith.busch@intel.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Ming Lei <ming.lei@redhat.com>,
	linux-nvme@lists.infradead.org, Christoph Hellwig <hch@lst.de>,
	Jens Axboe <axboe@fb.com>,
	linux-pci@vger.kernel.org
Subject: Re: [PATCH V2 1/3] PCI/MSI: preference to returning -ENOSPC from pci_alloc_irq_vectors_affinity
Date: Wed, 2 Jan 2019 15:46:39 -0700	[thread overview]
Message-ID: <20190102224637.GA9795@localhost.localdomain> (raw)
In-Reply-To: <20190102210202.GC126384@google.com>

On Wed, Jan 02, 2019 at 03:02:02PM -0600, Bjorn Helgaas wrote:
> Keith said:
> > The min/max vecs doesn't work correctly when using the irq_affinity
> > nr_sets because rebalancing the set counts is driver specific. To
> > get around that, drivers using nr_sets have to set min and max to
> > the same value and handle the "reduce and try again".
> 
> Sorry I saw that, but didn't follow it at first.  After a little
> archaeology, I see that 6da4b3ab9a6e ("genirq/affinity: Add support
> for allocating interrupt sets") added nr_sets and some validation
> tests (if affd.nr_sets, min_vecs == max_vecs) for using it in the API.
> 
> That's sort of a wart on the API, but I don't know if we should live
> with it or try to clean it up somehow.

Yeah, that interface is a bit awkward. I was thinking it would be nice to
thread a driver callback to PCI for the driver to redistribute the sets
as needed and let the PCI handle the retries as before. I am testing
with the following, and seems to work, but I'm getting some unexpected
warnings from blk-mq when I have nvme use it. Still investigating that,
but just throwing this out for early feedback.

---
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 7a1c8a09efa5..e33abb167c19 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -1035,13 +1035,6 @@ static int __pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec,
 	if (maxvec < minvec)
 		return -ERANGE;
 
-	/*
-	 * If the caller is passing in sets, we can't support a range of
-	 * vectors. The caller needs to handle that.
-	 */
-	if (affd && affd->nr_sets && minvec != maxvec)
-		return -EINVAL;
-
 	if (WARN_ON_ONCE(dev->msi_enabled))
 		return -EINVAL;
 
@@ -1093,13 +1086,6 @@ static int __pci_enable_msix_range(struct pci_dev *dev,
 	if (maxvec < minvec)
 		return -ERANGE;
 
-	/*
-	 * If the caller is passing in sets, we can't support a range of
-	 * supported vectors. The caller needs to handle that.
-	 */
-	if (affd && affd->nr_sets && minvec != maxvec)
-		return -EINVAL;
-
 	if (WARN_ON_ONCE(dev->msix_enabled))
 		return -EINVAL;
 
@@ -1110,6 +1096,9 @@ static int __pci_enable_msix_range(struct pci_dev *dev,
 				return -ENOSPC;
 		}
 
+		if (nvec != maxvec && affd && affd->recalc_sets)
+			affd->recalc_sets(affd, nvec);
+
 		rc = __pci_enable_msix(dev, entries, nvec, affd);
 		if (rc == 0)
 			return nvec;
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index c672f34235e7..326c9bd05f62 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -249,12 +249,16 @@ struct irq_affinity_notify {
  *			the MSI(-X) vector space
  * @nr_sets:		Length of passed in *sets array
  * @sets:		Number of affinitized sets
+ * @recalc_sets:	Recalculate sets when requested allocation failed
+ * @priv:		Driver private data
  */
 struct irq_affinity {
 	int	pre_vectors;
 	int	post_vectors;
 	int	nr_sets;
 	int	*sets;
+	void	(*recalc_sets)(struct irq_affinity *, unsigned int);
+	void	*priv;
 };
 
 /**
--

  reply	other threads:[~2019-01-02 22:46 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-29  3:26 [PATCH V2 0/3] nvme pci: two fixes on nvme_setup_irqs Ming Lei
2018-12-29  3:26 ` Ming Lei
2018-12-29  3:26 ` [PATCH V2 1/3] PCI/MSI: preference to returning -ENOSPC from pci_alloc_irq_vectors_affinity Ming Lei
2018-12-29  3:26   ` Ming Lei
2018-12-31 22:00   ` Bjorn Helgaas
2018-12-31 22:00     ` Bjorn Helgaas
2018-12-31 22:41     ` Keith Busch
2018-12-31 22:41       ` Keith Busch
2019-01-01  5:24     ` Ming Lei
2019-01-01  5:24       ` Ming Lei
2019-01-02 21:02       ` Bjorn Helgaas
2019-01-02 21:02         ` Bjorn Helgaas
2019-01-02 22:46         ` Keith Busch [this message]
2019-01-02 22:46           ` Keith Busch
2018-12-29  3:26 ` [PATCH V2 2/3] nvme pci: fix nvme_setup_irqs() Ming Lei
2018-12-29  3:26 ` [PATCH V2 3/3] nvme pci: introduce module parameter of 'default_queues' Ming Lei
2018-12-31 21:24   ` Bjorn Helgaas
2019-01-01  5:47     ` Ming Lei
2019-01-02  2:14       ` Shan Hai
     [not found]         ` <20190102073607.GA25590@ming.t460p>
     [not found]           ` <d59007c6-af13-318c-5c9d-438ad7d9149d@oracle.com>
     [not found]             ` <20190102083901.GA26881@ming.t460p>
2019-01-03  2:04               ` Shan Hai
2019-01-02 20:11       ` Bjorn Helgaas
2019-01-03  2:12         ` Ming Lei
2019-01-03  2:52           ` Shan Hai
2019-01-03  3:11             ` Shan Hai
2019-01-03  3:31               ` Ming Lei
2019-01-03  4:36                 ` Shan Hai
2019-01-03 10:34                   ` Ming Lei
2019-01-04  2:53                     ` Shan Hai
2019-01-03  4:51                 ` Shan Hai
2019-01-03  3:21             ` Ming Lei
2019-01-14 13:13 ` [PATCH V2 0/3] nvme pci: two fixes on nvme_setup_irqs John Garry
2019-01-14 13:13   ` John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190102224637.GA9795@localhost.localdomain \
    --to=keith.busch@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.