public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH V3 00/10] vfio/pci: Support dynamic allocation of MSI-X interrupts
@ 2023-04-18 17:29 Reinette Chatre
  2023-04-18 17:29 ` [PATCH V3 01/10] vfio/pci: Consolidate irq cleanup on MSI/MSI-X disable Reinette Chatre
                   ` (9 more replies)
  0 siblings, 10 replies; 23+ messages in thread
From: Reinette Chatre @ 2023-04-18 17:29 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian,
	alex.williamson
  Cc: tglx, darwi, kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu,
	tom.zanussi, reinette.chatre, linux-kernel

Changes since V2:
- V2: https://lore.kernel.org/lkml/cover.1680038771.git.reinette.chatre@intel.com/
- During testing of V2 "kernel test robot" reported issues resulting from
  include/linux/pci.h missing a stub for pci_msix_can_alloc_dyn() when
  CONFIG_PCI_MSI=n. A separate fix was sent to address this. The fix can
  be found in the kernel (since v6.3-rc7) as
  commit 195d8e5da3ac ("PCI/MSI: Provide missing stub for pci_msix_can_alloc_dyn()")
- Biggest change is the transition to "active contexts" for both MSI and MSI-X.
  Interrupt contexts have always been allocated when the interrupts are
  allocated while they are only used while interrupts are
  enabled. In this series interrupt contexts are made dynamic, while doing
  so their allocation is moved to match how they are used: allocated when
  interrupts are enabled. Whether a Linux interrupt number exists determines
  whether an interrupt can be enabled.
  Previous policy (up to V2) that an allocated interrupt has an interrupt
  context no longer applies. Instead, an interrupt context has a
  handler/trigger, aka "active contexts". (Alex)
- Re-ordered patches in support of "active contexts".
- Only free interrupts on MSI-X teardown and otherwise use the
  allocated interrupts as a cache. (Alex)
- Using unsigned int for the vector broke the unwind loop within
  vfio_msi_set_block(). (Alex)
- Introduce new "has_dyn_msix" property of virtual device instead of
  querying support every time. (Alex)
- Some smaller changes, please refer to individual patches.

Changes since RFC V1:
- RFC V1: https://lore.kernel.org/lkml/cover.1678911529.git.reinette.chatre@intel.com/
- Improved changelogs.
- Simplify interface so that vfio_irq_ctx_alloc_single() returns pointer to
  allocated context. (Alex)
- Remove vfio_irq_ctx_range_allocated() and associated attempts to maintain
  invalid error path behavior. (Alex and Kevin)
- Add pointer to interrupt context as function parameter to
  vfio_irq_ctx_free(). (Alex)
- Ensure variables are initialized. (Dan Carpenter)
- Only support dynamic allocation if device supports it. (Alex)

Qemu allocates interrupts incrementally at the time the guest unmasks an
interrupt, for example each time a Linux guest runs request_irq().

Dynamic allocation of MSI-X interrupts was not possible until v6.2 [1].
This prompted Qemu to, when allocating a new interrupt, first release all
previously allocated interrupts (including disable of MSI-X) followed
by re-allocation of all interrupts that includes the new interrupt.
Please see [2] for a detailed discussion about this issue.

Releasing and re-allocating interrupts may be acceptable if all
interrupts are unmasked during device initialization. If unmasking of
interrupts occur during runtime this may result in lost interrupts.
For example, consider an accelerator device with multiple work queues,
each work queue having a dedicated interrupt. A work queue can be
enabled at any time with its associated interrupt unmasked while other
work queues are already active. Having all interrupts released and MSI-X
disabled to enable the new work queue will impact active work queues.

This series builds on the recent interrupt sub-system core changes
that added support for dynamic MSI-X allocation after initial MSI-X
enabling.

Add support for dynamic MSI-X allocation to vfio-pci. A flag
indicating lack of support for dynamic allocation already exist:
VFIO_IRQ_INFO_NORESIZE and has always been set for MSI and MSI-X. With
support for dynamic MSI-X the flag is cleared for MSI-X when supported,
enabling Qemu to modify its behavior.

Any feedback is appreciated

Reinette

[1] commit 34026364df8e ("PCI/MSI: Provide post-enable dynamic allocation interfaces for MSI-X")
[2] https://lore.kernel.org/kvm/MWHPR11MB188603D0D809C1079F5817DC8C099@MWHPR11MB1886.namprd11.prod.outlook.com/#t

Reinette Chatre (10):
  vfio/pci: Consolidate irq cleanup on MSI/MSI-X disable
  vfio/pci: Remove negative check on unsigned vector
  vfio/pci: Prepare for dynamic interrupt context storage
  vfio/pci: Move to single error path
  vfio/pci: Use xarray for interrupt context storage
  vfio/pci: Remove interrupt context counter
  vfio/pci: Update stale comment
  vfio/pci: Probe and store ability to support dynamic MSI-X
  vfio/pci: Support dynamic MSI-X
  vfio/pci: Clear VFIO_IRQ_INFO_NORESIZE for MSI-X

 drivers/vfio/pci/vfio_pci_core.c  |  10 +-
 drivers/vfio/pci/vfio_pci_intrs.c | 320 +++++++++++++++++++++---------
 include/linux/vfio_pci_core.h     |   4 +-
 include/uapi/linux/vfio.h         |   3 +
 4 files changed, 234 insertions(+), 103 deletions(-)


base-commit: 6a8f57ae2eb07ab39a6f0ccad60c760743051026
-- 
2.34.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH V3 01/10] vfio/pci: Consolidate irq cleanup on MSI/MSI-X disable
  2023-04-18 17:29 [PATCH V3 00/10] vfio/pci: Support dynamic allocation of MSI-X interrupts Reinette Chatre
@ 2023-04-18 17:29 ` Reinette Chatre
  2023-04-18 17:29 ` [PATCH V3 02/10] vfio/pci: Remove negative check on unsigned vector Reinette Chatre
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Reinette Chatre @ 2023-04-18 17:29 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian,
	alex.williamson
  Cc: tglx, darwi, kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu,
	tom.zanussi, reinette.chatre, linux-kernel

vfio_msi_disable() releases all previously allocated state
associated with each interrupt before disabling MSI/MSI-X.

vfio_msi_disable() iterates twice over the interrupt state:
first directly with a for loop to do virqfd cleanup, followed
by another for loop within vfio_msi_set_block() that removes
the interrupt handler and its associated state using
vfio_msi_set_vector_signal().

Simplify interrupt cleanup by iterating over allocated interrupts
once.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V2:
- Improve accuracy of changelog.

 drivers/vfio/pci/vfio_pci_intrs.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index bffb0741518b..6a9c6a143cc3 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -426,10 +426,9 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix)
 	for (i = 0; i < vdev->num_ctx; i++) {
 		vfio_virqfd_disable(&vdev->ctx[i].unmask);
 		vfio_virqfd_disable(&vdev->ctx[i].mask);
+		vfio_msi_set_vector_signal(vdev, i, -1, msix);
 	}
 
-	vfio_msi_set_block(vdev, 0, vdev->num_ctx, NULL, msix);
-
 	cmd = vfio_pci_memory_lock_and_enable(vdev);
 	pci_free_irq_vectors(pdev);
 	vfio_pci_memory_unlock_and_restore(vdev, cmd);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH V3 02/10] vfio/pci: Remove negative check on unsigned vector
  2023-04-18 17:29 [PATCH V3 00/10] vfio/pci: Support dynamic allocation of MSI-X interrupts Reinette Chatre
  2023-04-18 17:29 ` [PATCH V3 01/10] vfio/pci: Consolidate irq cleanup on MSI/MSI-X disable Reinette Chatre
@ 2023-04-18 17:29 ` Reinette Chatre
  2023-04-18 17:29 ` [PATCH V3 03/10] vfio/pci: Prepare for dynamic interrupt context storage Reinette Chatre
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Reinette Chatre @ 2023-04-18 17:29 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian,
	alex.williamson
  Cc: tglx, darwi, kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu,
	tom.zanussi, reinette.chatre, linux-kernel

User space provides the vector as an unsigned int that is checked
early for validity (vfio_set_irqs_validate_and_prepare()).

A later negative check of the provided vector is not necessary.

Remove the negative check and ensure the type used
for the vector is consistent as an unsigned int.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V2:
- Rework unwind loop within vfio_msi_set_block() that required j
  to be an int. Rework results in both i and j used for vector, both
  moved to be unsigned int. (Alex)

 drivers/vfio/pci/vfio_pci_intrs.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index 6a9c6a143cc3..258de57ef956 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -317,14 +317,14 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi
 }
 
 static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
-				      int vector, int fd, bool msix)
+				      unsigned int vector, int fd, bool msix)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	struct eventfd_ctx *trigger;
 	int irq, ret;
 	u16 cmd;
 
-	if (vector < 0 || vector >= vdev->num_ctx)
+	if (vector >= vdev->num_ctx)
 		return -EINVAL;
 
 	irq = pci_irq_vector(pdev, vector);
@@ -399,7 +399,8 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
 static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start,
 			      unsigned count, int32_t *fds, bool msix)
 {
-	int i, j, ret = 0;
+	unsigned int i, j;
+	int ret = 0;
 
 	if (start >= vdev->num_ctx || start + count > vdev->num_ctx)
 		return -EINVAL;
@@ -410,8 +411,8 @@ static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start,
 	}
 
 	if (ret) {
-		for (--j; j >= (int)start; j--)
-			vfio_msi_set_vector_signal(vdev, j, -1, msix);
+		for (i = start; i < j; i++)
+			vfio_msi_set_vector_signal(vdev, i, -1, msix);
 	}
 
 	return ret;
@@ -420,7 +421,7 @@ static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start,
 static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix)
 {
 	struct pci_dev *pdev = vdev->pdev;
-	int i;
+	unsigned int i;
 	u16 cmd;
 
 	for (i = 0; i < vdev->num_ctx; i++) {
@@ -542,7 +543,7 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev,
 				    unsigned index, unsigned start,
 				    unsigned count, uint32_t flags, void *data)
 {
-	int i;
+	unsigned int i;
 	bool msix = (index == VFIO_PCI_MSIX_IRQ_INDEX) ? true : false;
 
 	if (irq_is(vdev, index) && !count && (flags & VFIO_IRQ_SET_DATA_NONE)) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH V3 03/10] vfio/pci: Prepare for dynamic interrupt context storage
  2023-04-18 17:29 [PATCH V3 00/10] vfio/pci: Support dynamic allocation of MSI-X interrupts Reinette Chatre
  2023-04-18 17:29 ` [PATCH V3 01/10] vfio/pci: Consolidate irq cleanup on MSI/MSI-X disable Reinette Chatre
  2023-04-18 17:29 ` [PATCH V3 02/10] vfio/pci: Remove negative check on unsigned vector Reinette Chatre
@ 2023-04-18 17:29 ` Reinette Chatre
  2023-04-18 17:29 ` [PATCH V3 04/10] vfio/pci: Move to single error path Reinette Chatre
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Reinette Chatre @ 2023-04-18 17:29 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian,
	alex.williamson
  Cc: tglx, darwi, kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu,
	tom.zanussi, reinette.chatre, linux-kernel

Interrupt context storage is statically allocated at the time
interrupts are allocated. Following allocation, the interrupt
context is managed by directly accessing the elements of the
array using the vector as index.

It is possible to allocate additional MSI-X vectors after
MSI-X has been enabled. Dynamic storage of interrupt context
is needed to support adding new MSI-X vectors after initial
allocation.

Replace direct access of array elements with pointers to the
array elements. Doing so reduces impact of moving to a new data
structure. Move interactions with the array to helpers to
mostly contain changes needed to transition to a dynamic
data structure.

No functional change intended.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
No changes since V2.

Changes since RFC V1:
- Improve accuracy of changelog.

 drivers/vfio/pci/vfio_pci_intrs.c | 206 ++++++++++++++++++++----------
 1 file changed, 140 insertions(+), 66 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index 258de57ef956..b664fbb6d2f2 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -48,6 +48,31 @@ static bool is_irq_none(struct vfio_pci_core_device *vdev)
 		 vdev->irq_type == VFIO_PCI_MSIX_IRQ_INDEX);
 }
 
+static
+struct vfio_pci_irq_ctx *vfio_irq_ctx_get(struct vfio_pci_core_device *vdev,
+					  unsigned long index)
+{
+	if (index >= vdev->num_ctx)
+		return NULL;
+	return &vdev->ctx[index];
+}
+
+static void vfio_irq_ctx_free_all(struct vfio_pci_core_device *vdev)
+{
+	kfree(vdev->ctx);
+}
+
+static int vfio_irq_ctx_alloc_num(struct vfio_pci_core_device *vdev,
+				  unsigned long num)
+{
+	vdev->ctx = kcalloc(num, sizeof(struct vfio_pci_irq_ctx),
+			    GFP_KERNEL_ACCOUNT);
+	if (!vdev->ctx)
+		return -ENOMEM;
+
+	return 0;
+}
+
 /*
  * INTx
  */
@@ -55,17 +80,28 @@ static void vfio_send_intx_eventfd(void *opaque, void *unused)
 {
 	struct vfio_pci_core_device *vdev = opaque;
 
-	if (likely(is_intx(vdev) && !vdev->virq_disabled))
-		eventfd_signal(vdev->ctx[0].trigger, 1);
+	if (likely(is_intx(vdev) && !vdev->virq_disabled)) {
+		struct vfio_pci_irq_ctx *ctx;
+
+		ctx = vfio_irq_ctx_get(vdev, 0);
+		if (!ctx)
+			return;
+		eventfd_signal(ctx->trigger, 1);
+	}
 }
 
 /* Returns true if the INTx vfio_pci_irq_ctx.masked value is changed. */
 bool vfio_pci_intx_mask(struct vfio_pci_core_device *vdev)
 {
 	struct pci_dev *pdev = vdev->pdev;
+	struct vfio_pci_irq_ctx *ctx;
 	unsigned long flags;
 	bool masked_changed = false;
 
+	ctx = vfio_irq_ctx_get(vdev, 0);
+	if (!ctx)
+		return masked_changed;
+
 	spin_lock_irqsave(&vdev->irqlock, flags);
 
 	/*
@@ -77,7 +113,7 @@ bool vfio_pci_intx_mask(struct vfio_pci_core_device *vdev)
 	if (unlikely(!is_intx(vdev))) {
 		if (vdev->pci_2_3)
 			pci_intx(pdev, 0);
-	} else if (!vdev->ctx[0].masked) {
+	} else if (!ctx->masked) {
 		/*
 		 * Can't use check_and_mask here because we always want to
 		 * mask, not just when something is pending.
@@ -87,7 +123,7 @@ bool vfio_pci_intx_mask(struct vfio_pci_core_device *vdev)
 		else
 			disable_irq_nosync(pdev->irq);
 
-		vdev->ctx[0].masked = true;
+		ctx->masked = true;
 		masked_changed = true;
 	}
 
@@ -105,9 +141,14 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused)
 {
 	struct vfio_pci_core_device *vdev = opaque;
 	struct pci_dev *pdev = vdev->pdev;
+	struct vfio_pci_irq_ctx *ctx;
 	unsigned long flags;
 	int ret = 0;
 
+	ctx = vfio_irq_ctx_get(vdev, 0);
+	if (!ctx)
+		return ret;
+
 	spin_lock_irqsave(&vdev->irqlock, flags);
 
 	/*
@@ -117,7 +158,7 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused)
 	if (unlikely(!is_intx(vdev))) {
 		if (vdev->pci_2_3)
 			pci_intx(pdev, 1);
-	} else if (vdev->ctx[0].masked && !vdev->virq_disabled) {
+	} else if (ctx->masked && !vdev->virq_disabled) {
 		/*
 		 * A pending interrupt here would immediately trigger,
 		 * but we can avoid that overhead by just re-sending
@@ -129,7 +170,7 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused)
 		} else
 			enable_irq(pdev->irq);
 
-		vdev->ctx[0].masked = (ret > 0);
+		ctx->masked = (ret > 0);
 	}
 
 	spin_unlock_irqrestore(&vdev->irqlock, flags);
@@ -146,18 +187,23 @@ void vfio_pci_intx_unmask(struct vfio_pci_core_device *vdev)
 static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
 {
 	struct vfio_pci_core_device *vdev = dev_id;
+	struct vfio_pci_irq_ctx *ctx;
 	unsigned long flags;
 	int ret = IRQ_NONE;
 
+	ctx = vfio_irq_ctx_get(vdev, 0);
+	if (!ctx)
+		return ret;
+
 	spin_lock_irqsave(&vdev->irqlock, flags);
 
 	if (!vdev->pci_2_3) {
 		disable_irq_nosync(vdev->pdev->irq);
-		vdev->ctx[0].masked = true;
+		ctx->masked = true;
 		ret = IRQ_HANDLED;
-	} else if (!vdev->ctx[0].masked &&  /* may be shared */
+	} else if (!ctx->masked &&  /* may be shared */
 		   pci_check_and_mask_intx(vdev->pdev)) {
-		vdev->ctx[0].masked = true;
+		ctx->masked = true;
 		ret = IRQ_HANDLED;
 	}
 
@@ -171,15 +217,24 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
 
 static int vfio_intx_enable(struct vfio_pci_core_device *vdev)
 {
+	struct vfio_pci_irq_ctx *ctx;
+	int ret;
+
 	if (!is_irq_none(vdev))
 		return -EINVAL;
 
 	if (!vdev->pdev->irq)
 		return -ENODEV;
 
-	vdev->ctx = kzalloc(sizeof(struct vfio_pci_irq_ctx), GFP_KERNEL_ACCOUNT);
-	if (!vdev->ctx)
-		return -ENOMEM;
+	ret = vfio_irq_ctx_alloc_num(vdev, 1);
+	if (ret)
+		return ret;
+
+	ctx = vfio_irq_ctx_get(vdev, 0);
+	if (!ctx) {
+		vfio_irq_ctx_free_all(vdev);
+		return -EINVAL;
+	}
 
 	vdev->num_ctx = 1;
 
@@ -189,9 +244,9 @@ static int vfio_intx_enable(struct vfio_pci_core_device *vdev)
 	 * here, non-PCI-2.3 devices will have to wait until the
 	 * interrupt is enabled.
 	 */
-	vdev->ctx[0].masked = vdev->virq_disabled;
+	ctx->masked = vdev->virq_disabled;
 	if (vdev->pci_2_3)
-		pci_intx(vdev->pdev, !vdev->ctx[0].masked);
+		pci_intx(vdev->pdev, !ctx->masked);
 
 	vdev->irq_type = VFIO_PCI_INTX_IRQ_INDEX;
 
@@ -202,41 +257,46 @@ static int vfio_intx_set_signal(struct vfio_pci_core_device *vdev, int fd)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	unsigned long irqflags = IRQF_SHARED;
+	struct vfio_pci_irq_ctx *ctx;
 	struct eventfd_ctx *trigger;
 	unsigned long flags;
 	int ret;
 
-	if (vdev->ctx[0].trigger) {
+	ctx = vfio_irq_ctx_get(vdev, 0);
+	if (!ctx)
+		return -EINVAL;
+
+	if (ctx->trigger) {
 		free_irq(pdev->irq, vdev);
-		kfree(vdev->ctx[0].name);
-		eventfd_ctx_put(vdev->ctx[0].trigger);
-		vdev->ctx[0].trigger = NULL;
+		kfree(ctx->name);
+		eventfd_ctx_put(ctx->trigger);
+		ctx->trigger = NULL;
 	}
 
 	if (fd < 0) /* Disable only */
 		return 0;
 
-	vdev->ctx[0].name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-intx(%s)",
-				      pci_name(pdev));
-	if (!vdev->ctx[0].name)
+	ctx->name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-intx(%s)",
+			      pci_name(pdev));
+	if (!ctx->name)
 		return -ENOMEM;
 
 	trigger = eventfd_ctx_fdget(fd);
 	if (IS_ERR(trigger)) {
-		kfree(vdev->ctx[0].name);
+		kfree(ctx->name);
 		return PTR_ERR(trigger);
 	}
 
-	vdev->ctx[0].trigger = trigger;
+	ctx->trigger = trigger;
 
 	if (!vdev->pci_2_3)
 		irqflags = 0;
 
 	ret = request_irq(pdev->irq, vfio_intx_handler,
-			  irqflags, vdev->ctx[0].name, vdev);
+			  irqflags, ctx->name, vdev);
 	if (ret) {
-		vdev->ctx[0].trigger = NULL;
-		kfree(vdev->ctx[0].name);
+		ctx->trigger = NULL;
+		kfree(ctx->name);
 		eventfd_ctx_put(trigger);
 		return ret;
 	}
@@ -246,7 +306,7 @@ static int vfio_intx_set_signal(struct vfio_pci_core_device *vdev, int fd)
 	 * disable_irq won't.
 	 */
 	spin_lock_irqsave(&vdev->irqlock, flags);
-	if (!vdev->pci_2_3 && vdev->ctx[0].masked)
+	if (!vdev->pci_2_3 && ctx->masked)
 		disable_irq_nosync(pdev->irq);
 	spin_unlock_irqrestore(&vdev->irqlock, flags);
 
@@ -255,12 +315,17 @@ static int vfio_intx_set_signal(struct vfio_pci_core_device *vdev, int fd)
 
 static void vfio_intx_disable(struct vfio_pci_core_device *vdev)
 {
-	vfio_virqfd_disable(&vdev->ctx[0].unmask);
-	vfio_virqfd_disable(&vdev->ctx[0].mask);
+	struct vfio_pci_irq_ctx *ctx;
+
+	ctx = vfio_irq_ctx_get(vdev, 0);
+	if (ctx) {
+		vfio_virqfd_disable(&ctx->unmask);
+		vfio_virqfd_disable(&ctx->mask);
+	}
 	vfio_intx_set_signal(vdev, -1);
 	vdev->irq_type = VFIO_PCI_NUM_IRQS;
 	vdev->num_ctx = 0;
-	kfree(vdev->ctx);
+	vfio_irq_ctx_free_all(vdev);
 }
 
 /*
@@ -284,10 +349,9 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi
 	if (!is_irq_none(vdev))
 		return -EINVAL;
 
-	vdev->ctx = kcalloc(nvec, sizeof(struct vfio_pci_irq_ctx),
-			    GFP_KERNEL_ACCOUNT);
-	if (!vdev->ctx)
-		return -ENOMEM;
+	ret = vfio_irq_ctx_alloc_num(vdev, nvec);
+	if (ret)
+		return ret;
 
 	/* return the number of supported vectors if we can't get all: */
 	cmd = vfio_pci_memory_lock_and_enable(vdev);
@@ -296,7 +360,7 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi
 		if (ret > 0)
 			pci_free_irq_vectors(pdev);
 		vfio_pci_memory_unlock_and_restore(vdev, cmd);
-		kfree(vdev->ctx);
+		vfio_irq_ctx_free_all(vdev);
 		return ret;
 	}
 	vfio_pci_memory_unlock_and_restore(vdev, cmd);
@@ -320,6 +384,7 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
 				      unsigned int vector, int fd, bool msix)
 {
 	struct pci_dev *pdev = vdev->pdev;
+	struct vfio_pci_irq_ctx *ctx;
 	struct eventfd_ctx *trigger;
 	int irq, ret;
 	u16 cmd;
@@ -327,33 +392,33 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
 	if (vector >= vdev->num_ctx)
 		return -EINVAL;
 
+	ctx = vfio_irq_ctx_get(vdev, vector);
+	if (!ctx)
+		return -EINVAL;
 	irq = pci_irq_vector(pdev, vector);
 
-	if (vdev->ctx[vector].trigger) {
-		irq_bypass_unregister_producer(&vdev->ctx[vector].producer);
+	if (ctx->trigger) {
+		irq_bypass_unregister_producer(&ctx->producer);
 
 		cmd = vfio_pci_memory_lock_and_enable(vdev);
-		free_irq(irq, vdev->ctx[vector].trigger);
+		free_irq(irq, ctx->trigger);
 		vfio_pci_memory_unlock_and_restore(vdev, cmd);
-
-		kfree(vdev->ctx[vector].name);
-		eventfd_ctx_put(vdev->ctx[vector].trigger);
-		vdev->ctx[vector].trigger = NULL;
+		kfree(ctx->name);
+		eventfd_ctx_put(ctx->trigger);
+		ctx->trigger = NULL;
 	}
 
 	if (fd < 0)
 		return 0;
 
-	vdev->ctx[vector].name = kasprintf(GFP_KERNEL_ACCOUNT,
-					   "vfio-msi%s[%d](%s)",
-					   msix ? "x" : "", vector,
-					   pci_name(pdev));
-	if (!vdev->ctx[vector].name)
+	ctx->name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-msi%s[%d](%s)",
+			      msix ? "x" : "", vector, pci_name(pdev));
+	if (!ctx->name)
 		return -ENOMEM;
 
 	trigger = eventfd_ctx_fdget(fd);
 	if (IS_ERR(trigger)) {
-		kfree(vdev->ctx[vector].name);
+		kfree(ctx->name);
 		return PTR_ERR(trigger);
 	}
 
@@ -372,26 +437,25 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
 		pci_write_msi_msg(irq, &msg);
 	}
 
-	ret = request_irq(irq, vfio_msihandler, 0,
-			  vdev->ctx[vector].name, trigger);
+	ret = request_irq(irq, vfio_msihandler, 0, ctx->name, trigger);
 	vfio_pci_memory_unlock_and_restore(vdev, cmd);
 	if (ret) {
-		kfree(vdev->ctx[vector].name);
+		kfree(ctx->name);
 		eventfd_ctx_put(trigger);
 		return ret;
 	}
 
-	vdev->ctx[vector].producer.token = trigger;
-	vdev->ctx[vector].producer.irq = irq;
-	ret = irq_bypass_register_producer(&vdev->ctx[vector].producer);
+	ctx->producer.token = trigger;
+	ctx->producer.irq = irq;
+	ret = irq_bypass_register_producer(&ctx->producer);
 	if (unlikely(ret)) {
 		dev_info(&pdev->dev,
 		"irq bypass producer (token %p) registration fails: %d\n",
-		vdev->ctx[vector].producer.token, ret);
+		ctx->producer.token, ret);
 
-		vdev->ctx[vector].producer.token = NULL;
+		ctx->producer.token = NULL;
 	}
-	vdev->ctx[vector].trigger = trigger;
+	ctx->trigger = trigger;
 
 	return 0;
 }
@@ -421,13 +485,17 @@ static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start,
 static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix)
 {
 	struct pci_dev *pdev = vdev->pdev;
+	struct vfio_pci_irq_ctx *ctx;
 	unsigned int i;
 	u16 cmd;
 
 	for (i = 0; i < vdev->num_ctx; i++) {
-		vfio_virqfd_disable(&vdev->ctx[i].unmask);
-		vfio_virqfd_disable(&vdev->ctx[i].mask);
-		vfio_msi_set_vector_signal(vdev, i, -1, msix);
+		ctx = vfio_irq_ctx_get(vdev, i);
+		if (ctx) {
+			vfio_virqfd_disable(&ctx->unmask);
+			vfio_virqfd_disable(&ctx->mask);
+			vfio_msi_set_vector_signal(vdev, i, -1, msix);
+		}
 	}
 
 	cmd = vfio_pci_memory_lock_and_enable(vdev);
@@ -443,7 +511,7 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix)
 
 	vdev->irq_type = VFIO_PCI_NUM_IRQS;
 	vdev->num_ctx = 0;
-	kfree(vdev->ctx);
+	vfio_irq_ctx_free_all(vdev);
 }
 
 /*
@@ -463,14 +531,18 @@ static int vfio_pci_set_intx_unmask(struct vfio_pci_core_device *vdev,
 		if (unmask)
 			vfio_pci_intx_unmask(vdev);
 	} else if (flags & VFIO_IRQ_SET_DATA_EVENTFD) {
+		struct vfio_pci_irq_ctx *ctx = vfio_irq_ctx_get(vdev, 0);
 		int32_t fd = *(int32_t *)data;
+
+		if (!ctx)
+			return -EINVAL;
 		if (fd >= 0)
 			return vfio_virqfd_enable((void *) vdev,
 						  vfio_pci_intx_unmask_handler,
 						  vfio_send_intx_eventfd, NULL,
-						  &vdev->ctx[0].unmask, fd);
+						  &ctx->unmask, fd);
 
-		vfio_virqfd_disable(&vdev->ctx[0].unmask);
+		vfio_virqfd_disable(&ctx->unmask);
 	}
 
 	return 0;
@@ -543,6 +615,7 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev,
 				    unsigned index, unsigned start,
 				    unsigned count, uint32_t flags, void *data)
 {
+	struct vfio_pci_irq_ctx *ctx;
 	unsigned int i;
 	bool msix = (index == VFIO_PCI_MSIX_IRQ_INDEX) ? true : false;
 
@@ -577,14 +650,15 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev,
 		return -EINVAL;
 
 	for (i = start; i < start + count; i++) {
-		if (!vdev->ctx[i].trigger)
+		ctx = vfio_irq_ctx_get(vdev, i);
+		if (!ctx || !ctx->trigger)
 			continue;
 		if (flags & VFIO_IRQ_SET_DATA_NONE) {
-			eventfd_signal(vdev->ctx[i].trigger, 1);
+			eventfd_signal(ctx->trigger, 1);
 		} else if (flags & VFIO_IRQ_SET_DATA_BOOL) {
 			uint8_t *bools = data;
 			if (bools[i - start])
-				eventfd_signal(vdev->ctx[i].trigger, 1);
+				eventfd_signal(ctx->trigger, 1);
 		}
 	}
 	return 0;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH V3 04/10] vfio/pci: Move to single error path
  2023-04-18 17:29 [PATCH V3 00/10] vfio/pci: Support dynamic allocation of MSI-X interrupts Reinette Chatre
                   ` (2 preceding siblings ...)
  2023-04-18 17:29 ` [PATCH V3 03/10] vfio/pci: Prepare for dynamic interrupt context storage Reinette Chatre
@ 2023-04-18 17:29 ` Reinette Chatre
  2023-04-18 17:29 ` [PATCH V3 05/10] vfio/pci: Use xarray for interrupt context storage Reinette Chatre
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Reinette Chatre @ 2023-04-18 17:29 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian,
	alex.williamson
  Cc: tglx, darwi, kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu,
	tom.zanussi, reinette.chatre, linux-kernel

Enabling and disabling of an interrupt involves several steps
that can fail. Cleanup after failure is done when the error
is encountered, resulting in some repetitive code.

Support for dynamic contexts will introduce more steps during
interrupt enabling and disabling.

Transition to centralized exit path in preparation for dynamic
contexts to eliminate duplicate error handling code.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V2:
- Move patch to earlier in series in support of the change to
  dynamic context management.
- Do not add the "ctx->name = NULL" in error path. It is not done
  in baseline and will not be needed when transitioning to
  dynamic context management.
- Update changelog to not make this change specific to dynamic
  MSI-X.

Changes since RFC V1:
- Improve changelog.

 drivers/vfio/pci/vfio_pci_intrs.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index b664fbb6d2f2..9e17e59a4d60 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -418,8 +418,8 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
 
 	trigger = eventfd_ctx_fdget(fd);
 	if (IS_ERR(trigger)) {
-		kfree(ctx->name);
-		return PTR_ERR(trigger);
+		ret = PTR_ERR(trigger);
+		goto out_free_name;
 	}
 
 	/*
@@ -439,11 +439,8 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
 
 	ret = request_irq(irq, vfio_msihandler, 0, ctx->name, trigger);
 	vfio_pci_memory_unlock_and_restore(vdev, cmd);
-	if (ret) {
-		kfree(ctx->name);
-		eventfd_ctx_put(trigger);
-		return ret;
-	}
+	if (ret)
+		goto out_put_eventfd_ctx;
 
 	ctx->producer.token = trigger;
 	ctx->producer.irq = irq;
@@ -458,6 +455,12 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
 	ctx->trigger = trigger;
 
 	return 0;
+
+out_put_eventfd_ctx:
+	eventfd_ctx_put(trigger);
+out_free_name:
+	kfree(ctx->name);
+	return ret;
 }
 
 static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH V3 05/10] vfio/pci: Use xarray for interrupt context storage
  2023-04-18 17:29 [PATCH V3 00/10] vfio/pci: Support dynamic allocation of MSI-X interrupts Reinette Chatre
                   ` (3 preceding siblings ...)
  2023-04-18 17:29 ` [PATCH V3 04/10] vfio/pci: Move to single error path Reinette Chatre
@ 2023-04-18 17:29 ` Reinette Chatre
  2023-04-18 17:29 ` [PATCH V3 06/10] vfio/pci: Remove interrupt context counter Reinette Chatre
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Reinette Chatre @ 2023-04-18 17:29 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian,
	alex.williamson
  Cc: tglx, darwi, kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu,
	tom.zanussi, reinette.chatre, linux-kernel

Interrupt context is statically allocated at the time interrupts
are allocated. Following allocation, the context is managed by
directly accessing the elements of the array using the vector
as index. The storage is released when interrupts are disabled.

It is possible to dynamically allocate a single MSI-X interrupt
after MSI-X is enabled. A dynamic storage for interrupt context
is needed to support this. Replace the interrupt context array with an
xarray (similar to what the core uses as store for MSI descriptors)
that can support the dynamic expansion while maintaining the
custom that uses the vector as index.

With a dynamic storage it is no longer required to pre-allocate
interrupt contexts at the time the interrupts are allocated.
MSI and MSI-X interrupt contexts are only used when interrupts are
enabled. Their allocation can thus be delayed until interrupt enabling.
Only enabled interrupts will have associated interrupt contexts.
Whether an interrupt has been allocated (a Linux irq number exists
for it) becomes the criteria for whether an interrupt can be enabled.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/lkml/20230404122444.59e36a99.alex.williamson@redhat.com/
---
Changes since V2:
- Only allocate contexts as they are used, or "active". (Alex)
- Move vfio_irq_ctx_free() from later patch to prevent open-coding
  the same within vfio_irq_ctx_free_all(). This evolved into
  vfio_irq_ctx_free() used for dynamic context allocation and
  vfio_irq_ctx_free_all() removed because of it. (Alex)
- With vfio_irq_ctx_alloc_num() removed, rename vfio_irq_ctx_alloc_single()
  to vfio_irq_ctx_alloc().

Changes since RFC V1:
- Let vfio_irq_ctx_alloc_single() return pointer to allocated
  context. (Alex)
- Transition INTx allocation to simplified vfio_irq_ctx_alloc_single().
- Improve accuracy of changelog.

 drivers/vfio/pci/vfio_pci_core.c  |  1 +
 drivers/vfio/pci/vfio_pci_intrs.c | 91 ++++++++++++++++---------------
 include/linux/vfio_pci_core.h     |  2 +-
 3 files changed, 48 insertions(+), 46 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index a5ab416cf476..ae0e161c7fc9 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -2102,6 +2102,7 @@ int vfio_pci_core_init_dev(struct vfio_device *core_vdev)
 	INIT_LIST_HEAD(&vdev->vma_list);
 	INIT_LIST_HEAD(&vdev->sriov_pfs_item);
 	init_rwsem(&vdev->memory_lock);
+	xa_init(&vdev->ctx);
 
 	return 0;
 }
diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index 9e17e59a4d60..117cd384b3ad 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -52,25 +52,33 @@ static
 struct vfio_pci_irq_ctx *vfio_irq_ctx_get(struct vfio_pci_core_device *vdev,
 					  unsigned long index)
 {
-	if (index >= vdev->num_ctx)
-		return NULL;
-	return &vdev->ctx[index];
+	return xa_load(&vdev->ctx, index);
 }
 
-static void vfio_irq_ctx_free_all(struct vfio_pci_core_device *vdev)
+static void vfio_irq_ctx_free(struct vfio_pci_core_device *vdev,
+			      struct vfio_pci_irq_ctx *ctx, unsigned long index)
 {
-	kfree(vdev->ctx);
+	xa_erase(&vdev->ctx, index);
+	kfree(ctx);
 }
 
-static int vfio_irq_ctx_alloc_num(struct vfio_pci_core_device *vdev,
-				  unsigned long num)
+static struct vfio_pci_irq_ctx *
+vfio_irq_ctx_alloc(struct vfio_pci_core_device *vdev, unsigned long index)
 {
-	vdev->ctx = kcalloc(num, sizeof(struct vfio_pci_irq_ctx),
-			    GFP_KERNEL_ACCOUNT);
-	if (!vdev->ctx)
-		return -ENOMEM;
+	struct vfio_pci_irq_ctx *ctx;
+	int ret;
 
-	return 0;
+	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL_ACCOUNT);
+	if (!ctx)
+		return NULL;
+
+	ret = xa_insert(&vdev->ctx, index, ctx, GFP_KERNEL_ACCOUNT);
+	if (ret) {
+		kfree(ctx);
+		return NULL;
+	}
+
+	return ctx;
 }
 
 /*
@@ -218,7 +226,6 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
 static int vfio_intx_enable(struct vfio_pci_core_device *vdev)
 {
 	struct vfio_pci_irq_ctx *ctx;
-	int ret;
 
 	if (!is_irq_none(vdev))
 		return -EINVAL;
@@ -226,15 +233,9 @@ static int vfio_intx_enable(struct vfio_pci_core_device *vdev)
 	if (!vdev->pdev->irq)
 		return -ENODEV;
 
-	ret = vfio_irq_ctx_alloc_num(vdev, 1);
-	if (ret)
-		return ret;
-
-	ctx = vfio_irq_ctx_get(vdev, 0);
-	if (!ctx) {
-		vfio_irq_ctx_free_all(vdev);
-		return -EINVAL;
-	}
+	ctx = vfio_irq_ctx_alloc(vdev, 0);
+	if (!ctx)
+		return -ENOMEM;
 
 	vdev->num_ctx = 1;
 
@@ -325,7 +326,7 @@ static void vfio_intx_disable(struct vfio_pci_core_device *vdev)
 	vfio_intx_set_signal(vdev, -1);
 	vdev->irq_type = VFIO_PCI_NUM_IRQS;
 	vdev->num_ctx = 0;
-	vfio_irq_ctx_free_all(vdev);
+	vfio_irq_ctx_free(vdev, ctx, 0);
 }
 
 /*
@@ -349,10 +350,6 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi
 	if (!is_irq_none(vdev))
 		return -EINVAL;
 
-	ret = vfio_irq_ctx_alloc_num(vdev, nvec);
-	if (ret)
-		return ret;
-
 	/* return the number of supported vectors if we can't get all: */
 	cmd = vfio_pci_memory_lock_and_enable(vdev);
 	ret = pci_alloc_irq_vectors(pdev, 1, nvec, flag);
@@ -360,7 +357,6 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi
 		if (ret > 0)
 			pci_free_irq_vectors(pdev);
 		vfio_pci_memory_unlock_and_restore(vdev, cmd);
-		vfio_irq_ctx_free_all(vdev);
 		return ret;
 	}
 	vfio_pci_memory_unlock_and_restore(vdev, cmd);
@@ -392,12 +388,13 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
 	if (vector >= vdev->num_ctx)
 		return -EINVAL;
 
-	ctx = vfio_irq_ctx_get(vdev, vector);
-	if (!ctx)
-		return -EINVAL;
 	irq = pci_irq_vector(pdev, vector);
+	if (irq < 0)
+		return -EINVAL;
 
-	if (ctx->trigger) {
+	ctx = vfio_irq_ctx_get(vdev, vector);
+
+	if (ctx) {
 		irq_bypass_unregister_producer(&ctx->producer);
 
 		cmd = vfio_pci_memory_lock_and_enable(vdev);
@@ -405,16 +402,22 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
 		vfio_pci_memory_unlock_and_restore(vdev, cmd);
 		kfree(ctx->name);
 		eventfd_ctx_put(ctx->trigger);
-		ctx->trigger = NULL;
+		vfio_irq_ctx_free(vdev, ctx, vector);
 	}
 
 	if (fd < 0)
 		return 0;
 
+	ctx = vfio_irq_ctx_alloc(vdev, vector);
+	if (!ctx)
+		return -ENOMEM;
+
 	ctx->name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-msi%s[%d](%s)",
 			      msix ? "x" : "", vector, pci_name(pdev));
-	if (!ctx->name)
-		return -ENOMEM;
+	if (!ctx->name) {
+		ret = -ENOMEM;
+		goto out_free_ctx;
+	}
 
 	trigger = eventfd_ctx_fdget(fd);
 	if (IS_ERR(trigger)) {
@@ -460,6 +463,8 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
 	eventfd_ctx_put(trigger);
 out_free_name:
 	kfree(ctx->name);
+out_free_ctx:
+	vfio_irq_ctx_free(vdev, ctx, vector);
 	return ret;
 }
 
@@ -489,16 +494,13 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	struct vfio_pci_irq_ctx *ctx;
-	unsigned int i;
+	unsigned long i;
 	u16 cmd;
 
-	for (i = 0; i < vdev->num_ctx; i++) {
-		ctx = vfio_irq_ctx_get(vdev, i);
-		if (ctx) {
-			vfio_virqfd_disable(&ctx->unmask);
-			vfio_virqfd_disable(&ctx->mask);
-			vfio_msi_set_vector_signal(vdev, i, -1, msix);
-		}
+	xa_for_each(&vdev->ctx, i, ctx) {
+		vfio_virqfd_disable(&ctx->unmask);
+		vfio_virqfd_disable(&ctx->mask);
+		vfio_msi_set_vector_signal(vdev, i, -1, msix);
 	}
 
 	cmd = vfio_pci_memory_lock_and_enable(vdev);
@@ -514,7 +516,6 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix)
 
 	vdev->irq_type = VFIO_PCI_NUM_IRQS;
 	vdev->num_ctx = 0;
-	vfio_irq_ctx_free_all(vdev);
 }
 
 /*
@@ -654,7 +655,7 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev,
 
 	for (i = start; i < start + count; i++) {
 		ctx = vfio_irq_ctx_get(vdev, i);
-		if (!ctx || !ctx->trigger)
+		if (!ctx)
 			continue;
 		if (flags & VFIO_IRQ_SET_DATA_NONE) {
 			eventfd_signal(ctx->trigger, 1);
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index 367fd79226a3..61d7873a3973 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -59,7 +59,7 @@ struct vfio_pci_core_device {
 	struct perm_bits	*msi_perm;
 	spinlock_t		irqlock;
 	struct mutex		igate;
-	struct vfio_pci_irq_ctx	*ctx;
+	struct xarray		ctx;
 	int			num_ctx;
 	int			irq_type;
 	int			num_regions;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH V3 06/10] vfio/pci: Remove interrupt context counter
  2023-04-18 17:29 [PATCH V3 00/10] vfio/pci: Support dynamic allocation of MSI-X interrupts Reinette Chatre
                   ` (4 preceding siblings ...)
  2023-04-18 17:29 ` [PATCH V3 05/10] vfio/pci: Use xarray for interrupt context storage Reinette Chatre
@ 2023-04-18 17:29 ` Reinette Chatre
  2023-04-18 17:29 ` [PATCH V3 07/10] vfio/pci: Update stale comment Reinette Chatre
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Reinette Chatre @ 2023-04-18 17:29 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian,
	alex.williamson
  Cc: tglx, darwi, kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu,
	tom.zanussi, reinette.chatre, linux-kernel

struct vfio_pci_core_device::num_ctx counts how many interrupt
contexts have been allocated. When all interrupt contexts are
allocated simultaneously num_ctx provides the upper bound of all
vectors that can be used as indices into the interrupt context
array.

With the upcoming support for dynamic MSI-X the number of
interrupt contexts does not necessarily span the range of allocated
interrupts. Consequently, num_ctx is no longer a trusted upper bound
for valid indices.

Stop using num_ctx to determine if a provided vector is valid. Use
the existence of allocated interrupt.

This changes behavior on the error path when user space provides
an invalid vector range. Behavior changes from early exit without
any modifications to possible modifications to valid vectors within
the invalid range. This is acceptable considering that an invalid
range is not a valid scenario, see link to discussion.

The checks that ensure that user space provides a range of vectors
that is valid for the device are untouched.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/lkml/20230316155646.07ae266f.alex.williamson@redhat.com/
---
Changes since V2:
- Update changelog to reflect change in policy that existence of
  allocated interrupt is validity check, not existence of context
  (which is now dynamically allocated).

Changes since RFC V1:
- Remove vfio_irq_ctx_range_allocated(). (Alex and Kevin).

 drivers/vfio/pci/vfio_pci_intrs.c | 13 +------------
 include/linux/vfio_pci_core.h     |  1 -
 2 files changed, 1 insertion(+), 13 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index 117cd384b3ad..5e3de004f4cb 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -237,8 +237,6 @@ static int vfio_intx_enable(struct vfio_pci_core_device *vdev)
 	if (!ctx)
 		return -ENOMEM;
 
-	vdev->num_ctx = 1;
-
 	/*
 	 * If the virtual interrupt is masked, restore it.  Devices
 	 * supporting DisINTx can be masked at the hardware level
@@ -325,7 +323,6 @@ static void vfio_intx_disable(struct vfio_pci_core_device *vdev)
 	}
 	vfio_intx_set_signal(vdev, -1);
 	vdev->irq_type = VFIO_PCI_NUM_IRQS;
-	vdev->num_ctx = 0;
 	vfio_irq_ctx_free(vdev, ctx, 0);
 }
 
@@ -361,7 +358,6 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi
 	}
 	vfio_pci_memory_unlock_and_restore(vdev, cmd);
 
-	vdev->num_ctx = nvec;
 	vdev->irq_type = msix ? VFIO_PCI_MSIX_IRQ_INDEX :
 				VFIO_PCI_MSI_IRQ_INDEX;
 
@@ -385,9 +381,6 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
 	int irq, ret;
 	u16 cmd;
 
-	if (vector >= vdev->num_ctx)
-		return -EINVAL;
-
 	irq = pci_irq_vector(pdev, vector);
 	if (irq < 0)
 		return -EINVAL;
@@ -474,9 +467,6 @@ static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start,
 	unsigned int i, j;
 	int ret = 0;
 
-	if (start >= vdev->num_ctx || start + count > vdev->num_ctx)
-		return -EINVAL;
-
 	for (i = 0, j = start; i < count && !ret; i++, j++) {
 		int fd = fds ? fds[i] : -1;
 		ret = vfio_msi_set_vector_signal(vdev, j, fd, msix);
@@ -515,7 +505,6 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix)
 		pci_intx(pdev, 0);
 
 	vdev->irq_type = VFIO_PCI_NUM_IRQS;
-	vdev->num_ctx = 0;
 }
 
 /*
@@ -650,7 +639,7 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev,
 		return ret;
 	}
 
-	if (!irq_is(vdev, index) || start + count > vdev->num_ctx)
+	if (!irq_is(vdev, index))
 		return -EINVAL;
 
 	for (i = start; i < start + count; i++) {
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index 61d7873a3973..148fd1ae6c1c 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -60,7 +60,6 @@ struct vfio_pci_core_device {
 	spinlock_t		irqlock;
 	struct mutex		igate;
 	struct xarray		ctx;
-	int			num_ctx;
 	int			irq_type;
 	int			num_regions;
 	struct vfio_pci_region	*region;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH V3 07/10] vfio/pci: Update stale comment
  2023-04-18 17:29 [PATCH V3 00/10] vfio/pci: Support dynamic allocation of MSI-X interrupts Reinette Chatre
                   ` (5 preceding siblings ...)
  2023-04-18 17:29 ` [PATCH V3 06/10] vfio/pci: Remove interrupt context counter Reinette Chatre
@ 2023-04-18 17:29 ` Reinette Chatre
  2023-04-18 17:29 ` [PATCH V3 08/10] vfio/pci: Probe and store ability to support dynamic MSI-X Reinette Chatre
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Reinette Chatre @ 2023-04-18 17:29 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian,
	alex.williamson
  Cc: tglx, darwi, kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu,
	tom.zanussi, reinette.chatre, linux-kernel

In preparation for surrounding code change it is helpful to
ensure that existing comments are accurate.

Remove inaccurate comment about direct access and update
the rest of the comment to reflect the purpose of writing
the cached MSI message to the device.

Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/lkml/20230330164050.0069e2a5.alex.williamson@redhat.com/
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V2:
- New patch.

 drivers/vfio/pci/vfio_pci_intrs.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index 5e3de004f4cb..bdda7f46c2be 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -419,11 +419,9 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
 	}
 
 	/*
-	 * The MSIx vector table resides in device memory which may be cleared
-	 * via backdoor resets. We don't allow direct access to the vector
-	 * table so even if a userspace driver attempts to save/restore around
-	 * such a reset it would be unsuccessful. To avoid this, restore the
-	 * cached value of the message prior to enabling.
+	 * If the vector was previously allocated, refresh the on-device
+	 * message data before enabling in case it had been cleared or
+	 * corrupted since writing.
 	 */
 	cmd = vfio_pci_memory_lock_and_enable(vdev);
 	if (msix) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH V3 08/10] vfio/pci: Probe and store ability to support dynamic MSI-X
  2023-04-18 17:29 [PATCH V3 00/10] vfio/pci: Support dynamic allocation of MSI-X interrupts Reinette Chatre
                   ` (6 preceding siblings ...)
  2023-04-18 17:29 ` [PATCH V3 07/10] vfio/pci: Update stale comment Reinette Chatre
@ 2023-04-18 17:29 ` Reinette Chatre
  2023-04-18 22:38   ` Alex Williamson
  2023-04-18 17:29 ` [PATCH V3 09/10] vfio/pci: Support " Reinette Chatre
  2023-04-18 17:29 ` [PATCH V3 10/10] vfio/pci: Clear VFIO_IRQ_INFO_NORESIZE for MSI-X Reinette Chatre
  9 siblings, 1 reply; 23+ messages in thread
From: Reinette Chatre @ 2023-04-18 17:29 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian,
	alex.williamson
  Cc: tglx, darwi, kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu,
	tom.zanussi, reinette.chatre, linux-kernel

Not all MSI-X devices support dynamic MSI-X allocation. Whether
a device supports dynamic MSI-X should be queried using
pci_msix_can_alloc_dyn().

Instead of scattering code with pci_msix_can_alloc_dyn(),
probe this ability once and store it as a property of the
virtual device.

Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V2:
- New patch. (Alex)

 drivers/vfio/pci/vfio_pci_core.c | 5 ++++-
 include/linux/vfio_pci_core.h    | 1 +
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index ae0e161c7fc9..a3635a8e54c8 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -530,8 +530,11 @@ int vfio_pci_core_enable(struct vfio_pci_core_device *vdev)
 		vdev->msix_bar = table & PCI_MSIX_TABLE_BIR;
 		vdev->msix_offset = table & PCI_MSIX_TABLE_OFFSET;
 		vdev->msix_size = ((flags & PCI_MSIX_FLAGS_QSIZE) + 1) * 16;
-	} else
+		vdev->has_dyn_msix = pci_msix_can_alloc_dyn(pdev);
+	} else {
 		vdev->msix_bar = 0xFF;
+		vdev->has_dyn_msix = false;
+	}
 
 	if (!vfio_vga_disabled() && vfio_pci_is_vga(pdev))
 		vdev->has_vga = true;
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index 148fd1ae6c1c..4f070f2d6fde 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -67,6 +67,7 @@ struct vfio_pci_core_device {
 	u8			msix_bar;
 	u16			msix_size;
 	u32			msix_offset;
+	bool			has_dyn_msix;
 	u32			rbar[7];
 	bool			pci_2_3;
 	bool			virq_disabled;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH V3 09/10] vfio/pci: Support dynamic MSI-X
  2023-04-18 17:29 [PATCH V3 00/10] vfio/pci: Support dynamic allocation of MSI-X interrupts Reinette Chatre
                   ` (7 preceding siblings ...)
  2023-04-18 17:29 ` [PATCH V3 08/10] vfio/pci: Probe and store ability to support dynamic MSI-X Reinette Chatre
@ 2023-04-18 17:29 ` Reinette Chatre
  2023-04-18 22:38   ` Alex Williamson
  2023-04-18 17:29 ` [PATCH V3 10/10] vfio/pci: Clear VFIO_IRQ_INFO_NORESIZE for MSI-X Reinette Chatre
  9 siblings, 1 reply; 23+ messages in thread
From: Reinette Chatre @ 2023-04-18 17:29 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian,
	alex.williamson
  Cc: tglx, darwi, kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu,
	tom.zanussi, reinette.chatre, linux-kernel

Recently introduced pci_msix_alloc_irq_at() and pci_msix_free_irq()
enables an individual MSI-X interrupt to be allocated and freed after
MSI-X enabling.

Use dynamic MSI-X (if supported by the device) to allocate an interrupt
after MSI-X is enabled. An MSI-X interrupt is dynamically allocated at
the time a valid eventfd is assigned. This is different behavior from
a range provided during MSI-X enabling where interrupts are allocated
for the entire range whether a valid eventfd is provided for each
interrupt or not.

Do not dynamically free interrupts, leave that to when MSI-X is
disabled.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/lkml/20230403211841.0e206b67.alex.williamson@redhat.com/
---
The get_cached_msi_msg()/pci_write_msi_msg() behavior is kept
similar to the scenario when MSI-X is enabled with triggers
provided for new interrupts. get_cached_msi_msg()/pci_write_msi_msg()
follows for interrupts recently allocated with pci_msix_alloc_irq_at()
just like get_cached_msi_msg()/pci_write_msi_msg() is done for
interrupts recently allocated with pci_alloc_irq_vectors().

Changes since V2:
- Move vfio_irq_ctx_free() to earlier in series to support
  earlier usage. (Alex)
- Use consistent terms in changelog: MSI-x changed to MSI-X.
- Make dynamic interrupt context creation generic across all
  MSI/MSI-X interrupts. This resulted in code moving to earlier
  in series as part of xarray introduction patch. (Alex)
- Remove the local allow_dyn_alloc and direct calling of
  pci_msix_can_alloc_dyn(), use the new vdev->has_dyn_msix
  introduced earlier instead. (Alex)
- Stop tracking new allocations (remove "new_ctx"). (Alex)
- Introduce new wrapper that returns Linux interrupt number or
  dynamically allocate a new interrupt. Wrapper can be used for
  all interrupt cases. (Alex)
- Only free dynamic MSI-X interrupts on MSI-X teardown. (Alex)

Changes since RFC V1:
- Add pointer to interrupt context as function parameter to
  vfio_irq_ctx_free(). (Alex)
- Initialize new_ctx to false. (Dan Carpenter)
- Only support dynamic allocation if device supports it. (Alex)

 drivers/vfio/pci/vfio_pci_intrs.c | 73 +++++++++++++++++++++++++++----
 1 file changed, 65 insertions(+), 8 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index bdda7f46c2be..c1a3e224c867 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -372,27 +372,74 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi
 	return 0;
 }
 
+/*
+ * Return Linux IRQ number of an MSI or MSI-X device interrupt vector.
+ * If a Linux IRQ number is not available then a new interrupt will be
+ * allocated if dynamic MSI-X is supported.
+ */
+static int vfio_msi_alloc_irq(struct vfio_pci_core_device *vdev,
+			      unsigned int vector, bool msix)
+{
+	struct pci_dev *pdev = vdev->pdev;
+	struct msi_map map;
+	int irq;
+	u16 cmd;
+
+	irq = pci_irq_vector(pdev, vector);
+	if (irq > 0 || !msix || !vdev->has_dyn_msix)
+		return irq;
+
+	cmd = vfio_pci_memory_lock_and_enable(vdev);
+	map = pci_msix_alloc_irq_at(pdev, vector, NULL);
+	vfio_pci_memory_unlock_and_restore(vdev, cmd);
+
+	return map.index < 0 ? map.index : map.virq;
+}
+
+/*
+ * Free interrupt if it can be re-allocated dynamically (while MSI-X
+ * is enabled).
+ */
+static void vfio_msi_free_irq(struct vfio_pci_core_device *vdev,
+			      unsigned int vector, bool msix)
+{
+	struct pci_dev *pdev = vdev->pdev;
+	struct msi_map map;
+	int irq;
+	u16 cmd;
+
+	if (!msix || !vdev->has_dyn_msix)
+		return;
+
+	irq = pci_irq_vector(pdev, vector);
+	map = (struct msi_map) { .index = vector, .virq = irq };
+
+	if (WARN_ON(irq < 0))
+		return;
+
+	cmd = vfio_pci_memory_lock_and_enable(vdev);
+	pci_msix_free_irq(pdev, map);
+	vfio_pci_memory_unlock_and_restore(vdev, cmd);
+}
+
 static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
 				      unsigned int vector, int fd, bool msix)
 {
 	struct pci_dev *pdev = vdev->pdev;
 	struct vfio_pci_irq_ctx *ctx;
 	struct eventfd_ctx *trigger;
-	int irq, ret;
+	int irq = -EINVAL, ret;
 	u16 cmd;
 
-	irq = pci_irq_vector(pdev, vector);
-	if (irq < 0)
-		return -EINVAL;
-
 	ctx = vfio_irq_ctx_get(vdev, vector);
 
 	if (ctx) {
 		irq_bypass_unregister_producer(&ctx->producer);
-
+		irq = pci_irq_vector(pdev, vector);
 		cmd = vfio_pci_memory_lock_and_enable(vdev);
 		free_irq(irq, ctx->trigger);
 		vfio_pci_memory_unlock_and_restore(vdev, cmd);
+		/* Interrupt stays allocated, will be freed at MSI-X disable. */
 		kfree(ctx->name);
 		eventfd_ctx_put(ctx->trigger);
 		vfio_irq_ctx_free(vdev, ctx, vector);
@@ -401,9 +448,17 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
 	if (fd < 0)
 		return 0;
 
+	if (irq == -EINVAL) {
+		irq = vfio_msi_alloc_irq(vdev, vector, msix);
+		if (irq < 0)
+			return irq;
+	}
+
 	ctx = vfio_irq_ctx_alloc(vdev, vector);
-	if (!ctx)
-		return -ENOMEM;
+	if (!ctx) {
+		ret = -ENOMEM;
+		goto out_free_irq;
+	}
 
 	ctx->name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-msi%s[%d](%s)",
 			      msix ? "x" : "", vector, pci_name(pdev));
@@ -456,6 +511,8 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
 	kfree(ctx->name);
 out_free_ctx:
 	vfio_irq_ctx_free(vdev, ctx, vector);
+out_free_irq:
+	vfio_msi_free_irq(vdev, vector, msix);
 	return ret;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH V3 10/10] vfio/pci: Clear VFIO_IRQ_INFO_NORESIZE for MSI-X
  2023-04-18 17:29 [PATCH V3 00/10] vfio/pci: Support dynamic allocation of MSI-X interrupts Reinette Chatre
                   ` (8 preceding siblings ...)
  2023-04-18 17:29 ` [PATCH V3 09/10] vfio/pci: Support " Reinette Chatre
@ 2023-04-18 17:29 ` Reinette Chatre
  2023-04-18 22:38   ` Alex Williamson
  9 siblings, 1 reply; 23+ messages in thread
From: Reinette Chatre @ 2023-04-18 17:29 UTC (permalink / raw)
  To: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian,
	alex.williamson
  Cc: tglx, darwi, kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu,
	tom.zanussi, reinette.chatre, linux-kernel

Dynamic MSI-X is supported. Clear VFIO_IRQ_INFO_NORESIZE
to provide guidance to user space.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V2:
- Use new vdev->has_dyn_msix property instead of calling
  pci_msix_can_alloc_dyn() directly. (Alex)

Changes since RFC V1:
- Only advertise VFIO_IRQ_INFO_NORESIZE for MSI-X devices that
  can actually support dynamic allocation. (Alex)

 drivers/vfio/pci/vfio_pci_core.c | 4 +++-
 include/uapi/linux/vfio.h        | 3 +++
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index a3635a8e54c8..4050ad3388c2 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -1114,7 +1114,9 @@ static int vfio_pci_ioctl_get_irq_info(struct vfio_pci_core_device *vdev,
 	if (info.index == VFIO_PCI_INTX_IRQ_INDEX)
 		info.flags |=
 			(VFIO_IRQ_INFO_MASKABLE | VFIO_IRQ_INFO_AUTOMASKED);
-	else
+	else if ((info.index != VFIO_PCI_MSIX_IRQ_INDEX) ||
+		 (info.index == VFIO_PCI_MSIX_IRQ_INDEX &&
+		  !vdev->has_dyn_msix))
 		info.flags |= VFIO_IRQ_INFO_NORESIZE;
 
 	return copy_to_user(arg, &info, minsz) ? -EFAULT : 0;
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 0552e8dcf0cb..1a36134cae5c 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -511,6 +511,9 @@ struct vfio_region_info_cap_nvlink2_lnkspd {
  * then add and unmask vectors, it's up to userspace to make the decision
  * whether to allocate the maximum supported number of vectors or tear
  * down setup and incrementally increase the vectors as each is enabled.
+ * Absence of the NORESIZE flag indicates that vectors can be enabled
+ * and disabled dynamically without impacting other vectors within the
+ * index.
  */
 struct vfio_irq_info {
 	__u32	argsz;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH V3 08/10] vfio/pci: Probe and store ability to support dynamic MSI-X
  2023-04-18 17:29 ` [PATCH V3 08/10] vfio/pci: Probe and store ability to support dynamic MSI-X Reinette Chatre
@ 2023-04-18 22:38   ` Alex Williamson
  2023-04-19 18:11     ` Reinette Chatre
  0 siblings, 1 reply; 23+ messages in thread
From: Alex Williamson @ 2023-04-18 22:38 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, tglx, darwi,
	kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	linux-kernel

On Tue, 18 Apr 2023 10:29:19 -0700
Reinette Chatre <reinette.chatre@intel.com> wrote:

> Not all MSI-X devices support dynamic MSI-X allocation. Whether
> a device supports dynamic MSI-X should be queried using
> pci_msix_can_alloc_dyn().
> 
> Instead of scattering code with pci_msix_can_alloc_dyn(),
> probe this ability once and store it as a property of the
> virtual device.
> 
> Suggested-by: Alex Williamson <alex.williamson@redhat.com>
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> Changes since V2:
> - New patch. (Alex)
> 
>  drivers/vfio/pci/vfio_pci_core.c | 5 ++++-
>  include/linux/vfio_pci_core.h    | 1 +
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> index ae0e161c7fc9..a3635a8e54c8 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -530,8 +530,11 @@ int vfio_pci_core_enable(struct vfio_pci_core_device *vdev)
>  		vdev->msix_bar = table & PCI_MSIX_TABLE_BIR;
>  		vdev->msix_offset = table & PCI_MSIX_TABLE_OFFSET;
>  		vdev->msix_size = ((flags & PCI_MSIX_FLAGS_QSIZE) + 1) * 16;
> -	} else
> +		vdev->has_dyn_msix = pci_msix_can_alloc_dyn(pdev);
> +	} else {
>  		vdev->msix_bar = 0xFF;
> +		vdev->has_dyn_msix = false;
> +	}
>  
>  	if (!vfio_vga_disabled() && vfio_pci_is_vga(pdev))
>  		vdev->has_vga = true;
> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
> index 148fd1ae6c1c..4f070f2d6fde 100644
> --- a/include/linux/vfio_pci_core.h
> +++ b/include/linux/vfio_pci_core.h
> @@ -67,6 +67,7 @@ struct vfio_pci_core_device {
>  	u8			msix_bar;
>  	u16			msix_size;
>  	u32			msix_offset;
> +	bool			has_dyn_msix;
>  	u32			rbar[7];
>  	bool			pci_2_3;
>  	bool			virq_disabled;

Nit, the whole data structure probably needs to be sorted with pahole,
but creating a hole here for locality to other msix fields should
probably be secondary to keeping the structure well packed, which
suggests including this new field among the bools below.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH V3 09/10] vfio/pci: Support dynamic MSI-X
  2023-04-18 17:29 ` [PATCH V3 09/10] vfio/pci: Support " Reinette Chatre
@ 2023-04-18 22:38   ` Alex Williamson
  2023-04-19 18:13     ` Reinette Chatre
  0 siblings, 1 reply; 23+ messages in thread
From: Alex Williamson @ 2023-04-18 22:38 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, tglx, darwi,
	kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	linux-kernel

On Tue, 18 Apr 2023 10:29:20 -0700
Reinette Chatre <reinette.chatre@intel.com> wrote:

> Recently introduced pci_msix_alloc_irq_at() and pci_msix_free_irq()
> enables an individual MSI-X interrupt to be allocated and freed after
> MSI-X enabling.
> 
> Use dynamic MSI-X (if supported by the device) to allocate an interrupt
> after MSI-X is enabled. An MSI-X interrupt is dynamically allocated at
> the time a valid eventfd is assigned. This is different behavior from
> a range provided during MSI-X enabling where interrupts are allocated
> for the entire range whether a valid eventfd is provided for each
> interrupt or not.
> 
> Do not dynamically free interrupts, leave that to when MSI-X is
> disabled.

But we do, sometimes, even if essentially only on the error path.  Is
that worthwhile?  It seems like we could entirely remove
vfio_msi_free_irq() and rely only on pci_free_irq_vectors() on MSI/X
teardown.

I'd probably also add a comment in the commit log about the theory
behind not dynamically freeing irqs, ie. latency, reliability, and
whatever else we used to justify it.  Thanks,

Alex

> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> Link: https://lore.kernel.org/lkml/20230403211841.0e206b67.alex.williamson@redhat.com/
> ---
> The get_cached_msi_msg()/pci_write_msi_msg() behavior is kept
> similar to the scenario when MSI-X is enabled with triggers
> provided for new interrupts. get_cached_msi_msg()/pci_write_msi_msg()
> follows for interrupts recently allocated with pci_msix_alloc_irq_at()
> just like get_cached_msi_msg()/pci_write_msi_msg() is done for
> interrupts recently allocated with pci_alloc_irq_vectors().
> 
> Changes since V2:
> - Move vfio_irq_ctx_free() to earlier in series to support
>   earlier usage. (Alex)
> - Use consistent terms in changelog: MSI-x changed to MSI-X.
> - Make dynamic interrupt context creation generic across all
>   MSI/MSI-X interrupts. This resulted in code moving to earlier
>   in series as part of xarray introduction patch. (Alex)
> - Remove the local allow_dyn_alloc and direct calling of
>   pci_msix_can_alloc_dyn(), use the new vdev->has_dyn_msix
>   introduced earlier instead. (Alex)
> - Stop tracking new allocations (remove "new_ctx"). (Alex)
> - Introduce new wrapper that returns Linux interrupt number or
>   dynamically allocate a new interrupt. Wrapper can be used for
>   all interrupt cases. (Alex)
> - Only free dynamic MSI-X interrupts on MSI-X teardown. (Alex)
> 
> Changes since RFC V1:
> - Add pointer to interrupt context as function parameter to
>   vfio_irq_ctx_free(). (Alex)
> - Initialize new_ctx to false. (Dan Carpenter)
> - Only support dynamic allocation if device supports it. (Alex)
> 
>  drivers/vfio/pci/vfio_pci_intrs.c | 73 +++++++++++++++++++++++++++----
>  1 file changed, 65 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
> index bdda7f46c2be..c1a3e224c867 100644
> --- a/drivers/vfio/pci/vfio_pci_intrs.c
> +++ b/drivers/vfio/pci/vfio_pci_intrs.c
> @@ -372,27 +372,74 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi
>  	return 0;
>  }
>  
> +/*
> + * Return Linux IRQ number of an MSI or MSI-X device interrupt vector.
> + * If a Linux IRQ number is not available then a new interrupt will be
> + * allocated if dynamic MSI-X is supported.
> + */
> +static int vfio_msi_alloc_irq(struct vfio_pci_core_device *vdev,
> +			      unsigned int vector, bool msix)
> +{
> +	struct pci_dev *pdev = vdev->pdev;
> +	struct msi_map map;
> +	int irq;
> +	u16 cmd;
> +
> +	irq = pci_irq_vector(pdev, vector);
> +	if (irq > 0 || !msix || !vdev->has_dyn_msix)
> +		return irq;
> +
> +	cmd = vfio_pci_memory_lock_and_enable(vdev);
> +	map = pci_msix_alloc_irq_at(pdev, vector, NULL);
> +	vfio_pci_memory_unlock_and_restore(vdev, cmd);
> +
> +	return map.index < 0 ? map.index : map.virq;
> +}
> +
> +/*
> + * Free interrupt if it can be re-allocated dynamically (while MSI-X
> + * is enabled).
> + */
> +static void vfio_msi_free_irq(struct vfio_pci_core_device *vdev,
> +			      unsigned int vector, bool msix)
> +{
> +	struct pci_dev *pdev = vdev->pdev;
> +	struct msi_map map;
> +	int irq;
> +	u16 cmd;
> +
> +	if (!msix || !vdev->has_dyn_msix)
> +		return;
> +
> +	irq = pci_irq_vector(pdev, vector);
> +	map = (struct msi_map) { .index = vector, .virq = irq };
> +
> +	if (WARN_ON(irq < 0))
> +		return;
> +
> +	cmd = vfio_pci_memory_lock_and_enable(vdev);
> +	pci_msix_free_irq(pdev, map);
> +	vfio_pci_memory_unlock_and_restore(vdev, cmd);
> +}
> +
>  static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
>  				      unsigned int vector, int fd, bool msix)
>  {
>  	struct pci_dev *pdev = vdev->pdev;
>  	struct vfio_pci_irq_ctx *ctx;
>  	struct eventfd_ctx *trigger;
> -	int irq, ret;
> +	int irq = -EINVAL, ret;
>  	u16 cmd;
>  
> -	irq = pci_irq_vector(pdev, vector);
> -	if (irq < 0)
> -		return -EINVAL;
> -
>  	ctx = vfio_irq_ctx_get(vdev, vector);
>  
>  	if (ctx) {
>  		irq_bypass_unregister_producer(&ctx->producer);
> -
> +		irq = pci_irq_vector(pdev, vector);
>  		cmd = vfio_pci_memory_lock_and_enable(vdev);
>  		free_irq(irq, ctx->trigger);
>  		vfio_pci_memory_unlock_and_restore(vdev, cmd);
> +		/* Interrupt stays allocated, will be freed at MSI-X disable. */
>  		kfree(ctx->name);
>  		eventfd_ctx_put(ctx->trigger);
>  		vfio_irq_ctx_free(vdev, ctx, vector);
> @@ -401,9 +448,17 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
>  	if (fd < 0)
>  		return 0;
>  
> +	if (irq == -EINVAL) {
> +		irq = vfio_msi_alloc_irq(vdev, vector, msix);
> +		if (irq < 0)
> +			return irq;
> +	}
> +
>  	ctx = vfio_irq_ctx_alloc(vdev, vector);
> -	if (!ctx)
> -		return -ENOMEM;
> +	if (!ctx) {
> +		ret = -ENOMEM;
> +		goto out_free_irq;
> +	}
>  
>  	ctx->name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-msi%s[%d](%s)",
>  			      msix ? "x" : "", vector, pci_name(pdev));
> @@ -456,6 +511,8 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
>  	kfree(ctx->name);
>  out_free_ctx:
>  	vfio_irq_ctx_free(vdev, ctx, vector);
> +out_free_irq:
> +	vfio_msi_free_irq(vdev, vector, msix);
>  	return ret;
>  }
>  


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH V3 10/10] vfio/pci: Clear VFIO_IRQ_INFO_NORESIZE for MSI-X
  2023-04-18 17:29 ` [PATCH V3 10/10] vfio/pci: Clear VFIO_IRQ_INFO_NORESIZE for MSI-X Reinette Chatre
@ 2023-04-18 22:38   ` Alex Williamson
  2023-04-19 18:13     ` Reinette Chatre
  0 siblings, 1 reply; 23+ messages in thread
From: Alex Williamson @ 2023-04-18 22:38 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, tglx, darwi,
	kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	linux-kernel

On Tue, 18 Apr 2023 10:29:21 -0700
Reinette Chatre <reinette.chatre@intel.com> wrote:

> Dynamic MSI-X is supported. Clear VFIO_IRQ_INFO_NORESIZE
> to provide guidance to user space.
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> Changes since V2:
> - Use new vdev->has_dyn_msix property instead of calling
>   pci_msix_can_alloc_dyn() directly. (Alex)
> 
> Changes since RFC V1:
> - Only advertise VFIO_IRQ_INFO_NORESIZE for MSI-X devices that
>   can actually support dynamic allocation. (Alex)
> 
>  drivers/vfio/pci/vfio_pci_core.c | 4 +++-
>  include/uapi/linux/vfio.h        | 3 +++
>  2 files changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> index a3635a8e54c8..4050ad3388c2 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -1114,7 +1114,9 @@ static int vfio_pci_ioctl_get_irq_info(struct vfio_pci_core_device *vdev,
>  	if (info.index == VFIO_PCI_INTX_IRQ_INDEX)
>  		info.flags |=
>  			(VFIO_IRQ_INFO_MASKABLE | VFIO_IRQ_INFO_AUTOMASKED);
> -	else
> +	else if ((info.index != VFIO_PCI_MSIX_IRQ_INDEX) ||
> +		 (info.index == VFIO_PCI_MSIX_IRQ_INDEX &&
> +		  !vdev->has_dyn_msix))

Isn't this the same as:

	(info.index != VFIO_PCI_MSIX_IRQ_INDEX || !vdev->has_dyn_msix)

Thanks,
Alex

>  		info.flags |= VFIO_IRQ_INFO_NORESIZE;
>  
>  	return copy_to_user(arg, &info, minsz) ? -EFAULT : 0;
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index 0552e8dcf0cb..1a36134cae5c 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -511,6 +511,9 @@ struct vfio_region_info_cap_nvlink2_lnkspd {
>   * then add and unmask vectors, it's up to userspace to make the decision
>   * whether to allocate the maximum supported number of vectors or tear
>   * down setup and incrementally increase the vectors as each is enabled.
> + * Absence of the NORESIZE flag indicates that vectors can be enabled
> + * and disabled dynamically without impacting other vectors within the
> + * index.
>   */
>  struct vfio_irq_info {
>  	__u32	argsz;


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH V3 08/10] vfio/pci: Probe and store ability to support dynamic MSI-X
  2023-04-18 22:38   ` Alex Williamson
@ 2023-04-19 18:11     ` Reinette Chatre
  2023-04-24 17:43       ` Jason Gunthorpe
  0 siblings, 1 reply; 23+ messages in thread
From: Reinette Chatre @ 2023-04-19 18:11 UTC (permalink / raw)
  To: Alex Williamson
  Cc: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, tglx, darwi,
	kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	linux-kernel

Hi Alex,

On 4/18/2023 3:38 PM, Alex Williamson wrote:
> On Tue, 18 Apr 2023 10:29:19 -0700
> Reinette Chatre <reinette.chatre@intel.com> wrote:
> 

...

>> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
>> index 148fd1ae6c1c..4f070f2d6fde 100644
>> --- a/include/linux/vfio_pci_core.h
>> +++ b/include/linux/vfio_pci_core.h
>> @@ -67,6 +67,7 @@ struct vfio_pci_core_device {
>>  	u8			msix_bar;
>>  	u16			msix_size;
>>  	u32			msix_offset;
>> +	bool			has_dyn_msix;
>>  	u32			rbar[7];
>>  	bool			pci_2_3;
>>  	bool			virq_disabled;
> 
> Nit, the whole data structure probably needs to be sorted with pahole,
> but creating a hole here for locality to other msix fields should
> probably be secondary to keeping the structure well packed, which
> suggests including this new field among the bools below.  Thanks,

Thanks for catching this. Moving it one field lower as shown in the
delta patch below seems to improve the layout:

diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index 4f070f2d6fde..d730d78754a2 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -67,8 +67,8 @@ struct vfio_pci_core_device {
 	u8			msix_bar;
 	u16			msix_size;
 	u32			msix_offset;
-	bool			has_dyn_msix;
 	u32			rbar[7];
+	bool			has_dyn_msix;
 	bool			pci_2_3;
 	bool			virq_disabled;
 	bool			reset_works;


Combined with the other changes to this struct (new struct xarray
for the context, and removing int num_ctx) the bools are no longer
together on a single cache line. Placing has_dyn_msix as shown above
keeps it on the same cache line as the other msix_* fields.

After this change the layout of this struct appears to be improved.
Before this patch series (v6.3-rc7):
        /* size: 2496, cachelines: 39, members: 46 */
        /* sum members: 2485, holes: 4, sum holes: 11 */
        /* paddings: 2, sum paddings: 11 */
        /* forced alignments: 1 */

After this patch series (v6.3-rc7 + V3 + delta patch):
        /* size: 2568, cachelines: 41, members: 46 */
        /* sum members: 2562, holes: 2, sum holes: 6 */
        /* paddings: 2, sum paddings: 11 */
        /* forced alignments: 1 */
        /* last cacheline: 8 bytes */

Reinette


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH V3 09/10] vfio/pci: Support dynamic MSI-X
  2023-04-18 22:38   ` Alex Williamson
@ 2023-04-19 18:13     ` Reinette Chatre
  2023-04-19 21:38       ` Alex Williamson
  0 siblings, 1 reply; 23+ messages in thread
From: Reinette Chatre @ 2023-04-19 18:13 UTC (permalink / raw)
  To: Alex Williamson
  Cc: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, tglx, darwi,
	kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	linux-kernel

Hi Alex,

On 4/18/2023 3:38 PM, Alex Williamson wrote:
> On Tue, 18 Apr 2023 10:29:20 -0700
> Reinette Chatre <reinette.chatre@intel.com> wrote:
> 
>> Recently introduced pci_msix_alloc_irq_at() and pci_msix_free_irq()
>> enables an individual MSI-X interrupt to be allocated and freed after
>> MSI-X enabling.
>>
>> Use dynamic MSI-X (if supported by the device) to allocate an interrupt
>> after MSI-X is enabled. An MSI-X interrupt is dynamically allocated at
>> the time a valid eventfd is assigned. This is different behavior from
>> a range provided during MSI-X enabling where interrupts are allocated
>> for the entire range whether a valid eventfd is provided for each
>> interrupt or not.
>>
>> Do not dynamically free interrupts, leave that to when MSI-X is
>> disabled.
> 
> But we do, sometimes, even if essentially only on the error path.  Is
> that worthwhile?  It seems like we could entirely remove
> vfio_msi_free_irq() and rely only on pci_free_irq_vectors() on MSI/X
> teardown.

Yes, it is only on the error path where dynamic MSI-X interrupts are
removed. I do not know how to determine if it is worthwhile. On the
kernel side failure seems unlikely since it would mean memory cannot
be allocated or request_irq() failed. In these cases it may not be
worthwhile since user space may try again and having the interrupt
already allocated would be helpful. The remaining error seems to be
if user space provided an invalid eventfd. An allocation in response
to wrong user input is a concern to me. Should we consider
buggy/malicious users? I am uncertain here so would defer to your
guidance.

> I'd probably also add a comment in the commit log about the theory
> behind not dynamically freeing irqs, ie. latency, reliability, and
> whatever else we used to justify it.  Thanks,

Sure. How about something like below to replace the final sentence of
the changelog:

"When a guest disables an interrupt, user space (Qemu) does not
disable the interrupt but instead assigns it a different trigger. A
common flow is thus for the VFIO_DEVICE_SET_IRQS ioctl() to be 
used to assign a new eventfd to an already enabled interrupt. Freeing
and re-allocating an interrupt in this scenario would add unnecessary
latency as well as uncertainty since the re-allocation may fail. Do
not dynamically free interrupts when an interrupt is disabled, instead
support a subsequent re-enable to draw from the initial allocation when
possible. Leave freeing of interrupts to when MSI-X is disabled."

Reinette


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH V3 10/10] vfio/pci: Clear VFIO_IRQ_INFO_NORESIZE for MSI-X
  2023-04-18 22:38   ` Alex Williamson
@ 2023-04-19 18:13     ` Reinette Chatre
  0 siblings, 0 replies; 23+ messages in thread
From: Reinette Chatre @ 2023-04-19 18:13 UTC (permalink / raw)
  To: Alex Williamson
  Cc: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, tglx, darwi,
	kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	linux-kernel

Hi Alex,

On 4/18/2023 3:38 PM, Alex Williamson wrote:
> On Tue, 18 Apr 2023 10:29:21 -0700
> Reinette Chatre <reinette.chatre@intel.com> wrote:
> 

...

>> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
>> index a3635a8e54c8..4050ad3388c2 100644
>> --- a/drivers/vfio/pci/vfio_pci_core.c
>> +++ b/drivers/vfio/pci/vfio_pci_core.c
>> @@ -1114,7 +1114,9 @@ static int vfio_pci_ioctl_get_irq_info(struct vfio_pci_core_device *vdev,
>>  	if (info.index == VFIO_PCI_INTX_IRQ_INDEX)
>>  		info.flags |=
>>  			(VFIO_IRQ_INFO_MASKABLE | VFIO_IRQ_INFO_AUTOMASKED);
>> -	else
>> +	else if ((info.index != VFIO_PCI_MSIX_IRQ_INDEX) ||
>> +		 (info.index == VFIO_PCI_MSIX_IRQ_INDEX &&
>> +		  !vdev->has_dyn_msix))
> 
> Isn't this the same as:
> 
> 	(info.index != VFIO_PCI_MSIX_IRQ_INDEX || !vdev->has_dyn_msix)
> 

Yes, it is. Will fix. Thank you very much.

Reinette

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH V3 09/10] vfio/pci: Support dynamic MSI-X
  2023-04-19 18:13     ` Reinette Chatre
@ 2023-04-19 21:38       ` Alex Williamson
  2023-04-19 22:03         ` Reinette Chatre
  0 siblings, 1 reply; 23+ messages in thread
From: Alex Williamson @ 2023-04-19 21:38 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, tglx, darwi,
	kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	linux-kernel

On Wed, 19 Apr 2023 11:13:29 -0700
Reinette Chatre <reinette.chatre@intel.com> wrote:

> Hi Alex,
> 
> On 4/18/2023 3:38 PM, Alex Williamson wrote:
> > On Tue, 18 Apr 2023 10:29:20 -0700
> > Reinette Chatre <reinette.chatre@intel.com> wrote:
> >   
> >> Recently introduced pci_msix_alloc_irq_at() and pci_msix_free_irq()
> >> enables an individual MSI-X interrupt to be allocated and freed after
> >> MSI-X enabling.
> >>
> >> Use dynamic MSI-X (if supported by the device) to allocate an interrupt
> >> after MSI-X is enabled. An MSI-X interrupt is dynamically allocated at
> >> the time a valid eventfd is assigned. This is different behavior from
> >> a range provided during MSI-X enabling where interrupts are allocated
> >> for the entire range whether a valid eventfd is provided for each
> >> interrupt or not.
> >>
> >> Do not dynamically free interrupts, leave that to when MSI-X is
> >> disabled.  
> > 
> > But we do, sometimes, even if essentially only on the error path.  Is
> > that worthwhile?  It seems like we could entirely remove
> > vfio_msi_free_irq() and rely only on pci_free_irq_vectors() on MSI/X
> > teardown.  
> 
> Yes, it is only on the error path where dynamic MSI-X interrupts are
> removed. I do not know how to determine if it is worthwhile. On the
> kernel side failure seems unlikely since it would mean memory cannot
> be allocated or request_irq() failed. In these cases it may not be
> worthwhile since user space may try again and having the interrupt
> already allocated would be helpful. The remaining error seems to be
> if user space provided an invalid eventfd. An allocation in response
> to wrong user input is a concern to me. Should we consider
> buggy/malicious users? I am uncertain here so would defer to your
> guidance.

I don't really see that a malicious user can exploit anything here,
their irq allocation is bound by the device support and they're
entitled to make use of the full vector set of the device by virtue of
having ownership of the device.  All the MSI-X allocated irqs are freed
when the interrupt mode is changed or the device is released regardless.

The end result is also no different than if the user had not configured
the vector when enabling MSI-X or configured it and later de-configured
with a -1 eventfd.  The irq is allocated but not attached to a ctx.
We're intentionally using this as a cache.

Also, as implemented here in v3, we might be freeing from the original
allocation rather than a new, dynamically allocated irq.

My thinking is that if we think there's a benefit to caching any
allocated irqs, we should do so consistently.  We don't currently know
if the irq was allocated now or previously.  Tracking that would add
complexity for little benefit.  The user can get to the same end result
of an allocated, unused irq in numerous way, the state itself is not
erroneous, and is actually in support of caching irq allocations.
Removing the de-allocation of a single vector further simplifies the
code as there exists only one path where irqs are freed, ie.
pci_free_irq_vectors().

So I'd lean towards removing vfio_msi_free_irq().
 
> > I'd probably also add a comment in the commit log about the theory
> > behind not dynamically freeing irqs, ie. latency, reliability, and
> > whatever else we used to justify it.  Thanks,  
> 
> Sure. How about something like below to replace the final sentence of
> the changelog:
> 
> "When a guest disables an interrupt, user space (Qemu) does not
> disable the interrupt but instead assigns it a different trigger. A
> common flow is thus for the VFIO_DEVICE_SET_IRQS ioctl() to be 
> used to assign a new eventfd to an already enabled interrupt. Freeing
> and re-allocating an interrupt in this scenario would add unnecessary
> latency as well as uncertainty since the re-allocation may fail. Do
> not dynamically free interrupts when an interrupt is disabled, instead
> support a subsequent re-enable to draw from the initial allocation when
> possible. Leave freeing of interrupts to when MSI-X is disabled."

There are other means besides caching irqs that could achieve the above,
for instance if a trigger is simply swapped from one eventfd to another,
that all happens within vfio_msi_set_vector_signal() where we could
hold the irq for the transition.

I think I might justify it as:

	The PCI-MSIX API requires that some number of irqs are
	allocated for an initial set of vectors when enabling MSI-X on
	the device.  When dynamic MSIX allocation is not supported, the
	vector table, and thus the allocated irq set can only be resized
	by disabling and re-enabling MSIX with a different range.  In
	that case the irq allocation is essentially a cache for
	configuring vectors within the previously allocated vector
	range.  When dynamic MSIX allocation is supported, the API
	still requires some initial set of irqs to be allocated, but
	also supports allocating and freeing specific irq vectors both
	within and beyond the initially allocated range.

	For consistency between modes, as well as to reduce latency and
	improve reliability of allocations, and also simplicity, this
	implementation only releases irqs via pci_free_irq_vectors()
	when either the interrupt mode changes or the device is
	released.

Does that cover the key points for someone that might want to revisit
this decision later?  Thanks,

Alex


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH V3 09/10] vfio/pci: Support dynamic MSI-X
  2023-04-19 21:38       ` Alex Williamson
@ 2023-04-19 22:03         ` Reinette Chatre
  0 siblings, 0 replies; 23+ messages in thread
From: Reinette Chatre @ 2023-04-19 22:03 UTC (permalink / raw)
  To: Alex Williamson
  Cc: jgg, yishaih, shameerali.kolothum.thodi, kevin.tian, tglx, darwi,
	kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu, tom.zanussi,
	linux-kernel

Hi Alex,

On 4/19/2023 2:38 PM, Alex Williamson wrote:
> On Wed, 19 Apr 2023 11:13:29 -0700
> Reinette Chatre <reinette.chatre@intel.com> wrote:
>> On 4/18/2023 3:38 PM, Alex Williamson wrote:
>>> On Tue, 18 Apr 2023 10:29:20 -0700
>>> Reinette Chatre <reinette.chatre@intel.com> wrote:
>>>   
>>>> Recently introduced pci_msix_alloc_irq_at() and pci_msix_free_irq()
>>>> enables an individual MSI-X interrupt to be allocated and freed after
>>>> MSI-X enabling.
>>>>
>>>> Use dynamic MSI-X (if supported by the device) to allocate an interrupt
>>>> after MSI-X is enabled. An MSI-X interrupt is dynamically allocated at
>>>> the time a valid eventfd is assigned. This is different behavior from
>>>> a range provided during MSI-X enabling where interrupts are allocated
>>>> for the entire range whether a valid eventfd is provided for each
>>>> interrupt or not.
>>>>
>>>> Do not dynamically free interrupts, leave that to when MSI-X is
>>>> disabled.  
>>>
>>> But we do, sometimes, even if essentially only on the error path.  Is
>>> that worthwhile?  It seems like we could entirely remove
>>> vfio_msi_free_irq() and rely only on pci_free_irq_vectors() on MSI/X
>>> teardown.  
>>
>> Yes, it is only on the error path where dynamic MSI-X interrupts are
>> removed. I do not know how to determine if it is worthwhile. On the
>> kernel side failure seems unlikely since it would mean memory cannot
>> be allocated or request_irq() failed. In these cases it may not be
>> worthwhile since user space may try again and having the interrupt
>> already allocated would be helpful. The remaining error seems to be
>> if user space provided an invalid eventfd. An allocation in response
>> to wrong user input is a concern to me. Should we consider
>> buggy/malicious users? I am uncertain here so would defer to your
>> guidance.
> 
> I don't really see that a malicious user can exploit anything here,
> their irq allocation is bound by the device support and they're
> entitled to make use of the full vector set of the device by virtue of
> having ownership of the device.  All the MSI-X allocated irqs are freed
> when the interrupt mode is changed or the device is released regardless.
> 
> The end result is also no different than if the user had not configured
> the vector when enabling MSI-X or configured it and later de-configured
> with a -1 eventfd.  The irq is allocated but not attached to a ctx.
> We're intentionally using this as a cache.
> 
> Also, as implemented here in v3, we might be freeing from the original
> allocation rather than a new, dynamically allocated irq.

Great point.

> 
> My thinking is that if we think there's a benefit to caching any
> allocated irqs, we should do so consistently.  We don't currently know
> if the irq was allocated now or previously.  Tracking that would add
> complexity for little benefit.  The user can get to the same end result
> of an allocated, unused irq in numerous way, the state itself is not
> erroneous, and is actually in support of caching irq allocations.
> Removing the de-allocation of a single vector further simplifies the
> code as there exists only one path where irqs are freed, ie.
> pci_free_irq_vectors().
> 
> So I'd lean towards removing vfio_msi_free_irq().

Thank you for your detailed analysis. I understand and agree.
I will remove vfio_msi_free_irq().

>  
>>> I'd probably also add a comment in the commit log about the theory
>>> behind not dynamically freeing irqs, ie. latency, reliability, and
>>> whatever else we used to justify it.  Thanks,  
>>
>> Sure. How about something like below to replace the final sentence of
>> the changelog:
>>
>> "When a guest disables an interrupt, user space (Qemu) does not
>> disable the interrupt but instead assigns it a different trigger. A
>> common flow is thus for the VFIO_DEVICE_SET_IRQS ioctl() to be 
>> used to assign a new eventfd to an already enabled interrupt. Freeing
>> and re-allocating an interrupt in this scenario would add unnecessary
>> latency as well as uncertainty since the re-allocation may fail. Do
>> not dynamically free interrupts when an interrupt is disabled, instead
>> support a subsequent re-enable to draw from the initial allocation when
>> possible. Leave freeing of interrupts to when MSI-X is disabled."
> 
> There are other means besides caching irqs that could achieve the above,
> for instance if a trigger is simply swapped from one eventfd to another,
> that all happens within vfio_msi_set_vector_signal() where we could
> hold the irq for the transition.
> 
> I think I might justify it as:
> 
> 	The PCI-MSIX API requires that some number of irqs are
> 	allocated for an initial set of vectors when enabling MSI-X on
> 	the device.  When dynamic MSIX allocation is not supported, the
> 	vector table, and thus the allocated irq set can only be resized
> 	by disabling and re-enabling MSIX with a different range.  In
> 	that case the irq allocation is essentially a cache for
> 	configuring vectors within the previously allocated vector
> 	range.  When dynamic MSIX allocation is supported, the API
> 	still requires some initial set of irqs to be allocated, but
> 	also supports allocating and freeing specific irq vectors both
> 	within and beyond the initially allocated range.
> 
> 	For consistency between modes, as well as to reduce latency and
> 	improve reliability of allocations, and also simplicity, this
> 	implementation only releases irqs via pci_free_irq_vectors()
> 	when either the interrupt mode changes or the device is
> 	released.
> 
> Does that cover the key points for someone that might want to revisit
> this decision later?  Thanks,

It does so clearly, yes. Thank you so much for taking the time to
write this. I will include it in the changelog.

Reinette



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH V3 08/10] vfio/pci: Probe and store ability to support dynamic MSI-X
  2023-04-19 18:11     ` Reinette Chatre
@ 2023-04-24 17:43       ` Jason Gunthorpe
  2023-04-24 23:52         ` Reinette Chatre
  0 siblings, 1 reply; 23+ messages in thread
From: Jason Gunthorpe @ 2023-04-24 17:43 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Alex Williamson, yishaih, shameerali.kolothum.thodi, kevin.tian,
	tglx, darwi, kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu,
	tom.zanussi, linux-kernel

On Wed, Apr 19, 2023 at 11:11:48AM -0700, Reinette Chatre wrote:
> Hi Alex,
> 
> On 4/18/2023 3:38 PM, Alex Williamson wrote:
> > On Tue, 18 Apr 2023 10:29:19 -0700
> > Reinette Chatre <reinette.chatre@intel.com> wrote:
> > 
> 
> ...
> 
> >> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
> >> index 148fd1ae6c1c..4f070f2d6fde 100644
> >> --- a/include/linux/vfio_pci_core.h
> >> +++ b/include/linux/vfio_pci_core.h
> >> @@ -67,6 +67,7 @@ struct vfio_pci_core_device {
> >>  	u8			msix_bar;
> >>  	u16			msix_size;
> >>  	u32			msix_offset;
> >> +	bool			has_dyn_msix;
> >>  	u32			rbar[7];
> >>  	bool			pci_2_3;
> >>  	bool			virq_disabled;
> > 
> > Nit, the whole data structure probably needs to be sorted with pahole,
> > but creating a hole here for locality to other msix fields should
> > probably be secondary to keeping the structure well packed, which
> > suggests including this new field among the bools below.  Thanks,
> 
> Thanks for catching this. Moving it one field lower as shown in the
> delta patch below seems to improve the layout:
> 
> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
> index 4f070f2d6fde..d730d78754a2 100644
> --- a/include/linux/vfio_pci_core.h
> +++ b/include/linux/vfio_pci_core.h
> @@ -67,8 +67,8 @@ struct vfio_pci_core_device {
>  	u8			msix_bar;
>  	u16			msix_size;
>  	u32			msix_offset;
> -	bool			has_dyn_msix;
>  	u32			rbar[7];
> +	bool			has_dyn_msix;
>  	bool			pci_2_3;
>  	bool			virq_disabled;
>  	bool			reset_works;

Also, Linus on record as strongly disliking these lists of bools

If they don't need read_once/etc stuff then use a list of bitfields

bool abc:1;
bool xyz:1;

Jason

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH V3 08/10] vfio/pci: Probe and store ability to support dynamic MSI-X
  2023-04-24 17:43       ` Jason Gunthorpe
@ 2023-04-24 23:52         ` Reinette Chatre
  2023-04-25 14:51           ` Jason Gunthorpe
  0 siblings, 1 reply; 23+ messages in thread
From: Reinette Chatre @ 2023-04-24 23:52 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, yishaih, shameerali.kolothum.thodi, kevin.tian,
	tglx, darwi, kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu,
	tom.zanussi, linux-kernel

Hi Jason,

On 4/24/2023 10:43 AM, Jason Gunthorpe wrote:
> On Wed, Apr 19, 2023 at 11:11:48AM -0700, Reinette Chatre wrote:
>> On 4/18/2023 3:38 PM, Alex Williamson wrote:
>>> On Tue, 18 Apr 2023 10:29:19 -0700
>>> Reinette Chatre <reinette.chatre@intel.com> wrote:

...

>> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
>> index 4f070f2d6fde..d730d78754a2 100644
>> --- a/include/linux/vfio_pci_core.h
>> +++ b/include/linux/vfio_pci_core.h
>> @@ -67,8 +67,8 @@ struct vfio_pci_core_device {
>>  	u8			msix_bar;
>>  	u16			msix_size;
>>  	u32			msix_offset;
>> -	bool			has_dyn_msix;
>>  	u32			rbar[7];
>> +	bool			has_dyn_msix;
>>  	bool			pci_2_3;
>>  	bool			virq_disabled;
>>  	bool			reset_works;
> 
> Also, Linus on record as strongly disliking these lists of bools

This looks like an example:
https://lkml.org/lkml/2017/11/21/384

> 
> If they don't need read_once/etc stuff then use a list of bitfields

I do not see any direct usage of read_once in the driver, but it is not
clear to me what falls under the "etc" umbrella. Do you consider all
the bools in struct vfio_pci_core_device to be candidates for 
transition?

> 
> bool abc:1;
> bool xyz:1;
> 

I think a base type of unsigned int since it appears to be the custom
and (if I understand correctly) was preferred at the time Linus wrote
the message I found.

Looking ahead there seems be be a bigger task here. A quick search
revealed a few other instances of vfio using "bool" in a struct. It
does not all qualify for your "lists of bools" comment, but they
may need a closer look because of the "please don't use "bool" in
structures at all" comment made by Linus in the email I found.

vfio_device::iommufd_attached
vfio_container::noiommu
vfio_platform_irq::masked
vfio_platform_device::reset_required
vfio_iommu::v2
vfio_iommu::nesting
vfio_iommu::dirty_page_tracking
vfio_dma::iommu_mapped
vfio_dma::lock_cap
vfio_dma::vaddr_invalid
vfio_iommu_group::pinned_page_dirty_scope
tce_container::enabled
tce_container::v2
tce_container::def_window_pending

Reinette

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH V3 08/10] vfio/pci: Probe and store ability to support dynamic MSI-X
  2023-04-24 23:52         ` Reinette Chatre
@ 2023-04-25 14:51           ` Jason Gunthorpe
  2023-04-25 16:52             ` Reinette Chatre
  0 siblings, 1 reply; 23+ messages in thread
From: Jason Gunthorpe @ 2023-04-25 14:51 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Alex Williamson, yishaih, shameerali.kolothum.thodi, kevin.tian,
	tglx, darwi, kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu,
	tom.zanussi, linux-kernel

On Mon, Apr 24, 2023 at 04:52:08PM -0700, Reinette Chatre wrote:
> Hi Jason,
> 
> On 4/24/2023 10:43 AM, Jason Gunthorpe wrote:
> > On Wed, Apr 19, 2023 at 11:11:48AM -0700, Reinette Chatre wrote:
> >> On 4/18/2023 3:38 PM, Alex Williamson wrote:
> >>> On Tue, 18 Apr 2023 10:29:19 -0700
> >>> Reinette Chatre <reinette.chatre@intel.com> wrote:
> 
> ...
> 
> >> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
> >> index 4f070f2d6fde..d730d78754a2 100644
> >> --- a/include/linux/vfio_pci_core.h
> >> +++ b/include/linux/vfio_pci_core.h
> >> @@ -67,8 +67,8 @@ struct vfio_pci_core_device {
> >>  	u8			msix_bar;
> >>  	u16			msix_size;
> >>  	u32			msix_offset;
> >> -	bool			has_dyn_msix;
> >>  	u32			rbar[7];
> >> +	bool			has_dyn_msix;
> >>  	bool			pci_2_3;
> >>  	bool			virq_disabled;
> >>  	bool			reset_works;
> > 
> > Also, Linus on record as strongly disliking these lists of bools
> 
> This looks like an example:
> https://lkml.org/lkml/2017/11/21/384
> 
> > 
> > If they don't need read_once/etc stuff then use a list of bitfields
> 
> I do not see any direct usage of read_once in the driver, but it is not
> clear to me what falls under the "etc" umbrella.

Anything that might assume atomicity, smp_store_release, set_bit, and others

>  Do you consider all the bools in struct vfio_pci_core_device to be
> candidates for transition?

Yes, group them ito into a bitfield.

> I think a base type of unsigned int since it appears to be the custom
> and (if I understand correctly) was preferred at the time Linus wrote
> the message I found.

It doesn't matter a lot, using "bool" means the compiler adds extra
code to ensure "foo = 4" stores true, and the underyling size is not
well defined (but we don't care here)
 
> Looking ahead there seems be be a bigger task here. A quick search
> revealed a few other instances of vfio using "bool" in a struct. It
> does not all qualify for your "lists of bools" comment, but they
> may need a closer look because of the "please don't use "bool" in
> structures at all" comment made by Linus in the email I found.

IMHO bool is helpful for clarity, it says it is a flag. In these cases
we won't gain anything by using u8 instead

Lists of bools however start to get a little silly when we use maybe 4
bytes per bool (though x86-64 is using 1 byte in structs)

Jason

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH V3 08/10] vfio/pci: Probe and store ability to support dynamic MSI-X
  2023-04-25 14:51           ` Jason Gunthorpe
@ 2023-04-25 16:52             ` Reinette Chatre
  0 siblings, 0 replies; 23+ messages in thread
From: Reinette Chatre @ 2023-04-25 16:52 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, yishaih, shameerali.kolothum.thodi, kevin.tian,
	tglx, darwi, kvm, dave.jiang, jing2.liu, ashok.raj, fenghua.yu,
	tom.zanussi, linux-kernel

Hi Jason,

On 4/25/2023 7:51 AM, Jason Gunthorpe wrote:
> On Mon, Apr 24, 2023 at 04:52:08PM -0700, Reinette Chatre wrote:
>> On 4/24/2023 10:43 AM, Jason Gunthorpe wrote:
>>> On Wed, Apr 19, 2023 at 11:11:48AM -0700, Reinette Chatre wrote:
>>>> On 4/18/2023 3:38 PM, Alex Williamson wrote:
>>>>> On Tue, 18 Apr 2023 10:29:19 -0700
>>>>> Reinette Chatre <reinette.chatre@intel.com> wrote:
>>
>> ...
>>
>>>> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
>>>> index 4f070f2d6fde..d730d78754a2 100644
>>>> --- a/include/linux/vfio_pci_core.h
>>>> +++ b/include/linux/vfio_pci_core.h
>>>> @@ -67,8 +67,8 @@ struct vfio_pci_core_device {
>>>>  	u8			msix_bar;
>>>>  	u16			msix_size;
>>>>  	u32			msix_offset;
>>>> -	bool			has_dyn_msix;
>>>>  	u32			rbar[7];
>>>> +	bool			has_dyn_msix;
>>>>  	bool			pci_2_3;
>>>>  	bool			virq_disabled;
>>>>  	bool			reset_works;
>>>
>>> Also, Linus on record as strongly disliking these lists of bools
>>
>> This looks like an example:
>> https://lkml.org/lkml/2017/11/21/384
>>
>>>
>>> If they don't need read_once/etc stuff then use a list of bitfields
>>
>> I do not see any direct usage of read_once in the driver, but it is not
>> clear to me what falls under the "etc" umbrella.
> 
> Anything that might assume atomicity, smp_store_release, set_bit, and others
> 
>>  Do you consider all the bools in struct vfio_pci_core_device to be
>> candidates for transition?
> 
> Yes, group them ito into a bitfield.

Will do.

> 
>> I think a base type of unsigned int since it appears to be the custom
>> and (if I understand correctly) was preferred at the time Linus wrote
>> the message I found.
> 
> It doesn't matter a lot, using "bool" means the compiler adds extra
> code to ensure "foo = 4" stores true, and the underyling size is not
> well defined (but we don't care here)

Looking further outside that email thread I do see using base type of bool
is common. I will use that. Doing so also reduces the churn of this
transition since only the data structure changes, not the code. 
>> Looking ahead there seems be be a bigger task here. A quick search
>> revealed a few other instances of vfio using "bool" in a struct. It
>> does not all qualify for your "lists of bools" comment, but they
>> may need a closer look because of the "please don't use "bool" in
>> structures at all" comment made by Linus in the email I found.
> 
> IMHO bool is helpful for clarity, it says it is a flag. In these cases
> we won't gain anything by using u8 instead
> 
> Lists of bools however start to get a little silly when we use maybe 4
> bytes per bool (though x86-64 is using 1 byte in structs)
> 

Thank you very much for catching this and providing guidance. I plan to
include this change to struct vfio_pci_core_device as a prep
patch within this series.

Reinette


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2023-04-25 16:52 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-18 17:29 [PATCH V3 00/10] vfio/pci: Support dynamic allocation of MSI-X interrupts Reinette Chatre
2023-04-18 17:29 ` [PATCH V3 01/10] vfio/pci: Consolidate irq cleanup on MSI/MSI-X disable Reinette Chatre
2023-04-18 17:29 ` [PATCH V3 02/10] vfio/pci: Remove negative check on unsigned vector Reinette Chatre
2023-04-18 17:29 ` [PATCH V3 03/10] vfio/pci: Prepare for dynamic interrupt context storage Reinette Chatre
2023-04-18 17:29 ` [PATCH V3 04/10] vfio/pci: Move to single error path Reinette Chatre
2023-04-18 17:29 ` [PATCH V3 05/10] vfio/pci: Use xarray for interrupt context storage Reinette Chatre
2023-04-18 17:29 ` [PATCH V3 06/10] vfio/pci: Remove interrupt context counter Reinette Chatre
2023-04-18 17:29 ` [PATCH V3 07/10] vfio/pci: Update stale comment Reinette Chatre
2023-04-18 17:29 ` [PATCH V3 08/10] vfio/pci: Probe and store ability to support dynamic MSI-X Reinette Chatre
2023-04-18 22:38   ` Alex Williamson
2023-04-19 18:11     ` Reinette Chatre
2023-04-24 17:43       ` Jason Gunthorpe
2023-04-24 23:52         ` Reinette Chatre
2023-04-25 14:51           ` Jason Gunthorpe
2023-04-25 16:52             ` Reinette Chatre
2023-04-18 17:29 ` [PATCH V3 09/10] vfio/pci: Support " Reinette Chatre
2023-04-18 22:38   ` Alex Williamson
2023-04-19 18:13     ` Reinette Chatre
2023-04-19 21:38       ` Alex Williamson
2023-04-19 22:03         ` Reinette Chatre
2023-04-18 17:29 ` [PATCH V3 10/10] vfio/pci: Clear VFIO_IRQ_INFO_NORESIZE for MSI-X Reinette Chatre
2023-04-18 22:38   ` Alex Williamson
2023-04-19 18:13     ` Reinette Chatre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox