Linux IOMMU Development
 help / color / mirror / Atom feed
* [PATCH v3 0/2] iommu/amd: Fix GAM IRTEs affinity and GALog restart
@ 2023-04-19 20:11 Joao Martins
  2023-04-19 20:11 ` [PATCH v3 1/2] iommu/amd: Don't block updates to GATag if guest mode is on Joao Martins
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Joao Martins @ 2023-04-19 20:11 UTC (permalink / raw)
  To: iommu
  Cc: Joerg Roedel, Suravee Suthikulpanit, Vasant Hegde, Will Deacon,
	Robin Murphy, Alejandro Jimenez, kvm, Joao Martins

Hey,

This small series fixes a couple bugs:

Patch 1) Fix affinity changes to already-in-guest-mode IRTEs which would
         otherwise be nops.

Patch 2) Handle the GALog overflow condition by restarting it, similar
         to how we do with the event log.

Comments appreciated.

Thanks,
	Joao

Changes since v2[2]:
- Fixes commit message spelling issues  (Alexey, patch 1)
- Consolidate the modified check into one line (Sean, patch 1)
- Add Rb in patch 2 (Vasant Hegde)

Changes since v1[1]:
- Adjust commit message in first patch (Suravee)
- Add Rb in the first patch (Suravee)
- Add new patch 2 for handling GALog overflows

[0] https://lore.kernel.org/linux-iommu/b39d505c-8d2b-d90b-f52d-ceabde8225cf@oracle.com/
[1] https://lore.kernel.org/linux-iommu/20230208131938.39898-1-joao.m.martins@oracle.com/
[2] https://lore.kernel.org/linux-iommu/20230316200219.42673-1-joao.m.martins@oracle.com/

Joao Martins (2):
  iommu/amd: Don't block updates to GATag if guest mode is on
  iommu/amd: Handle GALog overflows

 drivers/iommu/amd/amd_iommu.h |  1 +
 drivers/iommu/amd/init.c      | 24 ++++++++++++++++++++++++
 drivers/iommu/amd/iommu.c     | 12 +++++++++---
 3 files changed, 34 insertions(+), 3 deletions(-)

-- 
2.17.2


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v3 1/2] iommu/amd: Don't block updates to GATag if guest mode is on
  2023-04-19 20:11 [PATCH v3 0/2] iommu/amd: Fix GAM IRTEs affinity and GALog restart Joao Martins
@ 2023-04-19 20:11 ` Joao Martins
  2023-04-19 20:11 ` [PATCH v3 2/2] iommu/amd: Handle GALog overflows Joao Martins
  2023-05-22 15:16 ` [PATCH v3 0/2] iommu/amd: Fix GAM IRTEs affinity and GALog restart Joerg Roedel
  2 siblings, 0 replies; 4+ messages in thread
From: Joao Martins @ 2023-04-19 20:11 UTC (permalink / raw)
  To: iommu
  Cc: Joerg Roedel, Suravee Suthikulpanit, Vasant Hegde, Will Deacon,
	Robin Murphy, Alejandro Jimenez, kvm, Joao Martins

On KVM GSI routing table updates, specially those where they have vIOMMUs
with interrupt remapping enabled (to boot >255vcpus setups without relying
on KVM_FEATURE_MSI_EXT_DEST_ID), a VMM may update the backing VF MSIs
with a new VCPU affinity.

On AMD with AVIC enabled, the new vcpu affinity info is updated via:
	avic_pi_update_irte()
		irq_set_vcpu_affinity()
			amd_ir_set_vcpu_affinity()
				amd_iommu_{de}activate_guest_mode()

Where the IRTE[GATag] is updated with the new vcpu affinity. The GATag
contains VM ID and VCPU ID, and is used by IOMMU hardware to signal KVM
(via GALog) when interrupt cannot be delivered due to vCPU is in
blocking state.

The issue is that amd_iommu_activate_guest_mode() will essentially
only change IRTE fields on transitions from non-guest-mode to guest-mode
and otherwise returns *with no changes to IRTE* on already configured
guest-mode interrupts. To the guest this means that the VF interrupts
remain affined to the first vCPU they were first configured, and guest
will be unable to issue VF interrupts and receive messages like this
from spurious interrupts (e.g. from waking the wrong vCPU in GALog):

[  167.759472] __common_interrupt: 3.34 No irq handler for vector
[  230.680927] mlx5_core 0000:00:02.0: mlx5_cmd_eq_recover:247:(pid
3122): Recovered 1 EQEs on cmd_eq
[  230.681799] mlx5_core 0000:00:02.0:
wait_func_handle_exec_timeout:1113:(pid 3122): cmd[0]: CREATE_CQ(0x400)
recovered after timeout
[  230.683266] __common_interrupt: 3.34 No irq handler for vector

Given the fact that amd_ir_set_vcpu_affinity() uses
amd_iommu_activate_guest_mode() underneath it essentially means that VCPU
affinity changes of IRTEs are nops. Fix it by dropping the check for
guest-mode at amd_iommu_activate_guest_mode(). Same thing is applicable to
amd_iommu_deactivate_guest_mode() although, even if the IRTE doesn't change
underlying DestID on the host, the VFIO IRQ handler will still be able to
poke at the right guest-vCPU.

Fixes: b9c6ff94e43a ("iommu/amd: Re-factor guest virtual APIC (de-)activation code")
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/iommu.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 5a505ba5467e..fbe77ee2d26c 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3484,8 +3484,7 @@ int amd_iommu_activate_guest_mode(void *data)
 	struct irte_ga *entry = (struct irte_ga *) ir_data->entry;
 	u64 valid;
 
-	if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) ||
-	    !entry || entry->lo.fields_vapic.guest_mode)
+	if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) || !entry)
 		return 0;
 
 	valid = entry->lo.fields_vapic.valid;
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v3 2/2] iommu/amd: Handle GALog overflows
  2023-04-19 20:11 [PATCH v3 0/2] iommu/amd: Fix GAM IRTEs affinity and GALog restart Joao Martins
  2023-04-19 20:11 ` [PATCH v3 1/2] iommu/amd: Don't block updates to GATag if guest mode is on Joao Martins
@ 2023-04-19 20:11 ` Joao Martins
  2023-05-22 15:16 ` [PATCH v3 0/2] iommu/amd: Fix GAM IRTEs affinity and GALog restart Joerg Roedel
  2 siblings, 0 replies; 4+ messages in thread
From: Joao Martins @ 2023-04-19 20:11 UTC (permalink / raw)
  To: iommu
  Cc: Joerg Roedel, Suravee Suthikulpanit, Vasant Hegde, Will Deacon,
	Robin Murphy, Alejandro Jimenez, kvm, Joao Martins

GALog exists to propagate interrupts into all vCPUs in the system when
interrupts are marked as non running (e.g. when vCPUs aren't running). A
GALog overflow happens when there's in no space in the log to record the
GATag of the interrupt. So when the GALOverflow condition happens, the
GALog queue is processed and the GALog is restarted, as the IOMMU
manual indicates in section "2.7.4 Guest Virtual APIC Log Restart
Procedure":

| * Wait until MMIO Offset 2020h[GALogRun]=0b so that all request
|   entries are completed as circumstances allow. GALogRun must be 0b to
|   modify the guest virtual APIC log registers safely.
| * Write MMIO Offset 0018h[GALogEn]=0b.
| * As necessary, change the following values (e.g., to relocate or
| resize the guest virtual APIC event log):
|   - the Guest Virtual APIC Log Base Address Register
|      [MMIO Offset 00E0h],
|   - the Guest Virtual APIC Log Head Pointer Register
|      [MMIO Offset 2040h][GALogHead], and
|   - the Guest Virtual APIC Log Tail Pointer Register
|      [MMIO Offset 2048h][GALogTail].
| * Write MMIO Offset 2020h[GALOverflow] = 1b to clear the bit (W1C).
| * Write MMIO Offset 0018h[GALogEn] = 1b, and either set
|   MMIO Offset 0018h[GAIntEn] to enable the GA log interrupt or clear
|   the bit to disable it.

Failing to handle the GALog overflow means that none of the VFs (in any
guest) will work with IOMMU AVIC forcing the user to power cycle the
host. When handling the event it resumes the GALog without resizing
much like how it is done in the event handler overflow. The
[MMIO Offset 2020h][GALOverflow] bit might be set in status register
without the [MMIO Offset 2020h][GAInt] bit, so when deciding to poll
for GA events (to clear space in the galog), also check the overflow
bit.

[suravee: Check for GAOverflow without GAInt, toggle CONTROL_GAINT_EN]
Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/amd_iommu.h |  1 +
 drivers/iommu/amd/init.c      | 24 ++++++++++++++++++++++++
 drivers/iommu/amd/iommu.c     |  9 ++++++++-
 3 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index c160a332ce33..24c7e6c6c0de 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -15,6 +15,7 @@ extern irqreturn_t amd_iommu_int_thread(int irq, void *data);
 extern irqreturn_t amd_iommu_int_handler(int irq, void *data);
 extern void amd_iommu_apply_erratum_63(struct amd_iommu *iommu, u16 devid);
 extern void amd_iommu_restart_event_logging(struct amd_iommu *iommu);
+extern void amd_iommu_restart_ga_log(struct amd_iommu *iommu);
 extern int amd_iommu_init_devices(void);
 extern void amd_iommu_uninit_devices(void);
 extern void amd_iommu_init_notifier(void);
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 19a46b9f7357..fd487c33b28a 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -751,6 +751,30 @@ void amd_iommu_restart_event_logging(struct amd_iommu *iommu)
 	iommu_feature_enable(iommu, CONTROL_EVT_LOG_EN);
 }
 
+/*
+ * This function restarts event logging in case the IOMMU experienced
+ * an GA log overflow.
+ */
+void amd_iommu_restart_ga_log(struct amd_iommu *iommu)
+{
+	u32 status;
+
+	status = readl(iommu->mmio_base + MMIO_STATUS_OFFSET);
+	if (status & MMIO_STATUS_GALOG_RUN_MASK)
+		return;
+
+	pr_info_ratelimited("IOMMU GA Log restarting\n");
+
+	iommu_feature_disable(iommu, CONTROL_GALOG_EN);
+	iommu_feature_disable(iommu, CONTROL_GAINT_EN);
+
+	writel(MMIO_STATUS_GALOG_OVERFLOW_MASK,
+	       iommu->mmio_base + MMIO_STATUS_OFFSET);
+
+	iommu_feature_enable(iommu, CONTROL_GAINT_EN);
+	iommu_feature_enable(iommu, CONTROL_GALOG_EN);
+}
+
 /*
  * This function resets the command buffer if the IOMMU stopped fetching
  * commands from it.
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index fbe77ee2d26c..b6f52f5529eb 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -845,6 +845,7 @@ amd_iommu_set_pci_msi_domain(struct device *dev, struct amd_iommu *iommu) { }
 	(MMIO_STATUS_EVT_OVERFLOW_INT_MASK | \
 	 MMIO_STATUS_EVT_INT_MASK | \
 	 MMIO_STATUS_PPR_INT_MASK | \
+	 MMIO_STATUS_GALOG_OVERFLOW_MASK | \
 	 MMIO_STATUS_GALOG_INT_MASK)
 
 irqreturn_t amd_iommu_int_thread(int irq, void *data)
@@ -868,10 +869,16 @@ irqreturn_t amd_iommu_int_thread(int irq, void *data)
 		}
 
 #ifdef CONFIG_IRQ_REMAP
-		if (status & MMIO_STATUS_GALOG_INT_MASK) {
+		if (status & (MMIO_STATUS_GALOG_INT_MASK |
+			      MMIO_STATUS_GALOG_OVERFLOW_MASK)) {
 			pr_devel("Processing IOMMU GA Log\n");
 			iommu_poll_ga_log(iommu);
 		}
+
+		if (status & MMIO_STATUS_GALOG_OVERFLOW_MASK) {
+			pr_info_ratelimited("IOMMU GA Log overflow\n");
+			amd_iommu_restart_ga_log(iommu);
+		}
 #endif
 
 		if (status & MMIO_STATUS_EVT_OVERFLOW_INT_MASK) {
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v3 0/2] iommu/amd: Fix GAM IRTEs affinity and GALog restart
  2023-04-19 20:11 [PATCH v3 0/2] iommu/amd: Fix GAM IRTEs affinity and GALog restart Joao Martins
  2023-04-19 20:11 ` [PATCH v3 1/2] iommu/amd: Don't block updates to GATag if guest mode is on Joao Martins
  2023-04-19 20:11 ` [PATCH v3 2/2] iommu/amd: Handle GALog overflows Joao Martins
@ 2023-05-22 15:16 ` Joerg Roedel
  2 siblings, 0 replies; 4+ messages in thread
From: Joerg Roedel @ 2023-05-22 15:16 UTC (permalink / raw)
  To: Joao Martins
  Cc: iommu, Suravee Suthikulpanit, Vasant Hegde, Will Deacon,
	Robin Murphy, Alejandro Jimenez, kvm

On Wed, Apr 19, 2023 at 09:11:52PM +0100, Joao Martins wrote:
> Joao Martins (2):
>   iommu/amd: Don't block updates to GATag if guest mode is on
>   iommu/amd: Handle GALog overflows
> 
>  drivers/iommu/amd/amd_iommu.h |  1 +
>  drivers/iommu/amd/init.c      | 24 ++++++++++++++++++++++++
>  drivers/iommu/amd/iommu.c     | 12 +++++++++---
>  3 files changed, 34 insertions(+), 3 deletions(-)

Applied for 6.4, thanks.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-05-22 15:16 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-19 20:11 [PATCH v3 0/2] iommu/amd: Fix GAM IRTEs affinity and GALog restart Joao Martins
2023-04-19 20:11 ` [PATCH v3 1/2] iommu/amd: Don't block updates to GATag if guest mode is on Joao Martins
2023-04-19 20:11 ` [PATCH v3 2/2] iommu/amd: Handle GALog overflows Joao Martins
2023-05-22 15:16 ` [PATCH v3 0/2] iommu/amd: Fix GAM IRTEs affinity and GALog restart Joerg Roedel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox