public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Joao Martins <joao.m.martins@oracle.com>,
	Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
	Vasant Hegde <vasant.hegde@amd.com>,
	Joerg Roedel <jroedel@suse.de>, Sasha Levin <sashal@kernel.org>,
	joro@8bytes.org, will@kernel.org, iommu@lists.linux.dev
Subject: [PATCH AUTOSEL 6.3 02/19] iommu/amd: Handle GALog overflows
Date: Thu, 15 Jun 2023 07:37:02 -0400	[thread overview]
Message-ID: <20230615113719.648862-2-sashal@kernel.org> (raw)
In-Reply-To: <20230615113719.648862-1-sashal@kernel.org>

From: Joao Martins <joao.m.martins@oracle.com>

[ Upstream commit af47b0a24058e56e983881993752f88288ca6511 ]

GALog exists to propagate interrupts into all vCPUs in the system when
interrupts are marked as non running (e.g. when vCPUs aren't running). A
GALog overflow happens when there's in no space in the log to record the
GATag of the interrupt. So when the GALOverflow condition happens, the
GALog queue is processed and the GALog is restarted, as the IOMMU
manual indicates in section "2.7.4 Guest Virtual APIC Log Restart
Procedure":

| * Wait until MMIO Offset 2020h[GALogRun]=0b so that all request
|   entries are completed as circumstances allow. GALogRun must be 0b to
|   modify the guest virtual APIC log registers safely.
| * Write MMIO Offset 0018h[GALogEn]=0b.
| * As necessary, change the following values (e.g., to relocate or
| resize the guest virtual APIC event log):
|   - the Guest Virtual APIC Log Base Address Register
|      [MMIO Offset 00E0h],
|   - the Guest Virtual APIC Log Head Pointer Register
|      [MMIO Offset 2040h][GALogHead], and
|   - the Guest Virtual APIC Log Tail Pointer Register
|      [MMIO Offset 2048h][GALogTail].
| * Write MMIO Offset 2020h[GALOverflow] = 1b to clear the bit (W1C).
| * Write MMIO Offset 0018h[GALogEn] = 1b, and either set
|   MMIO Offset 0018h[GAIntEn] to enable the GA log interrupt or clear
|   the bit to disable it.

Failing to handle the GALog overflow means that none of the VFs (in any
guest) will work with IOMMU AVIC forcing the user to power cycle the
host. When handling the event it resumes the GALog without resizing
much like how it is done in the event handler overflow. The
[MMIO Offset 2020h][GALOverflow] bit might be set in status register
without the [MMIO Offset 2020h][GAInt] bit, so when deciding to poll
for GA events (to clear space in the galog), also check the overflow
bit.

[suravee: Check for GAOverflow without GAInt, toggle CONTROL_GAINT_EN]

Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20230419201154.83880-3-joao.m.martins@oracle.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/iommu/amd/amd_iommu.h |  1 +
 drivers/iommu/amd/init.c      | 24 ++++++++++++++++++++++++
 drivers/iommu/amd/iommu.c     |  9 ++++++++-
 3 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index c160a332ce339..24c7e6c6c0de9 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -15,6 +15,7 @@ extern irqreturn_t amd_iommu_int_thread(int irq, void *data);
 extern irqreturn_t amd_iommu_int_handler(int irq, void *data);
 extern void amd_iommu_apply_erratum_63(struct amd_iommu *iommu, u16 devid);
 extern void amd_iommu_restart_event_logging(struct amd_iommu *iommu);
+extern void amd_iommu_restart_ga_log(struct amd_iommu *iommu);
 extern int amd_iommu_init_devices(void);
 extern void amd_iommu_uninit_devices(void);
 extern void amd_iommu_init_notifier(void);
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 19a46b9f73574..fd487c33b28aa 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -751,6 +751,30 @@ void amd_iommu_restart_event_logging(struct amd_iommu *iommu)
 	iommu_feature_enable(iommu, CONTROL_EVT_LOG_EN);
 }
 
+/*
+ * This function restarts event logging in case the IOMMU experienced
+ * an GA log overflow.
+ */
+void amd_iommu_restart_ga_log(struct amd_iommu *iommu)
+{
+	u32 status;
+
+	status = readl(iommu->mmio_base + MMIO_STATUS_OFFSET);
+	if (status & MMIO_STATUS_GALOG_RUN_MASK)
+		return;
+
+	pr_info_ratelimited("IOMMU GA Log restarting\n");
+
+	iommu_feature_disable(iommu, CONTROL_GALOG_EN);
+	iommu_feature_disable(iommu, CONTROL_GAINT_EN);
+
+	writel(MMIO_STATUS_GALOG_OVERFLOW_MASK,
+	       iommu->mmio_base + MMIO_STATUS_OFFSET);
+
+	iommu_feature_enable(iommu, CONTROL_GAINT_EN);
+	iommu_feature_enable(iommu, CONTROL_GALOG_EN);
+}
+
 /*
  * This function resets the command buffer if the IOMMU stopped fetching
  * commands from it.
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 167da5b1a5e31..3f2355c377630 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -845,6 +845,7 @@ amd_iommu_set_pci_msi_domain(struct device *dev, struct amd_iommu *iommu) { }
 	(MMIO_STATUS_EVT_OVERFLOW_INT_MASK | \
 	 MMIO_STATUS_EVT_INT_MASK | \
 	 MMIO_STATUS_PPR_INT_MASK | \
+	 MMIO_STATUS_GALOG_OVERFLOW_MASK | \
 	 MMIO_STATUS_GALOG_INT_MASK)
 
 irqreturn_t amd_iommu_int_thread(int irq, void *data)
@@ -868,10 +869,16 @@ irqreturn_t amd_iommu_int_thread(int irq, void *data)
 		}
 
 #ifdef CONFIG_IRQ_REMAP
-		if (status & MMIO_STATUS_GALOG_INT_MASK) {
+		if (status & (MMIO_STATUS_GALOG_INT_MASK |
+			      MMIO_STATUS_GALOG_OVERFLOW_MASK)) {
 			pr_devel("Processing IOMMU GA Log\n");
 			iommu_poll_ga_log(iommu);
 		}
+
+		if (status & MMIO_STATUS_GALOG_OVERFLOW_MASK) {
+			pr_info_ratelimited("IOMMU GA Log overflow\n");
+			amd_iommu_restart_ga_log(iommu);
+		}
 #endif
 
 		if (status & MMIO_STATUS_EVT_OVERFLOW_INT_MASK) {
-- 
2.39.2


  reply	other threads:[~2023-06-15 11:37 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-15 11:37 [PATCH AUTOSEL 6.3 01/19] regmap: Account for register length when chunking Sasha Levin
2023-06-15 11:37 ` Sasha Levin [this message]
2023-06-15 11:37 ` [PATCH AUTOSEL 6.3 03/19] scsi: target: iscsi: Fix hang in the iSCSI login code Sasha Levin
2023-06-15 11:37 ` [PATCH AUTOSEL 6.3 04/19] scsi: target: iscsi: Remove unused transport_timer Sasha Levin
2023-06-15 11:37 ` [PATCH AUTOSEL 6.3 05/19] scsi: target: iscsi: Prevent login threads from racing between each other Sasha Levin
2023-06-15 11:37 ` [PATCH AUTOSEL 6.3 06/19] HID: google: add jewel USB id Sasha Levin
2023-06-15 11:37 ` [PATCH AUTOSEL 6.3 07/19] HID: wacom: Add error check to wacom_parse_and_register() Sasha Levin
2023-06-15 11:37 ` [PATCH AUTOSEL 6.3 08/19] arm64: Add missing Set/Way CMO encodings Sasha Levin
2023-06-15 11:37 ` [PATCH AUTOSEL 6.3 09/19] smb3: missing null check in SMB2_change_notify Sasha Levin
2023-06-15 11:37 ` [PATCH AUTOSEL 6.3 10/19] media: cec: core: disable adapter in cec_devnode_unregister Sasha Levin
2023-06-15 11:37 ` [PATCH AUTOSEL 6.3 11/19] media: cec: core: don't set last_initiator if tx in progress Sasha Levin
2023-06-15 11:37 ` [PATCH AUTOSEL 6.3 12/19] nfcsim.c: Fix error checking for debugfs_create_dir Sasha Levin
2023-06-15 11:37 ` [PATCH AUTOSEL 6.3 13/19] btrfs: fix an uninitialized variable warning in btrfs_log_inode Sasha Levin
2023-06-15 11:37 ` [PATCH AUTOSEL 6.3 14/19] usb: gadget: udc: fix NULL dereference in remove() Sasha Levin
2023-06-15 11:37 ` [PATCH AUTOSEL 6.3 15/19] nvme: fix miss command type check Sasha Levin
2023-06-15 11:37 ` [PATCH AUTOSEL 6.3 16/19] nvme: double KA polling frequency to avoid KATO with TBKAS on Sasha Levin
2023-06-15 11:37 ` [PATCH AUTOSEL 6.3 17/19] nvme: check IO start time when deciding to defer KA Sasha Levin
2023-06-15 11:37 ` [PATCH AUTOSEL 6.3 18/19] ext4: enable the lazy init thread when remounting read/write Sasha Levin
2023-06-15 11:37 ` [PATCH AUTOSEL 6.3 19/19] nvme: improve handling of long keep alives Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230615113719.648862-2-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=iommu@lists.linux.dev \
    --cc=joao.m.martins@oracle.com \
    --cc=joro@8bytes.org \
    --cc=jroedel@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=vasant.hegde@amd.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox