public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Lennert Buytenhek <buytenh@wantstofly.org>
To: David Woodhouse <dwmw2@infradead.org>,
	Lu Baolu <baolu.lu@linux.intel.com>
Cc: Joerg Roedel <joro@8bytes.org>, Will Deacon <will@kernel.org>,
	Robin Murphy <robin.murphy@arm.com>,
	iommu@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: [PATCH,RFC] iommu/vt-d: Convert dmar_fault IRQ to a threaded IRQ
Date: Tue, 25 Oct 2022 11:08:13 +0300	[thread overview]
Message-ID: <Y1eZbXKdJDoS8loC@wantstofly.org> (raw)

Under a high enough I/O page fault load, the dmar_fault hardirq handler
can end up starving other tasks that wanted to run on the CPU that the
IRQ is being routed to.  On an i7-6700 CPU this seems to happen at
around 2.5 million I/O page faults per second, and at a fraction of
that rate on some of the lower-end CPUs that we use.

An I/O page fault rate of 2.5 million per second may seem like a very
high number, but when we get an I/O page fault for every cache line
touched by a DMA operation, this I/O page fault rate can be the result
of a confused PCIe device DMAing to RAM at 2.5 * 64 = 160 MB/sec, which
is not an unlikely rate to be DMAing things to RAM at.  And, in fact,
when we do see PCIe devices getting confused like this, this sort of
I/O page fault rate is not uncommon.

A peripheral device continuously DMAing to RAM at 160 MB/s is
inarguably a bug, either in the kernel driver for the device or in the
firmware for the device, and should be fixed there, but it's the sort
of bug that iommu/vt-d could be handling better than it currently does,
and there is a fairly simple way to achieve that.

This patch changes the dmar_fault IRQ handler to be a threaded IRQ
handler.  This is a pretty minimal code change, and comes with the
advantage that Intel IOMMU I/O page fault handling work is now subject
to RT throttling, which allows it to be kept under control using the
sched_rt_period_us / sched_rt_runtime_us parameters.

iommu/amd already uses a threaded IRQ handler for its I/O page fault
reporting, and so it already has this advantage.

When IRQ remapping is enabled, iommu/vt-d will try to set up its
dmar_fault IRQ handler from start_kernel() -> x86_late_time_init()
-> apic_intr_mode_init() -> apic_bsp_setup() ->
irq_remap_enable_fault_handling() -> enable_drhd_fault_handling(),
which happens before kthreadd is started, and trying to set up a
threaded IRQ handler this early on will oops.  However, there
doesn't seem to be a reason why iommu/vt-d needs to set up its fault
reporting IRQ handler this early, and if we remove the IRQ setup code
from enable_drhd_fault_handling(), the IRQ will be registered instead
from pci_iommu_init() -> intel_iommu_init() -> init_dmars(), which
seems to work just fine.

Suggested-by: Scarlett Gourley <scarlett@arista.com>
Suggested-by: James Sewart <jamessewart@arista.com>
Suggested-by: Jack O'Sullivan <jack@arista.com>
Signed-off-by: Lennert Buytenhek <buytenh@arista.com>
---
 drivers/iommu/intel/dmar.c | 27 ++-------------------------
 1 file changed, 2 insertions(+), 25 deletions(-)

diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index 5a8f780e7ffd..d0871fe9d04d 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -2043,7 +2043,8 @@ int dmar_set_interrupt(struct intel_iommu *iommu)
 		return -EINVAL;
 	}
 
-	ret = request_irq(irq, dmar_fault, IRQF_NO_THREAD, iommu->name, iommu);
+	ret = request_threaded_irq(irq, NULL, dmar_fault, IRQF_ONESHOT,
+				   iommu->name, iommu);
 	if (ret)
 		pr_err("Can't request irq\n");
 	return ret;
@@ -2051,30 +2052,6 @@ int dmar_set_interrupt(struct intel_iommu *iommu)
 
 int __init enable_drhd_fault_handling(void)
 {
-	struct dmar_drhd_unit *drhd;
-	struct intel_iommu *iommu;
-
-	/*
-	 * Enable fault control interrupt.
-	 */
-	for_each_iommu(iommu, drhd) {
-		u32 fault_status;
-		int ret = dmar_set_interrupt(iommu);
-
-		if (ret) {
-			pr_err("DRHD %Lx: failed to enable fault, interrupt, ret %d\n",
-			       (unsigned long long)drhd->reg_base_addr, ret);
-			return -1;
-		}
-
-		/*
-		 * Clear any previous faults.
-		 */
-		dmar_fault(iommu->irq, iommu);
-		fault_status = readl(iommu->reg + DMAR_FSTS_REG);
-		writel(fault_status, iommu->reg + DMAR_FSTS_REG);
-	}
-
 	return 0;
 }
 
-- 
2.37.3

             reply	other threads:[~2022-10-25  8:08 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-25  8:08 Lennert Buytenhek [this message]
2022-10-26  2:10 ` [PATCH,RFC] iommu/vt-d: Convert dmar_fault IRQ to a threaded IRQ Baolu Lu
2022-10-27  8:19   ` Lennert Buytenhek
2022-10-29  8:12     ` Baolu Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y1eZbXKdJDoS8loC@wantstofly.org \
    --to=buytenh@wantstofly.org \
    --cc=baolu.lu@linux.intel.com \
    --cc=dwmw2@infradead.org \
    --cc=iommu@lists.linux.dev \
    --cc=joro@8bytes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox