From: Baolu Lu <baolu.lu@linux.intel.com>
To: Lennert Buytenhek <buytenh@wantstofly.org>,
David Woodhouse <dwmw2@infradead.org>
Cc: baolu.lu@linux.intel.com, Joerg Roedel <joro@8bytes.org>,
Will Deacon <will@kernel.org>,
Robin Murphy <robin.murphy@arm.com>,
iommu@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [PATCH,RFC] iommu/vt-d: Convert dmar_fault IRQ to a threaded IRQ
Date: Wed, 26 Oct 2022 10:10:29 +0800 [thread overview]
Message-ID: <028e2c63-939b-af31-88b9-b479b41ce67c@linux.intel.com> (raw)
In-Reply-To: <Y1eZbXKdJDoS8loC@wantstofly.org>
On 10/25/22 4:08 PM, Lennert Buytenhek wrote:
> Under a high enough I/O page fault load, the dmar_fault hardirq handler
> can end up starving other tasks that wanted to run on the CPU that the
> IRQ is being routed to. On an i7-6700 CPU this seems to happen at
> around 2.5 million I/O page faults per second, and at a fraction of
> that rate on some of the lower-end CPUs that we use.
>
> An I/O page fault rate of 2.5 million per second may seem like a very
> high number, but when we get an I/O page fault for every cache line
> touched by a DMA operation, this I/O page fault rate can be the result
> of a confused PCIe device DMAing to RAM at 2.5 * 64 = 160 MB/sec, which
> is not an unlikely rate to be DMAing things to RAM at. And, in fact,
> when we do see PCIe devices getting confused like this, this sort of
> I/O page fault rate is not uncommon.
>
> A peripheral device continuously DMAing to RAM at 160 MB/s is
> inarguably a bug, either in the kernel driver for the device or in the
> firmware for the device, and should be fixed there, but it's the sort
> of bug that iommu/vt-d could be handling better than it currently does,
> and there is a fairly simple way to achieve that.
>
> This patch changes the dmar_fault IRQ handler to be a threaded IRQ
> handler. This is a pretty minimal code change, and comes with the
> advantage that Intel IOMMU I/O page fault handling work is now subject
> to RT throttling, which allows it to be kept under control using the
> sched_rt_period_us / sched_rt_runtime_us parameters.
Thanks for the patch! I like it, but also have some concerns.
If you look at the commit history, you will find that the opposite
change took place 10+ years ago.
commit 477694e71113fd0694b6bb0bcc2d006b8ac62691
Author: Thomas Gleixner <tglx@linutronix.de>
Date: Tue Jul 19 16:25:42 2011 +0200
x86, iommu: Mark DMAR IRQ as non-threaded
Mark this lowlevel IRQ handler as non-threaded. This prevents a boot
crash when "threadirqs" is on the kernel commandline. Also the
interrupt handler is handling hardware critical events which should
not be delayed into a thread.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
I am not sure whether the "boot crash" mentioned above is due to that
"trying to setup a threaded IRQ handler before kthreadd is started".
>
> iommu/amd already uses a threaded IRQ handler for its I/O page fault
> reporting, and so it already has this advantage.
>
> When IRQ remapping is enabled, iommu/vt-d will try to set up its
> dmar_fault IRQ handler from start_kernel() -> x86_late_time_init()
> -> apic_intr_mode_init() -> apic_bsp_setup() ->
> irq_remap_enable_fault_handling() -> enable_drhd_fault_handling(),
> which happens before kthreadd is started, and trying to set up a
> threaded IRQ handler this early on will oops. However, there
> doesn't seem to be a reason why iommu/vt-d needs to set up its fault
> reporting IRQ handler this early, and if we remove the IRQ setup code
> from enable_drhd_fault_handling(), the IRQ will be registered instead
> from pci_iommu_init() -> intel_iommu_init() -> init_dmars(), which
> seems to work just fine.
At present, we cannot do so. Because the VT-d interrupt remapping and
DMA remapping can be independently enabled. In another words, it's a
possible case where interrupt remapping is enabled while DMA remapping
is not.
>
> Suggested-by: Scarlett Gourley <scarlett@arista.com>
> Suggested-by: James Sewart <jamessewart@arista.com>
> Suggested-by: Jack O'Sullivan <jack@arista.com>
> Signed-off-by: Lennert Buytenhek <buytenh@arista.com>
> ---
> drivers/iommu/intel/dmar.c | 27 ++-------------------------
> 1 file changed, 2 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
> index 5a8f780e7ffd..d0871fe9d04d 100644
> --- a/drivers/iommu/intel/dmar.c
> +++ b/drivers/iommu/intel/dmar.c
> @@ -2043,7 +2043,8 @@ int dmar_set_interrupt(struct intel_iommu *iommu)
> return -EINVAL;
> }
>
> - ret = request_irq(irq, dmar_fault, IRQF_NO_THREAD, iommu->name, iommu);
> + ret = request_threaded_irq(irq, NULL, dmar_fault, IRQF_ONESHOT,
> + iommu->name, iommu);
> if (ret)
> pr_err("Can't request irq\n");
> return ret;
> @@ -2051,30 +2052,6 @@ int dmar_set_interrupt(struct intel_iommu *iommu)
>
> int __init enable_drhd_fault_handling(void)
> {
> - struct dmar_drhd_unit *drhd;
> - struct intel_iommu *iommu;
> -
> - /*
> - * Enable fault control interrupt.
> - */
> - for_each_iommu(iommu, drhd) {
> - u32 fault_status;
> - int ret = dmar_set_interrupt(iommu);
> -
> - if (ret) {
> - pr_err("DRHD %Lx: failed to enable fault, interrupt, ret %d\n",
> - (unsigned long long)drhd->reg_base_addr, ret);
> - return -1;
> - }
> -
> - /*
> - * Clear any previous faults.
> - */
> - dmar_fault(iommu->irq, iommu);
> - fault_status = readl(iommu->reg + DMAR_FSTS_REG);
> - writel(fault_status, iommu->reg + DMAR_FSTS_REG);
> - }
> -
> return 0;
> }
>
Best regards,
baolu
next prev parent reply other threads:[~2022-10-26 2:17 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-25 8:08 [PATCH,RFC] iommu/vt-d: Convert dmar_fault IRQ to a threaded IRQ Lennert Buytenhek
2022-10-26 2:10 ` Baolu Lu [this message]
2022-10-27 8:19 ` Lennert Buytenhek
2022-10-29 8:12 ` Baolu Lu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=028e2c63-939b-af31-88b9-b479b41ce67c@linux.intel.com \
--to=baolu.lu@linux.intel.com \
--cc=buytenh@wantstofly.org \
--cc=dwmw2@infradead.org \
--cc=iommu@lists.linux.dev \
--cc=joro@8bytes.org \
--cc=linux-kernel@vger.kernel.org \
--cc=robin.murphy@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox