From: Jacob Pan <jacob.jun.pan@linux.intel.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>,
LKML <linux-kernel@vger.kernel.org>, X86 Kernel <x86@kernel.org>,
iommu@lists.linux.dev, Lu Baolu <baolu.lu@linux.intel.com>,
kvm@vger.kernel.org, Dave Hansen <dave.hansen@intel.com>,
Joerg Roedel <joro@8bytes.org>, "H. Peter Anvin" <hpa@zytor.com>,
Borislav Petkov <bp@alien8.de>, Ingo Molnar <mingo@redhat.com>,
Raj Ashok <ashok.raj@intel.com>,
"Tian, Kevin" <kevin.tian@intel.com>,
maz@kernel.org, seanjc@google.com,
Robin Murphy <robin.murphy@arm.com>,
jacob.jun.pan@linux.intel.com
Subject: Re: [PATCH RFC 09/13] x86/irq: Install posted MSI notification handler
Date: Fri, 8 Dec 2023 12:02:36 -0800 [thread overview]
Message-ID: <20231208120236.0f3b287d@jacob-builder> (raw)
In-Reply-To: <87zfyksyge.ffs@tglx>
Hi Thomas,
On Fri, 08 Dec 2023 12:52:49 +0100, Thomas Gleixner <tglx@linutronix.de>
wrote:
> On Thu, Dec 07 2023 at 20:46, Jacob Pan wrote:
> > On Wed, 06 Dec 2023 20:50:24 +0100, Thomas Gleixner <tglx@linutronix.de>
> > wrote:
> >> I don't understand what the whole copy business is about. It's
> >> absolutely not required.
> >
> > My thinking is the following:
> > The PIR cache line is contended by between CPU and IOMMU, where CPU can
> > access PIR much faster. Nevertheless, when IOMMU does atomic swap of the
> > PID (PIR included), L1 cache gets evicted. Subsequent CPU read or xchg
> > will deal with invalid cold cache.
> >
> > By making a copy of PIR as quickly as possible and clearing PIR with
> > xchg, we minimized the chance that IOMMU does atomic swap in the middle.
> > Therefore, having less L1D misses.
> >
> > In the code above, it does read, xchg, and call_irq_handler() in a loop
> > to handle the 4 64bit PIR bits at a time. IOMMU has a greater chance to
> > do atomic xchg on the PIR cache line while doing call_irq_handler().
> > Therefore, it causes more L1D misses.
>
> That makes sense and if we go there it wants to be documented.
will do. How about this explanation:
"
Posted interrupt descriptor (PID) fits in a cache line that is frequently
accessed by both CPU and IOMMU.
During posted MSI processing, the CPU needs to do 64-bit read and xchg for
checking and clearing posted interrupt request (PIR), a 256 bit field
within the PID. On the other side, IOMMU do atomic swaps of the entire
PID cache line when posting interrupts. The CPU can access the cache line
much faster than the IOMMU.
The cache line states after each operation are as follows:
CPU IOMMU PID Cache line state
-------------------------------------------------------------
read64 exclusive
lock xchg64 modified
post/atomic swap invalid
-------------------------------------------------------------
Note that PID cache line is evicted after each IOMMU interrupt posting.
The posted MSI demuxing loop is written to optimize the cache performance
based on the two considerations around the PID cache line:
1. Reduce L1 data cache miss by avoiding contention with IOMMU's interrupt
posting/atomic swap, a copy of PIR is used to dispatch interrupt handlers.
2. Keep the cache line state consistent as much as possible. e.g. when
making a copy and clearing the PIR (assuming non-zero PIR bits are present
in the entire PIR), do:
read, read, read, read, xchg, xchg, xchg, xchg
instead of:
read, xchg, read, xchg, read, xchg, read, xchg
"
>
> > Without PIR copy:
> >
> > DMA memfill bandwidth: 4.944 Gbps
> > Performance counter stats for './run_intr.sh 512 30':
> >
> > 77,313,298,506 L1-dcache-loads
> > (79.98%) 8,279,458 L1-dcache-load-misses #
> > 0.01% of all L1-dcache accesses (80.03%) 41,654,221,245
> > L1-dcache-stores (80.01%)
> > 10,476 LLC-load-misses # 0.31% of all LL-cache
> > accesses (79.99%) 3,332,748 LLC-loads
> > (80.00%) 30.212055434 seconds time elapsed
> >
> > 0.002149000 seconds user
> > 30.183292000 seconds sys
> >
> >
> > With PIR copy:
> > DMA memfill bandwidth: 5.029 Gbps
> > Performance counter stats for './run_intr.sh 512 30':
> >
> > 78,327,247,423 L1-dcache-loads
> > (80.01%) 7,762,311 L1-dcache-load-misses #
> > 0.01% of all L1-dcache accesses (80.01%) 42,203,221,466
> > L1-dcache-stores (79.99%)
> > 23,691 LLC-load-misses # 0.67% of all LL-cache
> > accesses (80.01%) 3,561,890 LLC-loads
> > (80.00%)
> >
> > 30.201065706 seconds time elapsed
> >
> > 0.005950000 seconds user
> > 30.167885000 seconds sys
>
> Interesting, though I'm not really convinced that this DMA memfill
> microbenchmark resembles real work loads.
>
It is just a tool to get some quick experiments done, not realistic. Though
I am adding various knobs to make it more useful. e.g. adjustable interrupt
rate, delays in idxd hardirq handler.
> Did you test with something realistic, e.g. storage or networking, too?
>
Not yet for this particular code, working on testing with FIO on Samsung
Gen5 NVMe disks. I am getting help from the people with the set up.
Thanks,
Jacob
next prev parent reply other threads:[~2023-12-08 19:57 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-12 4:16 [PATCH RFC 00/13] Coalesced Interrupt Delivery with posted MSI Jacob Pan
2023-11-12 4:16 ` [PATCH RFC 01/13] x86: Move posted interrupt descriptor out of vmx code Jacob Pan
2023-12-06 16:33 ` Thomas Gleixner
2023-12-08 4:54 ` Jacob Pan
2023-12-08 9:31 ` Thomas Gleixner
2023-12-08 23:21 ` Jacob Pan
2023-12-09 0:28 ` Jacob Pan
2023-11-12 4:16 ` [PATCH RFC 02/13] x86: Add a Kconfig option for posted MSI Jacob Pan
2023-12-06 16:35 ` Thomas Gleixner
2023-12-09 21:24 ` Jacob Pan
2023-11-12 4:16 ` [PATCH RFC 03/13] x86: Reserved a per CPU IDT vector for posted MSIs Jacob Pan
2023-12-06 16:47 ` Thomas Gleixner
2023-12-09 21:53 ` Jacob Pan
2023-11-12 4:16 ` [PATCH RFC 04/13] iommu/vt-d: Add helper and flag to check/disable posted MSI Jacob Pan
2023-12-06 16:49 ` Thomas Gleixner
2023-11-12 4:16 ` [PATCH RFC 05/13] x86/irq: Set up per host CPU posted interrupt descriptors Jacob Pan
2023-11-12 4:16 ` [PATCH RFC 06/13] x86/irq: Unionize PID.PIR for 64bit access w/o casting Jacob Pan
2023-12-06 16:51 ` Thomas Gleixner
2023-11-12 4:16 ` [PATCH RFC 07/13] x86/irq: Add helpers for checking Intel PID Jacob Pan
2023-12-06 19:02 ` Thomas Gleixner
2024-01-26 23:31 ` Jacob Pan
2023-11-12 4:16 ` [PATCH RFC 08/13] x86/irq: Factor out calling ISR from common_interrupt Jacob Pan
2023-11-12 4:16 ` [PATCH RFC 09/13] x86/irq: Install posted MSI notification handler Jacob Pan
2023-11-15 12:42 ` Peter Zijlstra
2023-11-15 20:05 ` Jacob Pan
2023-11-15 12:56 ` Peter Zijlstra
2023-11-15 20:04 ` Jacob Pan
2023-11-15 20:25 ` Peter Zijlstra
2023-12-06 19:50 ` Thomas Gleixner
2023-12-08 4:46 ` Jacob Pan
2023-12-08 11:52 ` Thomas Gleixner
2023-12-08 20:02 ` Jacob Pan [this message]
2024-01-26 23:32 ` Jacob Pan
2023-12-06 19:14 ` Thomas Gleixner
2023-11-12 4:16 ` [PATCH RFC 10/13] x86/irq: Handle potential lost IRQ during migration and CPU offline Jacob Pan
2023-12-06 20:09 ` Thomas Gleixner
2023-11-12 4:16 ` [PATCH RFC 11/13] iommu/vt-d: Add an irq_chip for posted MSIs Jacob Pan
2023-12-06 20:15 ` Thomas Gleixner
2024-01-26 23:31 ` Jacob Pan
2023-12-06 20:44 ` Thomas Gleixner
2023-12-13 3:42 ` Jacob Pan
2023-11-12 4:16 ` [PATCH RFC 12/13] iommu/vt-d: Add a helper to retrieve PID address Jacob Pan
2023-12-06 20:19 ` Thomas Gleixner
2024-01-26 23:30 ` Jacob Pan
2024-02-13 8:21 ` Thomas Gleixner
2024-02-13 19:31 ` Jacob Pan
2023-11-12 4:16 ` [PATCH RFC 13/13] iommu/vt-d: Enable posted mode for device MSIs Jacob Pan
2023-12-06 20:26 ` Thomas Gleixner
2023-12-13 22:00 ` Jacob Pan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231208120236.0f3b287d@jacob-builder \
--to=jacob.jun.pan@linux.intel.com \
--cc=ashok.raj@intel.com \
--cc=baolu.lu@linux.intel.com \
--cc=bp@alien8.de \
--cc=dave.hansen@intel.com \
--cc=hpa@zytor.com \
--cc=iommu@lists.linux.dev \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maz@kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=robin.murphy@arm.com \
--cc=seanjc@google.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox