From: Jacob Pan <jacob.jun.pan@linux.intel.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: LKML <linux-kernel@vger.kernel.org>, X86 Kernel <x86@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
iommu@lists.linux.dev, Thomas Gleixner <tglx@linutronix.de>,
Lu Baolu <baolu.lu@linux.intel.com>,
kvm@vger.kernel.org, Dave Hansen <dave.hansen@intel.com>,
Joerg Roedel <joro@8bytes.org>, "H. Peter Anvin" <hpa@zytor.com>,
Borislav Petkov <bp@alien8.de>, Ingo Molnar <mingo@redhat.com>,
Paul Luse <paul.e.luse@intel.com>,
Dan Williams <dan.j.williams@intel.com>,
Raj Ashok <ashok.raj@intel.com>,
"Tian, Kevin" <kevin.tian@intel.com>,
maz@kernel.org, seanjc@google.com,
Robin Murphy <robin.murphy@arm.com>,
jacob.jun.pan@linux.intel.com
Subject: Re: [PATCH 00/15] Coalesced Interrupt Delivery with posted MSI
Date: Fri, 9 Feb 2024 09:43:07 -0800 [thread overview]
Message-ID: <20240209094307.4e7eacd0@jacob-builder> (raw)
In-Reply-To: <051cf099-9ecf-4f5a-a3ac-ee2d63a62fa6@kernel.dk>
Hi Jens,
On Thu, 8 Feb 2024 08:34:55 -0700, Jens Axboe <axboe@kernel.dk> wrote:
> Hi Jacob,
>
> I gave this a quick spin, using 4 gen2 optane drives. Basic test, just
> IOPS bound on the drive, and using 1 thread per drive for IO. Random
> reads, using io_uring.
>
> For reference, using polled IO:
>
> IOPS=20.36M, BW=9.94GiB/s, IOS/call=31/31
> IOPS=20.36M, BW=9.94GiB/s, IOS/call=31/31
> IOPS=20.37M, BW=9.95GiB/s, IOS/call=31/31
>
> which is abount 5.1M/drive, which is what they can deliver.
>
> Before your patches, I see:
>
> IOPS=14.37M, BW=7.02GiB/s, IOS/call=32/32
> IOPS=14.38M, BW=7.02GiB/s, IOS/call=32/31
> IOPS=14.38M, BW=7.02GiB/s, IOS/call=32/31
> IOPS=14.37M, BW=7.02GiB/s, IOS/call=32/32
>
> at 2.82M ints/sec. With the patches, I see:
>
> IOPS=14.73M, BW=7.19GiB/s, IOS/call=32/31
> IOPS=14.90M, BW=7.27GiB/s, IOS/call=32/31
> IOPS=14.90M, BW=7.27GiB/s, IOS/call=31/32
>
> at 2.34M ints/sec. So a nice reduction in interrupt rate, though not
> quite at the extent I expected. Booted with 'posted_msi' and I do see
> posted interrupts increasing in the PMN in /proc/interrupts,
>
The ints/sec reduction is not as high as I expected either, especially
at this high rate. Which means not enough coalescing going on to get the
performance benefits.
The opportunity of IRQ coalescing is also dependent on how long the
driver's hardirq handler executes. In the posted MSI demux loop, it does
not wait for more MSIs to come before existing the pending IRQ polling
loop. So if the hardirq handler finishes very quickly, it may not coalesce
as much. Perhaps, we need to find more "useful" work to do to maximize the
window for coalescing.
I am not familiar with optane driver, need to look into how its hardirq
handler work. I have only tested NVMe gen5 in terms of storage IO, i saw
30-50% ints/sec reduction at even lower IRQ rate (200k/sec).
> Probably want to fold this one in:
>
> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
> index 8e09d40ea928..a289282f1cf9 100644
> --- a/arch/x86/kernel/irq.c
> +++ b/arch/x86/kernel/irq.c
> @@ -393,7 +393,7 @@ void intel_posted_msi_init(void)
> * instead of:
> * read, xchg, read, xchg, read, xchg, read, xchg
> */
> -static __always_inline inline bool handle_pending_pir(u64 *pir, struct
> pt_regs *regs) +static __always_inline bool handle_pending_pir(u64 *pir,
> struct pt_regs *regs) {
> int i, vec = FIRST_EXTERNAL_VECTOR;
> unsigned long pir_copy[4];
>
Good catch! will do.
Thanks,
Jacob
next prev parent reply other threads:[~2024-02-09 17:37 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-26 23:42 [PATCH 00/15] Coalesced Interrupt Delivery with posted MSI Jacob Pan
2024-01-26 23:42 ` [PATCH 01/15] x86/irq: Move posted interrupt descriptor out of vmx code Jacob Pan
2024-01-26 23:42 ` [PATCH 02/15] x86/irq: Unionize PID.PIR for 64bit access w/o casting Jacob Pan
2024-01-26 23:42 ` [PATCH 03/15] x86/irq: Use bitfields exclusively in posted interrupt descriptor Jacob Pan
2024-01-31 1:48 ` Sean Christopherson
2024-02-06 0:40 ` Jacob Pan
2024-01-26 23:42 ` [PATCH 04/15] x86/irq: Add a Kconfig option for posted MSI Jacob Pan
2024-04-05 2:28 ` Robert Hoo
2024-04-05 15:54 ` Jacob Pan
2024-01-26 23:42 ` [PATCH 05/15] x86/irq: Reserve a per CPU IDT vector for posted MSIs Jacob Pan
2024-04-04 13:38 ` Robert Hoo
2024-04-04 17:17 ` Jacob Pan
2024-01-26 23:42 ` [PATCH 06/15] x86/irq: Set up per host CPU posted interrupt descriptors Jacob Pan
2024-02-13 19:44 ` Jacob Pan
2024-01-26 23:42 ` [PATCH 07/15] x86/irq: Add accessors for " Jacob Pan
2024-01-26 23:42 ` [PATCH 08/15] x86/irq: Factor out calling ISR from common_interrupt Jacob Pan
2024-01-26 23:42 ` [PATCH 09/15] x86/irq: Install posted MSI notification handler Jacob Pan
2024-03-29 7:32 ` Zeng Guang
2024-04-03 2:43 ` Jacob Pan
2024-01-26 23:42 ` [PATCH 10/15] x86/irq: Factor out common code for checking pending interrupts Jacob Pan
2024-01-26 23:42 ` [PATCH 11/15] x86/irq: Extend checks for pending vectors to posted interrupts Jacob Pan
2024-01-26 23:42 ` [PATCH 12/15] iommu/vt-d: Make posted MSI an opt-in cmdline option Jacob Pan
2024-01-26 23:42 ` [PATCH 13/15] iommu/vt-d: Add an irq_chip for posted MSIs Jacob Pan
2024-01-26 23:42 ` [PATCH 14/15] iommu/vt-d: Add a helper to retrieve PID address Jacob Pan
2024-01-26 23:42 ` [PATCH 15/15] iommu/vt-d: Enable posted mode for device MSIs Jacob Pan
2024-02-08 15:34 ` [PATCH 00/15] Coalesced Interrupt Delivery with posted MSI Jens Axboe
2024-02-09 17:43 ` Jacob Pan [this message]
2024-02-09 20:31 ` Jens Axboe
2024-02-12 18:27 ` Jacob Pan
2024-02-12 18:36 ` Jens Axboe
2024-02-12 20:13 ` Jacob Pan
2024-02-13 1:10 ` Jacob Pan
2024-04-04 13:45 ` Robert Hoo
2024-04-04 17:37 ` Jacob Pan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240209094307.4e7eacd0@jacob-builder \
--to=jacob.jun.pan@linux.intel.com \
--cc=ashok.raj@intel.com \
--cc=axboe@kernel.dk \
--cc=baolu.lu@linux.intel.com \
--cc=bp@alien8.de \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@intel.com \
--cc=hpa@zytor.com \
--cc=iommu@lists.linux.dev \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maz@kernel.org \
--cc=mingo@redhat.com \
--cc=paul.e.luse@intel.com \
--cc=peterz@infradead.org \
--cc=robin.murphy@arm.com \
--cc=seanjc@google.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).