From: Jacob Pan <jacob.jun.pan@linux.intel.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>,
LKML <linux-kernel@vger.kernel.org>, X86 Kernel <x86@kernel.org>,
iommu@lists.linux.dev, Lu Baolu <baolu.lu@linux.intel.com>,
kvm@vger.kernel.org, Dave Hansen <dave.hansen@intel.com>,
Joerg Roedel <joro@8bytes.org>, "H. Peter Anvin" <hpa@zytor.com>,
Borislav Petkov <bp@alien8.de>, Ingo Molnar <mingo@redhat.com>,
Raj Ashok <ashok.raj@intel.com>,
"Tian, Kevin" <kevin.tian@intel.com>,
maz@kernel.org, seanjc@google.com,
Robin Murphy <robin.murphy@arm.com>,
jacob.jun.pan@linux.intel.com
Subject: Re: [PATCH RFC 09/13] x86/irq: Install posted MSI notification handler
Date: Fri, 26 Jan 2024 15:32:00 -0800 [thread overview]
Message-ID: <20240126153200.720883db@jacob-builder> (raw)
In-Reply-To: <87zfyksyge.ffs@tglx>
Hi Thomas,
On Fri, 08 Dec 2023 12:52:49 +0100, Thomas Gleixner <tglx@linutronix.de>
wrote:
> > Without PIR copy:
> >
> > DMA memfill bandwidth: 4.944 Gbps
> > Performance counter stats for './run_intr.sh 512 30':
> >
> > 77,313,298,506 L1-dcache-loads
> > (79.98%) 8,279,458 L1-dcache-load-misses #
> > 0.01% of all L1-dcache accesses (80.03%) 41,654,221,245
> > L1-dcache-stores (80.01%)
> > 10,476 LLC-load-misses # 0.31% of all LL-cache
> > accesses (79.99%) 3,332,748 LLC-loads
> > (80.00%) 30.212055434 seconds time elapsed
> >
> > 0.002149000 seconds user
> > 30.183292000 seconds sys
> >
> >
> > With PIR copy:
> > DMA memfill bandwidth: 5.029 Gbps
> > Performance counter stats for './run_intr.sh 512 30':
> >
> > 78,327,247,423 L1-dcache-loads
> > (80.01%) 7,762,311 L1-dcache-load-misses #
> > 0.01% of all L1-dcache accesses (80.01%) 42,203,221,466
> > L1-dcache-stores (79.99%)
> > 23,691 LLC-load-misses # 0.67% of all LL-cache
> > accesses (80.01%) 3,561,890 LLC-loads
> > (80.00%)
> >
> > 30.201065706 seconds time elapsed
> >
> > 0.005950000 seconds user
> > 30.167885000 seconds sys
>
> Interesting, though I'm not really convinced that this DMA memfill
> microbenchmark resembles real work loads.
>
> Did you test with something realistic, e.g. storage or networking, too?
I have done the following FIO test on NVME drives and not seeing any
meaningful differences in IOPS between the two implementations.
Here is my setup and results on 4 NVME drives connected to a x16 PCIe slot:
+-[0000:62]-
| +-01.0-[63]----00.0 Samsung Electronics Co Ltd NVMe SSD Controller PM174X
| +-03.0-[64]----00.0 Samsung Electronics Co Ltd NVMe SSD Controller PM174X
| +-05.0-[65]----00.0 Samsung Electronics Co Ltd NVMe SSD Controller PM174X
| \-07.0-[66]----00.0 Samsung Electronics Co Ltd NVMe SSD Controller PM174X
libaio, no PIR_COPY
======================================
fio-3.35
Starting 512 processes
Jobs: 512 (f=512): [r(512)][100.0%][r=32.2GiB/s][r=8445k IOPS][eta 00m:00s]
disk_nvme6n1_thread_1: (groupid=0, jobs=512): err= 0: pid=31559: Mon Jan 8 21:49:22 2024
read: IOPS=8419k, BW=32.1GiB/s (34.5GB/s)(964GiB/30006msec)
slat (nsec): min=1325, max=115807k, avg=42368.34, stdev=1517031.57
clat (usec): min=2, max=499085, avg=15139.97, stdev=25682.25
lat (usec): min=68, max=499089, avg=15182.33, stdev=25709.81
clat percentiles (usec):
| 1.00th=[ 734], 5.00th=[ 783], 10.00th=[ 816], 20.00th=[ 857],
| 30.00th=[ 906], 40.00th=[ 971], 50.00th=[ 1074], 60.00th=[ 1369],
| 70.00th=[ 13042], 80.00th=[ 19792], 90.00th=[ 76022], 95.00th=[ 76022],
| 99.00th=[ 77071], 99.50th=[ 81265], 99.90th=[ 85459], 99.95th=[ 91751],
| 99.99th=[200279]
bw ( MiB/s): min=18109, max=51859, per=100.00%, avg=32965.98, stdev=16.88, samples=14839
iops : min=4633413, max=13281470, avg=8439278.47, stdev=4324.70, samples=14839
lat (usec) : 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%, 100=0.01%
lat (usec) : 250=0.01%, 500=0.01%, 750=1.84%, 1000=41.96%
lat (msec) : 2=18.37%, 4=0.20%, 10=3.88%, 20=13.95%, 50=5.42%
lat (msec) : 100=14.33%, 250=0.02%, 500=0.01%
cpu : usr=1.16%, sys=3.54%, ctx=4932752, majf=0, minf=192764
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
issued rwts: total=252616589,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=256
Run status group 0 (all jobs):
READ: bw=32.1GiB/s (34.5GB/s), 32.1GiB/s-32.1GiB/s (34.5GB/s-34.5GB/s), io=964GiB (1035GB), run=30006-30006msec
Disk stats (read/write):
nvme6n1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=96.31%
nvme5n1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=97.15%
nvme4n1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=98.06%
nvme3n1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=98.94%
Performance counter stats for 'system wide':
22,985,903,515 L1-dcache-load-misses (42.86%)
22,989,992,126 L1-dcache-load-misses (57.14%)
751,228,710,993 L1-dcache-stores (57.14%)
465,033,820 LLC-load-misses # 18.27% of all LL-cache accesses (57.15%)
2,545,570,669 LLC-loads (57.14%)
1,058,582,881 LLC-stores (28.57%)
326,135,823 LLC-store-misses (28.57%)
32.045718194 seconds time elapsed
-------------------------------------------
libaio with PIR_COPY
-------------------------------------------
fio-3.35
Starting 512 processes
Jobs: 512 (f=512): [r(512)][100.0%][r=32.2GiB/s][r=8445k IOPS][eta 00m:00s]
disk_nvme6n1_thread_1: (groupid=0, jobs=512): err= 0: pid=5103: Mon Jan 8 23:12:12 2024
read: IOPS=8420k, BW=32.1GiB/s (34.5GB/s)(964GiB/30011msec)
slat (nsec): min=1339, max=97021k, avg=42447.84, stdev=1442726.09
clat (usec): min=2, max=369410, avg=14820.01, stdev=24112.59
lat (usec): min=69, max=369412, avg=14862.46, stdev=24139.33
clat percentiles (usec):
| 1.00th=[ 717], 5.00th=[ 783], 10.00th=[ 824], 20.00th=[ 873],
| 30.00th=[ 930], 40.00th=[ 1012], 50.00th=[ 1172], 60.00th=[ 8094],
| 70.00th=[ 14222], 80.00th=[ 18744], 90.00th=[ 76022], 95.00th=[ 76022],
| 99.00th=[ 76022], 99.50th=[ 78119], 99.90th=[ 81265], 99.95th=[ 81265],
| 99.99th=[135267]
bw ( MiB/s): min=19552, max=62819, per=100.00%, avg=33774.56, stdev=31.02, samples=14540
iops : min=5005807, max=16089892, avg=8646500.17, stdev=7944.42, samples=14540
lat (usec) : 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%, 100=0.01%
lat (usec) : 250=0.01%, 500=0.01%, 750=2.50%, 1000=36.41%
lat (msec) : 2=17.39%, 4=0.27%, 10=5.83%, 20=18.94%, 50=5.59%
lat (msec) : 100=13.06%, 250=0.01%, 500=0.01%
cpu : usr=1.20%, sys=3.74%, ctx=6758326, majf=0, minf=193128
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
issued rwts: total=252677827,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=256
Run status group 0 (all jobs):
READ: bw=32.1GiB/s (34.5GB/s), 32.1GiB/s-32.1GiB/s (34.5GB/s-34.5GB/s), io=964GiB (1035GB), run=30011-30011msec
Disk stats (read/write):
nvme6n1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=96.36%
nvme5n1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=97.18%
nvme4n1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=98.08%
nvme3n1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=98.96%
Performance counter stats for 'system wide':
24,762,800,042 L1-dcache-load-misses (42.86%)
24,764,415,765 L1-dcache-load-misses (57.14%)
756,096,467,595 L1-dcache-stores (57.14%)
483,611,270 LLC-load-misses # 16.21% of all LL-cache accesses (57.14%)
2,982,610,898 LLC-loads (57.14%)
1,283,077,818 LLC-stores (28.57%)
313,253,711 LLC-store-misses (28.57%)
32.059810215 seconds time elapsed
Thanks,
Jacob
next prev parent reply other threads:[~2024-01-26 23:26 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-12 4:16 [PATCH RFC 00/13] Coalesced Interrupt Delivery with posted MSI Jacob Pan
2023-11-12 4:16 ` [PATCH RFC 01/13] x86: Move posted interrupt descriptor out of vmx code Jacob Pan
2023-12-06 16:33 ` Thomas Gleixner
2023-12-08 4:54 ` Jacob Pan
2023-12-08 9:31 ` Thomas Gleixner
2023-12-08 23:21 ` Jacob Pan
2023-12-09 0:28 ` Jacob Pan
2023-11-12 4:16 ` [PATCH RFC 02/13] x86: Add a Kconfig option for posted MSI Jacob Pan
2023-12-06 16:35 ` Thomas Gleixner
2023-12-09 21:24 ` Jacob Pan
2023-11-12 4:16 ` [PATCH RFC 03/13] x86: Reserved a per CPU IDT vector for posted MSIs Jacob Pan
2023-12-06 16:47 ` Thomas Gleixner
2023-12-09 21:53 ` Jacob Pan
2023-11-12 4:16 ` [PATCH RFC 04/13] iommu/vt-d: Add helper and flag to check/disable posted MSI Jacob Pan
2023-12-06 16:49 ` Thomas Gleixner
2023-11-12 4:16 ` [PATCH RFC 05/13] x86/irq: Set up per host CPU posted interrupt descriptors Jacob Pan
2023-11-12 4:16 ` [PATCH RFC 06/13] x86/irq: Unionize PID.PIR for 64bit access w/o casting Jacob Pan
2023-12-06 16:51 ` Thomas Gleixner
2023-11-12 4:16 ` [PATCH RFC 07/13] x86/irq: Add helpers for checking Intel PID Jacob Pan
2023-12-06 19:02 ` Thomas Gleixner
2024-01-26 23:31 ` Jacob Pan
2023-11-12 4:16 ` [PATCH RFC 08/13] x86/irq: Factor out calling ISR from common_interrupt Jacob Pan
2023-11-12 4:16 ` [PATCH RFC 09/13] x86/irq: Install posted MSI notification handler Jacob Pan
2023-11-15 12:42 ` Peter Zijlstra
2023-11-15 20:05 ` Jacob Pan
2023-11-15 12:56 ` Peter Zijlstra
2023-11-15 20:04 ` Jacob Pan
2023-11-15 20:25 ` Peter Zijlstra
2023-12-06 19:50 ` Thomas Gleixner
2023-12-08 4:46 ` Jacob Pan
2023-12-08 11:52 ` Thomas Gleixner
2023-12-08 20:02 ` Jacob Pan
2024-01-26 23:32 ` Jacob Pan [this message]
2023-12-06 19:14 ` Thomas Gleixner
2023-11-12 4:16 ` [PATCH RFC 10/13] x86/irq: Handle potential lost IRQ during migration and CPU offline Jacob Pan
2023-12-06 20:09 ` Thomas Gleixner
2023-11-12 4:16 ` [PATCH RFC 11/13] iommu/vt-d: Add an irq_chip for posted MSIs Jacob Pan
2023-12-06 20:15 ` Thomas Gleixner
2024-01-26 23:31 ` Jacob Pan
2023-12-06 20:44 ` Thomas Gleixner
2023-12-13 3:42 ` Jacob Pan
2023-11-12 4:16 ` [PATCH RFC 12/13] iommu/vt-d: Add a helper to retrieve PID address Jacob Pan
2023-12-06 20:19 ` Thomas Gleixner
2024-01-26 23:30 ` Jacob Pan
2024-02-13 8:21 ` Thomas Gleixner
2024-02-13 19:31 ` Jacob Pan
2023-11-12 4:16 ` [PATCH RFC 13/13] iommu/vt-d: Enable posted mode for device MSIs Jacob Pan
2023-12-06 20:26 ` Thomas Gleixner
2023-12-13 22:00 ` Jacob Pan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240126153200.720883db@jacob-builder \
--to=jacob.jun.pan@linux.intel.com \
--cc=ashok.raj@intel.com \
--cc=baolu.lu@linux.intel.com \
--cc=bp@alien8.de \
--cc=dave.hansen@intel.com \
--cc=hpa@zytor.com \
--cc=iommu@lists.linux.dev \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maz@kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=robin.murphy@arm.com \
--cc=seanjc@google.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.