From: Alexander Shumakovitch <shurik@jhu.edu>
To: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: "linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>
Subject: Re: Read speed for a PCIe NVMe SSD is ridiculously slow on a multi-socket machine.
Date: Fri, 31 Mar 2023 07:53:30 +0000 [thread overview]
Message-ID: <ZCaReYKRomOGsM9q@hornet> (raw)
In-Reply-To: <67d3d0c7-afbb-98a2-4ce8-4d93dbb82663@opensource.wdc.com>
[-- Attachment #1: Type: text/plain, Size: 4222 bytes --]
Thanks a lot, Damien. This was very helpful indeed. As you have suggested,
I've run a few fio test with libaio and io_uring engines with QD=32 and
different number of jobs. The results were mostly consistent between the two
engines, except for the random reads in the cached mode. In the case of
libaio, there was virtually no difference between different nodes, and the
bandwidth steadily increased with the number of nodes. Which made sense to
me after your explanations.
But for io_uring, node #0 was getting progressively faster as the number
of jobs increased, but the other three were getting slower, see the summary
tables below. Does this make sense for you? I understand that libaio engine
might ignore the iodepth settings in the cached mode. But smaller QD should
make things slower, not faster, shouldn't it? For your information, I also
attach complete outputs for fio in a few boundary cases.
The main things I'm still concerned about is that not all Linux subsystems
might be fully NUMA aware on this machine. As I wrote, it has a buggy BIOS
that doesn't tell the OS its NUMA configuration. I populate the values of
numa_node in /sys/devices/pci0000:* myself after each boot, but this might
not be enough.
Thank you,
--- Alex.
Benchmarks for random reads: bs = 4k, iodepth = 32 (in MB/s):
|| libaio engine, cached mode || io_uring engine, cached mode |
jobs || CPU#0 | CPU#1 | CPU#2 | CPU#3 || CPU#0 | CPU#1 | CPU#2 | CPU#3 |
-------------------------------------------------------------------------
1 || 47.5 | 46.2 | 46.0 | 46.5 || 330 | 285 | 281 | 252 |
2 || 94.2 | 91.8 | 90.9 | 91.8 || 571 | 189 | 186 | 203 |
4 || 180 | 176 | 175 | 176 || 1108 | 184 | 191 | 219 |
8 || 331 | 322 | 319 | 322 || 1142 | 170 | 174 | 177 |
16 || 585 | 554 | 545 | 552 || 1353 | 175 | 173 | 180 |
-------------------------------------------------------------------------
|| libaio engine, direct mode || io_uring engine, direct mode |
jobs || CPU#0 | CPU#1 | CPU#2 | CPU#3 || CPU#0 | CPU#1 | CPU#2 | CPU#3 |
-------------------------------------------------------------------------
1 || 544 | 520 | 477 | 519 || 506 | 558 | 532 | 476 |
2 || 1034 | 928 | 943 | 996 || 1028 | 938 | 1023 | 1004 |
4 || 1139 | 1138 | 1138 | 1139 || 1138 | 1138 | 1138 | 1138 |
8 || 1140 | 1141 | 1141 | 1141 || 1142 | 1142 | 1141 | 1141 |
16 || 1141 | 1135 | 1112 | 1136 || 1141 | 1130 | 1133 | 1135 |
-------------------------------------------------------------------------
Benchmarks for sequential reads: bs = 256k, iodepth = 32, numjobs = 1.
| libaio engine, cached mode || io_uring engine, cached mode |
| CPU#0 | CPU#1 | CPU#2 | CPU#3 || CPU#0 | CPU#1 | CPU#2 | CPU#3 |
------------------------------------------------------------------
| 1411 | 160 | 159 | 163 || 1355 | 160 | 159 | 163 |
------------------------------------------------------------------
| libaio engine, direct mode || io_uring engine, direct mode |
| CPU#0 | CPU#1 | CPU#2 | CPU#3 || CPU#0 | CPU#1 | CPU#2 | CPU#3 |
------------------------------------------------------------------
| 3627 | 2160 | 1637 | 2184 || 3627 | 2076 | 1756 | 2167 |
------------------------------------------------------------------
On Sat, Mar 25, 2023 at 10:52:02AM +0900, Damien Le Moal wrote:
> For fast block devices, the overhead of the page management and memory
> copies done when using the page cache is very visible. Nothing that can be
> done about that. Any application, fio included, will most of the time show
> slower performance because of that overhead. Not always true though (e.g.
> sequential read with read-ahead should be just fine), but at the very least
> you will see a higher CPU load.
>
> dd and hdparm will also exercise the drive at QD=1, far from ideal when
> trying to measure the maximum throughput of a device, unless you one uses
> very large IO sizes.
>
[-- Attachment #2: fio-io_uring-iodepth_32-numjobs_16_cached-node0.txt --]
[-- Type: text/plain, Size: 1895 bytes --]
nvme0: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=32
...
fio-3.25
Starting 16 processes
nvme0: (groupid=0, jobs=16): err= 0: pid=15807: Sat Mar 25 21:16:24 2023
read: IOPS=330k, BW=1290MiB/s (1353MB/s)(37.8GiB/30003msec)
slat (nsec): min=1938, max=225749, avg=14434.23, stdev=6505.65
clat (usec): min=4, max=5997, avg=1533.09, stdev=566.96
lat (usec): min=7, max=6022, avg=1547.96, stdev=568.55
clat percentiles (usec):
| 1.00th=[ 196], 5.00th=[ 269], 10.00th=[ 783], 20.00th=[ 1139],
| 30.00th=[ 1352], 40.00th=[ 1483], 50.00th=[ 1582], 60.00th=[ 1680],
| 70.00th=[ 1795], 80.00th=[ 1942], 90.00th=[ 2180], 95.00th=[ 2409],
| 99.00th=[ 2900], 99.50th=[ 3130], 99.90th=[ 3589], 99.95th=[ 3785],
| 99.99th=[ 4228]
bw ( MiB/s): min= 1077, max= 2530, per=100.00%, avg=1291.45, stdev=38.12, samples=960
iops : min=275929, max=647897, avg=330606.98, stdev=9759.31, samples=960
lat (usec) : 10=0.01%, 20=0.01%, 50=0.01%, 100=0.01%, 250=4.43%
lat (usec) : 500=2.36%, 750=2.75%, 1000=5.40%
lat (msec) : 2=68.41%, 4=16.63%, 10=0.02%
cpu : usr=10.78%, sys=38.14%, ctx=4668658, majf=0, minf=932
IO depths : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
issued rwts: total=9907939,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
READ: bw=1290MiB/s (1353MB/s), 1290MiB/s-1290MiB/s (1353MB/s-1353MB/s), io=37.8GiB (40.6GB), run=30003-30003msec
Disk stats (read/write):
nvme0n1: ios=9758021/0, merge=0/0, ticks=16205206/0, in_queue=16205206, util=99.88%
[-- Attachment #3: fio-io_uring-iodepth_32-numjobs_16_cached-node1.txt --]
[-- Type: text/plain, Size: 1848 bytes --]
nvme0: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=32
...
fio-3.25
Starting 16 processes
nvme0: (groupid=0, jobs=16): err= 0: pid=15898: Sat Mar 25 21:17:02 2023
read: IOPS=42.7k, BW=167MiB/s (175MB/s)(5008MiB/30014msec)
slat (usec): min=7, max=395, avg=30.62, stdev=13.25
clat (usec): min=170, max=27446, avg=11952.06, stdev=2938.27
lat (usec): min=200, max=27462, avg=11983.34, stdev=2937.03
clat percentiles (usec):
| 1.00th=[ 2311], 5.00th=[ 6194], 10.00th=[ 7635], 20.00th=[10028],
| 30.00th=[11469], 40.00th=[12387], 50.00th=[12911], 60.00th=[13304],
| 70.00th=[13304], 80.00th=[13304], 90.00th=[14222], 95.00th=[15926],
| 99.00th=[18744], 99.50th=[19530], 99.90th=[21890], 99.95th=[22938],
| 99.99th=[25297]
bw ( KiB/s): min=149937, max=201662, per=100.00%, avg=170916.13, stdev=605.68, samples=960
iops : min=37484, max=50414, avg=42728.37, stdev=151.41, samples=960
lat (usec) : 250=0.01%, 500=0.01%, 750=0.02%, 1000=0.05%
lat (msec) : 2=0.73%, 4=1.25%, 10=17.50%, 20=80.14%, 50=0.35%
cpu : usr=2.90%, sys=11.16%, ctx=632800, majf=0, minf=931
IO depths : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
issued rwts: total=1281560,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
READ: bw=167MiB/s (175MB/s), 167MiB/s-167MiB/s (175MB/s-175MB/s), io=5008MiB (5251MB), run=30014-30014msec
Disk stats (read/write):
nvme0n1: ios=1495933/0, merge=0/0, ticks=17792295/0, in_queue=17792295, util=99.91%
[-- Attachment #4: fio-io_uring-iodepth_32-numjobs_1_cached-node0.txt --]
[-- Type: text/plain, Size: 1782 bytes --]
nvme0: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=32
fio-3.25
Starting 1 process
nvme0: (groupid=0, jobs=1): err= 0: pid=13491: Sat Mar 25 20:56:30 2023
read: IOPS=80.5k, BW=314MiB/s (330MB/s)(9431MiB/30001msec)
slat (usec): min=5, max=184, avg=10.63, stdev= 5.08
clat (usec): min=114, max=1024, avg=385.48, stdev=31.45
lat (usec): min=125, max=1031, avg=396.33, stdev=31.59
clat percentiles (usec):
| 1.00th=[ 330], 5.00th=[ 343], 10.00th=[ 351], 20.00th=[ 363],
| 30.00th=[ 371], 40.00th=[ 375], 50.00th=[ 383], 60.00th=[ 388],
| 70.00th=[ 396], 80.00th=[ 408], 90.00th=[ 424], 95.00th=[ 441],
| 99.00th=[ 490], 99.50th=[ 510], 99.90th=[ 570], 99.95th=[ 611],
| 99.99th=[ 701]
bw ( KiB/s): min=317616, max=325771, per=100.00%, avg=322393.75, stdev=1897.32, samples=60
iops : min=79404, max=81440, avg=80598.30, stdev=474.27, samples=60
lat (usec) : 250=0.01%, 500=99.32%, 750=0.67%, 1000=0.01%
lat (msec) : 2=0.01%
cpu : usr=16.53%, sys=83.37%, ctx=2371, majf=0, minf=58
IO depths : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
issued rwts: total=2414344,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
READ: bw=314MiB/s (330MB/s), 314MiB/s-314MiB/s (330MB/s-330MB/s), io=9431MiB (9889MB), run=30001-30001msec
Disk stats (read/write):
nvme0n1: ios=2783897/0, merge=0/0, ticks=227688/0, in_queue=227688, util=99.87%
[-- Attachment #5: fio-io_uring-iodepth_32-numjobs_1_cached-node1.txt --]
[-- Type: text/plain, Size: 1787 bytes --]
nvme0: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=32
fio-3.25
Starting 1 process
nvme0: (groupid=0, jobs=1): err= 0: pid=13559: Sat Mar 25 20:57:08 2023
read: IOPS=69.7k, BW=272MiB/s (285MB/s)(8166MiB/30001msec)
slat (usec): min=5, max=184, avg=11.64, stdev= 5.99
clat (usec): min=122, max=1266, avg=445.95, stdev=80.88
lat (usec): min=145, max=1278, avg=457.85, stdev=80.70
clat percentiles (usec):
| 1.00th=[ 359], 5.00th=[ 379], 10.00th=[ 388], 20.00th=[ 400],
| 30.00th=[ 408], 40.00th=[ 416], 50.00th=[ 420], 60.00th=[ 433],
| 70.00th=[ 445], 80.00th=[ 465], 90.00th=[ 523], 95.00th=[ 685],
| 99.00th=[ 717], 99.50th=[ 750], 99.90th=[ 857], 99.95th=[ 906],
| 99.99th=[ 1012]
bw ( KiB/s): min=244000, max=289154, per=100.00%, avg=279096.80, stdev=10540.54, samples=59
iops : min=61000, max=72286, avg=69773.97, stdev=2635.18, samples=59
lat (usec) : 250=0.02%, 500=86.92%, 750=12.60%, 1000=0.45%
lat (msec) : 2=0.01%
cpu : usr=14.47%, sys=83.01%, ctx=110446, majf=0, minf=58
IO depths : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
issued rwts: total=2090351,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
READ: bw=272MiB/s (285MB/s), 272MiB/s-272MiB/s (285MB/s-285MB/s), io=8166MiB (8562MB), run=30001-30001msec
Disk stats (read/write):
nvme0n1: ios=2417164/0, merge=0/0, ticks=324253/0, in_queue=324253, util=99.86%
[-- Attachment #6: fio-libaio-iodepth_32-numjobs_16_cached-node0.txt --]
[-- Type: text/plain, Size: 1876 bytes --]
nvme0: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
...
fio-3.25
Starting 16 processes
nvme0: (groupid=0, jobs=16): err= 0: pid=7354: Sat Mar 25 02:39:08 2023
read: IOPS=143k, BW=558MiB/s (585MB/s)(16.3GiB/30002msec)
slat (usec): min=2, max=494, avg=107.63, stdev=38.76
clat (usec): min=3, max=5129, avg=3474.62, stdev=718.16
lat (usec): min=89, max=5239, avg=3582.75, stdev=740.04
clat percentiles (usec):
| 1.00th=[ 139], 5.00th=[ 3195], 10.00th=[ 3326], 20.00th=[ 3425],
| 30.00th=[ 3490], 40.00th=[ 3556], 50.00th=[ 3589], 60.00th=[ 3654],
| 70.00th=[ 3720], 80.00th=[ 3785], 90.00th=[ 3884], 95.00th=[ 3949],
| 99.00th=[ 4146], 99.50th=[ 4228], 99.90th=[ 4359], 99.95th=[ 4424],
| 99.99th=[ 4555]
bw ( KiB/s): min=537617, max=1295837, per=100.00%, avg=572538.79, stdev=30952.47, samples=945
iops : min=134400, max=323956, avg=143130.77, stdev=7738.13, samples=945
lat (usec) : 4=0.01%, 10=0.01%, 20=0.01%, 100=0.01%, 250=4.15%
lat (usec) : 500=0.01%, 750=0.01%, 1000=0.01%
lat (msec) : 2=0.01%, 4=92.29%, 10=3.57%
cpu : usr=5.04%, sys=14.48%, ctx=4107578, majf=0, minf=932
IO depths : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
issued rwts: total=4284741,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
READ: bw=558MiB/s (585MB/s), 558MiB/s-558MiB/s (585MB/s-585MB/s), io=16.3GiB (17.6GB), run=30002-30002msec
Disk stats (read/write):
nvme0n1: ios=4661495/0, merge=0/0, ticks=444782/0, in_queue=444782, util=99.84%
[-- Attachment #7: fio-libaio-iodepth_32-numjobs_16_cached-node1.txt --]
[-- Type: text/plain, Size: 1870 bytes --]
nvme0: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
...
fio-3.25
Starting 16 processes
nvme0: (groupid=0, jobs=16): err= 0: pid=7271: Sat Mar 25 02:36:41 2023
read: IOPS=135k, BW=528MiB/s (554MB/s)(15.5GiB/30001msec)
slat (usec): min=2, max=668, avg=113.97, stdev=48.67
clat (usec): min=4, max=10041, avg=3670.72, stdev=868.33
lat (usec): min=90, max=10163, avg=3785.19, stdev=892.89
clat percentiles (usec):
| 1.00th=[ 165], 5.00th=[ 3294], 10.00th=[ 3425], 20.00th=[ 3523],
| 30.00th=[ 3621], 40.00th=[ 3654], 50.00th=[ 3720], 60.00th=[ 3785],
| 70.00th=[ 3851], 80.00th=[ 3982], 90.00th=[ 4178], 95.00th=[ 4555],
| 99.00th=[ 5669], 99.50th=[ 6325], 99.90th=[ 7767], 99.95th=[ 8455],
| 99.99th=[ 9372]
bw ( KiB/s): min=490230, max=1224562, per=100.00%, avg=541736.83, stdev=28522.06, samples=944
iops : min=122554, max=306131, avg=135429.10, stdev=7130.52, samples=944
lat (usec) : 10=0.01%, 100=0.01%, 250=4.39%, 500=0.01%, 750=0.01%
lat (usec) : 1000=0.01%
lat (msec) : 2=0.01%, 4=77.74%, 10=17.87%, 20=0.01%
cpu : usr=4.63%, sys=15.84%, ctx=3878084, majf=0, minf=935
IO depths : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
issued rwts: total=4055689,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
READ: bw=528MiB/s (554MB/s), 528MiB/s-528MiB/s (554MB/s-554MB/s), io=15.5GiB (16.6GB), run=30001-30001msec
Disk stats (read/write):
nvme0n1: ios=4409225/0, merge=0/0, ticks=443138/0, in_queue=443138, util=99.84%
[-- Attachment #8: fio-libaio-iodepth_32-numjobs_1_cached-node0.txt --]
[-- Type: text/plain, Size: 1828 bytes --]
nvme0: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
fio-3.25
Starting 1 process
nvme0: (groupid=0, jobs=1): err= 0: pid=12162: Sat Mar 25 20:01:31 2023
read: IOPS=11.6k, BW=45.3MiB/s (47.5MB/s)(1360MiB/30001msec)
slat (usec): min=62, max=233, avg=83.01, stdev=10.20
clat (usec): min=4, max=3633, avg=2672.83, stdev=83.73
lat (usec): min=96, max=3717, avg=2756.19, stdev=84.99
clat percentiles (usec):
| 1.00th=[ 2540], 5.00th=[ 2573], 10.00th=[ 2606], 20.00th=[ 2606],
| 30.00th=[ 2638], 40.00th=[ 2638], 50.00th=[ 2671], 60.00th=[ 2671],
| 70.00th=[ 2704], 80.00th=[ 2704], 90.00th=[ 2737], 95.00th=[ 2802],
| 99.00th=[ 2999], 99.50th=[ 3097], 99.90th=[ 3425], 99.95th=[ 3458],
| 99.99th=[ 3589]
bw ( KiB/s): min=44961, max=46781, per=100.00%, avg=46463.59, stdev=377.86, samples=59
iops : min=11240, max=11695, avg=11615.75, stdev=94.47, samples=59
lat (usec) : 10=0.01%, 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%
lat (usec) : 1000=0.01%
lat (msec) : 2=0.01%, 4=100.00%
cpu : usr=4.90%, sys=14.71%, ctx=348133, majf=0, minf=58
IO depths : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
issued rwts: total=348120,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
READ: bw=45.3MiB/s (47.5MB/s), 45.3MiB/s-45.3MiB/s (47.5MB/s-47.5MB/s), io=1360MiB (1426MB), run=30001-30001msec
Disk stats (read/write):
nvme0n1: ios=405265/0, merge=0/0, ticks=28620/0, in_queue=28620, util=99.83%
[-- Attachment #9: fio-libaio-iodepth_32-numjobs_1_cached-node1.txt --]
[-- Type: text/plain, Size: 1801 bytes --]
nvme0: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
fio-3.25
Starting 1 process
nvme0: (groupid=0, jobs=1): err= 0: pid=12230: Sat Mar 25 20:02:07 2023
read: IOPS=11.3k, BW=44.1MiB/s (46.2MB/s)(1322MiB/30001msec)
slat (usec): min=64, max=478, avg=85.49, stdev=10.13
clat (usec): min=5, max=3794, avg=2750.26, stdev=81.49
lat (usec): min=109, max=3871, avg=2836.09, stdev=82.66
clat percentiles (usec):
| 1.00th=[ 2638], 5.00th=[ 2671], 10.00th=[ 2671], 20.00th=[ 2704],
| 30.00th=[ 2704], 40.00th=[ 2737], 50.00th=[ 2737], 60.00th=[ 2737],
| 70.00th=[ 2769], 80.00th=[ 2802], 90.00th=[ 2835], 95.00th=[ 2900],
| 99.00th=[ 3064], 99.50th=[ 3130], 99.90th=[ 3359], 99.95th=[ 3458],
| 99.99th=[ 3621]
bw ( KiB/s): min=44793, max=45592, per=100.00%, avg=45152.73, stdev=156.54, samples=59
iops : min=11198, max=11398, avg=11287.93, stdev=39.13, samples=59
lat (usec) : 10=0.01%, 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01%
lat (msec) : 2=0.01%, 4=100.00%
cpu : usr=4.75%, sys=15.33%, ctx=338322, majf=0, minf=58
IO depths : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
issued rwts: total=338317,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
READ: bw=44.1MiB/s (46.2MB/s), 44.1MiB/s-44.1MiB/s (46.2MB/s-46.2MB/s), io=1322MiB (1386MB), run=30001-30001msec
Disk stats (read/write):
nvme0n1: ios=394292/0, merge=0/0, ticks=28596/0, in_queue=28596, util=99.83%
next prev parent reply other threads:[~2023-03-31 7:54 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <ZB1JgJ2DxyTMVUHB@hornet>
2023-03-24 8:43 ` Read speed for a PCIe NVMe SSD is ridiculously slow on a multi-socket machine Damien Le Moal
2023-03-24 21:19 ` Alexander Shumakovitch
2023-03-25 1:52 ` Damien Le Moal
2023-03-31 7:53 ` Alexander Shumakovitch [this message]
2023-03-25 0:33 ` Alexander Shumakovitch
2023-03-25 1:56 ` Damien Le Moal
2023-03-24 19:34 ` Keith Busch
2023-03-24 21:38 ` Alexander Shumakovitch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZCaReYKRomOGsM9q@hornet \
--to=shurik@jhu.edu \
--cc=damien.lemoal@opensource.wdc.com \
--cc=linux-nvme@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox