public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* [jlayton:mgtime] [xfs]  4edee232ed:  fio.write_iops -34.9% regression
@ 2024-06-14  6:24 kernel test robot
  2024-06-20 20:16 ` Jeff Layton
  0 siblings, 1 reply; 3+ messages in thread
From: kernel test robot @ 2024-06-14  6:24 UTC (permalink / raw)
  To: Jeff Layton
  Cc: oe-lkp, lkp, linux-xfs, ying.huang, feng.tang, fengwei.yin,
	oliver.sang



Hello,

kernel test robot noticed a -34.9% regression of fio.write_iops on:


commit: 4edee232ed5d0abb9f24af7af55e3a9aa271f993 ("xfs: switch to multigrain timestamps")
https://git.kernel.org/cgit/linux/kernel/git/jlayton/linux.git mgtime

testcase: fio-basic
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:

	runtime: 300s
	disk: 1HDD
	fs: xfs
	nr_task: 1
	test_size: 128G
	rw: write
	bs: 4k
	ioengine: falloc
	cpufreq_governor: performance




If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202406141453.7a44f956-oliver.sang@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240614/202406141453.7a44f956-oliver.sang@intel.com

=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase:
  4k/gcc-13/performance/1HDD/xfs/falloc/x86_64-rhel-8.3/1/debian-12-x86_64-20240206.cgz/300s/write/lkp-icl-2sp9/128G/fio-basic

commit: 
  61651220e0 ("fs: have setattr_copy handle multigrain timestamps appropriately")
  4edee232ed ("xfs: switch to multigrain timestamps")

61651220e0b91087 4edee232ed5d0abb9f24af7af55 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.97 ±  3%     -30.7%       0.67 ±  2%  iostat.cpu.user
 2.996e+09           +51.5%   4.54e+09        cpuidle..time
    222280 ±  4%     +44.7%     321595 ±  4%  cpuidle..usage
      0.01 ±  5%      -0.0        0.01 ±  6%  mpstat.cpu.all.irq%
      0.97 ±  3%      -0.3        0.66 ±  2%  mpstat.cpu.all.usr%
     88.86           +27.3%     113.13        uptime.boot
      5387           +28.4%       6916        uptime.idle
      2.98 ±  3%     -10.9%       2.65 ±  2%  vmstat.procs.r
      3475 ± 10%     -18.6%       2830 ±  6%  vmstat.system.cs
      4.65 ± 43%      -2.7        1.97 ±143%  perf-profile.calltrace.cycles-pp._free_event.perf_event_release_kernel.perf_release.__fput.task_work_run
      4.65 ± 43%      -2.7        1.97 ±143%  perf-profile.children.cycles-pp._free_event
      3.33 ± 76%      -2.4        0.90 ±141%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      3.33 ± 76%      -2.4        0.90 ±141%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
    769.93            +9.4%     842.10        proc-vmstat.nr_active_anon
      3936            +2.1%       4020        proc-vmstat.nr_shmem
    769.93            +9.4%     842.10        proc-vmstat.nr_zone_active_anon
    269328           +20.8%     325325 ± 11%  proc-vmstat.numa_hit
    203054 ±  2%     +27.6%     259008 ± 14%  proc-vmstat.numa_local
    297923           +16.3%     346459        proc-vmstat.pgalloc_normal
    181868 ±  2%     +30.2%     236868        proc-vmstat.pgfault
    173268 ±  3%     +27.2%     220312        proc-vmstat.pgfree
      9141 ±  7%     +23.5%      11288 ±  4%  proc-vmstat.pgreuse
      0.02 ± 26%      +0.1        0.10 ±  6%  fio.latency_10us%
     99.87            -8.4       91.43        fio.latency_2us%
      0.11 ± 20%      +8.4        8.47        fio.latency_4us%
     46.16           +53.3%      70.78        fio.time.elapsed_time
     46.16           +53.3%      70.78        fio.time.elapsed_time.max
     35.68           +66.7%      59.50        fio.time.system_time
      4940           +52.6%       7538        fio.time.voluntary_context_switches
      2857           -34.9%       1859        fio.write_bw_MBps
      1176           +64.4%       1933        fio.write_clat_90%_ns
      1200           +83.1%       2197        fio.write_clat_95%_ns
      1528           +46.6%       2240        fio.write_clat_99%_ns
      1167           +62.2%       1893        fio.write_clat_mean_ns
    731537           -34.9%     476002        fio.write_iops
      0.06 ±  6%     -25.5%       0.04 ±  5%  perf-stat.i.MPKI
      0.91 ±  3%      -0.2        0.67 ±  3%  perf-stat.i.branch-miss-rate%
  27659069 ±  3%     -28.0%   19920836 ±  4%  perf-stat.i.branch-misses
    822504 ±  5%     -25.2%     615111 ±  6%  perf-stat.i.cache-misses
   7527159 ±  6%     -26.9%    5499750 ±  3%  perf-stat.i.cache-references
      3394 ± 11%     -18.8%       2756 ±  7%  perf-stat.i.context-switches
      0.46 ±  2%     -13.0%       0.40        perf-stat.i.cpi
 5.727e+09 ±  2%     -12.3%   5.02e+09        perf-stat.i.cpu-cycles
     74.31            -3.0%      72.05        perf-stat.i.cpu-migrations
      2.31           +13.1%       2.61        perf-stat.i.ipc
      2905 ±  2%      -7.2%       2695 ±  2%  perf-stat.i.minor-faults
      2905 ±  2%      -7.2%       2695 ±  2%  perf-stat.i.page-faults
      0.07 ±  6%     -25.7%       0.05 ±  5%  perf-stat.overall.MPKI
      1.18 ±  3%      -0.3        0.87 ±  2%  perf-stat.overall.branch-miss-rate%
      0.48 ±  2%     -12.9%       0.42        perf-stat.overall.cpi
      6992 ±  6%     +17.1%       8190 ±  5%  perf-stat.overall.cycles-between-cache-misses
      2.09 ±  2%     +14.7%       2.40        perf-stat.overall.ipc
     16640           +53.3%      25504        perf-stat.overall.path-length
  27090197 ±  3%     -27.4%   19666246 ±  4%  perf-stat.ps.branch-misses
    805963 ±  5%     -24.6%     607413 ±  6%  perf-stat.ps.cache-misses
   7402971 ±  6%     -26.4%    5446622 ±  3%  perf-stat.ps.cache-references
      3329 ± 11%     -18.2%       2723 ±  7%  perf-stat.ps.context-switches
 5.616e+09 ±  2%     -11.7%  4.956e+09        perf-stat.ps.cpu-cycles
      2843 ±  2%      -6.5%       2657 ±  2%  perf-stat.ps.minor-faults
      2843 ±  2%      -6.5%       2657 ±  2%  perf-stat.ps.page-faults
 5.584e+11           +53.3%  8.558e+11        perf-stat.total.instructions




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [jlayton:mgtime] [xfs]  4edee232ed:  fio.write_iops -34.9% regression
  2024-06-14  6:24 [jlayton:mgtime] [xfs] 4edee232ed: fio.write_iops -34.9% regression kernel test robot
@ 2024-06-20 20:16 ` Jeff Layton
  2024-06-21  2:47   ` Oliver Sang
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff Layton @ 2024-06-20 20:16 UTC (permalink / raw)
  To: kernel test robot
  Cc: oe-lkp, lkp, linux-xfs, ying.huang, feng.tang, fengwei.yin

On Fri, 2024-06-14 at 14:24 +0800, kernel test robot wrote:
> 
> 
> Hello,
> 
> kernel test robot noticed a -34.9% regression of fio.write_iops on:
> 
> 
> commit: 4edee232ed5d0abb9f24af7af55e3a9aa271f993 ("xfs: switch to multigrain timestamps")
> https://git.kernel.org/cgit/linux/kernel/git/jlayton/linux.git mgtime
> 
> testcase: fio-basic
> test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
> parameters:
> 
> 	runtime: 300s
> 	disk: 1HDD
> 	fs: xfs
> 	nr_task: 1
> 	test_size: 128G
> 	rw: write
> 	bs: 4k
> 	ioengine: falloc
> 	cpufreq_governor: performance
> 
> 
> 
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> > Reported-by: kernel test robot <oliver.sang@intel.com>
> > Closes: https://lore.kernel.org/oe-lkp/202406141453.7a44f956-oliver.sang@intel.com
> 
> 
> Details are as below:
> -------------------------------------------------------------------------------------------------->
> 
> 
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20240614/202406141453.7a44f956-oliver.sang@intel.com
> 
> =========================================================================================
> bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase:
>   4k/gcc-13/performance/1HDD/xfs/falloc/x86_64-rhel-8.3/1/debian-12-x86_64-20240206.cgz/300s/write/lkp-icl-2sp9/128G/fio-basic
> 
> commit: 
>   61651220e0 ("fs: have setattr_copy handle multigrain timestamps appropriately")
>   4edee232ed ("xfs: switch to multigrain timestamps")
> 
> 61651220e0b91087 4edee232ed5d0abb9f24af7af55 
> ---------------- --------------------------- 
>          %stddev     %change         %stddev
>              \          |                \  
>       0.97 ±  3%     -30.7%       0.67 ±  2%  iostat.cpu.user
>  2.996e+09           +51.5%   4.54e+09        cpuidle..time
>     222280 ±  4%     +44.7%     321595 ±  4%  cpuidle..usage
>       0.01 ±  5%      -0.0        0.01 ±  6%  mpstat.cpu.all.irq%
>       0.97 ±  3%      -0.3        0.66 ±  2%  mpstat.cpu.all.usr%
>      88.86           +27.3%     113.13        uptime.boot
>       5387           +28.4%       6916        uptime.idle
>       2.98 ±  3%     -10.9%       2.65 ±  2%  vmstat.procs.r
>       3475 ± 10%     -18.6%       2830 ±  6%  vmstat.system.cs
>       4.65 ± 43%      -2.7        1.97 ±143%  perf-profile.calltrace.cycles-pp._free_event.perf_event_release_kernel.perf_release.__fput.task_work_run
>       4.65 ± 43%      -2.7        1.97 ±143%  perf-profile.children.cycles-pp._free_event
>       3.33 ± 76%      -2.4        0.90 ±141%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
>       3.33 ± 76%      -2.4        0.90 ±141%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
>     769.93            +9.4%     842.10        proc-vmstat.nr_active_anon
>       3936            +2.1%       4020        proc-vmstat.nr_shmem
>     769.93            +9.4%     842.10        proc-vmstat.nr_zone_active_anon
>     269328           +20.8%     325325 ± 11%  proc-vmstat.numa_hit
>     203054 ±  2%     +27.6%     259008 ± 14%  proc-vmstat.numa_local
>     297923           +16.3%     346459        proc-vmstat.pgalloc_normal
>     181868 ±  2%     +30.2%     236868        proc-vmstat.pgfault
>     173268 ±  3%     +27.2%     220312        proc-vmstat.pgfree
>       9141 ±  7%     +23.5%      11288 ±  4%  proc-vmstat.pgreuse
>       0.02 ± 26%      +0.1        0.10 ±  6%  fio.latency_10us%
>      99.87            -8.4       91.43        fio.latency_2us%
>       0.11 ± 20%      +8.4        8.47        fio.latency_4us%
>      46.16           +53.3%      70.78        fio.time.elapsed_time
>      46.16           +53.3%      70.78        fio.time.elapsed_time.max
>      35.68           +66.7%      59.50        fio.time.system_time
>       4940           +52.6%       7538        fio.time.voluntary_context_switches
>       2857           -34.9%       1859        fio.write_bw_MBps
>       1176           +64.4%       1933        fio.write_clat_90%_ns
>       1200           +83.1%       2197        fio.write_clat_95%_ns
>       1528           +46.6%       2240        fio.write_clat_99%_ns
>       1167           +62.2%       1893        fio.write_clat_mean_ns
>     731537           -34.9%     476002        fio.write_iops

I've been trying for several days to reproduce this, and have been
unable so far. Is this the same value as "write.iops" in the json
output? That's been my assumption, but I wanted to check that first.

That said, I'm only getting ~500k iops at best in this test with the
rig I have, so it's possible I need something faster to show it.


>       0.06 ±  6%     -25.5%       0.04 ±  5%  perf-stat.i.MPKI
>       0.91 ±  3%      -0.2        0.67 ±  3%  perf-stat.i.branch-miss-rate%
>   27659069 ±  3%     -28.0%   19920836 ±  4%  perf-stat.i.branch-misses
>     822504 ±  5%     -25.2%     615111 ±  6%  perf-stat.i.cache-misses
>    7527159 ±  6%     -26.9%    5499750 ±  3%  perf-stat.i.cache-references
>       3394 ± 11%     -18.8%       2756 ±  7%  perf-stat.i.context-switches
>       0.46 ±  2%     -13.0%       0.40        perf-stat.i.cpi
>  5.727e+09 ±  2%     -12.3%   5.02e+09        perf-stat.i.cpu-cycles
>      74.31            -3.0%      72.05        perf-stat.i.cpu-migrations
>       2.31           +13.1%       2.61        perf-stat.i.ipc
>       2905 ±  2%      -7.2%       2695 ±  2%  perf-stat.i.minor-faults
>       2905 ±  2%      -7.2%       2695 ±  2%  perf-stat.i.page-faults
>       0.07 ±  6%     -25.7%       0.05 ±  5%  perf-stat.overall.MPKI
>       1.18 ±  3%      -0.3        0.87 ±  2%  perf-stat.overall.branch-miss-rate%
>       0.48 ±  2%     -12.9%       0.42        perf-stat.overall.cpi
>       6992 ±  6%     +17.1%       8190 ±  5%  perf-stat.overall.cycles-between-cache-misses
>       2.09 ±  2%     +14.7%       2.40        perf-stat.overall.ipc
>      16640           +53.3%      25504        perf-stat.overall.path-length
>   27090197 ±  3%     -27.4%   19666246 ±  4%  perf-stat.ps.branch-misses
>     805963 ±  5%     -24.6%     607413 ±  6%  perf-stat.ps.cache-misses
>    7402971 ±  6%     -26.4%    5446622 ±  3%  perf-stat.ps.cache-references
>       3329 ± 11%     -18.2%       2723 ±  7%  perf-stat.ps.context-switches
>  5.616e+09 ±  2%     -11.7%  4.956e+09        perf-stat.ps.cpu-cycles
>       2843 ±  2%      -6.5%       2657 ±  2%  perf-stat.ps.minor-faults
>       2843 ±  2%      -6.5%       2657 ±  2%  perf-stat.ps.page-faults
>  5.584e+11           +53.3%  8.558e+11        perf-stat.total.instructions
> 
> 
> 
> 
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
> 
> 

Thanks!
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [jlayton:mgtime] [xfs]  4edee232ed:  fio.write_iops -34.9% regression
  2024-06-20 20:16 ` Jeff Layton
@ 2024-06-21  2:47   ` Oliver Sang
  0 siblings, 0 replies; 3+ messages in thread
From: Oliver Sang @ 2024-06-21  2:47 UTC (permalink / raw)
  To: Jeff Layton
  Cc: oe-lkp, lkp, linux-xfs, ying.huang, feng.tang, fengwei.yin,
	oliver.sang

[-- Attachment #1: Type: text/plain, Size: 7819 bytes --]

hi, Jeff Layton,

On Thu, Jun 20, 2024 at 04:16:30PM -0400, Jeff Layton wrote:
> On Fri, 2024-06-14 at 14:24 +0800, kernel test robot wrote:
> > 
> > 
> > Hello,
> > 
> > kernel test robot noticed a -34.9% regression of fio.write_iops on:
> > 
> > 
> > commit: 4edee232ed5d0abb9f24af7af55e3a9aa271f993 ("xfs: switch to multigrain timestamps")
> > https://git.kernel.org/cgit/linux/kernel/git/jlayton/linux.git mgtime
> > 
> > testcase: fio-basic
> > test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
> > parameters:
> > 
> > 	runtime: 300s
> > 	disk: 1HDD
> > 	fs: xfs
> > 	nr_task: 1
> > 	test_size: 128G
> > 	rw: write
> > 	bs: 4k
> > 	ioengine: falloc
> > 	cpufreq_governor: performance
> > 
> > 
> > 
> > 
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > > Reported-by: kernel test robot <oliver.sang@intel.com>
> > > Closes: https://lore.kernel.org/oe-lkp/202406141453.7a44f956-oliver.sang@intel.com
> > 
> > 
> > Details are as below:
> > -------------------------------------------------------------------------------------------------->
> > 
> > 
> > The kernel config and materials to reproduce are available at:
> > https://download.01.org/0day-ci/archive/20240614/202406141453.7a44f956-oliver.sang@intel.com
> > 
> > =========================================================================================
> > bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase:
> >   4k/gcc-13/performance/1HDD/xfs/falloc/x86_64-rhel-8.3/1/debian-12-x86_64-20240206.cgz/300s/write/lkp-icl-2sp9/128G/fio-basic
> > 
> > commit: 
> >   61651220e0 ("fs: have setattr_copy handle multigrain timestamps appropriately")
> >   4edee232ed ("xfs: switch to multigrain timestamps")
> > 
> > 61651220e0b91087 4edee232ed5d0abb9f24af7af55 
> > ---------------- --------------------------- 
> >          %stddev     %change         %stddev
> >              \          |                \  
> >       0.97 ±  3%     -30.7%       0.67 ±  2%  iostat.cpu.user
> >  2.996e+09           +51.5%   4.54e+09        cpuidle..time
> >     222280 ±  4%     +44.7%     321595 ±  4%  cpuidle..usage
> >       0.01 ±  5%      -0.0        0.01 ±  6%  mpstat.cpu.all.irq%
> >       0.97 ±  3%      -0.3        0.66 ±  2%  mpstat.cpu.all.usr%
> >      88.86           +27.3%     113.13        uptime.boot
> >       5387           +28.4%       6916        uptime.idle
> >       2.98 ±  3%     -10.9%       2.65 ±  2%  vmstat.procs.r
> >       3475 ± 10%     -18.6%       2830 ±  6%  vmstat.system.cs
> >       4.65 ± 43%      -2.7        1.97 ±143%  perf-profile.calltrace.cycles-pp._free_event.perf_event_release_kernel.perf_release.__fput.task_work_run
> >       4.65 ± 43%      -2.7        1.97 ±143%  perf-profile.children.cycles-pp._free_event
> >       3.33 ± 76%      -2.4        0.90 ±141%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
> >       3.33 ± 76%      -2.4        0.90 ±141%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
> >     769.93            +9.4%     842.10        proc-vmstat.nr_active_anon
> >       3936            +2.1%       4020        proc-vmstat.nr_shmem
> >     769.93            +9.4%     842.10        proc-vmstat.nr_zone_active_anon
> >     269328           +20.8%     325325 ± 11%  proc-vmstat.numa_hit
> >     203054 ±  2%     +27.6%     259008 ± 14%  proc-vmstat.numa_local
> >     297923           +16.3%     346459        proc-vmstat.pgalloc_normal
> >     181868 ±  2%     +30.2%     236868        proc-vmstat.pgfault
> >     173268 ±  3%     +27.2%     220312        proc-vmstat.pgfree
> >       9141 ±  7%     +23.5%      11288 ±  4%  proc-vmstat.pgreuse
> >       0.02 ± 26%      +0.1        0.10 ±  6%  fio.latency_10us%
> >      99.87            -8.4       91.43        fio.latency_2us%
> >       0.11 ± 20%      +8.4        8.47        fio.latency_4us%
> >      46.16           +53.3%      70.78        fio.time.elapsed_time
> >      46.16           +53.3%      70.78        fio.time.elapsed_time.max
> >      35.68           +66.7%      59.50        fio.time.system_time
> >       4940           +52.6%       7538        fio.time.voluntary_context_switches
> >       2857           -34.9%       1859        fio.write_bw_MBps
> >       1176           +64.4%       1933        fio.write_clat_90%_ns
> >       1200           +83.1%       2197        fio.write_clat_95%_ns
> >       1528           +46.6%       2240        fio.write_clat_99%_ns
> >       1167           +62.2%       1893        fio.write_clat_mean_ns
> >     731537           -34.9%     476002        fio.write_iops
> 
> I've been trying for several days to reproduce this, and have been
> unable so far. Is this the same value as "write.iops" in the json
> output? That's been my assumption, but I wanted to check that first.

right. then we calculate the average from 6 runs.

I attached fio.output from one run for 4edee232ed as example, witin it:

      "write" : {
        "io_bytes" : 137438953472,
        "io_kbytes" : 134217728,
        "bw_bytes" : 1950180255,
        "bw" : 1904472,
        "iops" : 476118.226321,     <-----
        "runtime" : 70475,
        "total_ios" : 33554432,
        "short_ios" : 0,
        "drop_ios" : 0,
...


> 
> That said, I'm only getting ~500k iops at best in this test with the
> rig I have, so it's possible I need something faster to show it.
> 
> 
> >       0.06 ±  6%     -25.5%       0.04 ±  5%  perf-stat.i.MPKI
> >       0.91 ±  3%      -0.2        0.67 ±  3%  perf-stat.i.branch-miss-rate%
> >   27659069 ±  3%     -28.0%   19920836 ±  4%  perf-stat.i.branch-misses
> >     822504 ±  5%     -25.2%     615111 ±  6%  perf-stat.i.cache-misses
> >    7527159 ±  6%     -26.9%    5499750 ±  3%  perf-stat.i.cache-references
> >       3394 ± 11%     -18.8%       2756 ±  7%  perf-stat.i.context-switches
> >       0.46 ±  2%     -13.0%       0.40        perf-stat.i.cpi
> >  5.727e+09 ±  2%     -12.3%   5.02e+09        perf-stat.i.cpu-cycles
> >      74.31            -3.0%      72.05        perf-stat.i.cpu-migrations
> >       2.31           +13.1%       2.61        perf-stat.i.ipc
> >       2905 ±  2%      -7.2%       2695 ±  2%  perf-stat.i.minor-faults
> >       2905 ±  2%      -7.2%       2695 ±  2%  perf-stat.i.page-faults
> >       0.07 ±  6%     -25.7%       0.05 ±  5%  perf-stat.overall.MPKI
> >       1.18 ±  3%      -0.3        0.87 ±  2%  perf-stat.overall.branch-miss-rate%
> >       0.48 ±  2%     -12.9%       0.42        perf-stat.overall.cpi
> >       6992 ±  6%     +17.1%       8190 ±  5%  perf-stat.overall.cycles-between-cache-misses
> >       2.09 ±  2%     +14.7%       2.40        perf-stat.overall.ipc
> >      16640           +53.3%      25504        perf-stat.overall.path-length
> >   27090197 ±  3%     -27.4%   19666246 ±  4%  perf-stat.ps.branch-misses
> >     805963 ±  5%     -24.6%     607413 ±  6%  perf-stat.ps.cache-misses
> >    7402971 ±  6%     -26.4%    5446622 ±  3%  perf-stat.ps.cache-references
> >       3329 ± 11%     -18.2%       2723 ±  7%  perf-stat.ps.context-switches
> >  5.616e+09 ±  2%     -11.7%  4.956e+09        perf-stat.ps.cpu-cycles
> >       2843 ±  2%      -6.5%       2657 ±  2%  perf-stat.ps.minor-faults
> >       2843 ±  2%      -6.5%       2657 ±  2%  perf-stat.ps.page-faults
> >  5.584e+11           +53.3%  8.558e+11        perf-stat.total.instructions
> > 
> > 
> > 
> > 
> > Disclaimer:
> > Results have been estimated based on internal Intel analysis and are provided
> > for informational purposes only. Any difference in system hardware or software
> > design or configuration may affect actual performance.
> > 
> > 
> 
> Thanks!
> -- 
> Jeff Layton <jlayton@kernel.org>

[-- Attachment #2: fio.output --]
[-- Type: text/plain, Size: 6864 bytes --]

note: both iodepth >= 1 and synchronous I/O engine are selected, queue depth will be capped at 1
{
  "fio version" : "fio-3.33",
  "timestamp" : 1718334968,
  "timestamp_ms" : 1718334968213,
  "time" : "Fri Jun 14 03:16:08 2024",
  "global options" : {
    "bs" : "4k",
    "ioengine" : "falloc",
    "iodepth" : "32",
    "size" : "137438953472",
    "direct" : "0",
    "runtime" : "300",
    "file_service_type" : "roundrobin",
    "random_distribution" : "random",
    "pre_read" : "0",
    "nrfiles" : "1",
    "filesize" : "137438953472",
    "invalidate" : "1",
    "fallocate" : "posix",
    "io_size" : "137438953472"
  },
  "jobs" : [
    {
      "jobname" : "task_0",
      "groupid" : 0,
      "error" : 0,
      "eta" : 0,
      "elapsed" : 71,
      "job options" : {
        "rw" : "write",
        "directory" : "/fs/sdb1",
        "numjobs" : "1"
      },
      "read" : {
        "io_bytes" : 0,
        "io_kbytes" : 0,
        "bw_bytes" : 0,
        "bw" : 0,
        "iops" : 0.000000,
        "runtime" : 0,
        "total_ios" : 0,
        "short_ios" : 0,
        "drop_ios" : 0,
        "slat_ns" : {
          "min" : 0,
          "max" : 0,
          "mean" : 0.000000,
          "stddev" : 0.000000,
          "N" : 0
        },
        "clat_ns" : {
          "min" : 0,
          "max" : 0,
          "mean" : 0.000000,
          "stddev" : 0.000000,
          "N" : 0
        },
        "lat_ns" : {
          "min" : 0,
          "max" : 0,
          "mean" : 0.000000,
          "stddev" : 0.000000,
          "N" : 0
        },
        "bw_min" : 0,
        "bw_max" : 0,
        "bw_agg" : 0.000000,
        "bw_mean" : 0.000000,
        "bw_dev" : 0.000000,
        "bw_samples" : 0,
        "iops_min" : 0,
        "iops_max" : 0,
        "iops_mean" : 0.000000,
        "iops_stddev" : 0.000000,
        "iops_samples" : 0
      },
      "write" : {
        "io_bytes" : 137438953472,
        "io_kbytes" : 134217728,
        "bw_bytes" : 1950180255,
        "bw" : 1904472,
        "iops" : 476118.226321,
        "runtime" : 70475,
        "total_ios" : 33554432,
        "short_ios" : 0,
        "drop_ios" : 0,
        "slat_ns" : {
          "min" : 0,
          "max" : 0,
          "mean" : 0.000000,
          "stddev" : 0.000000,
          "N" : 0
        },
        "clat_ns" : {
          "min" : 1778,
          "max" : 71068,
          "mean" : 1894.879822,
          "stddev" : 149.956185,
          "N" : 33554432,
          "percentile" : {
            "1.000000" : 1816,
            "5.000000" : 1816,
            "10.000000" : 1832,
            "20.000000" : 1832,
            "30.000000" : 1848,
            "40.000000" : 1848,
            "50.000000" : 1864,
            "60.000000" : 1864,
            "70.000000" : 1880,
            "80.000000" : 1896,
            "90.000000" : 1944,
            "95.000000" : 2192,
            "99.000000" : 2256,
            "99.500000" : 2288,
            "99.900000" : 4080,
            "99.950000" : 4512,
            "99.990000" : 5408
          }
        },
        "lat_ns" : {
          "min" : 1812,
          "max" : 71102,
          "mean" : 1929.203544,
          "stddev" : 150.831850,
          "N" : 33554432
        },
        "bw_min" : 1823696,
        "bw_max" : 1925208,
        "bw_agg" : 100.000000,
        "bw_mean" : 1904624.685714,
        "bw_dev" : 19260.493657,
        "bw_samples" : 140,
        "iops_min" : 455924,
        "iops_max" : 481302,
        "iops_mean" : 476156.171429,
        "iops_stddev" : 4815.123414,
        "iops_samples" : 140
      },
      "trim" : {
        "io_bytes" : 0,
        "io_kbytes" : 0,
        "bw_bytes" : 0,
        "bw" : 0,
        "iops" : 0.000000,
        "runtime" : 0,
        "total_ios" : 0,
        "short_ios" : 0,
        "drop_ios" : 0,
        "slat_ns" : {
          "min" : 0,
          "max" : 0,
          "mean" : 0.000000,
          "stddev" : 0.000000,
          "N" : 0
        },
        "clat_ns" : {
          "min" : 0,
          "max" : 0,
          "mean" : 0.000000,
          "stddev" : 0.000000,
          "N" : 0
        },
        "lat_ns" : {
          "min" : 0,
          "max" : 0,
          "mean" : 0.000000,
          "stddev" : 0.000000,
          "N" : 0
        },
        "bw_min" : 0,
        "bw_max" : 0,
        "bw_agg" : 0.000000,
        "bw_mean" : 0.000000,
        "bw_dev" : 0.000000,
        "bw_samples" : 0,
        "iops_min" : 0,
        "iops_max" : 0,
        "iops_mean" : 0.000000,
        "iops_stddev" : 0.000000,
        "iops_samples" : 0
      },
      "sync" : {
        "total_ios" : 0,
        "lat_ns" : {
          "min" : 0,
          "max" : 0,
          "mean" : 0.000000,
          "stddev" : 0.000000,
          "N" : 0
        }
      },
      "job_runtime" : 70474,
      "usr_cpu" : 15.249596,
      "sys_cpu" : 84.560263,
      "ctx" : 177,
      "majf" : 0,
      "minf" : 11,
      "iodepth_level" : {
        "1" : 100.000000,
        "2" : 0.000000,
        "4" : 0.000000,
        "8" : 0.000000,
        "16" : 0.000000,
        "32" : 0.000000,
        ">=64" : 0.000000
      },
      "iodepth_submit" : {
        "0" : 0.000000,
        "4" : 100.000000,
        "8" : 0.000000,
        "16" : 0.000000,
        "32" : 0.000000,
        "64" : 0.000000,
        ">=64" : 0.000000
      },
      "iodepth_complete" : {
        "0" : 0.000000,
        "4" : 100.000000,
        "8" : 0.000000,
        "16" : 0.000000,
        "32" : 0.000000,
        "64" : 0.000000,
        ">=64" : 0.000000
      },
      "latency_ns" : {
        "2" : 0.000000,
        "4" : 0.000000,
        "10" : 0.000000,
        "20" : 0.000000,
        "50" : 0.000000,
        "100" : 0.000000,
        "250" : 0.000000,
        "500" : 0.000000,
        "750" : 0.000000,
        "1000" : 0.000000
      },
      "latency_us" : {
        "2" : 91.456613,
        "4" : 8.435845,
        "10" : 0.107190,
        "20" : 0.010000,
        "50" : 0.010000,
        "100" : 0.010000,
        "250" : 0.000000,
        "500" : 0.000000,
        "750" : 0.000000,
        "1000" : 0.000000
      },
      "latency_ms" : {
        "2" : 0.000000,
        "4" : 0.000000,
        "10" : 0.000000,
        "20" : 0.000000,
        "50" : 0.000000,
        "100" : 0.000000,
        "250" : 0.000000,
        "500" : 0.000000,
        "750" : 0.000000,
        "1000" : 0.000000,
        "2000" : 0.000000,
        ">=2000" : 0.000000
      },
      "latency_depth" : 32,
      "latency_target" : 0,
      "latency_percentile" : 100.000000,
      "latency_window" : 0
    }
  ],
  "disk_util" : [
    {
      "name" : "sdb",
      "read_ios" : 0,
      "write_ios" : 18,
      "read_merges" : 0,
      "write_merges" : 6,
      "read_ticks" : 0,
      "write_ticks" : 5662,
      "in_queue" : 6085,
      "util" : 1.103687
    }
  ]
}

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-06-21  2:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-14  6:24 [jlayton:mgtime] [xfs] 4edee232ed: fio.write_iops -34.9% regression kernel test robot
2024-06-20 20:16 ` Jeff Layton
2024-06-21  2:47   ` Oliver Sang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox