public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Filipe Manana <fdmanana@kernel.org>
To: kernel test robot <oliver.sang@intel.com>
Cc: Filipe Manana <fdmanana@suse.com>,
	oe-lkp@lists.linux.dev, lkp@intel.com,
	David Sterba <dsterba@suse.com>,
	Josef Bacik <josef@toxicpanda.com>,
	linux-btrfs@vger.kernel.org, ying.huang@intel.com,
	feng.tang@intel.com, fengwei.yin@intel.com
Subject: Re: [kdave-btrfs-devel:dev/guilherme/temp-fsid-v4] [btrfs] 6c9131ed0d: stress-ng.sync-file.ops_per_sec -44.2% regression
Date: Tue, 26 Sep 2023 20:01:35 +0100	[thread overview]
Message-ID: <ZRMqjzDP/G+MKL5R@debian0.Home> (raw)
In-Reply-To: <202309261552.a03eeb4c-oliver.sang@intel.com>

On Tue, Sep 26, 2023 at 03:34:59PM +0800, kernel test robot wrote:
> 
> 
> Hello,
> 
> kernel test robot noticed a -44.2% regression of stress-ng.sync-file.ops_per_sec on:
> 
> 
> commit: 6c9131ed0d644324adeeaccd2feeef8d04950b2d ("btrfs: always reserve space for delayed refs when starting transaction")
> https://github.com/kdave/btrfs-devel.git dev/guilherme/temp-fsid-v4

David, can you remove this patch from misc-next/for-next in the meanwhile?

Starting to reserve space in advance for delayed refs is causing the slowdown,
and I can reproduce it with the stress-ng test reported below.

By avoiding refilling the delayed block reserve I can recover about 60% of the
lost performance, but that increases the chance in extreme scenarios of exhausting
the global reserve and reaching a dead end -ENOSPC while committing transactions.
It has happened rarely, both upstream and on SLE kernels.

At the moment I don't see how to keep both the upfront reservation of space for
delayed refs and the refill of the delayed refs reserve without the performance
impact on a test like that stress-ng test.

Thanks.


> 
> testcase: stress-ng
> test machine: 36 threads 1 sockets Intel(R) Core(TM) i9-9980XE CPU @ 3.00GHz (Skylake) with 32G memory
> parameters:
> 
> 	nr_threads: 10%
> 	disk: 1SSD
> 	testtime: 60s
> 	fs: btrfs
> 	class: filesystem
> 	test: sync-file
> 	cpufreq_governor: performance
> 
> 
> In addition to that, the commit also has significant impact on the following tests:
> 
> +------------------+-------------------------------------------------------------------------------------------+
> | testcase: change | fio-basic: fio.write_iops -11.2% regression                                               |
> | test machine     | 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory |
> | test parameters  | bs=4k                                                                                     |
> |                  | cpufreq_governor=performance                                                              |
> |                  | disk=1HDD                                                                                 |
> |                  | fs=btrfs                                                                                  |
> |                  | ioengine=ftruncate                                                                        |
> |                  | nr_task=100%                                                                              |
> |                  | runtime=300s                                                                              |
> |                  | rw=write                                                                                  |
> |                  | test_size=128G                                                                            |
> +------------------+-------------------------------------------------------------------------------------------+
> 
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202309261552.a03eeb4c-oliver.sang@intel.com
> 
> 
> Details are as below:
> -------------------------------------------------------------------------------------------------->
> 
> 
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20230926/202309261552.a03eeb4c-oliver.sang@intel.com
> 
> =========================================================================================
> class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
>   filesystem/gcc-12/performance/1SSD/btrfs/x86_64-rhel-8.3/10%/debian-11.1-x86_64-20220510.cgz/lkp-skl-d08/sync-file/stress-ng/60s
> 
> commit: 
>   d80879b7d6 ("btrfs: stop doing excessive space reservation for csum deletion")
>   6c9131ed0d ("btrfs: always reserve space for delayed refs when starting transaction")
> 
> d80879b7d6aff432 6c9131ed0d644324adeeaccd2fe 
> ---------------- --------------------------- 
>          %stddev     %change         %stddev
>              \          |                \  
>      10499 ± 10%     -18.6%       8544 ±  3%  meminfo.Active(file)
>      91.01            +1.6%      92.44        iostat.cpu.idle
>       0.13 ±  4%    +742.3%       1.12 ± 12%  iostat.cpu.iowait
>       4.19           -16.5%       3.49        iostat.cpu.system
>       4.67           -36.9%       2.94 ±  4%  iostat.cpu.user
>     135390 ±  3%     +46.3%     198023 ±  3%  vmstat.io.bo
>       3.00           -33.3%       2.00        vmstat.procs.r
>      36293           +17.3%      42567 ±  2%  vmstat.system.cs
>      52794            +2.8%      54279        vmstat.system.in
>       0.13 ±  4%      +1.0        1.15 ± 12%  mpstat.cpu.all.iowait%
>       0.72 ±  2%      +0.3        0.99 ±  4%  mpstat.cpu.all.irq%
>       0.03 ±  7%      +0.0        0.04 ±  7%  mpstat.cpu.all.soft%
>       3.50            -1.0        2.51 ±  2%  mpstat.cpu.all.sys%
>       4.78            -1.8        3.00 ±  4%  mpstat.cpu.all.usr%
>      20683           -44.2%      11546 ±  5%  stress-ng.sync-file.ops
>     344.72           -44.2%     192.43 ±  5%  stress-ng.sync-file.ops_per_sec
>      71365 ±  3%    +502.5%     429994 ±  8%  stress-ng.time.file_system_outputs
>     273.50           -42.5%     157.33 ±  4%  stress-ng.time.percent_of_cpu_this_job_got
>      72.33           -41.6%      42.21 ±  4%  stress-ng.time.system_time
>      97.83           -43.0%      55.78 ±  4%  stress-ng.time.user_time
>      54130 ±  2%     +45.3%      78669 ±  6%  stress-ng.time.voluntary_context_switches
>     359.83           -31.2%     247.50 ±  2%  turbostat.Avg_MHz
>       9.56            -2.4        7.20        turbostat.Busy%
>       3764            -8.7%       3436        turbostat.Bzy_MHz
>    1083169           +21.5%    1316512 ±  2%  turbostat.C1E
>       4.48            +2.8        7.24 ±  6%  turbostat.C1E%
>       0.11           +30.3%       0.14 ±  3%  turbostat.IPC
>      45335 ±  8%    +122.2%     100745 ±  8%  turbostat.POLL
>       0.02            +0.0        0.06 ± 11%  turbostat.POLL%
>      84.17           -12.5%      73.64        turbostat.PkgWatt
>      11555           -47.0%       6125 ± 33%  sched_debug.cfs_rq:/.avg_vruntime.avg
>     105327 ±  6%     -52.8%      49736 ± 38%  sched_debug.cfs_rq:/.avg_vruntime.max
>      22835 ±  3%     -55.8%      10104 ± 38%  sched_debug.cfs_rq:/.avg_vruntime.stddev
>      11555           -47.0%       6125 ± 33%  sched_debug.cfs_rq:/.min_vruntime.avg
>     105327 ±  6%     -52.8%      49736 ± 38%  sched_debug.cfs_rq:/.min_vruntime.max
>      22836 ±  3%     -55.8%      10104 ± 38%  sched_debug.cfs_rq:/.min_vruntime.stddev
>      79.98 ± 15%     -28.8%      56.92 ± 16%  sched_debug.cfs_rq:/.util_est_enqueued.avg
>     832.17           -21.1%     656.67 ± 13%  sched_debug.cfs_rq:/.util_est_enqueued.max
>     215.60 ±  5%     -29.5%     151.99 ± 12%  sched_debug.cfs_rq:/.util_est_enqueued.stddev
>     222796 ± 14%     -46.0%     120393 ± 45%  sched_debug.cpu.nr_switches.max
>      43389 ±  9%     -40.3%      25896 ± 50%  sched_debug.cpu.nr_switches.stddev
>       2624 ± 10%     -18.7%       2133 ±  3%  proc-vmstat.nr_active_file
>    2228941 ±  3%     +43.9%    3208058 ±  3%  proc-vmstat.nr_dirtied
>      10236            +2.7%      10513        proc-vmstat.nr_mapped
>    2228950 ±  3%     +43.9%    3208079 ±  3%  proc-vmstat.nr_written
>       2624 ± 10%     -18.7%       2133 ±  3%  proc-vmstat.nr_zone_active_file
>    1791181 ±  2%     +30.5%    2337777 ±  4%  proc-vmstat.numa_hit
>    1788181 ±  2%     +31.0%    2342800 ±  4%  proc-vmstat.numa_local
>     442749 ±  2%     +53.9%     681290 ± 10%  proc-vmstat.pgactivate
>    1814457 ±  2%     +30.6%    2369210 ±  4%  proc-vmstat.pgalloc_normal
>     209992            +2.7%     215559        proc-vmstat.pgfault
>    1767669 ±  2%     +31.2%    2319090 ±  4%  proc-vmstat.pgfree
>    8933836 ±  3%     +45.9%   13035524 ±  3%  proc-vmstat.pgpgout
>       0.00 ± 58%    +236.4%       0.01 ± 74%  perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.barrier_all_devices.write_all_supers.btrfs_commit_transaction
>       0.00 ± 44%    +400.0%       0.01 ± 16%  perf-sched.sch_delay.avg.ms.btrfs_commit_transaction.flush_space.btrfs_async_reclaim_metadata_space.process_one_work
>       0.00 ± 50%    +160.0%       0.01 ± 36%  perf-sched.sch_delay.avg.ms.btrfs_start_ordered_extent.btrfs_wait_ordered_range.__btrfs_wait_cache_io.btrfs_start_dirty_block_groups
>       0.00           +58.3%       0.00 ± 11%  perf-sched.sch_delay.avg.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
>       0.00          +116.7%       0.00 ± 17%  perf-sched.sch_delay.avg.ms.io_schedule.folio_wait_bit_common.folio_wait_writeback.__filemap_fdatawait_range
>       0.00 ±156%    +353.8%       0.01 ±  3%  perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.__btrfs_tree_lock
>       0.00 ± 83%    +188.9%       0.00 ± 31%  perf-sched.sch_delay.avg.ms.schedule_timeout.io_schedule_timeout.__wait_for_common.barrier_all_devices
>       0.00           +44.4%       0.00 ± 17%  perf-sched.sch_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
>       0.00 ±  9%     +34.8%       0.01 ±  7%  perf-sched.sch_delay.avg.ms.wait_current_trans.start_transaction.btrfs_replace_file_extents.insert_prealloc_file_extent
>       0.00 ± 12%     +50.0%       0.01 ± 20%  perf-sched.sch_delay.avg.ms.wait_current_trans.start_transaction.btrfs_truncate.btrfs_setsize.isra
>       0.01 ± 64%    +259.2%       0.03 ± 49%  perf-sched.sch_delay.max.ms.__cond_resched.__wait_for_common.barrier_all_devices.write_all_supers.btrfs_commit_transaction
>       0.01 ± 38%    +102.6%       0.01 ±  8%  perf-sched.sch_delay.max.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
>       0.00 ± 10%    +542.9%       0.03 ± 63%  perf-sched.sch_delay.max.ms.btrfs_commit_transaction.flush_space.btrfs_async_reclaim_metadata_space.process_one_work
>       0.01 ± 51%    +256.6%       0.05 ± 42%  perf-sched.sch_delay.max.ms.btrfs_start_ordered_extent.btrfs_wait_ordered_range.__btrfs_wait_cache_io.btrfs_start_dirty_block_groups
>       0.02 ± 39%    +244.2%       0.06 ±  9%  perf-sched.sch_delay.max.ms.btrfs_start_ordered_extent.btrfs_wait_ordered_range.__btrfs_wait_cache_io.btrfs_write_dirty_block_groups
>       0.01 ± 36%     +66.7%       0.02 ± 33%  perf-sched.sch_delay.max.ms.handle_reserve_ticket.__reserve_bytes.btrfs_reserve_metadata_bytes.btrfs_delayed_refs_rsv_refill
>       0.03 ± 27%     +95.0%       0.06 ± 22%  perf-sched.sch_delay.max.ms.io_schedule.folio_wait_bit_common.folio_wait_writeback.__filemap_fdatawait_range
>       0.01 ± 10%    +268.8%       0.05 ± 13%  perf-sched.sch_delay.max.ms.io_schedule.folio_wait_bit_common.write_all_supers.btrfs_commit_transaction
>       0.01 ± 10%     +66.1%       0.02 ± 19%  perf-sched.sch_delay.max.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
>       0.00 ±156%   +1407.7%       0.03 ± 25%  perf-sched.sch_delay.max.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.__btrfs_tree_lock
>       0.02 ± 16%    +190.0%       0.05 ± 15%  perf-sched.sch_delay.max.ms.schedule_timeout.io_schedule_timeout.__wait_for_common.barrier_all_devices
>       0.01 ± 24%    +120.6%       0.01 ±  6%  perf-sched.sch_delay.max.ms.schedule_timeout.kcompactd.kthread.ret_from_fork
>       0.01 ± 13%    +112.7%       0.02 ± 18%  perf-sched.sch_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
>       0.01 ± 43%    +233.3%       0.04 ± 12%  perf-sched.sch_delay.max.ms.wait_current_trans.start_transaction.btrfs_replace_file_extents.insert_prealloc_file_extent
>       0.01 ± 57%    +189.2%       0.03 ± 28%  perf-sched.sch_delay.max.ms.wait_current_trans.start_transaction.btrfs_truncate.btrfs_setsize.isra
>       2.55 ± 31%     -35.1%       1.66 ± 42%  perf-sched.sch_delay.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
>      91749           +16.3%     106682 ±  6%  perf-sched.total_wait_and_delay.count.ms
>       0.11 ±  7%    +673.4%       0.88 ± 33%  perf-sched.wait_and_delay.avg.ms.handle_reserve_ticket.__reserve_bytes.btrfs_reserve_metadata_bytes.start_transaction
>       0.07 ±  7%     +56.4%       0.11 ± 37%  perf-sched.wait_and_delay.avg.ms.io_schedule.folio_wait_bit_common.folio_wait_writeback.__filemap_fdatawait_range
>     304.58            -8.8%     277.73 ±  4%  perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
>       6.63 ± 14%     -29.9%       4.65 ± 15%  perf-sched.wait_and_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
>     563.39 ±  5%     -22.5%     436.67 ±  6%  perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
>       2886 ± 13%    +245.8%       9982 ± 23%  perf-sched.wait_and_delay.count.io_schedule.folio_wait_bit_common.folio_wait_writeback.__filemap_fdatawait_range
>     622.83 ± 10%     +72.0%       1071 ± 13%  perf-sched.wait_and_delay.count.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
>     316.33 ±  6%     +48.7%     470.33 ±  7%  perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
>       6577 ±  8%     +14.1%       7504 ± 10%  perf-sched.wait_and_delay.count.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
>       0.12 ± 36%    +346.4%       0.55 ± 42%  perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
>       0.11 ±  6%    +703.4%       0.87 ± 33%  perf-sched.wait_time.avg.ms.handle_reserve_ticket.__reserve_bytes.btrfs_reserve_metadata_bytes.start_transaction
>       0.07 ±  7%     +55.7%       0.11 ± 37%  perf-sched.wait_time.avg.ms.io_schedule.folio_wait_bit_common.folio_wait_writeback.__filemap_fdatawait_range
>     304.58            -8.8%     277.73 ±  4%  perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
>       0.01 ± 62%   +1181.8%       0.19 ±192%  perf-sched.wait_time.avg.ms.schedule_timeout.io_schedule_timeout.__wait_for_common.barrier_all_devices
>       6.63 ± 14%     -29.9%       4.64 ± 15%  perf-sched.wait_time.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
>     562.83 ±  6%     -22.4%     436.66 ±  6%  perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
>       0.23 ± 45%    +222.6%       0.74 ± 36%  perf-sched.wait_time.avg.ms.wait_current_trans.start_transaction.btrfs_replace_file_extents.insert_prealloc_file_extent
>       0.10 ± 29%   +1271.7%       1.32 ± 26%  perf-sched.wait_time.avg.ms.wait_current_trans.start_transaction.btrfs_sync_file.__x64_sys_fdatasync
>       0.17 ±121%    +755.1%       1.42 ± 43%  perf-sched.wait_time.avg.ms.wait_current_trans.start_transaction.btrfs_truncate.btrfs_setsize.isra
>       1.52 ± 18%    +820.9%      14.02 ±127%  perf-sched.wait_time.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
>       1.19 ± 69%    +608.9%       8.46 ± 80%  perf-sched.wait_time.max.ms.io_schedule.folio_wait_bit_common.write_all_supers.btrfs_commit_transaction
>       0.15 ±203%   +4798.4%       7.11 ± 64%  perf-sched.wait_time.max.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.__btrfs_tree_lock
>       0.07 ±  5%  +40103.4%      27.27 ±158%  perf-sched.wait_time.max.ms.schedule_timeout.io_schedule_timeout.__wait_for_common.barrier_all_devices
>       4.89 ±125%    +179.1%      13.66 ± 40%  perf-sched.wait_time.max.ms.wait_current_trans.start_transaction.btrfs_replace_file_extents.insert_prealloc_file_extent
>       0.30 ± 35%   +2013.1%       6.40 ± 41%  perf-sched.wait_time.max.ms.wait_current_trans.start_transaction.btrfs_sync_file.__x64_sys_fdatasync
>       4.37 ±124%    +172.7%      11.91 ± 51%  perf-sched.wait_time.max.ms.wait_current_trans.start_transaction.btrfs_truncate.btrfs_setsize.isra
>       0.11 ± 10%     +33.0%       0.14 ±  3%  perf-stat.i.MPKI
>  8.058e+08           -15.9%  6.778e+08        perf-stat.i.branch-instructions
>       1.74            +0.1        1.82        perf-stat.i.branch-miss-rate%
>   17570571            -7.7%   16220035 ±  2%  perf-stat.i.branch-misses
>       6.91 ±  4%      -0.7        6.19 ±  4%  perf-stat.i.cache-miss-rate%
>     707872 ±  3%     +11.4%     788836        perf-stat.i.cache-misses
>    6021967 ±  2%     +37.0%    8247385 ±  2%  perf-stat.i.cache-references
>      37785           +17.4%      44340        perf-stat.i.context-switches
>       3.13           -17.2%       2.59        perf-stat.i.cpi
>  1.257e+10           -32.6%   8.47e+09 ±  3%  perf-stat.i.cpu-cycles
>      68.81 ±  2%     +23.9%      85.27 ±  3%  perf-stat.i.cpu-migrations
>      88843 ±  4%     -52.0%      42678 ±  8%  perf-stat.i.cycles-between-cache-misses
>       0.45            -0.1        0.32 ±  3%  perf-stat.i.dTLB-load-miss-rate%
>    4584350           -43.1%    2608740 ±  4%  perf-stat.i.dTLB-load-misses
>  1.105e+09           -16.0%  9.279e+08        perf-stat.i.dTLB-loads
>  6.732e+08           -22.3%  5.232e+08 ±  2%  perf-stat.i.dTLB-stores
>      78.39            -9.3       69.11        perf-stat.i.iTLB-load-miss-rate%
>    4906270 ±  2%     -45.2%    2687083 ±  5%  perf-stat.i.iTLB-load-misses
>    1221451 ±  4%     -10.3%    1095174 ±  2%  perf-stat.i.iTLB-loads
>  4.415e+09           -16.6%   3.68e+09        perf-stat.i.instructions
>       0.36           +19.4%       0.43        perf-stat.i.ipc
>       0.35           -32.6%       0.24 ±  3%  perf-stat.i.metric.GHz
>     205.37 ±  2%     +29.2%     265.27 ±  2%  perf-stat.i.metric.K/sec
>      71.75           -17.6%      59.11        perf-stat.i.metric.M/sec
>       1910            +4.1%       1989        perf-stat.i.minor-faults
>      65257 ±  6%     +52.0%      99201        perf-stat.i.node-stores
>       1910            +4.1%       1989        perf-stat.i.page-faults
>       0.16 ±  2%     +33.7%       0.21        perf-stat.overall.MPKI
>       2.18            +0.2        2.39        perf-stat.overall.branch-miss-rate%
>      11.75 ±  3%      -2.2        9.57 ±  3%  perf-stat.overall.cache-miss-rate%
>       2.85           -19.2%       2.30        perf-stat.overall.cpi
>      17782 ±  3%     -39.6%      10745 ±  2%  perf-stat.overall.cycles-between-cache-misses
>       0.41            -0.1        0.28 ±  3%  perf-stat.overall.dTLB-load-miss-rate%
>       0.00 ±  3%      +0.0        0.00 ±  7%  perf-stat.overall.dTLB-store-miss-rate%
>      80.06            -9.1       71.01        perf-stat.overall.iTLB-load-miss-rate%
>     900.18 ±  2%     +52.4%       1372 ±  4%  perf-stat.overall.instructions-per-iTLB-miss
>       0.35           +23.7%       0.43        perf-stat.overall.ipc
>   7.93e+08           -15.9%  6.671e+08        perf-stat.ps.branch-instructions
>   17290247            -7.6%   15967777 ±  2%  perf-stat.ps.branch-misses
>     696216 ±  3%     +11.4%     775874        perf-stat.ps.cache-misses
>    5925900 ±  2%     +37.0%    8117167 ±  2%  perf-stat.ps.cache-references
>      37185           +17.3%      43634        perf-stat.ps.context-switches
>  1.237e+10           -32.6%  8.337e+09 ±  3%  perf-stat.ps.cpu-cycles
>      67.72 ±  2%     +23.9%      83.91 ±  3%  perf-stat.ps.cpu-migrations
>    4512549           -43.1%    2567904 ±  4%  perf-stat.ps.dTLB-load-misses
>  1.088e+09           -16.0%  9.133e+08        perf-stat.ps.dTLB-loads
>  6.626e+08           -22.3%  5.149e+08 ±  2%  perf-stat.ps.dTLB-stores
>    4829299 ±  2%     -45.2%    2645024 ±  5%  perf-stat.ps.iTLB-load-misses
>    1202224 ±  4%     -10.3%    1077818 ±  2%  perf-stat.ps.iTLB-loads
>  4.345e+09           -16.6%  3.622e+09        perf-stat.ps.instructions
>       1878            +4.2%       1956        perf-stat.ps.minor-faults
>      64207 ±  6%     +52.0%      97622        perf-stat.ps.node-stores
>       1878            +4.2%       1957        perf-stat.ps.page-faults
>  2.763e+11           -16.8%  2.298e+11        perf-stat.total.instructions
>      66.48 ±  4%     -10.1       56.36        perf-profile.calltrace.cycles-pp.sync_file_range
>      36.12 ±  4%      -5.5       30.61        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.sync_file_range
>      25.64 ±  4%      -3.9       21.72 ±  2%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.sync_file_range
>      18.86 ±  5%      -2.9       15.91        perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.sync_file_range
>      16.66 ±  5%      -2.7       13.94 ±  2%  perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.sync_file_range
>       9.10 ±  5%      -1.2        7.89 ±  2%  perf-profile.calltrace.cycles-pp.__entry_text_start.sync_file_range
>       7.30 ±  4%      -0.9        6.36 ±  3%  perf-profile.calltrace.cycles-pp.__x64_sys_sync_file_range.do_syscall_64.entry_SYSCALL_64_after_hwframe.sync_file_range
>       3.20 ±  5%      -0.5        2.73 ±  4%  perf-profile.calltrace.cycles-pp.sync_file_range.__x64_sys_sync_file_range.do_syscall_64.entry_SYSCALL_64_after_hwframe.sync_file_range
>       2.64 ±  6%      -0.3        2.32 ±  5%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.sync_file_range
>       1.07 ±  8%      -0.2        0.91 ±  5%  perf-profile.calltrace.cycles-pp.file_fdatawait_range.sync_file_range.__x64_sys_sync_file_range.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       0.96 ±  9%      -0.1        0.83 ±  6%  perf-profile.calltrace.cycles-pp.__filemap_fdatawait_range.file_fdatawait_range.sync_file_range.__x64_sys_sync_file_range.do_syscall_64
>       0.75 ±  7%      -0.1        0.65 ±  7%  perf-profile.calltrace.cycles-pp.__filemap_fdatawait_range.file_fdatawait_range.__x64_sys_sync_file_range.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       0.57 ±  6%      +0.2        0.81 ±  9%  perf-profile.calltrace.cycles-pp.clock_nanosleep
>       0.62 ± 10%      +0.5        1.13 ±  9%  perf-profile.calltrace.cycles-pp.update_process_times.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt
>       0.62 ±  9%      +0.5        1.14 ±  9%  perf-profile.calltrace.cycles-pp.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt
>       0.68 ±  9%      +0.6        1.27 ±  9%  perf-profile.calltrace.cycles-pp.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt
>       0.00            +0.6        0.62 ± 16%  perf-profile.calltrace.cycles-pp.__btrfs_write_out_cache.btrfs_write_out_cache.btrfs_write_dirty_block_groups.commit_cowonly_roots.btrfs_commit_transaction
>       0.00            +0.6        0.62 ± 16%  perf-profile.calltrace.cycles-pp.btrfs_write_out_cache.btrfs_write_dirty_block_groups.commit_cowonly_roots.btrfs_commit_transaction.flush_space
>       0.38 ± 70%      +0.6        1.01 ± 10%  perf-profile.calltrace.cycles-pp.scheduler_tick.update_process_times.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues
>       0.00            +0.7        0.66 ± 12%  perf-profile.calltrace.cycles-pp.__extent_writepage.extent_write_cache_pages.extent_writepages.do_writepages.filemap_fdatawrite_wbc
>       0.00            +0.7        0.70 ± 16%  perf-profile.calltrace.cycles-pp.truncate_pagecache.btrfs_truncate_free_space_cache.cache_save_setup.btrfs_start_dirty_block_groups.btrfs_commit_transaction
>       0.00            +0.7        0.70 ± 16%  perf-profile.calltrace.cycles-pp.truncate_inode_pages_range.truncate_pagecache.btrfs_truncate_free_space_cache.cache_save_setup.btrfs_start_dirty_block_groups
>       0.00            +0.8        0.76 ±  8%  perf-profile.calltrace.cycles-pp.perf_adjust_freq_unthr_context.perf_event_task_tick.scheduler_tick.update_process_times.tick_sched_handle
>       0.00            +0.8        0.77 ± 17%  perf-profile.calltrace.cycles-pp.btrfs_truncate_free_space_cache.cache_save_setup.btrfs_start_dirty_block_groups.btrfs_commit_transaction.flush_space
>       1.01 ±  8%      +0.8        1.79 ±  9%  perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
>       0.00            +0.8        0.78 ±  9%  perf-profile.calltrace.cycles-pp.perf_event_task_tick.scheduler_tick.update_process_times.tick_sched_handle.tick_sched_timer
>       0.00            +0.8        0.82 ± 14%  perf-profile.calltrace.cycles-pp.btrfs_write_dirty_block_groups.commit_cowonly_roots.btrfs_commit_transaction.flush_space.btrfs_async_reclaim_metadata_space
>       0.00            +0.8        0.82 ± 13%  perf-profile.calltrace.cycles-pp.extent_writepages.do_writepages.filemap_fdatawrite_wbc.__filemap_fdatawrite_range.btrfs_fdatawrite_range
>       0.00            +0.8        0.82 ± 13%  perf-profile.calltrace.cycles-pp.extent_write_cache_pages.extent_writepages.do_writepages.filemap_fdatawrite_wbc.__filemap_fdatawrite_range
>       0.00            +0.8        0.82 ± 13%  perf-profile.calltrace.cycles-pp.__filemap_fdatawrite_range.btrfs_fdatawrite_range.__btrfs_write_out_cache.btrfs_write_out_cache.btrfs_start_dirty_block_groups
>       0.00            +0.8        0.82 ± 13%  perf-profile.calltrace.cycles-pp.filemap_fdatawrite_wbc.__filemap_fdatawrite_range.btrfs_fdatawrite_range.__btrfs_write_out_cache.btrfs_write_out_cache
>       0.00            +0.8        0.82 ± 13%  perf-profile.calltrace.cycles-pp.do_writepages.filemap_fdatawrite_wbc.__filemap_fdatawrite_range.btrfs_fdatawrite_range.__btrfs_write_out_cache
>       0.00            +0.8        0.82 ± 13%  perf-profile.calltrace.cycles-pp.btrfs_fdatawrite_range.__btrfs_write_out_cache.btrfs_write_out_cache.btrfs_start_dirty_block_groups.btrfs_commit_transaction
>       0.00            +0.8        0.82 ±  5%  perf-profile.calltrace.cycles-pp.poll_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
>       0.00            +0.8        0.83 ± 15%  perf-profile.calltrace.cycles-pp.submit_eb_page.btree_write_cache_pages.do_writepages.filemap_fdatawrite_wbc.__filemap_fdatawrite_range
>       0.00            +0.8        0.84 ± 16%  perf-profile.calltrace.cycles-pp.do_writepages.filemap_fdatawrite_wbc.__filemap_fdatawrite_range.btrfs_write_marked_extents.btrfs_write_and_wait_transaction
>       0.00            +0.8        0.84 ± 16%  perf-profile.calltrace.cycles-pp.btree_write_cache_pages.do_writepages.filemap_fdatawrite_wbc.__filemap_fdatawrite_range.btrfs_write_marked_extents
>       0.00            +0.8        0.84 ± 16%  perf-profile.calltrace.cycles-pp.__filemap_fdatawrite_range.btrfs_write_marked_extents.btrfs_write_and_wait_transaction.btrfs_commit_transaction.flush_space
>       0.00            +0.8        0.84 ± 16%  perf-profile.calltrace.cycles-pp.filemap_fdatawrite_wbc.__filemap_fdatawrite_range.btrfs_write_marked_extents.btrfs_write_and_wait_transaction.btrfs_commit_transaction
>       0.19 ±142%      +0.8        1.04 ± 14%  perf-profile.calltrace.cycles-pp.cache_save_setup.btrfs_start_dirty_block_groups.btrfs_commit_transaction.flush_space.btrfs_async_reclaim_metadata_space
>       0.00            +0.8        0.85 ± 16%  perf-profile.calltrace.cycles-pp.btrfs_write_marked_extents.btrfs_write_and_wait_transaction.btrfs_commit_transaction.flush_space.btrfs_async_reclaim_metadata_space
>       0.00            +0.9        0.90 ± 16%  perf-profile.calltrace.cycles-pp.btrfs_write_and_wait_transaction.btrfs_commit_transaction.flush_space.btrfs_async_reclaim_metadata_space.process_one_work
>       1.28 ±  8%      +1.0        2.27 ±  8%  perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
>       1.29 ±  8%      +1.0        2.30 ±  8%  perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
>       0.79 ± 46%      +1.2        1.98 ± 15%  perf-profile.calltrace.cycles-pp.__btrfs_write_out_cache.btrfs_write_out_cache.btrfs_start_dirty_block_groups.btrfs_commit_transaction.flush_space
>       0.80 ± 46%      +1.2        2.01 ± 15%  perf-profile.calltrace.cycles-pp.btrfs_write_out_cache.btrfs_start_dirty_block_groups.btrfs_commit_transaction.flush_space.btrfs_async_reclaim_metadata_space
>       0.00            +1.2        1.23 ± 14%  perf-profile.calltrace.cycles-pp.commit_cowonly_roots.btrfs_commit_transaction.flush_space.btrfs_async_reclaim_metadata_space.process_one_work
>       1.72 ± 11%      +1.2        2.96 ±  8%  perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
>       1.88 ± 11%      +1.4        3.24 ±  7%  perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
>       1.09 ± 37%      +1.5        2.60 ± 25%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
>       1.44 ± 23%      +2.0        3.45 ± 14%  perf-profile.calltrace.cycles-pp.btrfs_start_dirty_block_groups.btrfs_commit_transaction.flush_space.btrfs_async_reclaim_metadata_space.process_one_work
>       1.82 ± 23%      +4.0        5.82 ± 13%  perf-profile.calltrace.cycles-pp.btrfs_commit_transaction.flush_space.btrfs_async_reclaim_metadata_space.process_one_work.worker_thread
>       2.06 ± 24%      +4.3        6.34 ± 14%  perf-profile.calltrace.cycles-pp.flush_space.btrfs_async_reclaim_metadata_space.process_one_work.worker_thread.kthread
>       2.06 ± 24%      +4.3        6.36 ± 14%  perf-profile.calltrace.cycles-pp.btrfs_async_reclaim_metadata_space.process_one_work.worker_thread.kthread.ret_from_fork
>       2.10 ± 24%      +4.4        6.46 ± 14%  perf-profile.calltrace.cycles-pp.process_one_work.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
>       2.12 ± 23%      +4.4        6.53 ± 14%  perf-profile.calltrace.cycles-pp.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
>       2.15 ± 23%      +4.4        6.57 ± 14%  perf-profile.calltrace.cycles-pp.ret_from_fork_asm
>       2.15 ± 23%      +4.4        6.57 ± 14%  perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm
>       2.15 ± 23%      +4.4        6.57 ± 14%  perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm
>      27.62 ± 11%      +4.5       32.16        perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
>      27.45 ± 11%      +4.9       32.33        perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
>      27.79 ± 11%      +5.2       32.98        perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
>      28.51 ± 10%      +5.4       33.86        perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
>      28.04 ± 11%      +5.4       33.44        perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
>      28.05 ± 11%      +5.4       33.47        perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
>      28.05 ± 11%      +5.4       33.47        perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
>      69.96 ±  4%     -10.6       59.40        perf-profile.children.cycles-pp.sync_file_range
>      37.42 ±  4%      -5.2       32.18        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
>      26.79 ±  4%      -3.7       23.14        perf-profile.children.cycles-pp.do_syscall_64
>      19.14 ±  5%      -2.9       16.20        perf-profile.children.cycles-pp.syscall_return_via_sysret
>      16.85 ±  5%      -2.7       14.15 ±  2%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode
>       9.00 ±  5%      -1.2        7.83 ±  2%  perf-profile.children.cycles-pp.__entry_text_start
>       7.31 ±  4%      -0.9        6.38 ±  3%  perf-profile.children.cycles-pp.__x64_sys_sync_file_range
>       1.91 ±  7%      -0.3        1.64 ±  4%  perf-profile.children.cycles-pp.file_fdatawait_range
>       1.42 ±  6%      -0.2        1.17 ±  4%  perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
>       1.65 ±  5%      -0.2        1.45 ±  4%  perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
>       1.49 ±  4%      -0.2        1.32 ±  6%  perf-profile.children.cycles-pp.file_check_and_advance_wb_err
>       0.62 ±  5%      -0.1        0.52 ±  8%  perf-profile.children.cycles-pp.stress_sync_file
>       0.30 ± 11%      -0.1        0.25 ±  6%  perf-profile.children.cycles-pp.stress_mwc32
>       0.08 ± 21%      +0.0        0.12 ±  6%  perf-profile.children.cycles-pp.irqtime_account_irq
>       0.10 ± 12%      +0.0        0.14 ± 13%  perf-profile.children.cycles-pp.btrfs_truncate
>       0.12 ± 11%      +0.0        0.16 ± 13%  perf-profile.children.cycles-pp.do_sys_ftruncate
>       0.06 ± 47%      +0.0        0.10 ± 14%  perf-profile.children.cycles-pp.run_rebalance_domains
>       0.14 ± 12%      +0.0        0.18 ± 15%  perf-profile.children.cycles-pp.hrtimer_wakeup
>       0.09 ± 16%      +0.0        0.14 ±  9%  perf-profile.children.cycles-pp.enqueue_entity
>       0.02 ±141%      +0.0        0.06 ±  7%  perf-profile.children.cycles-pp.rcu_sched_clock_irq
>       0.11 ± 13%      +0.0        0.16 ± 12%  perf-profile.children.cycles-pp.do_truncate
>       0.11 ± 12%      +0.0        0.16 ± 11%  perf-profile.children.cycles-pp.btrfs_setattr
>       0.02 ±141%      +0.0        0.07 ± 15%  perf-profile.children.cycles-pp.tick_sched_do_timer
>       0.11 ± 12%      +0.0        0.16 ± 12%  perf-profile.children.cycles-pp.notify_change
>       0.05 ± 47%      +0.0        0.10 ± 14%  perf-profile.children.cycles-pp.update_blocked_averages
>       0.07 ± 14%      +0.0        0.12 ± 18%  perf-profile.children.cycles-pp.sched_clock
>       0.02 ±142%      +0.1        0.07 ±  5%  perf-profile.children.cycles-pp.rcu_core
>       0.06 ± 11%      +0.1        0.11 ± 27%  perf-profile.children.cycles-pp.tick_nohz_next_event
>       0.01 ±223%      +0.1        0.06 ± 13%  perf-profile.children.cycles-pp.btrfs_get_extent
>       0.00            +0.1        0.05 ±  8%  perf-profile.children.cycles-pp.rcu_pending
>       0.02 ±141%      +0.1        0.07 ± 27%  perf-profile.children.cycles-pp.hrtimer_next_event_without
>       0.12 ± 11%      +0.1        0.18 ± 12%  perf-profile.children.cycles-pp.ftruncate64
>       0.08 ± 19%      +0.1        0.13 ± 19%  perf-profile.children.cycles-pp.start_transaction
>       0.08 ± 18%      +0.1        0.13 ± 16%  perf-profile.children.cycles-pp.sched_clock_cpu
>       0.08 ± 11%      +0.1        0.14 ± 13%  perf-profile.children.cycles-pp.perf_pmu_nop_void
>       0.10 ± 16%      +0.1        0.16 ± 14%  perf-profile.children.cycles-pp.enqueue_task_fair
>       0.03 ±100%      +0.1        0.09 ± 22%  perf-profile.children.cycles-pp.folio_add_lru
>       0.00            +0.1        0.06 ± 15%  perf-profile.children.cycles-pp.extent_buffer_write_end_io
>       0.07 ± 12%      +0.1        0.13 ± 17%  perf-profile.children.cycles-pp.btrfs_truncate_inode_items
>       0.08 ± 16%      +0.1        0.14 ± 17%  perf-profile.children.cycles-pp.native_sched_clock
>       0.11 ± 13%      +0.1        0.17 ± 14%  perf-profile.children.cycles-pp.activate_task
>       0.11 ± 30%      +0.1        0.17 ± 20%  perf-profile.children.cycles-pp.rebalance_domains
>       0.00            +0.1        0.06 ± 28%  perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
>       0.01 ±223%      +0.1        0.07 ± 15%  perf-profile.children.cycles-pp.__alloc_pages
>       0.05 ± 72%      +0.1        0.12 ± 13%  perf-profile.children.cycles-pp.update_sg_lb_stats
>       0.01 ±223%      +0.1        0.08 ± 30%  perf-profile.children.cycles-pp.btrfs_work_helper
>       0.05 ± 77%      +0.1        0.12 ± 29%  perf-profile.children.cycles-pp.ktime_get_update_offsets_now
>       0.02 ±142%      +0.1        0.09 ± 12%  perf-profile.children.cycles-pp.folio_alloc
>       0.08 ± 18%      +0.1        0.15 ± 12%  perf-profile.children.cycles-pp.dequeue_entity
>       0.05 ± 46%      +0.1        0.12 ± 26%  perf-profile.children.cycles-pp.arch_scale_freq_tick
>       0.00            +0.1        0.07 ± 28%  perf-profile.children.cycles-pp.lock_delalloc_pages
>       0.00            +0.1        0.07 ± 18%  perf-profile.children.cycles-pp.nvme_queue_rq
>       0.01 ±223%      +0.1        0.08 ± 29%  perf-profile.children.cycles-pp.__mod_node_page_state
>       0.00            +0.1        0.07 ± 18%  perf-profile.children.cycles-pp.__blk_mq_issue_directly
>       0.01 ±223%      +0.1        0.08 ± 17%  perf-profile.children.cycles-pp.__mod_lruvec_state
>       0.00            +0.1        0.07 ± 20%  perf-profile.children.cycles-pp.blk_mq_try_issue_directly
>       0.12 ± 11%      +0.1        0.19 ± 11%  perf-profile.children.cycles-pp.ttwu_do_activate
>       0.09 ± 11%      +0.1        0.17 ± 17%  perf-profile.children.cycles-pp.dequeue_task_fair
>       0.05 ± 74%      +0.1        0.12 ± 15%  perf-profile.children.cycles-pp.run_delalloc_nocow
>       0.00            +0.1        0.07 ± 28%  perf-profile.children.cycles-pp.folio_wait_bit_common
>       0.05 ± 74%      +0.1        0.12 ± 15%  perf-profile.children.cycles-pp.btrfs_run_delalloc_range
>       0.09 ± 29%      +0.1        0.16 ± 12%  perf-profile.children.cycles-pp.find_busiest_group
>       0.00            +0.1        0.08 ± 29%  perf-profile.children.cycles-pp.folio_wait_writeback
>       0.01 ±223%      +0.1        0.09 ± 13%  perf-profile.children.cycles-pp.try_release_extent_mapping
>       0.02 ±141%      +0.1        0.09 ± 34%  perf-profile.children.cycles-pp.btrfs_lookup_inode
>       0.01 ±223%      +0.1        0.08 ± 13%  perf-profile.children.cycles-pp.xas_load
>       0.00            +0.1        0.08 ± 20%  perf-profile.children.cycles-pp.nvme_prep_rq
>       0.06 ± 51%      +0.1        0.14 ± 15%  perf-profile.children.cycles-pp.alloc_extent_state
>       0.01 ±223%      +0.1        0.09 ± 33%  perf-profile.children.cycles-pp.clear_state_bit
>       0.00            +0.1        0.08 ± 16%  perf-profile.children.cycles-pp.__wake_up_common
>       0.00            +0.1        0.08 ± 16%  perf-profile.children.cycles-pp.check_extent_data_item
>       0.00            +0.1        0.08 ± 22%  perf-profile.children.cycles-pp.folio_account_dirtied
>       0.12 ±  8%      +0.1        0.20 ± 16%  perf-profile.children.cycles-pp.schedule_idle
>       0.07 ± 53%      +0.1        0.16 ± 13%  perf-profile.children.cycles-pp.update_sd_lb_stats
>       0.00            +0.1        0.08 ±  8%  perf-profile.children.cycles-pp.newidle_balance
>       0.03 ±100%      +0.1        0.11 ± 16%  perf-profile.children.cycles-pp.lock_extent
>       0.00            +0.1        0.08 ± 26%  perf-profile.children.cycles-pp.filemap_get_entry
>       0.12 ± 13%      +0.1        0.20 ± 10%  perf-profile.children.cycles-pp.read_tsc
>       0.03 ±101%      +0.1        0.12 ± 15%  perf-profile.children.cycles-pp.find_lock_delalloc_range
>       0.02 ±141%      +0.1        0.10 ± 17%  perf-profile.children.cycles-pp.percpu_counter_add_batch
>       0.14 ± 18%      +0.1        0.23 ± 11%  perf-profile.children.cycles-pp.lapic_next_deadline
>       0.05 ± 47%      +0.1        0.14 ± 18%  perf-profile.children.cycles-pp.__set_extent_bit
>       0.00            +0.1        0.09 ± 17%  perf-profile.children.cycles-pp.write_one_eb
>       0.00            +0.1        0.09 ± 31%  perf-profile.children.cycles-pp.lookup_inline_extent_backref
>       0.00            +0.1        0.09 ± 31%  perf-profile.children.cycles-pp.lookup_extent_backref
>       0.00            +0.1        0.09 ± 16%  perf-profile.children.cycles-pp.memmove_extent_buffer
>       0.01 ±223%      +0.1        0.10 ± 23%  perf-profile.children.cycles-pp.clear_page_erms
>       0.12 ± 30%      +0.1        0.20 ± 12%  perf-profile.children.cycles-pp.load_balance
>       0.00            +0.1        0.09 ± 15%  perf-profile.children.cycles-pp.folio_mark_accessed
>       0.04 ±104%      +0.1        0.13 ± 19%  perf-profile.children.cycles-pp.delete_from_page_cache_batch
>       0.09 ± 10%      +0.1        0.18 ±  8%  perf-profile.children.cycles-pp.pick_next_task_fair
>       0.00            +0.1        0.09 ± 18%  perf-profile.children.cycles-pp.btrfs_get_64
>       0.01 ±223%      +0.1        0.10 ± 20%  perf-profile.children.cycles-pp.flush_smp_call_function_queue
>       0.27 ±  9%      +0.1        0.37 ± 14%  perf-profile.children.cycles-pp.hrtimer_nanosleep
>       0.25 ±  8%      +0.1        0.34 ± 15%  perf-profile.children.cycles-pp.do_nanosleep
>       0.10 ± 14%      +0.1        0.19 ± 13%  perf-profile.children.cycles-pp.kmem_cache_alloc
>       0.12 ± 14%      +0.1        0.22 ± 11%  perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
>       0.08 ± 35%      +0.1        0.18 ± 25%  perf-profile.children.cycles-pp.alloc_reserved_file_extent
>       0.06 ± 47%      +0.1        0.16 ± 12%  perf-profile.children.cycles-pp.__folio_mark_dirty
>       0.00            +0.1        0.10 ± 21%  perf-profile.children.cycles-pp.__flush_smp_call_function_queue
>       0.01 ±223%      +0.1        0.11 ± 33%  perf-profile.children.cycles-pp.xas_store
>       0.02 ±142%      +0.1        0.12 ± 22%  perf-profile.children.cycles-pp.btrfs_wait_ordered_range
>       0.04 ± 75%      +0.1        0.15 ± 15%  perf-profile.children.cycles-pp.folio_batch_move_lru
>       0.28 ±  9%      +0.1        0.38 ± 13%  perf-profile.children.cycles-pp.common_nsleep
>       0.03 ±101%      +0.1        0.14 ±  9%  perf-profile.children.cycles-pp.submit_one_bio
>       0.00            +0.1        0.10 ± 18%  perf-profile.children.cycles-pp.__memmove
>       0.07 ± 18%      +0.1        0.18 ± 18%  perf-profile.children.cycles-pp.__folio_batch_release
>       0.00            +0.1        0.11 ± 24%  perf-profile.children.cycles-pp.check_block_group_item
>       0.12 ± 24%      +0.1        0.22 ± 11%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
>       0.17 ± 14%      +0.1        0.28 ± 10%  perf-profile.children.cycles-pp.__intel_pmu_enable_all
>       0.08 ± 16%      +0.1        0.19 ± 17%  perf-profile.children.cycles-pp.release_pages
>       0.00            +0.1        0.11 ± 36%  perf-profile.children.cycles-pp.btrfs_alloc_tree_block
>       0.02 ±141%      +0.1        0.13 ± 21%  perf-profile.children.cycles-pp.filemap_fdatawait_range
>       0.07 ± 52%      +0.1        0.18 ± 33%  perf-profile.children.cycles-pp.__filemap_add_folio
>       0.00            +0.1        0.11 ± 23%  perf-profile.children.cycles-pp.write_all_supers
>       0.16 ± 13%      +0.1        0.28 ±  7%  perf-profile.children.cycles-pp.try_to_wake_up
>       0.03 ±102%      +0.1        0.14 ± 29%  perf-profile.children.cycles-pp.btrfs_update_inode_item
>       0.06 ± 51%      +0.1        0.17 ±  9%  perf-profile.children.cycles-pp.__mod_lruvec_page_state
>       0.07 ± 17%      +0.1        0.19 ± 16%  perf-profile.children.cycles-pp.folio_clear_dirty_for_io
>       0.29 ±  7%      +0.1        0.41 ± 13%  perf-profile.children.cycles-pp.__x64_sys_clock_nanosleep
>       0.18 ± 13%      +0.1        0.31 ±  9%  perf-profile.children.cycles-pp.clockevents_program_event
>       0.21 ± 20%      +0.1        0.33 ± 10%  perf-profile.children.cycles-pp.intel_idle_irq
>       0.08 ± 29%      +0.1        0.21 ±  6%  perf-profile.children.cycles-pp.submit_extent_page
>       0.00            +0.1        0.14 ± 11%  perf-profile.children.cycles-pp.read_extent_buffer
>       0.03 ±100%      +0.1        0.17 ± 22%  perf-profile.children.cycles-pp.__btrfs_wait_cache_io
>       0.03 ±102%      +0.1        0.17 ± 10%  perf-profile.children.cycles-pp.blk_mq_submit_bio
>       0.11 ± 28%      +0.1        0.26 ± 17%  perf-profile.children.cycles-pp.writepage_delalloc
>       0.03 ±102%      +0.1        0.18 ± 10%  perf-profile.children.cycles-pp.submit_bio_noacct_nocheck
>       0.25 ± 22%      +0.1        0.40 ± 10%  perf-profile.children.cycles-pp.__do_softirq
>       0.09 ± 22%      +0.1        0.24 ±  9%  perf-profile.children.cycles-pp.filemap_dirty_folio
>       0.10 ± 24%      +0.2        0.26 ± 13%  perf-profile.children.cycles-pp.btrfs_dirty_pages
>       0.12 ± 22%      +0.2        0.28 ± 15%  perf-profile.children.cycles-pp.__clear_extent_bit
>       0.12 ± 30%      +0.2        0.28 ± 25%  perf-profile.children.cycles-pp.filemap_add_folio
>       0.20 ± 31%      +0.2        0.36 ± 15%  perf-profile.children.cycles-pp.ktime_get
>       0.10 ± 24%      +0.2        0.26 ±  8%  perf-profile.children.cycles-pp.__folio_end_writeback
>       0.08 ± 22%      +0.2        0.24 ± 16%  perf-profile.children.cycles-pp.__folio_start_writeback
>       0.31 ± 21%      +0.2        0.48 ±  7%  perf-profile.children.cycles-pp.__irq_exit_rcu
>       0.02 ±142%      +0.2        0.20 ± 20%  perf-profile.children.cycles-pp.__write_extent_buffer
>       0.17 ± 10%      +0.2        0.35 ±  5%  perf-profile.children.cycles-pp._raw_spin_lock
>       0.10 ± 25%      +0.2        0.28 ± 17%  perf-profile.children.cycles-pp.btrfs_set_range_writeback
>       0.22 ±  8%      +0.2        0.42 ± 10%  perf-profile.children.cycles-pp.schedule
>       0.00            +0.2        0.22 ± 20%  perf-profile.children.cycles-pp.btrfs_get_32
>       0.01 ±223%      +0.2        0.24 ± 24%  perf-profile.children.cycles-pp.btrfs_cow_block
>       0.01 ±223%      +0.2        0.24 ± 24%  perf-profile.children.cycles-pp.__btrfs_cow_block
>       0.15 ± 23%      +0.2        0.38 ± 16%  perf-profile.children.cycles-pp.crc_pcl
>       0.13 ± 23%      +0.2        0.36 ±  7%  perf-profile.children.cycles-pp.folio_end_writeback
>       0.19 ± 27%      +0.2        0.42 ± 15%  perf-profile.children.cycles-pp.run_delayed_data_ref
>       0.26 ± 10%      +0.2        0.51 ±  8%  perf-profile.children.cycles-pp.menu_select
>       0.57 ±  6%      +0.2        0.82 ±  9%  perf-profile.children.cycles-pp.clock_nanosleep
>       0.16 ± 26%      +0.2        0.41 ± 18%  perf-profile.children.cycles-pp.btrfs_search_slot
>       0.18 ± 23%      +0.2        0.43 ± 20%  perf-profile.children.cycles-pp.btrfs_invalidate_folio
>       0.21 ± 30%      +0.2        0.46 ±  8%  perf-profile.children.cycles-pp.end_bio_extent_writepage
>       0.21 ± 30%      +0.3        0.46 ±  8%  perf-profile.children.cycles-pp.__btrfs_bio_end_io
>       0.19 ± 21%      +0.3        0.44 ± 20%  perf-profile.children.cycles-pp.truncate_cleanup_folio
>       0.04 ± 71%      +0.3        0.29 ± 20%  perf-profile.children.cycles-pp.btrfs_set_token_32
>       0.17 ± 24%      +0.3        0.43 ± 17%  perf-profile.children.cycles-pp.crc32c_pcl_intel_update
>       0.20 ± 24%      +0.3        0.48 ± 15%  perf-profile.children.cycles-pp.crc32c
>       0.00            +0.3        0.28 ± 15%  perf-profile.children.cycles-pp.alloc_reserved_tree_block
>       0.33 ±  6%      +0.3        0.61 ± 11%  perf-profile.children.cycles-pp.__schedule
>       0.03 ±106%      +0.3        0.33 ± 13%  perf-profile.children.cycles-pp.btrfs_get_token_32
>       0.22 ± 25%      +0.3        0.52 ± 15%  perf-profile.children.cycles-pp.io_ctl_set_crc
>       0.24 ± 28%      +0.3        0.58 ± 19%  perf-profile.children.cycles-pp.io_ctl_prepare_pages
>       0.22 ± 29%      +0.3        0.56 ± 19%  perf-profile.children.cycles-pp.__filemap_get_folio
>       0.22 ± 28%      +0.3        0.57 ±  9%  perf-profile.children.cycles-pp.blk_mq_end_request_batch
>       0.01 ±223%      +0.4        0.36 ± 16%  perf-profile.children.cycles-pp.check_leaf_item
>       0.00            +0.4        0.36 ± 24%  perf-profile.children.cycles-pp.run_delayed_tree_ref
>       0.22 ± 28%      +0.4        0.58 ± 19%  perf-profile.children.cycles-pp.pagecache_get_page
>       0.26 ± 20%      +0.4        0.62 ± 10%  perf-profile.children.cycles-pp.__extent_writepage_io
>       0.12 ± 25%      +0.4        0.49 ± 14%  perf-profile.children.cycles-pp.btrfs_insert_empty_items
>       0.09 ± 28%      +0.4        0.46 ± 15%  perf-profile.children.cycles-pp.setup_items_for_insert
>       0.08 ± 19%      +0.4        0.46 ± 17%  perf-profile.children.cycles-pp.btrfs_del_items
>       0.50 ±  9%      +0.4        0.88 ±  8%  perf-profile.children.cycles-pp.perf_adjust_freq_unthr_context
>       0.51 ±  9%      +0.4        0.89 ±  9%  perf-profile.children.cycles-pp.perf_event_task_tick
>       0.23 ± 30%      +0.4        0.62 ±  9%  perf-profile.children.cycles-pp.__handle_irq_event_percpu
>       0.23 ± 30%      +0.4        0.62 ±  9%  perf-profile.children.cycles-pp.nvme_irq
>       0.23 ± 30%      +0.4        0.63 ± 10%  perf-profile.children.cycles-pp.handle_irq_event
>       0.23 ± 30%      +0.4        0.64 ± 10%  perf-profile.children.cycles-pp.handle_edge_irq
>       0.23 ± 30%      +0.4        0.64 ± 10%  perf-profile.children.cycles-pp.__common_interrupt
>       0.24 ± 29%      +0.4        0.66 ± 11%  perf-profile.children.cycles-pp.common_interrupt
>       0.24 ± 29%      +0.4        0.66 ± 10%  perf-profile.children.cycles-pp.asm_common_interrupt
>       0.32 ± 22%      +0.4        0.75 ± 17%  perf-profile.children.cycles-pp.truncate_pagecache
>       0.31 ± 22%      +0.4        0.75 ± 17%  perf-profile.children.cycles-pp.truncate_inode_pages_range
>       0.12 ± 23%      +0.5        0.59 ± 17%  perf-profile.children.cycles-pp.__btrfs_free_extent
>       0.34 ± 22%      +0.5        0.82 ± 16%  perf-profile.children.cycles-pp.btrfs_truncate_free_space_cache
>       0.68 ±  9%      +0.5        1.18 ±  8%  perf-profile.children.cycles-pp.scheduler_tick
>       0.39 ± 25%      +0.5        0.91 ± 11%  perf-profile.children.cycles-pp.__extent_writepage
>       0.78 ±  9%      +0.5        1.33 ±  7%  perf-profile.children.cycles-pp.update_process_times
>       0.79 ±  8%      +0.6        1.34 ±  8%  perf-profile.children.cycles-pp.tick_sched_handle
>       0.26 ± 31%      +0.6        0.82 ± 14%  perf-profile.children.cycles-pp.btrfs_write_dirty_block_groups
>       0.51 ± 23%      +0.6        1.11 ± 14%  perf-profile.children.cycles-pp.cache_save_setup
>       0.87 ±  7%      +0.6        1.50 ±  7%  perf-profile.children.cycles-pp.tick_sched_timer
>       0.02 ±144%      +0.6        0.65 ± 17%  perf-profile.children.cycles-pp.__btrfs_check_leaf
>       0.02 ±144%      +0.6        0.65 ± 17%  perf-profile.children.cycles-pp.btrfs_check_leaf
>       0.47 ± 25%      +0.6        1.11 ± 13%  perf-profile.children.cycles-pp.extent_writepages
>       0.47 ± 25%      +0.6        1.11 ± 13%  perf-profile.children.cycles-pp.extent_write_cache_pages
>       0.47 ± 24%      +0.6        1.12 ± 13%  perf-profile.children.cycles-pp.btrfs_fdatawrite_range
>       0.03 ±105%      +0.7        0.71 ± 16%  perf-profile.children.cycles-pp.btree_csum_one_bio
>       0.15 ± 38%      +0.7        0.86 ±  9%  perf-profile.children.cycles-pp.poll_idle
>       0.10 ± 30%      +0.8        0.87 ± 15%  perf-profile.children.cycles-pp.btrfs_submit_chunk
>       0.10 ± 30%      +0.8        0.87 ± 15%  perf-profile.children.cycles-pp.btrfs_submit_bio
>       0.04 ± 74%      +0.8        0.83 ± 15%  perf-profile.children.cycles-pp.submit_eb_page
>       0.05 ± 73%      +0.8        0.84 ± 16%  perf-profile.children.cycles-pp.btree_write_cache_pages
>       0.05 ± 73%      +0.8        0.85 ± 16%  perf-profile.children.cycles-pp.btrfs_write_marked_extents
>       1.26 ±  8%      +0.8        2.09 ±  8%  perf-profile.children.cycles-pp.__hrtimer_run_queues
>       0.24 ± 27%      +0.8        1.09 ± 16%  perf-profile.children.cycles-pp.btrfs_run_delayed_refs_for_head
>       0.06 ± 49%      +0.8        0.90 ± 16%  perf-profile.children.cycles-pp.btrfs_write_and_wait_transaction
>       0.28 ± 27%      +0.9        1.14 ± 16%  perf-profile.children.cycles-pp.__btrfs_run_delayed_refs
>       0.28 ± 27%      +0.9        1.14 ± 17%  perf-profile.children.cycles-pp.btrfs_run_delayed_refs
>       0.30 ± 28%      +0.9        1.23 ± 14%  perf-profile.children.cycles-pp.commit_cowonly_roots
>       1.55 ±  9%      +1.1        2.62 ±  8%  perf-profile.children.cycles-pp.hrtimer_interrupt
>       1.57 ±  9%      +1.1        2.64 ±  8%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
>       2.06 ± 11%      +1.3        3.38 ±  8%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
>       1.14 ± 14%      +1.4        2.50 ±  9%  perf-profile.children.cycles-pp.__filemap_fdatawrite_range
>       0.75 ± 22%      +1.4        2.13 ± 12%  perf-profile.children.cycles-pp.filemap_fdatawrite_wbc
>       0.52 ± 24%      +1.4        1.96 ± 14%  perf-profile.children.cycles-pp.do_writepages
>       2.33 ± 11%      +1.5        3.78 ±  7%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
>       1.11 ± 37%      +1.5        2.62 ± 25%  perf-profile.children.cycles-pp.intel_idle
>       1.09 ± 24%      +1.5        2.63 ± 14%  perf-profile.children.cycles-pp.__btrfs_write_out_cache
>       1.09 ± 24%      +1.5        2.64 ± 14%  perf-profile.children.cycles-pp.btrfs_write_out_cache
>       1.44 ± 23%      +2.0        3.45 ± 14%  perf-profile.children.cycles-pp.btrfs_start_dirty_block_groups
>       1.82 ± 23%      +4.0        5.82 ± 13%  perf-profile.children.cycles-pp.btrfs_commit_transaction
>       2.06 ± 24%      +4.3        6.35 ± 14%  perf-profile.children.cycles-pp.flush_space
>       2.06 ± 24%      +4.3        6.36 ± 14%  perf-profile.children.cycles-pp.btrfs_async_reclaim_metadata_space
>       2.10 ± 24%      +4.4        6.46 ± 14%  perf-profile.children.cycles-pp.process_one_work
>       2.12 ± 23%      +4.4        6.53 ± 14%  perf-profile.children.cycles-pp.worker_thread
>       2.16 ± 23%      +4.4        6.57 ± 14%  perf-profile.children.cycles-pp.ret_from_fork_asm
>       2.16 ± 23%      +4.4        6.57 ± 14%  perf-profile.children.cycles-pp.ret_from_fork
>       2.15 ± 23%      +4.4        6.57 ± 14%  perf-profile.children.cycles-pp.kthread
>      27.90 ± 11%      +4.8       32.69        perf-profile.children.cycles-pp.cpuidle_enter_state
>      27.90 ± 11%      +4.8       32.70        perf-profile.children.cycles-pp.cpuidle_enter
>      28.26 ± 11%      +5.1       33.39        perf-profile.children.cycles-pp.cpuidle_idle_call
>      28.51 ± 10%      +5.4       33.86        perf-profile.children.cycles-pp.do_idle
>      28.51 ± 10%      +5.4       33.86        perf-profile.children.cycles-pp.secondary_startup_64_no_verify
>      28.51 ± 10%      +5.4       33.86        perf-profile.children.cycles-pp.cpu_startup_entry
>      28.05 ± 11%      +5.4       33.47        perf-profile.children.cycles-pp.start_secondary
>      19.11 ±  5%      -2.9       16.17        perf-profile.self.cycles-pp.syscall_return_via_sysret
>      16.30 ±  5%      -2.6       13.68 ±  2%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode
>      10.84 ±  5%      -1.6        9.21        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
>       7.86 ±  5%      -1.1        6.80 ±  2%  perf-profile.self.cycles-pp.__entry_text_start
>       2.12 ±  6%      -0.3        1.82 ±  6%  perf-profile.self.cycles-pp.sync_file_range
>       1.23 ±  6%      -0.2        1.02 ±  4%  perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
>       1.42 ±  4%      -0.2        1.25 ±  5%  perf-profile.self.cycles-pp.file_check_and_advance_wb_err
>       1.06 ±  6%      -0.1        0.91 ±  5%  perf-profile.self.cycles-pp.__x64_sys_sync_file_range
>       1.05 ±  5%      -0.1        0.92 ±  6%  perf-profile.self.cycles-pp.__filemap_fdatawait_range
>       0.65 ±  7%      -0.1        0.56 ±  8%  perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
>       0.08 ± 10%      +0.1        0.13 ± 16%  perf-profile.self.cycles-pp.perf_pmu_nop_void
>       0.01 ±223%      +0.1        0.07 ± 14%  perf-profile.self.cycles-pp.__hrtimer_run_queues
>       0.01 ±223%      +0.1        0.07 ± 18%  perf-profile.self.cycles-pp.__schedule
>       0.00            +0.1        0.06 ± 15%  perf-profile.self.cycles-pp.__mod_lruvec_page_state
>       0.02 ±142%      +0.1        0.08 ± 15%  perf-profile.self.cycles-pp.update_sg_lb_stats
>       0.08 ± 17%      +0.1        0.14 ± 19%  perf-profile.self.cycles-pp.native_sched_clock
>       0.02 ±141%      +0.1        0.08 ± 20%  perf-profile.self.cycles-pp.folio_clear_dirty_for_io
>       0.05 ± 46%      +0.1        0.12 ± 26%  perf-profile.self.cycles-pp.arch_scale_freq_tick
>       0.01 ±223%      +0.1        0.08 ± 21%  perf-profile.self.cycles-pp.cpuidle_idle_call
>       0.00            +0.1        0.08 ± 27%  perf-profile.self.cycles-pp.__mod_node_page_state
>       0.02 ±141%      +0.1        0.09 ± 24%  perf-profile.self.cycles-pp.crc32c
>       0.00            +0.1        0.08 ± 23%  perf-profile.self.cycles-pp.__folio_end_writeback
>       0.06 ± 21%      +0.1        0.14 ± 11%  perf-profile.self.cycles-pp.kmem_cache_alloc
>       0.00            +0.1        0.08 ± 22%  perf-profile.self.cycles-pp.btrfs_del_items
>       0.01 ±223%      +0.1        0.09 ± 22%  perf-profile.self.cycles-pp.release_pages
>       0.01 ±223%      +0.1        0.09 ± 14%  perf-profile.self.cycles-pp.percpu_counter_add_batch
>       0.14 ± 18%      +0.1        0.23 ± 11%  perf-profile.self.cycles-pp.lapic_next_deadline
>       0.11 ± 16%      +0.1        0.20 ± 11%  perf-profile.self.cycles-pp.read_tsc
>       0.00            +0.1        0.09 ± 28%  perf-profile.self.cycles-pp.__write_extent_buffer
>       0.00            +0.1        0.09 ± 24%  perf-profile.self.cycles-pp.btrfs_get_64
>       0.00            +0.1        0.09 ± 31%  perf-profile.self.cycles-pp.__btrfs_check_leaf
>       0.00            +0.1        0.09 ± 20%  perf-profile.self.cycles-pp.clear_page_erms
>       0.00            +0.1        0.10 ± 30%  perf-profile.self.cycles-pp.setup_items_for_insert
>       0.00            +0.1        0.10 ± 20%  perf-profile.self.cycles-pp.__memmove
>       0.11 ± 26%      +0.1        0.22 ± 12%  perf-profile.self.cycles-pp._raw_spin_lock_irqsave
>       0.17 ± 14%      +0.1        0.28 ± 10%  perf-profile.self.cycles-pp.__intel_pmu_enable_all
>       0.12 ± 12%      +0.1        0.24 ±  9%  perf-profile.self.cycles-pp.menu_select
>       0.00            +0.1        0.13 ± 13%  perf-profile.self.cycles-pp.read_extent_buffer
>       0.16 ± 10%      +0.2        0.33 ±  4%  perf-profile.self.cycles-pp._raw_spin_lock
>       0.22 ± 18%      +0.2        0.42 ±  3%  perf-profile.self.cycles-pp.cpuidle_enter_state
>       0.28 ± 10%      +0.2        0.48 ± 10%  perf-profile.self.cycles-pp.perf_adjust_freq_unthr_context
>       0.00            +0.2        0.21 ± 21%  perf-profile.self.cycles-pp.btrfs_get_32
>       0.15 ± 25%      +0.2        0.37 ± 18%  perf-profile.self.cycles-pp.crc_pcl
>       0.02 ±142%      +0.3        0.27 ± 20%  perf-profile.self.cycles-pp.btrfs_set_token_32
>       0.02 ±146%      +0.3        0.30 ± 12%  perf-profile.self.cycles-pp.btrfs_get_token_32
>       0.14 ± 38%      +0.7        0.84 ±  8%  perf-profile.self.cycles-pp.poll_idle
>       1.11 ± 37%      +1.5        2.62 ± 25%  perf-profile.self.cycles-pp.intel_idle
> 
> 
> ***************************************************************************************************
> lkp-icl-2sp9: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
> =========================================================================================
> bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase:
>   4k/gcc-12/performance/1HDD/btrfs/ftruncate/x86_64-rhel-8.3/100%/debian-11.1-x86_64-20220510.cgz/300s/write/lkp-icl-2sp9/128G/fio-basic
> 
> commit: 
>   d80879b7d6 ("btrfs: stop doing excessive space reservation for csum deletion")
>   6c9131ed0d ("btrfs: always reserve space for delayed refs when starting transaction")
> 
> d80879b7d6aff432 6c9131ed0d644324adeeaccd2fe 
> ---------------- --------------------------- 
>          %stddev     %change         %stddev
>              \          |                \  
>      15.04 ± 12%     -40.5%       8.94 ± 10%  iostat.cpu.idle
>      84.23 ±  2%      +7.4%      90.44        iostat.cpu.system
>      13923 ±  7%     +31.9%      18370 ±  3%  sched_debug.cpu.avg_idle.min
>       4.15 ± 22%     +30.7%       5.42 ±  6%  sched_debug.cpu.clock.stddev
>      15545 ± 26%    +242.2%      53197 ±  9%  meminfo.Active(anon)
>     740596 ±  2%     -15.8%     623416 ±  3%  meminfo.Dirty
>      58563 ±  8%     +64.9%      96555 ±  5%  meminfo.Shmem
>      13.16 ± 15%      -6.0        7.14 ± 13%  mpstat.cpu.all.idle%
>       0.13 ± 27%      -0.1        0.07 ± 12%  mpstat.cpu.all.iowait%
>       0.02            -0.0        0.02 ±  5%  mpstat.cpu.all.soft%
>       0.58 ±  3%      -0.1        0.53 ±  2%  mpstat.cpu.all.usr%
>      13.52 ± 17%      -6.0        7.56 ± 12%  turbostat.C1%
>      13.02 ± 17%     -44.1%       7.28 ± 12%  turbostat.CPU%c1
>      60.83 ±  2%      +9.6%      66.67 ±  3%  turbostat.PkgTmp
>     337.21            +1.3%     341.42        turbostat.PkgWatt
>      14.50 ± 11%     -42.5%       8.33 ± 13%  vmstat.cpu.id
>      31509 ±  7%     +21.3%      38218 ±  2%  vmstat.io.bo
>     217220 ± 17%     -44.1%     121500 ± 17%  vmstat.system.cs
>     170878 ± 11%     -27.9%     123239 ±  8%  vmstat.system.in
>     370058           -15.7%     312135 ±  3%  numa-meminfo.node0.Dirty
>     932599 ± 38%    +136.7%    2207627 ± 53%  numa-meminfo.node0.FilePages
>     261078 ±138%    +486.3%    1530767 ± 78%  numa-meminfo.node0.Unevictable
>       6689 ± 92%    +329.1%      28703 ± 57%  numa-meminfo.node1.Active(anon)
>      26989 ±115%    +110.9%      56922 ± 55%  numa-meminfo.node1.AnonHugePages
>     369834 ±  2%     -15.7%     311910 ±  3%  numa-meminfo.node1.Dirty
>      92513           -15.6%      78039 ±  3%  numa-vmstat.node0.nr_dirty
>     233224 ± 38%    +136.8%     552294 ± 53%  numa-vmstat.node0.nr_file_pages
>      65269 ±138%    +486.3%     382691 ± 78%  numa-vmstat.node0.nr_unevictable
>     166993 ±  9%     +40.9%     235292 ±  4%  numa-vmstat.node0.nr_written
>      65269 ±138%    +486.3%     382691 ± 78%  numa-vmstat.node0.nr_zone_unevictable
>      92521           -15.6%      78049 ±  3%  numa-vmstat.node0.nr_zone_write_pending
>       1775 ± 90%    +307.7%       7236 ± 54%  numa-vmstat.node1.nr_active_anon
>      92461 ±  2%     -15.6%      78020 ±  3%  numa-vmstat.node1.nr_dirty
>     167011 ±  9%     +40.6%     234771 ±  4%  numa-vmstat.node1.nr_written
>       1775 ± 90%    +307.7%       7236 ± 54%  numa-vmstat.node1.nr_zone_active_anon
>      92467 ±  2%     -15.6%      78027 ±  3%  numa-vmstat.node1.nr_zone_write_pending
>       0.94 ± 18%      -0.5        0.46 ± 26%  fio.latency_100us%
>       0.15 ± 16%      -0.0        0.10 ± 12%  fio.latency_10us%
>      93.34            +2.9       96.23        fio.latency_250us%
>      85.21           +12.6%      95.98        fio.time.elapsed_time
>      85.21           +12.6%      95.98        fio.time.elapsed_time.max
>      47779 ±  2%     +30.7%      62438 ±  2%  fio.time.involuntary_context_switches
>       5565 ±  2%      +6.9%       5947        fio.time.percent_of_cpu_this_job_got
>       4720           +20.5%       5686        fio.time.system_time
>       1544           -11.2%       1370        fio.write_bw_MBps
>     391850 ±  4%     -19.7%     314709 ±  4%  fio.write_clat_99%_us
>     160205           +13.1%     181223        fio.write_clat_mean_us
>     395291           -11.2%     350824        fio.write_iops
>       4058 ± 24%    +224.3%      13163 ± 10%  proc-vmstat.nr_active_anon
>     185226 ±  2%     -15.8%     156031 ±  3%  proc-vmstat.nr_dirty
>      14662 ±  8%     +63.9%      24028 ±  5%  proc-vmstat.nr_shmem
>     335451 ±  9%     +39.2%     466844 ±  4%  proc-vmstat.nr_written
>       4058 ± 24%    +224.3%      13163 ± 10%  proc-vmstat.nr_zone_active_anon
>     185238 ±  2%     -15.8%     156047 ±  3%  proc-vmstat.nr_zone_write_pending
>    1731801            +0.9%    1747722        proc-vmstat.numa_hit
>    1665401            +1.0%    1681395        proc-vmstat.numa_local
>     393334            +5.6%     415532        proc-vmstat.pgfault
>     349710            +5.5%     368993        proc-vmstat.pgfree
>    2681920 ±  9%     +39.2%    3733194 ±  4%  proc-vmstat.pgpgout
>      13786 ±  2%      +8.0%      14893 ±  2%  proc-vmstat.pgreuse
>     816128 ±  5%      +9.2%     891136 ±  3%  proc-vmstat.unevictable_pgs_scanned
>       7.33 ±118%      -5.5        1.85 ±223%  perf-profile.calltrace.cycles-pp._compound_head.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
>       3.86 ± 73%      -1.1        2.78 ±141%  perf-profile.calltrace.cycles-pp.put_files_struct.do_exit.do_group_exit.get_signal.arch_do_signal_or_restart
>       3.86 ± 73%      -1.1        2.78 ±141%  perf-profile.calltrace.cycles-pp.filp_close.put_files_struct.do_exit.do_group_exit.get_signal
>       3.86 ± 73%      +0.5        4.36 ±149%  perf-profile.calltrace.cycles-pp.__fput.task_work_run.do_exit.do_group_exit.get_signal
>       3.86 ± 73%      +0.5        4.36 ±149%  perf-profile.calltrace.cycles-pp.task_work_run.do_exit.do_group_exit.get_signal.arch_do_signal_or_restart
>       3.86 ± 73%      +0.5        4.36 ±149%  perf-profile.calltrace.cycles-pp.perf_release.__fput.task_work_run.do_exit.do_group_exit
>       3.86 ± 73%      +0.5        4.36 ±149%  perf-profile.calltrace.cycles-pp.perf_event_release_kernel.perf_release.__fput.task_work_run.do_exit
>       6.14 ±113%      -4.3        1.85 ±223%  perf-profile.children.cycles-pp._compound_head
>       3.86 ± 73%      -2.5        1.39 ±223%  perf-profile.children.cycles-pp.tlb_finish_mmu
>       3.86 ± 73%      -1.1        2.78 ±141%  perf-profile.children.cycles-pp.put_files_struct
>       3.86 ± 73%      -1.1        2.78 ±141%  perf-profile.children.cycles-pp.filp_close
>       3.86 ± 73%      +0.5        4.36 ±149%  perf-profile.children.cycles-pp.__fput
>       3.86 ± 73%      +0.5        4.36 ±149%  perf-profile.children.cycles-pp.task_work_run
>       3.86 ± 73%      +0.5        4.36 ±149%  perf-profile.children.cycles-pp.perf_release
>       3.86 ± 73%      +0.5        4.36 ±149%  perf-profile.children.cycles-pp.perf_event_release_kernel
>       4.95 ±121%      -3.1        1.85 ±223%  perf-profile.self.cycles-pp._compound_head
>       3.55 ±104%      -1.4        2.18 ±149%  perf-profile.self.cycles-pp._raw_spin_lock
>       0.03 ± 73%     -96.9%       0.00 ± 57%  perf-sched.sch_delay.avg.ms.__cond_resched.btrfs_alloc_path.btrfs_drop_extents.maybe_insert_hole.btrfs_cont_expand
>       0.00 ±152%    +433.3%       0.00 ± 41%  perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc.alloc_extent_state.__set_extent_bit.lock_extent
>       0.05 ±  8%     +21.9%       0.06 ±  5%  perf-sched.sch_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
>       3.25 ± 28%     -99.6%       0.01 ± 28%  perf-sched.sch_delay.max.ms.__cond_resched.btrfs_alloc_path.btrfs_drop_extents.maybe_insert_hole.btrfs_cont_expand
>       0.90 ± 65%     -81.0%       0.17 ± 96%  perf-sched.wait_and_delay.avg.ms.__cond_resched.btrfs_alloc_path.btrfs_drop_extents.maybe_insert_hole.btrfs_cont_expand
>       0.36 ± 41%     -55.4%       0.16 ± 49%  perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.__btrfs_tree_read_lock
>     335.83 ± 25%     -83.9%      54.00 ± 73%  perf-sched.wait_and_delay.count.__cond_resched.btrfs_alloc_path.btrfs_drop_extents.maybe_insert_hole.btrfs_cont_expand
>     625.17 ± 14%     +87.2%       1170 ±  6%  perf-sched.wait_and_delay.count.__cond_resched.kmem_cache_alloc.alloc_extent_state.__set_extent_bit.set_extent_bit
>     674.33 ± 14%     +77.6%       1197 ±  5%  perf-sched.wait_and_delay.count.__cond_resched.kmem_cache_alloc.start_transaction.btrfs_dirty_inode.btrfs_setattr
>     232.33 ± 23%     -76.9%      53.67 ± 73%  perf-sched.wait_and_delay.count.__cond_resched.mutex_lock.btrfs_delayed_update_inode.btrfs_update_inode.btrfs_dirty_inode
>     375.00 ± 24%     -76.8%      87.17 ± 19%  perf-sched.wait_and_delay.count.__cond_resched.mutex_lock.btrfs_delayed_update_inode.btrfs_update_inode.btrfs_setsize
>     889.33 ± 38%     +48.6%       1321 ±  4%  perf-sched.wait_and_delay.count.io_schedule.rq_qos_wait.wbt_wait.__rq_qos_throttle
>       1318 ± 25%     +31.6%       1735 ±  3%  perf-sched.wait_and_delay.count.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
>       6.20 ± 35%     -70.6%       1.83 ± 93%  perf-sched.wait_and_delay.max.ms.__cond_resched.mutex_lock.btrfs_delayed_update_inode.btrfs_update_inode.btrfs_dirty_inode
>       7.65 ±  6%     -47.8%       4.00 ± 43%  perf-sched.wait_and_delay.max.ms.__cond_resched.mutex_lock.btrfs_delayed_update_inode.btrfs_update_inode.btrfs_setsize
>     657.21 ± 88%     -76.7%     153.27 ± 27%  perf-sched.wait_and_delay.max.ms.io_schedule.rq_qos_wait.wbt_wait.__rq_qos_throttle
>       0.86 ± 65%     -64.1%       0.31 ± 53%  perf-sched.wait_time.avg.ms.__cond_resched.btrfs_alloc_path.btrfs_drop_extents.maybe_insert_hole.btrfs_cont_expand
>       0.28 ± 93%     -89.1%       0.03 ±145%  perf-sched.wait_time.avg.ms.__cond_resched.down_read.__btrfs_tree_read_lock.btrfs_search_slot.btrfs_next_old_leaf
>       0.01 ±136%   +5431.2%       0.74 ±118%  perf-sched.wait_time.avg.ms.__cond_resched.down_write.do_truncate.do_sys_ftruncate.do_syscall_64
>       0.36 ± 42%     -56.5%       0.15 ± 50%  perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.__btrfs_tree_read_lock
>       0.51 ± 23%     -95.1%       0.02 ±169%  perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.__btrfs_tree_lock
>       0.95 ±145%     -90.1%       0.09 ±158%  perf-sched.wait_time.max.ms.__cond_resched.down_read.__btrfs_tree_read_lock.btrfs_search_slot.btrfs_next_old_leaf
>       5.34 ± 25%     -56.9%       2.30 ± 74%  perf-sched.wait_time.max.ms.__cond_resched.down_write.__btrfs_tree_lock.btrfs_search_slot.btrfs_insert_empty_items
>       0.02 ±123%   +5763.7%       1.00 ±134%  perf-sched.wait_time.max.ms.__cond_resched.down_write.do_truncate.do_sys_ftruncate.do_syscall_64
>       5.92 ± 21%     +31.5%       7.78 ± 11%  perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc.start_transaction.btrfs_dirty_inode.btrfs_setattr
>       7.50 ±  6%     -46.7%       4.00 ± 43%  perf-sched.wait_time.max.ms.__cond_resched.mutex_lock.btrfs_delayed_update_inode.btrfs_update_inode.btrfs_setsize
>     656.70 ± 87%     -77.0%     151.28 ± 28%  perf-sched.wait_time.max.ms.io_schedule.rq_qos_wait.wbt_wait.__rq_qos_throttle
>       2.69 ± 32%     -98.2%       0.05 ±147%  perf-sched.wait_time.max.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.__btrfs_tree_lock
>       3.16 ±  3%     -11.0%       2.81 ±  2%  perf-stat.i.MPKI
>       0.41 ±  4%      -0.1        0.34 ±  4%  perf-stat.i.branch-miss-rate%
>   23832019 ±  2%     -14.2%   20447460 ±  4%  perf-stat.i.branch-misses
>      39.99            -1.0       39.00        perf-stat.i.cache-miss-rate%
>   91853184 ±  3%     -12.0%   80862154        perf-stat.i.cache-misses
>  2.304e+08 ±  2%      -9.3%  2.091e+08        perf-stat.i.cache-references
>     229178 ± 17%     -44.5%     127108 ± 17%  perf-stat.i.context-switches
>       6.84            +9.1%       7.46        perf-stat.i.cpi
>  2.028e+11 ±  2%      +6.3%  2.156e+11        perf-stat.i.cpu-cycles
>       2455 ±  9%     -26.0%       1816 ±  7%  perf-stat.i.cpu-migrations
>       2190 ±  4%     +22.4%       2679 ±  2%  perf-stat.i.cycles-between-cache-misses
>   2.41e+09            -9.6%  2.178e+09        perf-stat.i.dTLB-stores
>       0.16 ±  2%      -9.2%       0.14        perf-stat.i.ipc
>       3.17 ±  2%      +6.3%       3.37        perf-stat.i.metric.GHz
>     596.33           -13.7%     514.82        perf-stat.i.metric.K/sec
>     254.22            -2.5%     247.79        perf-stat.i.metric.M/sec
>       3354            -4.9%       3190        perf-stat.i.minor-faults
>   20345926           -12.1%   17889584        perf-stat.i.node-load-misses
>     670929 ±  3%      -8.4%     614724 ±  4%  perf-stat.i.node-loads
>      78.76            +1.8       80.53        perf-stat.i.node-store-miss-rate%
>   13307645           -13.5%   11505149 ±  2%  perf-stat.i.node-store-misses
>    3521965 ±  2%     -20.2%    2810680 ±  2%  perf-stat.i.node-stores
>       3354            -4.9%       3191        perf-stat.i.page-faults
>       3.14 ±  3%     -10.5%       2.81 ±  2%  perf-stat.overall.MPKI
>       0.38 ±  3%      -0.1        0.33 ±  3%  perf-stat.overall.branch-miss-rate%
>      39.88            -1.2       38.69        perf-stat.overall.cache-miss-rate%
>       6.94            +8.0%       7.50        perf-stat.overall.cpi
>       2212 ±  3%     +20.6%       2669 ±  2%  perf-stat.overall.cycles-between-cache-misses
>       0.00 ±  5%      +0.0        0.00 ±  4%  perf-stat.overall.dTLB-store-miss-rate%
>       0.14            -7.4%       0.13        perf-stat.overall.ipc
>      79.10            +1.3       80.38        perf-stat.overall.node-store-miss-rate%
>      74346           +10.8%      82392        perf-stat.overall.path-length
>   23499855 ±  2%     -14.1%   20194868 ±  3%  perf-stat.ps.branch-misses
>   90797306 ±  3%     -11.9%   80007324        perf-stat.ps.cache-misses
>  2.277e+08 ±  2%      -9.2%  2.068e+08        perf-stat.ps.cache-references
>     223550 ± 16%     -44.4%     124400 ± 17%  perf-stat.ps.context-switches
>  2.007e+11 ±  2%      +6.4%  2.135e+11        perf-stat.ps.cpu-cycles
>       2398 ±  9%     -25.8%       1779 ±  7%  perf-stat.ps.cpu-migrations
>  2.381e+09            -9.5%  2.154e+09        perf-stat.ps.dTLB-stores
>       3317            -4.7%       3162        perf-stat.ps.minor-faults
>   20103614           -12.0%   17697322        perf-stat.ps.node-load-misses
>     663246 ±  3%      -8.3%     608333 ±  4%  perf-stat.ps.node-loads
>   13156513           -13.5%   11384162 ±  2%  perf-stat.ps.node-store-misses
>    3475107 ±  2%     -20.1%    2778120 ±  2%  perf-stat.ps.node-stores
>       3317            -4.7%       3162        perf-stat.ps.page-faults
>  2.495e+12           +10.8%  2.765e+12        perf-stat.total.instructions
> 
> 
> 
> 
> 
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
> 
> 
> -- 
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
> 

  reply	other threads:[~2023-09-26 19:01 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-26  7:34 [kdave-btrfs-devel:dev/guilherme/temp-fsid-v4] [btrfs] 6c9131ed0d: stress-ng.sync-file.ops_per_sec -44.2% regression kernel test robot
2023-09-26 19:01 ` Filipe Manana [this message]
2023-09-26 19:08   ` Josef Bacik
2023-09-27 17:47     ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZRMqjzDP/G+MKL5R@debian0.Home \
    --to=fdmanana@kernel.org \
    --cc=dsterba@suse.com \
    --cc=fdmanana@suse.com \
    --cc=feng.tang@intel.com \
    --cc=fengwei.yin@intel.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oliver.sang@intel.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox