* [linus:master] [btrfs] f9a48549a1: fio.write_iops 6.5% regression
@ 2026-04-29 7:40 kernel test robot
0 siblings, 0 replies; only message in thread
From: kernel test robot @ 2026-04-29 7:40 UTC (permalink / raw)
To: Leo Martins
Cc: oe-lkp, lkp, linux-kernel, David Sterba, Filipe Manana,
Sun YangKai, Boris Burkov, linux-btrfs, oliver.sang
Hello,
kernel test robot noticed a 6.5% regression of fio.write_iops on:
commit: f9a48549a15aa369d42cebc08a6a72b71a53d547 ("btrfs: inhibit extent buffer writeback to prevent COW amplification")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
[still regression on linus/master 27d128c1cff64c3b8012cc56dd5a1391bb4f1821]
[still regression on linux-next/master 7080e32d3f09d8688c4a87d81bdcc71f7f606b16]
testcase: fio-basic
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:
runtime: 300s
disk: 1HDD
fs: btrfs
nr_task: 1
test_size: 128G
rw: randwrite
bs: 4k
ioengine: vsync
cpufreq_governor: performance
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202604291540.72917ba4-lkp@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20260429/202604291540.72917ba4-lkp@intel.com
=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase:
4k/gcc-14/performance/1HDD/btrfs/vsync/x86_64-rhel-9.4/1/debian-13-x86_64-20250902.cgz/300s/randwrite/lkp-icl-2sp9/128G/fio-basic
commit:
cab4c8b594 ("btrfs: extract the max compression chunk size into a macro")
f9a48549a1 ("btrfs: inhibit extent buffer writeback to prevent COW amplification")
cab4c8b594e23649 f9a48549a15aa369d42cebc08a6
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.89 +0.9 1.81 ± 13% fio.latency_1000us%
0.03 ± 9% +0.0 0.05 ± 14% fio.latency_100ms%
0.11 ± 4% +0.0 0.12 ± 8% fio.latency_10ms%
0.41 +0.0 0.43 fio.latency_20ms%
0.01 +0.0 0.02 ± 15% fio.latency_250ms%
0.32 ± 3% +0.1 0.41 ± 5% fio.latency_2ms%
38.01 -1.8 36.25 ± 2% fio.latency_500us%
3854021 -6.4% 3605989 fio.time.file_system_outputs
568393 -4.2% 544573 fio.time.voluntary_context_switches
481752 -6.4% 450748 fio.workload
6.27 -6.5% 5.86 fio.write_bw_MBps
453290 +2.3% 463530 fio.write_clat_95%_ns
1452714 +27.1% 1845930 ± 12% fio.write_clat_99%_ns
622412 +6.9% 665559 fio.write_clat_mean_ns
9945919 +2.6% 10208918 fio.write_clat_stddev
1605 -6.5% 1501 fio.write_iops
1.66 +7.0% 1.77 iostat.cpu.iowait
0.99 +7.2% 1.06 turbostat.IPC
1.66 +0.1 1.78 mpstat.cpu.all.iowait%
0.01 ± 9% +0.0 0.01 ± 9% mpstat.cpu.all.soft%
0.15 ± 2% +0.0 0.16 mpstat.cpu.all.sys%
11925 +43.4% 17095 ± 7% vmstat.io.bo
1.06 +7.4% 1.13 vmstat.procs.b
11590 -2.7% 11278 vmstat.system.cs
1.24 ± 4% -0.1 1.15 ± 2% perf-stat.i.branch-miss-rate%
494667 ± 4% +13.7% 562455 ± 6% perf-stat.i.cache-misses
3724781 ± 2% +6.8% 3976385 perf-stat.i.cache-references
11645 -2.7% 11330 perf-stat.i.context-switches
1.03 -8.8% 0.94 perf-stat.i.cpi
1.06 +11.7% 1.18 ± 2% perf-stat.i.ipc
3.98 ± 5% -0.5 3.53 ± 3% perf-stat.overall.branch-miss-rate%
1.07 -6.9% 0.99 perf-stat.overall.cpi
1739 ± 2% -10.3% 1561 ± 6% perf-stat.overall.cycles-between-cache-misses
0.94 +7.5% 1.01 perf-stat.overall.ipc
504122 ± 4% +17.0% 589787 ± 2% perf-stat.overall.path-length
493093 ± 4% +13.7% 560650 ± 6% perf-stat.ps.cache-misses
3715678 ± 2% +6.7% 3966204 perf-stat.ps.cache-references
11607 -2.7% 11293 perf-stat.ps.context-switches
0.68 ± 20% -0.4 0.30 ±100% perf-profile.calltrace.cycles-pp.__schedule.schedule.io_schedule.folio_wait_bit_common.folio_wait_writeback
0.70 ± 19% -0.3 0.40 ± 71% perf-profile.calltrace.cycles-pp.io_schedule.folio_wait_bit_common.folio_wait_writeback.__filemap_fdatawait_range.filemap_fdatawait_range
1.18 ± 15% +0.4 1.57 ± 11% perf-profile.calltrace.cycles-pp.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.pv_native_safe_halt.acpi_safe_halt
1.12 ± 14% +0.4 1.51 ± 12% perf-profile.calltrace.cycles-pp.handle_softirqs.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.pv_native_safe_halt
0.45 ± 72% +0.5 0.96 ± 12% perf-profile.calltrace.cycles-pp.rcu_core.handle_softirqs.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
0.30 ±100% +0.5 0.84 ± 11% perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.handle_softirqs.__irq_exit_rcu.sysvec_apic_timer_interrupt
0.00 +0.6 0.63 ± 7% perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs.__irq_exit_rcu
0.00 +0.9 0.90 ± 17% perf-profile.calltrace.cycles-pp.xas_find.xa_find_after.btrfs_uninhibit_all_eb_writeback.__btrfs_end_transaction.btrfs_finish_one_ordered
0.00 +1.0 0.96 ± 16% perf-profile.calltrace.cycles-pp.xa_find_after.btrfs_uninhibit_all_eb_writeback.__btrfs_end_transaction.btrfs_finish_one_ordered.btrfs_work_helper
0.00 +2.1 2.09 ± 18% perf-profile.calltrace.cycles-pp.btrfs_uninhibit_all_eb_writeback.__btrfs_end_transaction.btrfs_finish_one_ordered.btrfs_work_helper.process_one_work
0.00 +2.2 2.16 ± 18% perf-profile.calltrace.cycles-pp.__btrfs_end_transaction.btrfs_finish_one_ordered.btrfs_work_helper.process_one_work.worker_thread
0.34 ± 79% -0.3 0.06 ± 73% perf-profile.children.cycles-pp.memcpy_extent_buffer
0.46 ± 12% -0.1 0.33 ± 15% perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
0.16 ± 40% -0.1 0.04 ±107% perf-profile.children.cycles-pp.lock_extent_buffer_for_io
0.47 ± 16% -0.1 0.38 ± 15% perf-profile.children.cycles-pp.update_load_avg
0.13 ± 23% -0.1 0.06 ± 62% perf-profile.children.cycles-pp.menu_reflect
0.04 ± 72% +0.1 0.10 ± 9% perf-profile.children.cycles-pp.__mem_cgroup_charge
0.04 ± 73% +0.1 0.12 ± 29% perf-profile.children.cycles-pp.scsi_dma_map
0.10 ± 41% +0.1 0.18 ± 34% perf-profile.children.cycles-pp.lookup_extent_backref
0.02 ±141% +0.1 0.10 ± 30% perf-profile.children.cycles-pp.__dma_map_sg_attrs
0.02 ±141% +0.1 0.10 ± 29% perf-profile.children.cycles-pp.dma_map_sg_attrs
0.04 ± 77% +0.1 0.13 ± 33% perf-profile.children.cycles-pp.___perf_sw_event
0.22 ± 16% +0.1 0.31 ± 19% perf-profile.children.cycles-pp.__call_rcu_common
0.14 ± 30% +0.1 0.24 ± 24% perf-profile.children.cycles-pp.lookup_inline_extent_backref
0.07 ± 55% +0.1 0.19 ± 18% perf-profile.children.cycles-pp.__refill_objects_node
0.14 ± 54% +0.1 0.26 ± 19% perf-profile.children.cycles-pp.__memcg_slab_free_hook
0.20 ± 26% +0.1 0.34 ± 16% perf-profile.children.cycles-pp.__pcs_replace_empty_main
0.38 ± 18% +0.2 0.56 ± 11% perf-profile.children.cycles-pp.xas_alloc
0.04 ±110% +0.2 0.25 ± 22% perf-profile.children.cycles-pp.xa_find
0.09 ± 46% +0.2 0.32 ± 97% perf-profile.children.cycles-pp.__pcs_replace_full_main
0.46 ± 26% +0.3 0.78 ± 11% perf-profile.children.cycles-pp.xas_create
0.04 ±108% +0.4 0.44 ± 23% perf-profile.children.cycles-pp.__xa_store
0.66 ± 26% +0.4 1.05 ± 10% perf-profile.children.cycles-pp.rcu_core
0.52 ± 29% +0.4 0.92 ± 10% perf-profile.children.cycles-pp.rcu_do_batch
0.04 ±108% +0.4 0.49 ± 23% perf-profile.children.cycles-pp.xa_store
0.00 +0.7 0.69 ± 23% perf-profile.children.cycles-pp.xas_free_nodes
0.00 +0.7 0.72 ± 25% perf-profile.children.cycles-pp.xa_destroy
0.00 +0.8 0.76 ± 20% perf-profile.children.cycles-pp.btrfs_inhibit_eb_writeback
0.00 +1.0 0.96 ± 17% perf-profile.children.cycles-pp.xa_find_after
0.50 ± 7% +1.1 1.62 ± 15% perf-profile.children.cycles-pp.xas_find
0.00 +2.1 2.11 ± 18% perf-profile.children.cycles-pp.btrfs_uninhibit_all_eb_writeback
0.04 ± 77% +2.1 2.17 ± 18% perf-profile.children.cycles-pp.__btrfs_end_transaction
0.12 ± 23% -0.1 0.04 ±105% perf-profile.self.cycles-pp.finish_task_switch
0.01 ±223% +0.1 0.10 ± 35% perf-profile.self.cycles-pp.xa_load
0.00 +0.1 0.09 ± 33% perf-profile.self.cycles-pp.btrfs_uninhibit_all_eb_writeback
0.04 ± 77% +0.1 0.14 ± 34% perf-profile.self.cycles-pp.__call_rcu_common
0.06 ± 58% +0.1 0.18 ± 19% perf-profile.self.cycles-pp.__refill_objects_node
0.05 ±104% +0.1 0.20 ± 22% perf-profile.self.cycles-pp.__memcg_slab_free_hook
0.06 ±100% +0.2 0.21 ± 23% perf-profile.self.cycles-pp.xas_create
0.45 ± 25% +0.2 0.65 ± 17% perf-profile.self.cycles-pp.kmem_cache_free
0.00 +0.6 0.56 ± 19% perf-profile.self.cycles-pp.xas_free_nodes
0.01 ±223% +1.1 1.08 ± 15% perf-profile.self.cycles-pp.xas_find
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2026-04-29 7:40 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-29 7:40 [linus:master] [btrfs] f9a48549a1: fio.write_iops 6.5% regression kernel test robot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox