public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [linus:master] [btrfs]  f9a48549a1:  fio.write_iops 6.5% regression
@ 2026-04-29  7:40 kernel test robot
  0 siblings, 0 replies; only message in thread
From: kernel test robot @ 2026-04-29  7:40 UTC (permalink / raw)
  To: Leo Martins
  Cc: oe-lkp, lkp, linux-kernel, David Sterba, Filipe Manana,
	Sun YangKai, Boris Burkov, linux-btrfs, oliver.sang



Hello,

kernel test robot noticed a 6.5% regression of fio.write_iops on:


commit: f9a48549a15aa369d42cebc08a6a72b71a53d547 ("btrfs: inhibit extent buffer writeback to prevent COW amplification")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[still regression on linus/master      27d128c1cff64c3b8012cc56dd5a1391bb4f1821]
[still regression on linux-next/master 7080e32d3f09d8688c4a87d81bdcc71f7f606b16]

testcase: fio-basic
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:

	runtime: 300s
	disk: 1HDD
	fs: btrfs
	nr_task: 1
	test_size: 128G
	rw: randwrite
	bs: 4k
	ioengine: vsync
	cpufreq_governor: performance



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202604291540.72917ba4-lkp@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20260429/202604291540.72917ba4-lkp@intel.com

=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase:
  4k/gcc-14/performance/1HDD/btrfs/vsync/x86_64-rhel-9.4/1/debian-13-x86_64-20250902.cgz/300s/randwrite/lkp-icl-2sp9/128G/fio-basic

commit: 
  cab4c8b594 ("btrfs: extract the max compression chunk size into a macro")
  f9a48549a1 ("btrfs: inhibit extent buffer writeback to prevent COW amplification")

cab4c8b594e23649 f9a48549a15aa369d42cebc08a6 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.89            +0.9        1.81 ± 13%  fio.latency_1000us%
      0.03 ±  9%      +0.0        0.05 ± 14%  fio.latency_100ms%
      0.11 ±  4%      +0.0        0.12 ±  8%  fio.latency_10ms%
      0.41            +0.0        0.43        fio.latency_20ms%
      0.01            +0.0        0.02 ± 15%  fio.latency_250ms%
      0.32 ±  3%      +0.1        0.41 ±  5%  fio.latency_2ms%
     38.01            -1.8       36.25 ±  2%  fio.latency_500us%
   3854021            -6.4%    3605989        fio.time.file_system_outputs
    568393            -4.2%     544573        fio.time.voluntary_context_switches
    481752            -6.4%     450748        fio.workload
      6.27            -6.5%       5.86        fio.write_bw_MBps
    453290            +2.3%     463530        fio.write_clat_95%_ns
   1452714           +27.1%    1845930 ± 12%  fio.write_clat_99%_ns
    622412            +6.9%     665559        fio.write_clat_mean_ns
   9945919            +2.6%   10208918        fio.write_clat_stddev
      1605            -6.5%       1501        fio.write_iops
      1.66            +7.0%       1.77        iostat.cpu.iowait
      0.99            +7.2%       1.06        turbostat.IPC
      1.66            +0.1        1.78        mpstat.cpu.all.iowait%
      0.01 ±  9%      +0.0        0.01 ±  9%  mpstat.cpu.all.soft%
      0.15 ±  2%      +0.0        0.16        mpstat.cpu.all.sys%
     11925           +43.4%      17095 ±  7%  vmstat.io.bo
      1.06            +7.4%       1.13        vmstat.procs.b
     11590            -2.7%      11278        vmstat.system.cs
      1.24 ±  4%      -0.1        1.15 ±  2%  perf-stat.i.branch-miss-rate%
    494667 ±  4%     +13.7%     562455 ±  6%  perf-stat.i.cache-misses
   3724781 ±  2%      +6.8%    3976385        perf-stat.i.cache-references
     11645            -2.7%      11330        perf-stat.i.context-switches
      1.03            -8.8%       0.94        perf-stat.i.cpi
      1.06           +11.7%       1.18 ±  2%  perf-stat.i.ipc
      3.98 ±  5%      -0.5        3.53 ±  3%  perf-stat.overall.branch-miss-rate%
      1.07            -6.9%       0.99        perf-stat.overall.cpi
      1739 ±  2%     -10.3%       1561 ±  6%  perf-stat.overall.cycles-between-cache-misses
      0.94            +7.5%       1.01        perf-stat.overall.ipc
    504122 ±  4%     +17.0%     589787 ±  2%  perf-stat.overall.path-length
    493093 ±  4%     +13.7%     560650 ±  6%  perf-stat.ps.cache-misses
   3715678 ±  2%      +6.7%    3966204        perf-stat.ps.cache-references
     11607            -2.7%      11293        perf-stat.ps.context-switches
      0.68 ± 20%      -0.4        0.30 ±100%  perf-profile.calltrace.cycles-pp.__schedule.schedule.io_schedule.folio_wait_bit_common.folio_wait_writeback
      0.70 ± 19%      -0.3        0.40 ± 71%  perf-profile.calltrace.cycles-pp.io_schedule.folio_wait_bit_common.folio_wait_writeback.__filemap_fdatawait_range.filemap_fdatawait_range
      1.18 ± 15%      +0.4        1.57 ± 11%  perf-profile.calltrace.cycles-pp.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.pv_native_safe_halt.acpi_safe_halt
      1.12 ± 14%      +0.4        1.51 ± 12%  perf-profile.calltrace.cycles-pp.handle_softirqs.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.pv_native_safe_halt
      0.45 ± 72%      +0.5        0.96 ± 12%  perf-profile.calltrace.cycles-pp.rcu_core.handle_softirqs.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
      0.30 ±100%      +0.5        0.84 ± 11%  perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.handle_softirqs.__irq_exit_rcu.sysvec_apic_timer_interrupt
      0.00            +0.6        0.63 ±  7%  perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs.__irq_exit_rcu
      0.00            +0.9        0.90 ± 17%  perf-profile.calltrace.cycles-pp.xas_find.xa_find_after.btrfs_uninhibit_all_eb_writeback.__btrfs_end_transaction.btrfs_finish_one_ordered
      0.00            +1.0        0.96 ± 16%  perf-profile.calltrace.cycles-pp.xa_find_after.btrfs_uninhibit_all_eb_writeback.__btrfs_end_transaction.btrfs_finish_one_ordered.btrfs_work_helper
      0.00            +2.1        2.09 ± 18%  perf-profile.calltrace.cycles-pp.btrfs_uninhibit_all_eb_writeback.__btrfs_end_transaction.btrfs_finish_one_ordered.btrfs_work_helper.process_one_work
      0.00            +2.2        2.16 ± 18%  perf-profile.calltrace.cycles-pp.__btrfs_end_transaction.btrfs_finish_one_ordered.btrfs_work_helper.process_one_work.worker_thread
      0.34 ± 79%      -0.3        0.06 ± 73%  perf-profile.children.cycles-pp.memcpy_extent_buffer
      0.46 ± 12%      -0.1        0.33 ± 15%  perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
      0.16 ± 40%      -0.1        0.04 ±107%  perf-profile.children.cycles-pp.lock_extent_buffer_for_io
      0.47 ± 16%      -0.1        0.38 ± 15%  perf-profile.children.cycles-pp.update_load_avg
      0.13 ± 23%      -0.1        0.06 ± 62%  perf-profile.children.cycles-pp.menu_reflect
      0.04 ± 72%      +0.1        0.10 ±  9%  perf-profile.children.cycles-pp.__mem_cgroup_charge
      0.04 ± 73%      +0.1        0.12 ± 29%  perf-profile.children.cycles-pp.scsi_dma_map
      0.10 ± 41%      +0.1        0.18 ± 34%  perf-profile.children.cycles-pp.lookup_extent_backref
      0.02 ±141%      +0.1        0.10 ± 30%  perf-profile.children.cycles-pp.__dma_map_sg_attrs
      0.02 ±141%      +0.1        0.10 ± 29%  perf-profile.children.cycles-pp.dma_map_sg_attrs
      0.04 ± 77%      +0.1        0.13 ± 33%  perf-profile.children.cycles-pp.___perf_sw_event
      0.22 ± 16%      +0.1        0.31 ± 19%  perf-profile.children.cycles-pp.__call_rcu_common
      0.14 ± 30%      +0.1        0.24 ± 24%  perf-profile.children.cycles-pp.lookup_inline_extent_backref
      0.07 ± 55%      +0.1        0.19 ± 18%  perf-profile.children.cycles-pp.__refill_objects_node
      0.14 ± 54%      +0.1        0.26 ± 19%  perf-profile.children.cycles-pp.__memcg_slab_free_hook
      0.20 ± 26%      +0.1        0.34 ± 16%  perf-profile.children.cycles-pp.__pcs_replace_empty_main
      0.38 ± 18%      +0.2        0.56 ± 11%  perf-profile.children.cycles-pp.xas_alloc
      0.04 ±110%      +0.2        0.25 ± 22%  perf-profile.children.cycles-pp.xa_find
      0.09 ± 46%      +0.2        0.32 ± 97%  perf-profile.children.cycles-pp.__pcs_replace_full_main
      0.46 ± 26%      +0.3        0.78 ± 11%  perf-profile.children.cycles-pp.xas_create
      0.04 ±108%      +0.4        0.44 ± 23%  perf-profile.children.cycles-pp.__xa_store
      0.66 ± 26%      +0.4        1.05 ± 10%  perf-profile.children.cycles-pp.rcu_core
      0.52 ± 29%      +0.4        0.92 ± 10%  perf-profile.children.cycles-pp.rcu_do_batch
      0.04 ±108%      +0.4        0.49 ± 23%  perf-profile.children.cycles-pp.xa_store
      0.00            +0.7        0.69 ± 23%  perf-profile.children.cycles-pp.xas_free_nodes
      0.00            +0.7        0.72 ± 25%  perf-profile.children.cycles-pp.xa_destroy
      0.00            +0.8        0.76 ± 20%  perf-profile.children.cycles-pp.btrfs_inhibit_eb_writeback
      0.00            +1.0        0.96 ± 17%  perf-profile.children.cycles-pp.xa_find_after
      0.50 ±  7%      +1.1        1.62 ± 15%  perf-profile.children.cycles-pp.xas_find
      0.00            +2.1        2.11 ± 18%  perf-profile.children.cycles-pp.btrfs_uninhibit_all_eb_writeback
      0.04 ± 77%      +2.1        2.17 ± 18%  perf-profile.children.cycles-pp.__btrfs_end_transaction
      0.12 ± 23%      -0.1        0.04 ±105%  perf-profile.self.cycles-pp.finish_task_switch
      0.01 ±223%      +0.1        0.10 ± 35%  perf-profile.self.cycles-pp.xa_load
      0.00            +0.1        0.09 ± 33%  perf-profile.self.cycles-pp.btrfs_uninhibit_all_eb_writeback
      0.04 ± 77%      +0.1        0.14 ± 34%  perf-profile.self.cycles-pp.__call_rcu_common
      0.06 ± 58%      +0.1        0.18 ± 19%  perf-profile.self.cycles-pp.__refill_objects_node
      0.05 ±104%      +0.1        0.20 ± 22%  perf-profile.self.cycles-pp.__memcg_slab_free_hook
      0.06 ±100%      +0.2        0.21 ± 23%  perf-profile.self.cycles-pp.xas_create
      0.45 ± 25%      +0.2        0.65 ± 17%  perf-profile.self.cycles-pp.kmem_cache_free
      0.00            +0.6        0.56 ± 19%  perf-profile.self.cycles-pp.xas_free_nodes
      0.01 ±223%      +1.1        1.08 ± 15%  perf-profile.self.cycles-pp.xas_find




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2026-04-29  7:40 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-29  7:40 [linus:master] [btrfs] f9a48549a1: fio.write_iops 6.5% regression kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox