All of lore.kernel.org
 help / color / mirror / Atom feed
* [linus:master] [btrfs]  f9a48549a1:  fio.write_iops 6.5% regression
@ 2026-04-29  7:40 kernel test robot
  0 siblings, 0 replies; only message in thread
From: kernel test robot @ 2026-04-29  7:40 UTC (permalink / raw)
  To: Leo Martins
  Cc: oe-lkp, lkp, linux-kernel, David Sterba, Filipe Manana,
	Sun YangKai, Boris Burkov, linux-btrfs, oliver.sang



Hello,

kernel test robot noticed a 6.5% regression of fio.write_iops on:


commit: f9a48549a15aa369d42cebc08a6a72b71a53d547 ("btrfs: inhibit extent buffer writeback to prevent COW amplification")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[still regression on linus/master      27d128c1cff64c3b8012cc56dd5a1391bb4f1821]
[still regression on linux-next/master 7080e32d3f09d8688c4a87d81bdcc71f7f606b16]

testcase: fio-basic
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:

	runtime: 300s
	disk: 1HDD
	fs: btrfs
	nr_task: 1
	test_size: 128G
	rw: randwrite
	bs: 4k
	ioengine: vsync
	cpufreq_governor: performance



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202604291540.72917ba4-lkp@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20260429/202604291540.72917ba4-lkp@intel.com

=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase:
  4k/gcc-14/performance/1HDD/btrfs/vsync/x86_64-rhel-9.4/1/debian-13-x86_64-20250902.cgz/300s/randwrite/lkp-icl-2sp9/128G/fio-basic

commit: 
  cab4c8b594 ("btrfs: extract the max compression chunk size into a macro")
  f9a48549a1 ("btrfs: inhibit extent buffer writeback to prevent COW amplification")

cab4c8b594e23649 f9a48549a15aa369d42cebc08a6 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.89            +0.9        1.81 ± 13%  fio.latency_1000us%
      0.03 ±  9%      +0.0        0.05 ± 14%  fio.latency_100ms%
      0.11 ±  4%      +0.0        0.12 ±  8%  fio.latency_10ms%
      0.41            +0.0        0.43        fio.latency_20ms%
      0.01            +0.0        0.02 ± 15%  fio.latency_250ms%
      0.32 ±  3%      +0.1        0.41 ±  5%  fio.latency_2ms%
     38.01            -1.8       36.25 ±  2%  fio.latency_500us%
   3854021            -6.4%    3605989        fio.time.file_system_outputs
    568393            -4.2%     544573        fio.time.voluntary_context_switches
    481752            -6.4%     450748        fio.workload
      6.27            -6.5%       5.86        fio.write_bw_MBps
    453290            +2.3%     463530        fio.write_clat_95%_ns
   1452714           +27.1%    1845930 ± 12%  fio.write_clat_99%_ns
    622412            +6.9%     665559        fio.write_clat_mean_ns
   9945919            +2.6%   10208918        fio.write_clat_stddev
      1605            -6.5%       1501        fio.write_iops
      1.66            +7.0%       1.77        iostat.cpu.iowait
      0.99            +7.2%       1.06        turbostat.IPC
      1.66            +0.1        1.78        mpstat.cpu.all.iowait%
      0.01 ±  9%      +0.0        0.01 ±  9%  mpstat.cpu.all.soft%
      0.15 ±  2%      +0.0        0.16        mpstat.cpu.all.sys%
     11925           +43.4%      17095 ±  7%  vmstat.io.bo
      1.06            +7.4%       1.13        vmstat.procs.b
     11590            -2.7%      11278        vmstat.system.cs
      1.24 ±  4%      -0.1        1.15 ±  2%  perf-stat.i.branch-miss-rate%
    494667 ±  4%     +13.7%     562455 ±  6%  perf-stat.i.cache-misses
   3724781 ±  2%      +6.8%    3976385        perf-stat.i.cache-references
     11645            -2.7%      11330        perf-stat.i.context-switches
      1.03            -8.8%       0.94        perf-stat.i.cpi
      1.06           +11.7%       1.18 ±  2%  perf-stat.i.ipc
      3.98 ±  5%      -0.5        3.53 ±  3%  perf-stat.overall.branch-miss-rate%
      1.07            -6.9%       0.99        perf-stat.overall.cpi
      1739 ±  2%     -10.3%       1561 ±  6%  perf-stat.overall.cycles-between-cache-misses
      0.94            +7.5%       1.01        perf-stat.overall.ipc
    504122 ±  4%     +17.0%     589787 ±  2%  perf-stat.overall.path-length
    493093 ±  4%     +13.7%     560650 ±  6%  perf-stat.ps.cache-misses
   3715678 ±  2%      +6.7%    3966204        perf-stat.ps.cache-references
     11607            -2.7%      11293        perf-stat.ps.context-switches
      0.68 ± 20%      -0.4        0.30 ±100%  perf-profile.calltrace.cycles-pp.__schedule.schedule.io_schedule.folio_wait_bit_common.folio_wait_writeback
      0.70 ± 19%      -0.3        0.40 ± 71%  perf-profile.calltrace.cycles-pp.io_schedule.folio_wait_bit_common.folio_wait_writeback.__filemap_fdatawait_range.filemap_fdatawait_range
      1.18 ± 15%      +0.4        1.57 ± 11%  perf-profile.calltrace.cycles-pp.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.pv_native_safe_halt.acpi_safe_halt
      1.12 ± 14%      +0.4        1.51 ± 12%  perf-profile.calltrace.cycles-pp.handle_softirqs.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.pv_native_safe_halt
      0.45 ± 72%      +0.5        0.96 ± 12%  perf-profile.calltrace.cycles-pp.rcu_core.handle_softirqs.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
      0.30 ±100%      +0.5        0.84 ± 11%  perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.handle_softirqs.__irq_exit_rcu.sysvec_apic_timer_interrupt
      0.00            +0.6        0.63 ±  7%  perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs.__irq_exit_rcu
      0.00            +0.9        0.90 ± 17%  perf-profile.calltrace.cycles-pp.xas_find.xa_find_after.btrfs_uninhibit_all_eb_writeback.__btrfs_end_transaction.btrfs_finish_one_ordered
      0.00            +1.0        0.96 ± 16%  perf-profile.calltrace.cycles-pp.xa_find_after.btrfs_uninhibit_all_eb_writeback.__btrfs_end_transaction.btrfs_finish_one_ordered.btrfs_work_helper
      0.00            +2.1        2.09 ± 18%  perf-profile.calltrace.cycles-pp.btrfs_uninhibit_all_eb_writeback.__btrfs_end_transaction.btrfs_finish_one_ordered.btrfs_work_helper.process_one_work
      0.00            +2.2        2.16 ± 18%  perf-profile.calltrace.cycles-pp.__btrfs_end_transaction.btrfs_finish_one_ordered.btrfs_work_helper.process_one_work.worker_thread
      0.34 ± 79%      -0.3        0.06 ± 73%  perf-profile.children.cycles-pp.memcpy_extent_buffer
      0.46 ± 12%      -0.1        0.33 ± 15%  perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
      0.16 ± 40%      -0.1        0.04 ±107%  perf-profile.children.cycles-pp.lock_extent_buffer_for_io
      0.47 ± 16%      -0.1        0.38 ± 15%  perf-profile.children.cycles-pp.update_load_avg
      0.13 ± 23%      -0.1        0.06 ± 62%  perf-profile.children.cycles-pp.menu_reflect
      0.04 ± 72%      +0.1        0.10 ±  9%  perf-profile.children.cycles-pp.__mem_cgroup_charge
      0.04 ± 73%      +0.1        0.12 ± 29%  perf-profile.children.cycles-pp.scsi_dma_map
      0.10 ± 41%      +0.1        0.18 ± 34%  perf-profile.children.cycles-pp.lookup_extent_backref
      0.02 ±141%      +0.1        0.10 ± 30%  perf-profile.children.cycles-pp.__dma_map_sg_attrs
      0.02 ±141%      +0.1        0.10 ± 29%  perf-profile.children.cycles-pp.dma_map_sg_attrs
      0.04 ± 77%      +0.1        0.13 ± 33%  perf-profile.children.cycles-pp.___perf_sw_event
      0.22 ± 16%      +0.1        0.31 ± 19%  perf-profile.children.cycles-pp.__call_rcu_common
      0.14 ± 30%      +0.1        0.24 ± 24%  perf-profile.children.cycles-pp.lookup_inline_extent_backref
      0.07 ± 55%      +0.1        0.19 ± 18%  perf-profile.children.cycles-pp.__refill_objects_node
      0.14 ± 54%      +0.1        0.26 ± 19%  perf-profile.children.cycles-pp.__memcg_slab_free_hook
      0.20 ± 26%      +0.1        0.34 ± 16%  perf-profile.children.cycles-pp.__pcs_replace_empty_main
      0.38 ± 18%      +0.2        0.56 ± 11%  perf-profile.children.cycles-pp.xas_alloc
      0.04 ±110%      +0.2        0.25 ± 22%  perf-profile.children.cycles-pp.xa_find
      0.09 ± 46%      +0.2        0.32 ± 97%  perf-profile.children.cycles-pp.__pcs_replace_full_main
      0.46 ± 26%      +0.3        0.78 ± 11%  perf-profile.children.cycles-pp.xas_create
      0.04 ±108%      +0.4        0.44 ± 23%  perf-profile.children.cycles-pp.__xa_store
      0.66 ± 26%      +0.4        1.05 ± 10%  perf-profile.children.cycles-pp.rcu_core
      0.52 ± 29%      +0.4        0.92 ± 10%  perf-profile.children.cycles-pp.rcu_do_batch
      0.04 ±108%      +0.4        0.49 ± 23%  perf-profile.children.cycles-pp.xa_store
      0.00            +0.7        0.69 ± 23%  perf-profile.children.cycles-pp.xas_free_nodes
      0.00            +0.7        0.72 ± 25%  perf-profile.children.cycles-pp.xa_destroy
      0.00            +0.8        0.76 ± 20%  perf-profile.children.cycles-pp.btrfs_inhibit_eb_writeback
      0.00            +1.0        0.96 ± 17%  perf-profile.children.cycles-pp.xa_find_after
      0.50 ±  7%      +1.1        1.62 ± 15%  perf-profile.children.cycles-pp.xas_find
      0.00            +2.1        2.11 ± 18%  perf-profile.children.cycles-pp.btrfs_uninhibit_all_eb_writeback
      0.04 ± 77%      +2.1        2.17 ± 18%  perf-profile.children.cycles-pp.__btrfs_end_transaction
      0.12 ± 23%      -0.1        0.04 ±105%  perf-profile.self.cycles-pp.finish_task_switch
      0.01 ±223%      +0.1        0.10 ± 35%  perf-profile.self.cycles-pp.xa_load
      0.00            +0.1        0.09 ± 33%  perf-profile.self.cycles-pp.btrfs_uninhibit_all_eb_writeback
      0.04 ± 77%      +0.1        0.14 ± 34%  perf-profile.self.cycles-pp.__call_rcu_common
      0.06 ± 58%      +0.1        0.18 ± 19%  perf-profile.self.cycles-pp.__refill_objects_node
      0.05 ±104%      +0.1        0.20 ± 22%  perf-profile.self.cycles-pp.__memcg_slab_free_hook
      0.06 ±100%      +0.2        0.21 ± 23%  perf-profile.self.cycles-pp.xas_create
      0.45 ± 25%      +0.2        0.65 ± 17%  perf-profile.self.cycles-pp.kmem_cache_free
      0.00            +0.6        0.56 ± 19%  perf-profile.self.cycles-pp.xas_free_nodes
      0.01 ±223%      +1.1        1.08 ± 15%  perf-profile.self.cycles-pp.xas_find




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2026-04-29  7:40 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-29  7:40 [linus:master] [btrfs] f9a48549a1: fio.write_iops 6.5% regression kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.