All of lore.kernel.org
 help / color / mirror / Atom feed
* [linux-next:master] [btrfs]  551e510a97:  fio.write_iops 12.4% improvement
@ 2026-05-29  9:04 kernel test robot
  2026-05-29 10:39 ` Mark Harmstone
  0 siblings, 1 reply; 3+ messages in thread
From: kernel test robot @ 2026-05-29  9:04 UTC (permalink / raw)
  To: Mark Harmstone; +Cc: oe-lkp, lkp, David Sterba, linux-btrfs, oliver.sang



Hello,

kernel test robot noticed a 12.4% improvement of fio.write_iops on:


commit: 551e510a97a487218d5f22d61d1a3388ef1171ac ("btrfs: don't force DIO writes to be serialized")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master


testcase: fio-basic
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:

	runtime: 300s
	disk: 1HDD
	fs: btrfs
	nr_task: 1
	test_size: 128G
	rw: randwrite
	bs: 4k
	ioengine: falloc
	cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+---------------------------------------------+
| testcase: change | fio-basic: fio.write_iops 10.9% improvement |
| test parameters  | bs=4k                                       |
|                  | cpufreq_governor=performance                |
|                  | disk=1HDD                                   |
|                  | fs=btrfs                                    |
|                  | ioengine=falloc                             |
|                  | nr_task=1                                   |
|                  | runtime=300s                                |
|                  | rw=write                                    |
|                  | test_size=128G                              |
+------------------+---------------------------------------------+




Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20260529/202605291631.659bf248-lkp@intel.com

=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase:
  4k/gcc-14/performance/1HDD/btrfs/falloc/x86_64-rhel-9.4/1/debian-13-x86_64-20250902.cgz/300s/randwrite/lkp-icl-2sp9/128G/fio-basic

commit: 
  964f569c14 ("btrfs: limit size of bios submitted from writeback")
  551e510a97 ("btrfs: don't force DIO writes to be serialized")

964f569c14d7778c 551e510a97a487218d5f22d61d1 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.04 ±  5%      -0.0        0.04 ±  5%  fio.latency_4us%
      5315           +12.4%       5976        fio.write_bw_MBps
    510.67           -14.2%     438.00        fio.write_clat_90%_ns
    515.00           -14.2%     442.00        fio.write_clat_95%_ns
    522.67           -13.8%     450.67        fio.write_clat_99%_ns
    496.01           -14.6%     423.39        fio.write_clat_mean_ns
   1360660           +12.4%    1530010        fio.write_iops
  1.67e+09           -10.0%  1.503e+09        cpuidle..time
      1.48            +2.5%       1.51        iostat.cpu.system
     42426            -6.0%      39867 ±  2%  meminfo.AnonHugePages
    165506 ±  3%      -6.8%     154330        turbostat.IRQ
      2392            +4.8%       2506        vmstat.system.cs
      1945            +4.5%       2034        perf-stat.i.context-switches
      1874            +4.1%       1949        perf-stat.ps.context-switches
   1489031 ±110%    +162.9%    3915040 ±  3%  numa-meminfo.node0.FilePages
     40746 ± 46%     +72.2%      70177 ±  5%  numa-meminfo.node0.KReclaimable
     29851 ±122%    +178.8%      83225        numa-meminfo.node0.Mapped
     40746 ± 46%     +72.2%      70177 ±  5%  numa-meminfo.node0.SReclaimable
    126444 ± 22%     +29.1%     163254 ±  8%  numa-meminfo.node0.Slab
   1485610 ±110%    +163.2%    3910117 ±  3%  numa-meminfo.node0.Unevictable
   2626454 ± 62%     -92.4%     200365 ± 69%  numa-meminfo.node1.FilePages
     56220 ± 65%     -95.9%       2321 ± 67%  numa-meminfo.node1.Mapped
   2622239 ± 62%     -92.5%     197733 ± 70%  numa-meminfo.node1.Unevictable
    372258 ±110%    +162.9%     978760 ±  3%  numa-vmstat.node0.nr_file_pages
      7462 ±122%    +178.8%      20806        numa-vmstat.node0.nr_mapped
     10186 ± 46%     +72.2%      17544 ±  5%  numa-vmstat.node0.nr_slab_reclaimable
    371402 ±110%    +163.2%     977529 ±  3%  numa-vmstat.node0.nr_unevictable
    371402 ±110%    +163.2%     977529 ±  3%  numa-vmstat.node0.nr_zone_unevictable
    656614 ± 62%     -92.4%      50091 ± 69%  numa-vmstat.node1.nr_file_pages
     14054 ± 65%     -95.9%     580.52 ± 67%  numa-vmstat.node1.nr_mapped
    655559 ± 62%     -92.5%      49433 ± 70%  numa-vmstat.node1.nr_unevictable
    655559 ± 62%     -92.5%      49433 ± 70%  numa-vmstat.node1.nr_zone_unevictable
     80.52 ± 25%     -58.3       22.22 ±147%  perf-profile.calltrace.cycles-pp.__mmput.exit_mm.do_exit.do_group_exit.get_signal
     80.52 ± 25%     -58.3       22.22 ±147%  perf-profile.calltrace.cycles-pp.exit_mm.do_exit.do_group_exit.get_signal.arch_do_signal_or_restart
     80.52 ± 25%     -56.6       23.89 ±144%  perf-profile.calltrace.cycles-pp.exit_mmap.__mmput.exit_mm.do_exit.do_group_exit
     68.85 ± 49%     -50.0       18.89 ±163%  perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.exit_mmap.__mmput.exit_mm
     68.85 ± 49%     -50.0       18.89 ±163%  perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.exit_mmap.__mmput
     68.57 ± 49%     -49.7       18.89 ±163%  perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.exit_mmap
     68.85 ± 49%     -48.3       20.56 ±153%  perf-profile.calltrace.cycles-pp.unmap_vmas.exit_mmap.__mmput.exit_mm.do_exit
     70.24 ± 51%     -48.0       22.22 ±147%  perf-profile.calltrace.cycles-pp.arch_do_signal_or_restart.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
     70.24 ± 51%     -48.0       22.22 ±147%  perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop
     70.24 ± 51%     -48.0       22.22 ±147%  perf-profile.calltrace.cycles-pp.do_group_exit.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop.do_syscall_64
     70.24 ± 51%     -48.0       22.22 ±147%  perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
     70.24 ± 51%     -48.0       22.22 ±147%  perf-profile.calltrace.cycles-pp.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
     71.35 ± 48%     -45.8       25.56 ±142%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
     71.35 ± 48%     -45.8       25.56 ±142%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
     35.48 ± 64%     -24.4       11.11 ±223%  perf-profile.calltrace.cycles-pp.zap_present_ptes.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
     22.82 ± 62%     -18.4        4.44 ±147%  perf-profile.calltrace.cycles-pp.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range.zap_pmd_range
     23.10 ± 64%     -15.9        7.22 ±169%  perf-profile.calltrace.cycles-pp.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range.zap_pmd_range.unmap_page_range
     23.10 ± 64%     -15.9        7.22 ±169%  perf-profile.calltrace.cycles-pp.tlb_flush_mmu.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
     12.38 ±118%     -12.4        0.00        perf-profile.calltrace.cycles-pp.lruvec_stat_mod_folio.folio_remove_rmap_ptes.zap_present_ptes.zap_pte_range.zap_pmd_range
     15.99 ±131%     -10.4        5.56 ±223%  perf-profile.calltrace.cycles-pp.folio_remove_rmap_ptes.zap_present_ptes.zap_pte_range.zap_pmd_range.unmap_page_range
     12.38 ±143%      -9.6        2.78 ±223%  perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm
     12.38 ±143%      -9.6        2.78 ±223%  perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm
     12.38 ±143%      -9.6        2.78 ±223%  perf-profile.calltrace.cycles-pp.ret_from_fork_asm
     11.71 ± 98%      -7.3        4.44 ±147%  perf-profile.calltrace.cycles-pp.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range
     83.17 ± 23%     -59.3       23.89 ±144%  perf-profile.children.cycles-pp.__mmput
     83.17 ± 23%     -59.3       23.89 ±144%  perf-profile.children.cycles-pp.exit_mmap
     80.79 ± 24%     -58.6       22.22 ±147%  perf-profile.children.cycles-pp.arch_do_signal_or_restart
     80.79 ± 24%     -58.6       22.22 ±147%  perf-profile.children.cycles-pp.get_signal
     80.79 ± 24%     -56.9       23.89 ±144%  perf-profile.children.cycles-pp.do_exit
     80.79 ± 24%     -56.9       23.89 ±144%  perf-profile.children.cycles-pp.do_group_exit
     80.52 ± 25%     -56.6       23.89 ±144%  perf-profile.children.cycles-pp.exit_mm
     75.95 ± 40%     -50.4       25.56 ±142%  perf-profile.children.cycles-pp.do_syscall_64
     75.95 ± 40%     -50.4       25.56 ±142%  perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     68.85 ± 49%     -50.0       18.89 ±163%  perf-profile.children.cycles-pp.unmap_page_range
     68.85 ± 49%     -50.0       18.89 ±163%  perf-profile.children.cycles-pp.zap_pmd_range
     68.85 ± 49%     -50.0       18.89 ±163%  perf-profile.children.cycles-pp.zap_pte_range
     68.85 ± 49%     -48.3       20.56 ±153%  perf-profile.children.cycles-pp.unmap_vmas
     70.24 ± 51%     -48.0       22.22 ±147%  perf-profile.children.cycles-pp.exit_to_user_mode_loop
     35.20 ± 66%     -26.9        8.33 ±223%  perf-profile.children.cycles-pp.zap_present_ptes
     23.10 ± 64%     -15.9        7.22 ±169%  perf-profile.children.cycles-pp.__tlb_batch_free_encoded_pages
     23.10 ± 64%     -15.9        7.22 ±169%  perf-profile.children.cycles-pp.free_pages_and_swap_cache
     23.10 ± 64%     -15.9        7.22 ±169%  perf-profile.children.cycles-pp.tlb_flush_mmu
     19.33 ±102%     -13.8        5.56 ±223%  perf-profile.children.cycles-pp.folio_remove_rmap_ptes
     12.38 ±143%      -9.6        2.78 ±223%  perf-profile.children.cycles-pp.kthread
     12.38 ±143%      -9.6        2.78 ±223%  perf-profile.children.cycles-pp.ret_from_fork
     12.38 ±143%      -9.6        2.78 ±223%  perf-profile.children.cycles-pp.ret_from_fork_asm
      9.05 ±102%      -9.0        0.00        perf-profile.children.cycles-pp.lruvec_stat_mod_folio
     11.71 ± 98%      -7.3        4.44 ±147%  perf-profile.children.cycles-pp.folios_put_refs
     15.60 ± 97%     -15.6        0.00        perf-profile.self.cycles-pp.zap_present_ptes
     13.61 ±107%      -8.1        5.56 ±223%  perf-profile.self.cycles-pp.folio_remove_rmap_ptes
     10.28 ± 94%      -6.9        3.33 ±223%  perf-profile.self.cycles-pp.zap_pte_range


***************************************************************************************************

=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase:
  4k/gcc-14/performance/1HDD/btrfs/falloc/x86_64-rhel-9.4/1/debian-13-x86_64-20250902.cgz/300s/write/lkp-icl-2sp9/128G/fio-basic

commit: 
  964f569c14 ("btrfs: limit size of bios submitted from writeback")
  551e510a97 ("btrfs: don't force DIO writes to be serialized")

964f569c14d7778c 551e510a97a487218d5f22d61d1 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.04 ±  4%      -0.0        0.04 ±  4%  fio.latency_4us%
      6156           +10.9%       6825        fio.write_bw_MBps
    459.33           -13.5%     397.33        fio.write_clat_90%_ns
    462.00           -13.7%     398.67 ±  2%  fio.write_clat_95%_ns
    466.67           -13.6%     403.33 ±  2%  fio.write_clat_99%_ns
    454.12           -14.1%     389.87        fio.write_clat_mean_ns
   1575955           +10.9%    1747309        fio.write_iops
    556943 ± 91%     -62.8%     207326 ±184%  sched_debug.cfs_rq:/.load.max
    152355 ±  2%      -7.7%     140620 ±  3%  turbostat.IRQ
 1.472e+09 ±  2%      -9.2%  1.337e+09        cpuidle..time
     99291 ±  3%      -7.8%      91524 ±  4%  cpuidle..usage
      9.05 ±102%      -9.0        0.00        perf-profile.calltrace.cycles-pp.free_pgtables.exit_mmap.__mmput.exit_mm.do_exit
      9.05 ±102%      -7.4        1.67 ±223%  perf-profile.children.cycles-pp.free_pgtables





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [linux-next:master] [btrfs] 551e510a97: fio.write_iops 12.4% improvement
  2026-05-29  9:04 [linux-next:master] [btrfs] 551e510a97: fio.write_iops 12.4% improvement kernel test robot
@ 2026-05-29 10:39 ` Mark Harmstone
  2026-06-01  1:52   ` Oliver Sang
  0 siblings, 1 reply; 3+ messages in thread
From: Mark Harmstone @ 2026-05-29 10:39 UTC (permalink / raw)
  To: kernel test robot; +Cc: oe-lkp, lkp, David Sterba, linux-btrfs

Thanks for this Oliver, mind if I offer some constructive criticism?

1) Your fio script has both "buffered=0" and "direct=0", which are
contradictory. It looks like in this case fio treats it as "direct=1",
but this probably should be explicit.

2) Recent versions of btrfs fallback to buffered I/O unless nodatasum is
set. You probably want to add:

	mkdir /fs/sda1/nocow
	chattr +C /fs/sda1/nocow

and then change the directory to be /fs/sda1/nocow.

3) My experience is that for FS performance testing, the only way to get
useful figures is by PCI passthrough of an NVMe drive. Otherwise the VM
overhead is enough to skew the figures.

Mark

On 29/05/2026 10.04 am, kernel test robot wrote:
> 
> 
> Hello,
> 
> kernel test robot noticed a 12.4% improvement of fio.write_iops on:
> 
> 
> commit: 551e510a97a487218d5f22d61d1a3388ef1171ac ("btrfs: don't force DIO writes to be serialized")
> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> 
> 
> testcase: fio-basic
> config: x86_64-rhel-9.4
> compiler: gcc-14
> test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
> parameters:
> 
> 	runtime: 300s
> 	disk: 1HDD
> 	fs: btrfs
> 	nr_task: 1
> 	test_size: 128G
> 	rw: randwrite
> 	bs: 4k
> 	ioengine: falloc
> 	cpufreq_governor: performance
> 
> 
> In addition to that, the commit also has significant impact on the following tests:
> 
> +------------------+---------------------------------------------+
> | testcase: change | fio-basic: fio.write_iops 10.9% improvement |
> | test parameters  | bs=4k                                       |
> |                  | cpufreq_governor=performance                |
> |                  | disk=1HDD                                   |
> |                  | fs=btrfs                                    |
> |                  | ioengine=falloc                             |
> |                  | nr_task=1                                   |
> |                  | runtime=300s                                |
> |                  | rw=write                                    |
> |                  | test_size=128G                              |
> +------------------+---------------------------------------------+
> 
> 
> 
> 
> Details are as below:
> -------------------------------------------------------------------------------------------------->
> 
> 
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20260529/202605291631.659bf248-lkp@intel.com
> 
> =========================================================================================
> bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase:
>   4k/gcc-14/performance/1HDD/btrfs/falloc/x86_64-rhel-9.4/1/debian-13-x86_64-20250902.cgz/300s/randwrite/lkp-icl-2sp9/128G/fio-basic
> 
> commit: 
>   964f569c14 ("btrfs: limit size of bios submitted from writeback")
>   551e510a97 ("btrfs: don't force DIO writes to be serialized")
> 
> 964f569c14d7778c 551e510a97a487218d5f22d61d1 
> ---------------- --------------------------- 
>          %stddev     %change         %stddev
>              \          |                \  
>       0.04 ±  5%      -0.0        0.04 ±  5%  fio.latency_4us%
>       5315           +12.4%       5976        fio.write_bw_MBps
>     510.67           -14.2%     438.00        fio.write_clat_90%_ns
>     515.00           -14.2%     442.00        fio.write_clat_95%_ns
>     522.67           -13.8%     450.67        fio.write_clat_99%_ns
>     496.01           -14.6%     423.39        fio.write_clat_mean_ns
>    1360660           +12.4%    1530010        fio.write_iops
>   1.67e+09           -10.0%  1.503e+09        cpuidle..time
>       1.48            +2.5%       1.51        iostat.cpu.system
>      42426            -6.0%      39867 ±  2%  meminfo.AnonHugePages
>     165506 ±  3%      -6.8%     154330        turbostat.IRQ
>       2392            +4.8%       2506        vmstat.system.cs
>       1945            +4.5%       2034        perf-stat.i.context-switches
>       1874            +4.1%       1949        perf-stat.ps.context-switches
>    1489031 ±110%    +162.9%    3915040 ±  3%  numa-meminfo.node0.FilePages
>      40746 ± 46%     +72.2%      70177 ±  5%  numa-meminfo.node0.KReclaimable
>      29851 ±122%    +178.8%      83225        numa-meminfo.node0.Mapped
>      40746 ± 46%     +72.2%      70177 ±  5%  numa-meminfo.node0.SReclaimable
>     126444 ± 22%     +29.1%     163254 ±  8%  numa-meminfo.node0.Slab
>    1485610 ±110%    +163.2%    3910117 ±  3%  numa-meminfo.node0.Unevictable
>    2626454 ± 62%     -92.4%     200365 ± 69%  numa-meminfo.node1.FilePages
>      56220 ± 65%     -95.9%       2321 ± 67%  numa-meminfo.node1.Mapped
>    2622239 ± 62%     -92.5%     197733 ± 70%  numa-meminfo.node1.Unevictable
>     372258 ±110%    +162.9%     978760 ±  3%  numa-vmstat.node0.nr_file_pages
>       7462 ±122%    +178.8%      20806        numa-vmstat.node0.nr_mapped
>      10186 ± 46%     +72.2%      17544 ±  5%  numa-vmstat.node0.nr_slab_reclaimable
>     371402 ±110%    +163.2%     977529 ±  3%  numa-vmstat.node0.nr_unevictable
>     371402 ±110%    +163.2%     977529 ±  3%  numa-vmstat.node0.nr_zone_unevictable
>     656614 ± 62%     -92.4%      50091 ± 69%  numa-vmstat.node1.nr_file_pages
>      14054 ± 65%     -95.9%     580.52 ± 67%  numa-vmstat.node1.nr_mapped
>     655559 ± 62%     -92.5%      49433 ± 70%  numa-vmstat.node1.nr_unevictable
>     655559 ± 62%     -92.5%      49433 ± 70%  numa-vmstat.node1.nr_zone_unevictable
>      80.52 ± 25%     -58.3       22.22 ±147%  perf-profile.calltrace.cycles-pp.__mmput.exit_mm.do_exit.do_group_exit.get_signal
>      80.52 ± 25%     -58.3       22.22 ±147%  perf-profile.calltrace.cycles-pp.exit_mm.do_exit.do_group_exit.get_signal.arch_do_signal_or_restart
>      80.52 ± 25%     -56.6       23.89 ±144%  perf-profile.calltrace.cycles-pp.exit_mmap.__mmput.exit_mm.do_exit.do_group_exit
>      68.85 ± 49%     -50.0       18.89 ±163%  perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.exit_mmap.__mmput.exit_mm
>      68.85 ± 49%     -50.0       18.89 ±163%  perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.exit_mmap.__mmput
>      68.57 ± 49%     -49.7       18.89 ±163%  perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.exit_mmap
>      68.85 ± 49%     -48.3       20.56 ±153%  perf-profile.calltrace.cycles-pp.unmap_vmas.exit_mmap.__mmput.exit_mm.do_exit
>      70.24 ± 51%     -48.0       22.22 ±147%  perf-profile.calltrace.cycles-pp.arch_do_signal_or_restart.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
>      70.24 ± 51%     -48.0       22.22 ±147%  perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop
>      70.24 ± 51%     -48.0       22.22 ±147%  perf-profile.calltrace.cycles-pp.do_group_exit.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop.do_syscall_64
>      70.24 ± 51%     -48.0       22.22 ±147%  perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
>      70.24 ± 51%     -48.0       22.22 ±147%  perf-profile.calltrace.cycles-pp.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
>      71.35 ± 48%     -45.8       25.56 ±142%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
>      71.35 ± 48%     -45.8       25.56 ±142%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
>      35.48 ± 64%     -24.4       11.11 ±223%  perf-profile.calltrace.cycles-pp.zap_present_ptes.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
>      22.82 ± 62%     -18.4        4.44 ±147%  perf-profile.calltrace.cycles-pp.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range.zap_pmd_range
>      23.10 ± 64%     -15.9        7.22 ±169%  perf-profile.calltrace.cycles-pp.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range.zap_pmd_range.unmap_page_range
>      23.10 ± 64%     -15.9        7.22 ±169%  perf-profile.calltrace.cycles-pp.tlb_flush_mmu.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
>      12.38 ±118%     -12.4        0.00        perf-profile.calltrace.cycles-pp.lruvec_stat_mod_folio.folio_remove_rmap_ptes.zap_present_ptes.zap_pte_range.zap_pmd_range
>      15.99 ±131%     -10.4        5.56 ±223%  perf-profile.calltrace.cycles-pp.folio_remove_rmap_ptes.zap_present_ptes.zap_pte_range.zap_pmd_range.unmap_page_range
>      12.38 ±143%      -9.6        2.78 ±223%  perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm
>      12.38 ±143%      -9.6        2.78 ±223%  perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm
>      12.38 ±143%      -9.6        2.78 ±223%  perf-profile.calltrace.cycles-pp.ret_from_fork_asm
>      11.71 ± 98%      -7.3        4.44 ±147%  perf-profile.calltrace.cycles-pp.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range
>      83.17 ± 23%     -59.3       23.89 ±144%  perf-profile.children.cycles-pp.__mmput
>      83.17 ± 23%     -59.3       23.89 ±144%  perf-profile.children.cycles-pp.exit_mmap
>      80.79 ± 24%     -58.6       22.22 ±147%  perf-profile.children.cycles-pp.arch_do_signal_or_restart
>      80.79 ± 24%     -58.6       22.22 ±147%  perf-profile.children.cycles-pp.get_signal
>      80.79 ± 24%     -56.9       23.89 ±144%  perf-profile.children.cycles-pp.do_exit
>      80.79 ± 24%     -56.9       23.89 ±144%  perf-profile.children.cycles-pp.do_group_exit
>      80.52 ± 25%     -56.6       23.89 ±144%  perf-profile.children.cycles-pp.exit_mm
>      75.95 ± 40%     -50.4       25.56 ±142%  perf-profile.children.cycles-pp.do_syscall_64
>      75.95 ± 40%     -50.4       25.56 ±142%  perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
>      68.85 ± 49%     -50.0       18.89 ±163%  perf-profile.children.cycles-pp.unmap_page_range
>      68.85 ± 49%     -50.0       18.89 ±163%  perf-profile.children.cycles-pp.zap_pmd_range
>      68.85 ± 49%     -50.0       18.89 ±163%  perf-profile.children.cycles-pp.zap_pte_range
>      68.85 ± 49%     -48.3       20.56 ±153%  perf-profile.children.cycles-pp.unmap_vmas
>      70.24 ± 51%     -48.0       22.22 ±147%  perf-profile.children.cycles-pp.exit_to_user_mode_loop
>      35.20 ± 66%     -26.9        8.33 ±223%  perf-profile.children.cycles-pp.zap_present_ptes
>      23.10 ± 64%     -15.9        7.22 ±169%  perf-profile.children.cycles-pp.__tlb_batch_free_encoded_pages
>      23.10 ± 64%     -15.9        7.22 ±169%  perf-profile.children.cycles-pp.free_pages_and_swap_cache
>      23.10 ± 64%     -15.9        7.22 ±169%  perf-profile.children.cycles-pp.tlb_flush_mmu
>      19.33 ±102%     -13.8        5.56 ±223%  perf-profile.children.cycles-pp.folio_remove_rmap_ptes
>      12.38 ±143%      -9.6        2.78 ±223%  perf-profile.children.cycles-pp.kthread
>      12.38 ±143%      -9.6        2.78 ±223%  perf-profile.children.cycles-pp.ret_from_fork
>      12.38 ±143%      -9.6        2.78 ±223%  perf-profile.children.cycles-pp.ret_from_fork_asm
>       9.05 ±102%      -9.0        0.00        perf-profile.children.cycles-pp.lruvec_stat_mod_folio
>      11.71 ± 98%      -7.3        4.44 ±147%  perf-profile.children.cycles-pp.folios_put_refs
>      15.60 ± 97%     -15.6        0.00        perf-profile.self.cycles-pp.zap_present_ptes
>      13.61 ±107%      -8.1        5.56 ±223%  perf-profile.self.cycles-pp.folio_remove_rmap_ptes
>      10.28 ± 94%      -6.9        3.33 ±223%  perf-profile.self.cycles-pp.zap_pte_range
> 
> 
> ***************************************************************************************************
> 
> =========================================================================================
> bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase:
>   4k/gcc-14/performance/1HDD/btrfs/falloc/x86_64-rhel-9.4/1/debian-13-x86_64-20250902.cgz/300s/write/lkp-icl-2sp9/128G/fio-basic
> 
> commit: 
>   964f569c14 ("btrfs: limit size of bios submitted from writeback")
>   551e510a97 ("btrfs: don't force DIO writes to be serialized")
> 
> 964f569c14d7778c 551e510a97a487218d5f22d61d1 
> ---------------- --------------------------- 
>          %stddev     %change         %stddev
>              \          |                \  
>       0.04 ±  4%      -0.0        0.04 ±  4%  fio.latency_4us%
>       6156           +10.9%       6825        fio.write_bw_MBps
>     459.33           -13.5%     397.33        fio.write_clat_90%_ns
>     462.00           -13.7%     398.67 ±  2%  fio.write_clat_95%_ns
>     466.67           -13.6%     403.33 ±  2%  fio.write_clat_99%_ns
>     454.12           -14.1%     389.87        fio.write_clat_mean_ns
>    1575955           +10.9%    1747309        fio.write_iops
>     556943 ± 91%     -62.8%     207326 ±184%  sched_debug.cfs_rq:/.load.max
>     152355 ±  2%      -7.7%     140620 ±  3%  turbostat.IRQ
>  1.472e+09 ±  2%      -9.2%  1.337e+09        cpuidle..time
>      99291 ±  3%      -7.8%      91524 ±  4%  cpuidle..usage
>       9.05 ±102%      -9.0        0.00        perf-profile.calltrace.cycles-pp.free_pgtables.exit_mmap.__mmput.exit_mm.do_exit
>       9.05 ±102%      -7.4        1.67 ±223%  perf-profile.children.cycles-pp.free_pgtables
> 
> 
> 
> 
> 
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
> 
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [linux-next:master] [btrfs] 551e510a97: fio.write_iops 12.4% improvement
  2026-05-29 10:39 ` Mark Harmstone
@ 2026-06-01  1:52   ` Oliver Sang
  0 siblings, 0 replies; 3+ messages in thread
From: Oliver Sang @ 2026-06-01  1:52 UTC (permalink / raw)
  To: Mark Harmstone; +Cc: oe-lkp, lkp, David Sterba, linux-btrfs, oliver.sang

hi, Mark,

On Fri, May 29, 2026 at 11:39:39AM +0100, Mark Harmstone wrote:
> Thanks for this Oliver, mind if I offer some constructive criticism?
> 
> 1) Your fio script has both "buffered=0" and "direct=0", which are
> contradictory. It looks like in this case fio treats it as "direct=1",
> but this probably should be explicit.
> 
> 2) Recent versions of btrfs fallback to buffered I/O unless nodatasum is
> set. You probably want to add:
> 
> 	mkdir /fs/sda1/nocow
> 	chattr +C /fs/sda1/nocow
> 
> and then change the directory to be /fs/sda1/nocow.
> 
> 3) My experience is that for FS performance testing, the only way to get
> useful figures is by PCI passthrough of an NVMe drive. Otherwise the VM
> overhead is enough to skew the figures.

thanks a lot for these education to us! we will investigate further how to
improve our tests. thanks again!

> 
> Mark
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-06-01  1:52 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-29  9:04 [linux-next:master] [btrfs] 551e510a97: fio.write_iops 12.4% improvement kernel test robot
2026-05-29 10:39 ` Mark Harmstone
2026-06-01  1:52   ` Oliver Sang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.