From: kernel test robot <oliver.sang@intel.com>
To: Mark Harmstone <mark@harmstone.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
David Sterba <dsterba@suse.com>, <linux-btrfs@vger.kernel.org>,
<oliver.sang@intel.com>
Subject: [linux-next:master] [btrfs] 551e510a97: fio.write_iops 12.4% improvement
Date: Fri, 29 May 2026 17:04:00 +0800 [thread overview]
Message-ID: <202605291631.659bf248-lkp@intel.com> (raw)
Hello,
kernel test robot noticed a 12.4% improvement of fio.write_iops on:
commit: 551e510a97a487218d5f22d61d1a3388ef1171ac ("btrfs: don't force DIO writes to be serialized")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
testcase: fio-basic
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:
runtime: 300s
disk: 1HDD
fs: btrfs
nr_task: 1
test_size: 128G
rw: randwrite
bs: 4k
ioengine: falloc
cpufreq_governor: performance
In addition to that, the commit also has significant impact on the following tests:
+------------------+---------------------------------------------+
| testcase: change | fio-basic: fio.write_iops 10.9% improvement |
| test parameters | bs=4k |
| | cpufreq_governor=performance |
| | disk=1HDD |
| | fs=btrfs |
| | ioengine=falloc |
| | nr_task=1 |
| | runtime=300s |
| | rw=write |
| | test_size=128G |
+------------------+---------------------------------------------+
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20260529/202605291631.659bf248-lkp@intel.com
=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase:
4k/gcc-14/performance/1HDD/btrfs/falloc/x86_64-rhel-9.4/1/debian-13-x86_64-20250902.cgz/300s/randwrite/lkp-icl-2sp9/128G/fio-basic
commit:
964f569c14 ("btrfs: limit size of bios submitted from writeback")
551e510a97 ("btrfs: don't force DIO writes to be serialized")
964f569c14d7778c 551e510a97a487218d5f22d61d1
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.04 ± 5% -0.0 0.04 ± 5% fio.latency_4us%
5315 +12.4% 5976 fio.write_bw_MBps
510.67 -14.2% 438.00 fio.write_clat_90%_ns
515.00 -14.2% 442.00 fio.write_clat_95%_ns
522.67 -13.8% 450.67 fio.write_clat_99%_ns
496.01 -14.6% 423.39 fio.write_clat_mean_ns
1360660 +12.4% 1530010 fio.write_iops
1.67e+09 -10.0% 1.503e+09 cpuidle..time
1.48 +2.5% 1.51 iostat.cpu.system
42426 -6.0% 39867 ± 2% meminfo.AnonHugePages
165506 ± 3% -6.8% 154330 turbostat.IRQ
2392 +4.8% 2506 vmstat.system.cs
1945 +4.5% 2034 perf-stat.i.context-switches
1874 +4.1% 1949 perf-stat.ps.context-switches
1489031 ±110% +162.9% 3915040 ± 3% numa-meminfo.node0.FilePages
40746 ± 46% +72.2% 70177 ± 5% numa-meminfo.node0.KReclaimable
29851 ±122% +178.8% 83225 numa-meminfo.node0.Mapped
40746 ± 46% +72.2% 70177 ± 5% numa-meminfo.node0.SReclaimable
126444 ± 22% +29.1% 163254 ± 8% numa-meminfo.node0.Slab
1485610 ±110% +163.2% 3910117 ± 3% numa-meminfo.node0.Unevictable
2626454 ± 62% -92.4% 200365 ± 69% numa-meminfo.node1.FilePages
56220 ± 65% -95.9% 2321 ± 67% numa-meminfo.node1.Mapped
2622239 ± 62% -92.5% 197733 ± 70% numa-meminfo.node1.Unevictable
372258 ±110% +162.9% 978760 ± 3% numa-vmstat.node0.nr_file_pages
7462 ±122% +178.8% 20806 numa-vmstat.node0.nr_mapped
10186 ± 46% +72.2% 17544 ± 5% numa-vmstat.node0.nr_slab_reclaimable
371402 ±110% +163.2% 977529 ± 3% numa-vmstat.node0.nr_unevictable
371402 ±110% +163.2% 977529 ± 3% numa-vmstat.node0.nr_zone_unevictable
656614 ± 62% -92.4% 50091 ± 69% numa-vmstat.node1.nr_file_pages
14054 ± 65% -95.9% 580.52 ± 67% numa-vmstat.node1.nr_mapped
655559 ± 62% -92.5% 49433 ± 70% numa-vmstat.node1.nr_unevictable
655559 ± 62% -92.5% 49433 ± 70% numa-vmstat.node1.nr_zone_unevictable
80.52 ± 25% -58.3 22.22 ±147% perf-profile.calltrace.cycles-pp.__mmput.exit_mm.do_exit.do_group_exit.get_signal
80.52 ± 25% -58.3 22.22 ±147% perf-profile.calltrace.cycles-pp.exit_mm.do_exit.do_group_exit.get_signal.arch_do_signal_or_restart
80.52 ± 25% -56.6 23.89 ±144% perf-profile.calltrace.cycles-pp.exit_mmap.__mmput.exit_mm.do_exit.do_group_exit
68.85 ± 49% -50.0 18.89 ±163% perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.exit_mmap.__mmput.exit_mm
68.85 ± 49% -50.0 18.89 ±163% perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.exit_mmap.__mmput
68.57 ± 49% -49.7 18.89 ±163% perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.exit_mmap
68.85 ± 49% -48.3 20.56 ±153% perf-profile.calltrace.cycles-pp.unmap_vmas.exit_mmap.__mmput.exit_mm.do_exit
70.24 ± 51% -48.0 22.22 ±147% perf-profile.calltrace.cycles-pp.arch_do_signal_or_restart.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
70.24 ± 51% -48.0 22.22 ±147% perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop
70.24 ± 51% -48.0 22.22 ±147% perf-profile.calltrace.cycles-pp.do_group_exit.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop.do_syscall_64
70.24 ± 51% -48.0 22.22 ±147% perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
70.24 ± 51% -48.0 22.22 ±147% perf-profile.calltrace.cycles-pp.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
71.35 ± 48% -45.8 25.56 ±142% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
71.35 ± 48% -45.8 25.56 ±142% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
35.48 ± 64% -24.4 11.11 ±223% perf-profile.calltrace.cycles-pp.zap_present_ptes.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
22.82 ± 62% -18.4 4.44 ±147% perf-profile.calltrace.cycles-pp.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range.zap_pmd_range
23.10 ± 64% -15.9 7.22 ±169% perf-profile.calltrace.cycles-pp.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range.zap_pmd_range.unmap_page_range
23.10 ± 64% -15.9 7.22 ±169% perf-profile.calltrace.cycles-pp.tlb_flush_mmu.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
12.38 ±118% -12.4 0.00 perf-profile.calltrace.cycles-pp.lruvec_stat_mod_folio.folio_remove_rmap_ptes.zap_present_ptes.zap_pte_range.zap_pmd_range
15.99 ±131% -10.4 5.56 ±223% perf-profile.calltrace.cycles-pp.folio_remove_rmap_ptes.zap_present_ptes.zap_pte_range.zap_pmd_range.unmap_page_range
12.38 ±143% -9.6 2.78 ±223% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm
12.38 ±143% -9.6 2.78 ±223% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm
12.38 ±143% -9.6 2.78 ±223% perf-profile.calltrace.cycles-pp.ret_from_fork_asm
11.71 ± 98% -7.3 4.44 ±147% perf-profile.calltrace.cycles-pp.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range
83.17 ± 23% -59.3 23.89 ±144% perf-profile.children.cycles-pp.__mmput
83.17 ± 23% -59.3 23.89 ±144% perf-profile.children.cycles-pp.exit_mmap
80.79 ± 24% -58.6 22.22 ±147% perf-profile.children.cycles-pp.arch_do_signal_or_restart
80.79 ± 24% -58.6 22.22 ±147% perf-profile.children.cycles-pp.get_signal
80.79 ± 24% -56.9 23.89 ±144% perf-profile.children.cycles-pp.do_exit
80.79 ± 24% -56.9 23.89 ±144% perf-profile.children.cycles-pp.do_group_exit
80.52 ± 25% -56.6 23.89 ±144% perf-profile.children.cycles-pp.exit_mm
75.95 ± 40% -50.4 25.56 ±142% perf-profile.children.cycles-pp.do_syscall_64
75.95 ± 40% -50.4 25.56 ±142% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
68.85 ± 49% -50.0 18.89 ±163% perf-profile.children.cycles-pp.unmap_page_range
68.85 ± 49% -50.0 18.89 ±163% perf-profile.children.cycles-pp.zap_pmd_range
68.85 ± 49% -50.0 18.89 ±163% perf-profile.children.cycles-pp.zap_pte_range
68.85 ± 49% -48.3 20.56 ±153% perf-profile.children.cycles-pp.unmap_vmas
70.24 ± 51% -48.0 22.22 ±147% perf-profile.children.cycles-pp.exit_to_user_mode_loop
35.20 ± 66% -26.9 8.33 ±223% perf-profile.children.cycles-pp.zap_present_ptes
23.10 ± 64% -15.9 7.22 ±169% perf-profile.children.cycles-pp.__tlb_batch_free_encoded_pages
23.10 ± 64% -15.9 7.22 ±169% perf-profile.children.cycles-pp.free_pages_and_swap_cache
23.10 ± 64% -15.9 7.22 ±169% perf-profile.children.cycles-pp.tlb_flush_mmu
19.33 ±102% -13.8 5.56 ±223% perf-profile.children.cycles-pp.folio_remove_rmap_ptes
12.38 ±143% -9.6 2.78 ±223% perf-profile.children.cycles-pp.kthread
12.38 ±143% -9.6 2.78 ±223% perf-profile.children.cycles-pp.ret_from_fork
12.38 ±143% -9.6 2.78 ±223% perf-profile.children.cycles-pp.ret_from_fork_asm
9.05 ±102% -9.0 0.00 perf-profile.children.cycles-pp.lruvec_stat_mod_folio
11.71 ± 98% -7.3 4.44 ±147% perf-profile.children.cycles-pp.folios_put_refs
15.60 ± 97% -15.6 0.00 perf-profile.self.cycles-pp.zap_present_ptes
13.61 ±107% -8.1 5.56 ±223% perf-profile.self.cycles-pp.folio_remove_rmap_ptes
10.28 ± 94% -6.9 3.33 ±223% perf-profile.self.cycles-pp.zap_pte_range
***************************************************************************************************
=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase:
4k/gcc-14/performance/1HDD/btrfs/falloc/x86_64-rhel-9.4/1/debian-13-x86_64-20250902.cgz/300s/write/lkp-icl-2sp9/128G/fio-basic
commit:
964f569c14 ("btrfs: limit size of bios submitted from writeback")
551e510a97 ("btrfs: don't force DIO writes to be serialized")
964f569c14d7778c 551e510a97a487218d5f22d61d1
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.04 ± 4% -0.0 0.04 ± 4% fio.latency_4us%
6156 +10.9% 6825 fio.write_bw_MBps
459.33 -13.5% 397.33 fio.write_clat_90%_ns
462.00 -13.7% 398.67 ± 2% fio.write_clat_95%_ns
466.67 -13.6% 403.33 ± 2% fio.write_clat_99%_ns
454.12 -14.1% 389.87 fio.write_clat_mean_ns
1575955 +10.9% 1747309 fio.write_iops
556943 ± 91% -62.8% 207326 ±184% sched_debug.cfs_rq:/.load.max
152355 ± 2% -7.7% 140620 ± 3% turbostat.IRQ
1.472e+09 ± 2% -9.2% 1.337e+09 cpuidle..time
99291 ± 3% -7.8% 91524 ± 4% cpuidle..usage
9.05 ±102% -9.0 0.00 perf-profile.calltrace.cycles-pp.free_pgtables.exit_mmap.__mmput.exit_mm.do_exit
9.05 ±102% -7.4 1.67 ±223% perf-profile.children.cycles-pp.free_pgtables
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
next reply other threads:[~2026-05-29 9:04 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-29 9:04 kernel test robot [this message]
2026-05-29 10:39 ` [linux-next:master] [btrfs] 551e510a97: fio.write_iops 12.4% improvement Mark Harmstone
2026-06-01 1:52 ` Oliver Sang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202605291631.659bf248-lkp@intel.com \
--to=oliver.sang@intel.com \
--cc=dsterba@suse.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=lkp@intel.com \
--cc=mark@harmstone.com \
--cc=oe-lkp@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox