* [axboe-block:for-next] [block] 1122c0c1cc: aim7.jobs-per-min 22.6% improvement
@ 2024-06-25 2:28 kernel test robot
2024-06-25 8:57 ` Christoph Hellwig
0 siblings, 1 reply; 8+ messages in thread
From: kernel test robot @ 2024-06-25 2:28 UTC (permalink / raw)
To: Christoph Hellwig
Cc: oe-lkp, lkp, Jens Axboe, Ulf Hansson, Damien Le Moal,
Hannes Reinecke, linux-block, linux-um, drbd-dev, nbd,
linuxppc-dev, virtualization, xen-devel, linux-bcache, dm-devel,
linux-raid, linux-mmc, linux-mtd, nvdimm, linux-nvme, linux-scsi,
ying.huang, feng.tang, fengwei.yin, oliver.sang
Hello,
kernel test robot noticed a 22.6% improvement of aim7.jobs-per-min on:
commit: 1122c0c1cc71f740fa4d5f14f239194e06a1d5e7 ("block: move cache control settings out of queue->flags")
https://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git for-next
testcase: aim7
test machine: 96 threads 2 sockets Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz (Cascade Lake) with 128G memory
parameters:
disk: 4BRD_12G
md: RAID0
fs: xfs
test: sync_disk_rw
load: 300
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240625/202406250948.e0044f1d-oliver.sang@intel.com
=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/load/md/rootfs/tbox_group/test/testcase:
gcc-13/performance/4BRD_12G/xfs/x86_64-rhel-8.3/300/RAID0/debian-12-x86_64-20240206.cgz/lkp-csl-2sp3/sync_disk_rw/aim7
commit:
70905f8706 ("block: remove blk_flush_policy")
1122c0c1cc ("block: move cache control settings out of queue->flags")
70905f8706b62113 1122c0c1cc71f740fa4d5f14f23
---------------- ---------------------------
%stddev %change %stddev
\ | \
153.19 -13.3% 132.81 uptime.boot
2.8e+09 -11.9% 2.466e+09 cpuidle..time
21945319 ± 2% -40.4% 13076160 cpuidle..usage
29.31 +7.8% 31.58 ± 2% iostat.cpu.idle
69.87 -3.6% 67.35 iostat.cpu.system
0.04 ± 4% +0.0 0.08 ± 5% mpstat.cpu.all.iowait%
0.78 ± 2% +0.2 0.99 ± 2% mpstat.cpu.all.usr%
52860 ± 49% -78.2% 11536 ± 78% numa-numastat.node0.other_node
46804 ± 56% +88.4% 88190 ± 10% numa-numastat.node1.other_node
955871 ± 10% -43.3% 542216 ± 14% numa-meminfo.node1.Active
955871 ± 10% -43.3% 542216 ± 14% numa-meminfo.node1.Active(anon)
1015354 ± 10% -34.7% 662696 ± 13% numa-meminfo.node1.Shmem
6008 -14.3% 5146 ± 2% perf-c2c.DRAM.remote
7889 -12.4% 6908 ± 2% perf-c2c.HITM.local
3839 -16.5% 3203 ± 2% perf-c2c.HITM.remote
11728 -13.8% 10112 ± 2% perf-c2c.HITM.total
695109 +20.5% 837625 vmstat.io.bo
105.99 ± 7% -23.7% 80.83 ± 11% vmstat.procs.r
803244 -30.9% 555360 vmstat.system.cs
209736 -12.9% 182626 vmstat.system.in
1448 ± 89% +207.9% 4459 ± 6% numa-vmstat.node0.nr_page_table_pages
52860 ± 49% -78.2% 11536 ± 78% numa-vmstat.node0.numa_other
239214 ± 10% -43.6% 134883 ± 13% numa-vmstat.node1.nr_active_anon
254124 ± 10% -34.9% 165421 ± 13% numa-vmstat.node1.nr_shmem
239214 ± 10% -43.6% 134883 ± 13% numa-vmstat.node1.nr_zone_active_anon
46805 ± 56% +88.4% 88190 ± 10% numa-vmstat.node1.numa_other
17374 +22.6% 21299 aim7.jobs-per-min
103.64 -18.4% 84.58 aim7.time.elapsed_time
103.64 -18.4% 84.58 aim7.time.elapsed_time.max
4641240 -83.4% 770073 aim7.time.involuntary_context_switches
32705 -4.3% 31289 ± 2% aim7.time.minor_page_faults
6562 -3.1% 6359 aim7.time.percent_of_cpu_this_job_got
6775 -21.0% 5351 ± 2% aim7.time.system_time
49095202 -38.3% 30299361 aim7.time.voluntary_context_switches
1297567 -37.0% 817692 meminfo.Active
1297567 -37.0% 817692 meminfo.Active(anon)
97760 ± 5% -23.4% 74859 ± 20% meminfo.AnonHugePages
2390317 -15.3% 2024905 meminfo.Committed_AS
884407 +11.9% 989723 meminfo.Inactive
743152 ± 2% +14.8% 853331 meminfo.Inactive(anon)
159265 ± 8% +38.6% 220668 ± 3% meminfo.Mapped
1382079 -27.1% 1007445 meminfo.Shmem
324534 -37.2% 203663 ± 2% proc-vmstat.nr_active_anon
1165686 -8.2% 1070277 proc-vmstat.nr_file_pages
185928 ± 2% +14.9% 213697 proc-vmstat.nr_inactive_anon
35436 -2.9% 34420 proc-vmstat.nr_inactive_file
40463 ± 8% +38.2% 55918 ± 3% proc-vmstat.nr_mapped
345824 -27.3% 251424 proc-vmstat.nr_shmem
28871 -1.4% 28477 proc-vmstat.nr_slab_reclaimable
324534 -37.2% 203663 ± 2% proc-vmstat.nr_zone_active_anon
185928 ± 2% +14.9% 213697 proc-vmstat.nr_zone_inactive_anon
35436 -2.9% 34420 proc-vmstat.nr_zone_inactive_file
5120744 -2.4% 4996195 proc-vmstat.numa_hit
5020486 -2.5% 4896473 proc-vmstat.numa_local
207026 ± 10% +50.2% 310941 proc-vmstat.pgactivate
5196440 -2.7% 5057618 proc-vmstat.pgalloc_normal
763396 ± 6% -11.8% 673464 proc-vmstat.pgfault
74254490 -1.3% 73292473 proc-vmstat.pgpgout
11.25 ± 24% -60.0% 4.50 ± 29% sched_debug.cfs_rq:/.h_nr_running.max
1.59 ± 20% -42.7% 0.91 ± 13% sched_debug.cfs_rq:/.h_nr_running.stddev
968.29 ± 5% -13.2% 840.04 ± 5% sched_debug.cfs_rq:/.runnable_avg.avg
5533 ± 21% -47.1% 2925 ± 21% sched_debug.cfs_rq:/.runnable_avg.max
798.88 ± 13% -38.3% 492.63 ± 9% sched_debug.cfs_rq:/.runnable_avg.stddev
578.50 ± 5% -9.9% 521.30 ± 4% sched_debug.cfs_rq:/.util_avg.avg
3120 ± 20% -40.3% 1862 ± 19% sched_debug.cfs_rq:/.util_avg.max
479.36 ± 12% -30.4% 333.40 ± 8% sched_debug.cfs_rq:/.util_avg.stddev
4592 ± 24% -51.8% 2215 ± 31% sched_debug.cfs_rq:/.util_est.max
615.47 ± 21% -35.7% 395.64 ± 15% sched_debug.cfs_rq:/.util_est.stddev
11.33 ± 24% -58.8% 4.67 ± 26% sched_debug.cpu.nr_running.max
1.62 ± 20% -42.6% 0.93 ± 11% sched_debug.cpu.nr_running.stddev
224323 -28.2% 161088 sched_debug.cpu.nr_switches.avg
242363 ± 2% -27.9% 174695 ± 2% sched_debug.cpu.nr_switches.max
197870 ± 2% -27.6% 143186 sched_debug.cpu.nr_switches.min
7911 ± 19% -33.1% 5295 ± 10% sched_debug.cpu.nr_switches.stddev
1.23 -4.8% 1.17 perf-stat.i.MPKI
1.105e+10 +5.6% 1.167e+10 perf-stat.i.branch-instructions
1.20 ± 2% +0.1 1.29 ± 2% perf-stat.i.branch-miss-rate%
820863 -30.7% 569230 perf-stat.i.context-switches
3.79 -10.2% 3.41 perf-stat.i.cpi
2.176e+11 -3.2% 2.106e+11 perf-stat.i.cpu-cycles
212040 -27.8% 153137 perf-stat.i.cpu-migrations
5.416e+10 +6.8% 5.785e+10 perf-stat.i.instructions
0.32 +11.8% 0.36 perf-stat.i.ipc
0.05 ± 77% +233.9% 0.17 ± 50% perf-stat.i.major-faults
10.74 -30.2% 7.50 perf-stat.i.metric.K/sec
1.28 -4.3% 1.22 perf-stat.overall.MPKI
4.02 -9.4% 3.64 perf-stat.overall.cpi
3145 -5.3% 2979 perf-stat.overall.cycles-between-cache-misses
0.25 +10.3% 0.27 perf-stat.overall.ipc
1.094e+10 +5.4% 1.153e+10 perf-stat.ps.branch-instructions
812563 -30.8% 562343 perf-stat.ps.context-switches
2.156e+11 -3.4% 2.082e+11 perf-stat.ps.cpu-cycles
209965 -28.0% 151248 perf-stat.ps.cpu-migrations
5.365e+10 +6.6% 5.717e+10 perf-stat.ps.instructions
5.641e+12 -13.1% 4.905e+12 ± 2% perf-stat.total.instructions
14.88 ± 5% -14.9 0.00 perf-profile.calltrace.cycles-pp.blkdev_issue_flush.xfs_file_fsync.xfs_file_buffered_write.vfs_write.ksys_write
14.86 ± 5% -14.9 0.00 perf-profile.calltrace.cycles-pp.submit_bio_wait.blkdev_issue_flush.xfs_file_fsync.xfs_file_buffered_write.vfs_write
14.77 ± 5% -14.8 0.00 perf-profile.calltrace.cycles-pp.__submit_bio_noacct.submit_bio_wait.blkdev_issue_flush.xfs_file_fsync.xfs_file_buffered_write
14.76 ± 5% -14.8 0.00 perf-profile.calltrace.cycles-pp.__submit_bio.__submit_bio_noacct.submit_bio_wait.blkdev_issue_flush.xfs_file_fsync
14.74 ± 5% -14.7 0.00 perf-profile.calltrace.cycles-pp.md_handle_request.__submit_bio.__submit_bio_noacct.submit_bio_wait.blkdev_issue_flush
14.72 ± 5% -14.7 0.00 perf-profile.calltrace.cycles-pp.raid0_make_request.md_handle_request.__submit_bio.__submit_bio_noacct.submit_bio_wait
14.71 ± 5% -14.7 0.00 perf-profile.calltrace.cycles-pp.md_flush_request.raid0_make_request.md_handle_request.__submit_bio.__submit_bio_noacct
13.32 ± 5% -13.3 0.00 perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.md_flush_request.raid0_make_request.md_handle_request.__submit_bio
13.25 ± 5% -13.3 0.00 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irq.md_flush_request.raid0_make_request.md_handle_request
9.70 ± 3% -1.1 8.61 ± 3% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.common_startup_64
9.70 ± 3% -1.1 8.61 ± 3% perf-profile.calltrace.cycles-pp.start_secondary.common_startup_64
9.70 ± 3% -1.1 8.61 ± 3% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.common_startup_64
9.80 ± 3% -1.1 8.71 ± 3% perf-profile.calltrace.cycles-pp.common_startup_64
9.12 ± 3% -1.0 8.15 ± 3% perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.common_startup_64
8.95 ± 3% -0.9 8.01 ± 3% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
8.95 ± 3% -0.9 8.02 ± 3% perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
2.21 -0.4 1.78 ± 2% perf-profile.calltrace.cycles-pp.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
2.22 -0.4 1.79 ± 2% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm
2.22 -0.4 1.79 ± 2% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm
2.22 -0.4 1.79 ± 2% perf-profile.calltrace.cycles-pp.ret_from_fork_asm
2.08 -0.4 1.68 ± 2% perf-profile.calltrace.cycles-pp.process_one_work.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
3.09 -0.2 2.86 ± 2% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.remove_wait_queue.xlog_wait_on_iclog.xfs_log_force_seq
3.10 -0.2 2.87 ± 2% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.remove_wait_queue.xlog_wait_on_iclog.xfs_log_force_seq.xfs_file_fsync
3.10 -0.2 2.87 ± 2% perf-profile.calltrace.cycles-pp.remove_wait_queue.xlog_wait_on_iclog.xfs_log_force_seq.xfs_file_fsync.xfs_file_buffered_write
3.44 -0.2 3.23 ± 4% perf-profile.calltrace.cycles-pp.xlog_wait_on_iclog.xfs_log_force_seq.xfs_file_fsync.xfs_file_buffered_write.vfs_write
0.95 +0.1 1.04 perf-profile.calltrace.cycles-pp.mutex_spin_on_owner.__mutex_lock.__flush_workqueue.xlog_cil_push_now.xlog_cil_force_seq
0.57 +0.1 0.71 ± 2% perf-profile.calltrace.cycles-pp.iomap_file_buffered_write.xfs_file_buffered_write.vfs_write.ksys_write.do_syscall_64
0.58 ± 2% +0.3 0.84 ± 3% perf-profile.calltrace.cycles-pp.xfs_end_ioend.xfs_end_io.process_one_work.worker_thread.kthread
0.59 ± 2% +0.3 0.85 ± 2% perf-profile.calltrace.cycles-pp.xfs_end_io.process_one_work.worker_thread.kthread.ret_from_fork
0.90 ± 2% +0.4 1.27 ± 3% perf-profile.calltrace.cycles-pp.__submit_bio_noacct.iomap_submit_ioend.iomap_writepages.xfs_vm_writepages.do_writepages
0.88 ± 2% +0.4 1.26 ± 3% perf-profile.calltrace.cycles-pp.__submit_bio.__submit_bio_noacct.iomap_submit_ioend.iomap_writepages.xfs_vm_writepages
0.92 ± 3% +0.4 1.30 ± 3% perf-profile.calltrace.cycles-pp.iomap_submit_ioend.iomap_writepages.xfs_vm_writepages.do_writepages.filemap_fdatawrite_wbc
0.57 ± 3% +0.4 0.95 ± 6% perf-profile.calltrace.cycles-pp.xlog_cil_commit.__xfs_trans_commit.xfs_vn_update_time.kiocb_modified.xfs_file_write_checks
0.64 ± 3% +0.4 1.03 ± 6% perf-profile.calltrace.cycles-pp.__xfs_trans_commit.xfs_vn_update_time.kiocb_modified.xfs_file_write_checks.xfs_file_buffered_write
6.90 ± 2% +0.5 7.40 ± 3% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
0.92 ± 4% +0.5 1.43 ± 6% perf-profile.calltrace.cycles-pp.xfs_vn_update_time.kiocb_modified.xfs_file_write_checks.xfs_file_buffered_write.vfs_write
0.00 +0.5 0.52 perf-profile.calltrace.cycles-pp.complete.__flush_workqueue.xlog_cil_push_now.xlog_cil_force_seq.xfs_log_force_seq
0.94 ± 4% +0.5 1.46 ± 6% perf-profile.calltrace.cycles-pp.kiocb_modified.xfs_file_write_checks.xfs_file_buffered_write.vfs_write.ksys_write
0.96 ± 4% +0.5 1.48 ± 6% perf-profile.calltrace.cycles-pp.xfs_file_write_checks.xfs_file_buffered_write.vfs_write.ksys_write.do_syscall_64
0.00 +0.5 0.54 ± 2% perf-profile.calltrace.cycles-pp.xfs_iomap_write_unwritten.xfs_end_ioend.xfs_end_io.process_one_work.worker_thread
0.00 +0.5 0.55 ± 2% perf-profile.calltrace.cycles-pp.iomap_write_iter.iomap_file_buffered_write.xfs_file_buffered_write.vfs_write.ksys_write
0.00 +0.6 0.56 ± 10% perf-profile.calltrace.cycles-pp.__folio_start_writeback.iomap_writepage_map.iomap_writepages.xfs_vm_writepages.do_writepages
0.00 +0.6 0.57 ± 6% perf-profile.calltrace.cycles-pp.__folio_end_writeback.folio_end_writeback.iomap_finish_ioend.md_end_clone_io.__submit_bio
0.00 +0.6 0.58 ± 7% perf-profile.calltrace.cycles-pp.folio_end_writeback.iomap_finish_ioend.md_end_clone_io.__submit_bio.__submit_bio_noacct
0.00 +0.6 0.60 ± 6% perf-profile.calltrace.cycles-pp.iomap_finish_ioend.md_end_clone_io.__submit_bio.__submit_bio_noacct.iomap_submit_ioend
0.08 ±223% +0.6 0.72 ± 5% perf-profile.calltrace.cycles-pp.md_end_clone_io.__submit_bio.__submit_bio_noacct.iomap_submit_ioend.iomap_writepages
1.45 ± 4% +0.7 2.15 ± 4% perf-profile.calltrace.cycles-pp.iomap_writepages.xfs_vm_writepages.do_writepages.filemap_fdatawrite_wbc.__filemap_fdatawrite_range
1.46 ± 4% +0.7 2.16 ± 4% perf-profile.calltrace.cycles-pp.xfs_vm_writepages.do_writepages.filemap_fdatawrite_wbc.__filemap_fdatawrite_range.file_write_and_wait_range
1.48 ± 4% +0.7 2.18 ± 4% perf-profile.calltrace.cycles-pp.do_writepages.filemap_fdatawrite_wbc.__filemap_fdatawrite_range.file_write_and_wait_range.xfs_file_fsync
1.51 ± 4% +0.7 2.22 ± 4% perf-profile.calltrace.cycles-pp.filemap_fdatawrite_wbc.__filemap_fdatawrite_range.file_write_and_wait_range.xfs_file_fsync.xfs_file_buffered_write
1.51 ± 3% +0.7 2.23 ± 4% perf-profile.calltrace.cycles-pp.__filemap_fdatawrite_range.file_write_and_wait_range.xfs_file_fsync.xfs_file_buffered_write.vfs_write
0.00 +0.7 0.72 ± 7% perf-profile.calltrace.cycles-pp.iomap_writepage_map.iomap_writepages.xfs_vm_writepages.do_writepages.filemap_fdatawrite_wbc
1.60 ± 3% +0.8 2.36 ± 4% perf-profile.calltrace.cycles-pp.file_write_and_wait_range.xfs_file_fsync.xfs_file_buffered_write.vfs_write.ksys_write
85.48 +0.8 86.24 perf-profile.calltrace.cycles-pp.xfs_file_fsync.xfs_file_buffered_write.vfs_write.ksys_write.do_syscall_64
87.06 +1.4 88.49 perf-profile.calltrace.cycles-pp.xfs_file_buffered_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
87.18 +1.5 88.64 perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
87.36 +1.5 88.82 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
87.19 +1.5 88.65 perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
87.36 +1.5 88.82 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write
87.62 +1.5 89.10 perf-profile.calltrace.cycles-pp.write
56.74 +13.7 70.42 perf-profile.calltrace.cycles-pp.osq_lock.__mutex_lock.__flush_workqueue.xlog_cil_push_now.xlog_cil_force_seq
57.89 +13.8 71.74 perf-profile.calltrace.cycles-pp.__mutex_lock.__flush_workqueue.xlog_cil_push_now.xlog_cil_force_seq.xfs_log_force_seq
60.36 +14.6 74.96 perf-profile.calltrace.cycles-pp.__flush_workqueue.xlog_cil_push_now.xlog_cil_force_seq.xfs_log_force_seq.xfs_file_fsync
61.48 +14.6 76.09 perf-profile.calltrace.cycles-pp.xlog_cil_push_now.xlog_cil_force_seq.xfs_log_force_seq.xfs_file_fsync.xfs_file_buffered_write
68.74 +14.8 83.60 perf-profile.calltrace.cycles-pp.xfs_log_force_seq.xfs_file_fsync.xfs_file_buffered_write.vfs_write.ksys_write
64.97 +15.1 80.03 perf-profile.calltrace.cycles-pp.xlog_cil_force_seq.xfs_log_force_seq.xfs_file_fsync.xfs_file_buffered_write.vfs_write
14.86 ± 5% -14.9 0.00 perf-profile.children.cycles-pp.submit_bio_wait
14.96 ± 5% -14.8 0.12 ± 4% perf-profile.children.cycles-pp.md_handle_request
14.94 ± 5% -14.8 0.11 ± 3% perf-profile.children.cycles-pp.raid0_make_request
14.83 ± 5% -14.8 0.00 perf-profile.children.cycles-pp.md_flush_request
14.88 ± 5% -14.8 0.06 ± 6% perf-profile.children.cycles-pp.blkdev_issue_flush
15.82 ± 5% -14.5 1.32 ± 3% perf-profile.children.cycles-pp.__submit_bio_noacct
15.81 ± 5% -14.5 1.31 ± 3% perf-profile.children.cycles-pp.__submit_bio
13.86 ± 5% -13.6 0.29 ± 3% perf-profile.children.cycles-pp._raw_spin_lock_irq
22.32 ± 3% -13.1 9.23 ± 4% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
1.96 ± 9% -1.5 0.49 ± 4% perf-profile.children.cycles-pp.intel_idle_irq
9.70 ± 3% -1.1 8.61 ± 3% perf-profile.children.cycles-pp.start_secondary
9.80 ± 3% -1.1 8.71 ± 3% perf-profile.children.cycles-pp.common_startup_64
9.80 ± 3% -1.1 8.71 ± 3% perf-profile.children.cycles-pp.cpu_startup_entry
9.79 ± 3% -1.1 8.71 ± 3% perf-profile.children.cycles-pp.do_idle
9.20 ± 3% -1.0 8.25 ± 3% perf-profile.children.cycles-pp.cpuidle_idle_call
9.04 ± 3% -0.9 8.11 ± 3% perf-profile.children.cycles-pp.cpuidle_enter
9.04 ± 3% -0.9 8.11 ± 3% perf-profile.children.cycles-pp.cpuidle_enter_state
2.21 -0.4 1.78 ± 2% perf-profile.children.cycles-pp.worker_thread
2.22 -0.4 1.79 ± 2% perf-profile.children.cycles-pp.kthread
2.22 -0.4 1.79 ± 2% perf-profile.children.cycles-pp.ret_from_fork
2.22 -0.4 1.79 ± 2% perf-profile.children.cycles-pp.ret_from_fork_asm
2.08 -0.4 1.68 ± 2% perf-profile.children.cycles-pp.process_one_work
0.57 -0.3 0.24 perf-profile.children.cycles-pp.__wake_up
0.63 -0.3 0.32 ± 2% perf-profile.children.cycles-pp.__wake_up_common
1.26 -0.3 0.99 perf-profile.children.cycles-pp.try_to_wake_up
3.56 ± 2% -0.2 3.34 ± 4% perf-profile.children.cycles-pp.xlog_wait_on_iclog
0.46 ± 2% -0.1 0.36 ± 2% perf-profile.children.cycles-pp.select_task_rq
0.86 ± 3% -0.1 0.75 ± 2% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
0.43 ± 2% -0.1 0.33 ± 2% perf-profile.children.cycles-pp.select_task_rq_fair
0.64 -0.1 0.55 ± 2% perf-profile.children.cycles-pp.ttwu_do_activate
0.71 ± 3% -0.1 0.62 ± 3% perf-profile.children.cycles-pp.activate_task
0.57 -0.1 0.48 perf-profile.children.cycles-pp.__flush_smp_call_function_queue
0.17 ± 2% -0.1 0.08 perf-profile.children.cycles-pp.xlog_state_release_iclog
0.48 -0.1 0.41 ± 2% perf-profile.children.cycles-pp.sched_ttwu_pending
0.61 ± 3% -0.1 0.54 ± 3% perf-profile.children.cycles-pp.enqueue_task_fair
0.28 ± 3% -0.1 0.21 ± 3% perf-profile.children.cycles-pp.select_idle_sibling
0.19 -0.1 0.13 ± 2% perf-profile.children.cycles-pp.schedule_idle
0.22 ± 3% -0.1 0.16 ± 4% perf-profile.children.cycles-pp.select_idle_cpu
0.47 ± 4% -0.1 0.41 ± 5% perf-profile.children.cycles-pp.update_load_avg
0.35 ± 2% -0.1 0.29 ± 2% perf-profile.children.cycles-pp.flush_smp_call_function_queue
0.42 ± 3% -0.1 0.37 ± 2% perf-profile.children.cycles-pp.enqueue_entity
0.11 ± 6% -0.1 0.06 ± 8% perf-profile.children.cycles-pp.finish_task_switch
0.18 ± 5% -0.0 0.13 ± 5% perf-profile.children.cycles-pp.available_idle_cpu
0.33 -0.0 0.28 perf-profile.children.cycles-pp.xlog_write
0.12 ± 3% -0.0 0.07 ± 5% perf-profile.children.cycles-pp.xlog_write_partial
0.30 ± 3% -0.0 0.25 ± 3% perf-profile.children.cycles-pp.asm_sysvec_call_function_single
0.12 ± 4% -0.0 0.07 ± 5% perf-profile.children.cycles-pp.xlog_write_get_more_iclog_space
0.37 ± 5% -0.0 0.32 ± 8% perf-profile.children.cycles-pp.dequeue_entity
0.08 -0.0 0.03 ± 70% perf-profile.children.cycles-pp.__cond_resched
0.46 -0.0 0.41 perf-profile.children.cycles-pp.xlog_cil_push_work
0.27 ± 3% -0.0 0.23 ± 3% perf-profile.children.cycles-pp.sysvec_call_function_single
0.08 ± 6% -0.0 0.04 ± 44% perf-profile.children.cycles-pp.select_idle_core
0.26 ± 2% -0.0 0.22 ± 3% perf-profile.children.cycles-pp.__sysvec_call_function_single
0.12 ± 3% -0.0 0.09 ± 5% perf-profile.children.cycles-pp.queue_work_on
0.14 ± 3% -0.0 0.12 ± 6% perf-profile.children.cycles-pp.prepare_task_switch
0.12 ± 3% -0.0 0.09 perf-profile.children.cycles-pp.ttwu_queue_wakelist
0.26 ± 5% -0.0 0.23 ± 6% perf-profile.children.cycles-pp.update_curr
0.12 -0.0 0.10 ± 5% perf-profile.children.cycles-pp.perf_trace_sched_wakeup_template
0.13 ± 3% -0.0 0.11 perf-profile.children.cycles-pp.wake_affine
0.08 ± 4% -0.0 0.06 ± 8% perf-profile.children.cycles-pp.set_next_entity
0.10 ± 5% -0.0 0.07 ± 6% perf-profile.children.cycles-pp.kick_pool
0.11 ± 4% -0.0 0.09 ± 4% perf-profile.children.cycles-pp.__queue_work
0.10 ± 3% -0.0 0.08 ± 4% perf-profile.children.cycles-pp.__switch_to_asm
0.10 ± 4% -0.0 0.08 ± 6% perf-profile.children.cycles-pp.switch_mm_irqs_off
0.07 -0.0 0.05 perf-profile.children.cycles-pp.__smp_call_single_queue
0.11 -0.0 0.09 perf-profile.children.cycles-pp.xlog_cil_set_ctx_write_state
0.10 -0.0 0.08 ± 4% perf-profile.children.cycles-pp.task_h_load
0.08 ± 4% -0.0 0.06 perf-profile.children.cycles-pp.sched_mm_cid_migrate_to
0.08 ± 4% -0.0 0.06 perf-profile.children.cycles-pp.set_task_cpu
0.07 ± 5% -0.0 0.05 perf-profile.children.cycles-pp.__switch_to
0.13 ± 4% -0.0 0.11 ± 3% perf-profile.children.cycles-pp.menu_select
0.13 ± 6% -0.0 0.11 ± 5% perf-profile.children.cycles-pp.reweight_entity
0.11 -0.0 0.09 ± 4% perf-profile.children.cycles-pp.xlog_cil_write_commit_record
0.06 ± 6% -0.0 0.05 perf-profile.children.cycles-pp.___perf_sw_event
0.08 ± 5% -0.0 0.07 ± 6% perf-profile.children.cycles-pp.avg_vruntime
0.06 -0.0 0.05 perf-profile.children.cycles-pp.perf_tp_event
0.06 -0.0 0.05 perf-profile.children.cycles-pp.place_entity
0.06 -0.0 0.05 perf-profile.children.cycles-pp.sched_clock
0.05 +0.0 0.06 perf-profile.children.cycles-pp.rep_movs_alternative
0.05 +0.0 0.06 ± 6% perf-profile.children.cycles-pp.kfree
0.06 +0.0 0.07 ± 5% perf-profile.children.cycles-pp.copy_page_from_iter_atomic
0.10 ± 3% +0.0 0.12 ± 4% perf-profile.children.cycles-pp.xfs_inode_item_format_data_fork
0.05 +0.0 0.06 ± 7% perf-profile.children.cycles-pp.xfs_trans_read_buf_map
0.06 +0.0 0.07 ± 6% perf-profile.children.cycles-pp.xfs_btree_lookup_get_block
0.07 ± 5% +0.0 0.08 ± 5% perf-profile.children.cycles-pp.filemap_get_entry
0.09 ± 5% +0.0 0.10 ± 3% perf-profile.children.cycles-pp.memcpy_orig
0.12 ± 3% +0.0 0.14 ± 3% perf-profile.children.cycles-pp.xlog_state_clean_iclog
0.07 ± 5% +0.0 0.08 ± 5% perf-profile.children.cycles-pp.filemap_dirty_folio
0.07 +0.0 0.09 ± 5% perf-profile.children.cycles-pp.iomap_set_range_uptodate
0.07 ± 5% +0.0 0.08 ± 5% perf-profile.children.cycles-pp.writeback_get_folio
0.07 +0.0 0.09 ± 5% perf-profile.children.cycles-pp.xfs_end_bio
0.06 ± 9% +0.0 0.07 ± 5% perf-profile.children.cycles-pp.io_schedule
0.10 +0.0 0.12 ± 3% perf-profile.children.cycles-pp.xfs_buffered_write_iomap_begin
0.09 +0.0 0.11 ± 6% perf-profile.children.cycles-pp.xfs_btree_lookup
0.10 ± 3% +0.0 0.12 ± 5% perf-profile.children.cycles-pp.writeback_iter
0.09 +0.0 0.11 perf-profile.children.cycles-pp.xfs_trans_committed_bulk
0.26 +0.0 0.28 perf-profile.children.cycles-pp.flush_workqueue_prep_pwqs
0.10 +0.0 0.12 ± 3% perf-profile.children.cycles-pp.__filemap_get_folio
0.07 ± 7% +0.0 0.09 ± 4% perf-profile.children.cycles-pp.folio_wait_bit_common
0.16 ± 3% +0.0 0.19 ± 3% perf-profile.children.cycles-pp.xfs_inode_item_format
0.08 ± 5% +0.0 0.11 perf-profile.children.cycles-pp.__filemap_fdatawait_range
0.07 ± 5% +0.0 0.09 ± 5% perf-profile.children.cycles-pp.wake_page_function
0.07 ± 7% +0.0 0.09 ± 4% perf-profile.children.cycles-pp.folio_wait_writeback
0.12 ± 4% +0.0 0.14 ± 2% perf-profile.children.cycles-pp.iomap_writepage_map_blocks
0.07 ± 6% +0.0 0.10 ± 5% perf-profile.children.cycles-pp.folio_wake_bit
0.13 ± 2% +0.0 0.16 ± 2% perf-profile.children.cycles-pp.llseek
0.03 ± 70% +0.0 0.06 perf-profile.children.cycles-pp.get_jiffies_update
0.12 ± 3% +0.0 0.15 ± 2% perf-profile.children.cycles-pp.iomap_iter
0.14 ± 5% +0.0 0.16 ± 3% perf-profile.children.cycles-pp.__mutex_unlock_slowpath
0.03 ± 70% +0.0 0.06 ± 6% perf-profile.children.cycles-pp.tmigr_requires_handle_remote
0.04 ± 44% +0.0 0.07 perf-profile.children.cycles-pp.__lruvec_stat_mod_folio
0.14 ± 2% +0.0 0.17 ± 4% perf-profile.children.cycles-pp.iomap_write_end
0.04 ± 45% +0.0 0.07 ± 6% perf-profile.children.cycles-pp.xfs_trans_alloc_inode
0.03 ± 70% +0.0 0.06 ± 7% perf-profile.children.cycles-pp.xfs_map_blocks
0.15 ± 3% +0.0 0.18 ± 2% perf-profile.children.cycles-pp.iomap_write_begin
0.11 ± 5% +0.0 0.14 ± 3% perf-profile.children.cycles-pp.wake_up_q
0.14 ± 3% +0.0 0.17 ± 3% perf-profile.children.cycles-pp.xlog_cil_committed
0.14 ± 3% +0.0 0.17 ± 2% perf-profile.children.cycles-pp.xlog_cil_process_committed
0.03 ± 70% +0.0 0.07 ± 8% perf-profile.children.cycles-pp.balance_dirty_pages_ratelimited_flags
0.22 +0.0 0.26 ± 2% perf-profile.children.cycles-pp.xlog_cil_insert_format_items
0.15 ± 2% +0.0 0.19 ± 5% perf-profile.children.cycles-pp.xfs_bmap_add_extent_unwritten_real
0.16 ± 2% +0.0 0.20 ± 5% perf-profile.children.cycles-pp.xfs_bmapi_convert_unwritten
0.02 ±141% +0.0 0.06 ± 13% perf-profile.children.cycles-pp.xlog_grant_push_threshold
0.28 ± 4% +0.0 0.32 ± 2% perf-profile.children.cycles-pp.update_process_times
0.15 +0.0 0.19 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
0.32 ± 3% +0.0 0.36 ± 3% perf-profile.children.cycles-pp.tick_nohz_handler
0.18 ± 2% +0.0 0.23 ± 4% perf-profile.children.cycles-pp.xfs_bmapi_write
0.27 ± 2% +0.0 0.32 perf-profile.children.cycles-pp.xlog_ioend_work
0.36 ± 4% +0.0 0.41 ± 3% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.26 ± 2% +0.0 0.31 perf-profile.children.cycles-pp.xlog_state_do_callback
0.26 ± 2% +0.0 0.31 perf-profile.children.cycles-pp.xlog_state_do_iclog_callbacks
0.00 +0.1 0.05 perf-profile.children.cycles-pp.xa_load
0.00 +0.1 0.05 perf-profile.children.cycles-pp.xfs_iext_lookup_extent
0.02 ±141% +0.1 0.07 ± 5% perf-profile.children.cycles-pp.up_write
0.31 ± 2% +0.1 0.38 ± 2% perf-profile.children.cycles-pp.xlog_cil_insert_items
0.41 ± 4% +0.1 0.47 ± 2% perf-profile.children.cycles-pp.hrtimer_interrupt
0.41 ± 3% +0.1 0.48 ± 3% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
0.13 ± 12% +0.1 0.20 ± 8% perf-profile.children.cycles-pp.xfs_log_ticket_ungrant
0.30 +0.1 0.38 ± 3% perf-profile.children.cycles-pp.copy_to_brd
0.56 ± 3% +0.1 0.64 ± 2% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
0.35 +0.1 0.43 ± 3% perf-profile.children.cycles-pp.brd_submit_bio
0.95 +0.1 1.04 perf-profile.children.cycles-pp.mutex_spin_on_owner
0.11 ± 11% +0.1 0.21 ± 12% perf-profile.children.cycles-pp.xlog_grant_add_space
0.44 +0.1 0.55 ± 2% perf-profile.children.cycles-pp.iomap_write_iter
0.19 ± 5% +0.1 0.30 ± 6% perf-profile.children.cycles-pp.iomap_finish_ioends
0.21 ± 11% +0.1 0.35 ± 12% perf-profile.children.cycles-pp.xfs_log_reserve
0.22 ± 11% +0.1 0.36 ± 11% perf-profile.children.cycles-pp.xfs_trans_reserve
0.40 ± 2% +0.1 0.54 ± 2% perf-profile.children.cycles-pp.xfs_iomap_write_unwritten
0.57 +0.1 0.71 ± 2% perf-profile.children.cycles-pp.iomap_file_buffered_write
0.25 ± 10% +0.1 0.39 ± 10% perf-profile.children.cycles-pp.xfs_trans_alloc
0.13 ± 11% +0.2 0.32 ± 16% perf-profile.children.cycles-pp.schedule_preempt_disabled
0.23 ± 13% +0.2 0.46 ± 12% perf-profile.children.cycles-pp.sb_mark_inode_writeback
0.25 ± 12% +0.2 0.50 ± 12% perf-profile.children.cycles-pp.sb_clear_inode_writeback
0.59 ± 2% +0.3 0.85 ± 2% perf-profile.children.cycles-pp.xfs_end_io
0.58 ± 2% +0.3 0.84 ± 3% perf-profile.children.cycles-pp.xfs_end_ioend
0.46 ± 6% +0.3 0.72 ± 6% perf-profile.children.cycles-pp.md_end_clone_io
0.30 ± 10% +0.3 0.57 ± 9% perf-profile.children.cycles-pp.__folio_start_writeback
0.11 ± 11% +0.3 0.38 ± 13% perf-profile.children.cycles-pp.rwsem_down_read_slowpath
0.43 ± 7% +0.3 0.72 ± 7% perf-profile.children.cycles-pp.iomap_writepage_map
0.16 ± 9% +0.3 0.46 ± 11% perf-profile.children.cycles-pp.down_read
0.44 ± 8% +0.3 0.76 ± 7% perf-profile.children.cycles-pp.__folio_end_writeback
0.52 ± 7% +0.4 0.88 ± 6% perf-profile.children.cycles-pp.folio_end_writeback
0.54 ± 7% +0.4 0.90 ± 6% perf-profile.children.cycles-pp.iomap_finish_ioend
0.92 ± 2% +0.4 1.30 ± 3% perf-profile.children.cycles-pp.iomap_submit_ioend
0.72 ± 3% +0.4 1.16 ± 5% perf-profile.children.cycles-pp.xlog_cil_commit
0.82 ± 3% +0.5 1.28 ± 5% perf-profile.children.cycles-pp.__xfs_trans_commit
0.92 ± 4% +0.5 1.43 ± 6% perf-profile.children.cycles-pp.xfs_vn_update_time
0.94 ± 4% +0.5 1.46 ± 6% perf-profile.children.cycles-pp.kiocb_modified
0.96 ± 4% +0.5 1.48 ± 6% perf-profile.children.cycles-pp.xfs_file_write_checks
6.96 ± 2% +0.5 7.49 ± 3% perf-profile.children.cycles-pp.intel_idle
1.45 ± 4% +0.7 2.15 ± 5% perf-profile.children.cycles-pp.iomap_writepages
1.46 ± 4% +0.7 2.16 ± 4% perf-profile.children.cycles-pp.xfs_vm_writepages
1.48 ± 4% +0.7 2.18 ± 4% perf-profile.children.cycles-pp.do_writepages
1.51 ± 4% +0.7 2.22 ± 4% perf-profile.children.cycles-pp.filemap_fdatawrite_wbc
1.51 ± 3% +0.7 2.23 ± 4% perf-profile.children.cycles-pp.__filemap_fdatawrite_range
1.61 ± 3% +0.8 2.36 ± 4% perf-profile.children.cycles-pp.file_write_and_wait_range
85.48 +0.8 86.24 perf-profile.children.cycles-pp.xfs_file_fsync
87.06 +1.4 88.49 perf-profile.children.cycles-pp.xfs_file_buffered_write
87.19 +1.5 88.65 perf-profile.children.cycles-pp.vfs_write
87.20 +1.5 88.66 perf-profile.children.cycles-pp.ksys_write
87.66 +1.5 89.14 perf-profile.children.cycles-pp.write
87.50 +1.5 88.98 perf-profile.children.cycles-pp.do_syscall_64
87.50 +1.5 88.99 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
56.76 +13.7 70.44 perf-profile.children.cycles-pp.osq_lock
57.89 +13.9 71.74 perf-profile.children.cycles-pp.__mutex_lock
60.36 +14.6 74.96 perf-profile.children.cycles-pp.__flush_workqueue
61.49 +14.6 76.10 perf-profile.children.cycles-pp.xlog_cil_push_now
68.74 +14.8 83.60 perf-profile.children.cycles-pp.xfs_log_force_seq
64.98 +15.1 80.03 perf-profile.children.cycles-pp.xlog_cil_force_seq
22.30 ± 3% -13.1 9.22 ± 4% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
1.91 ± 9% -1.4 0.46 ± 5% perf-profile.self.cycles-pp.intel_idle_irq
0.24 ± 2% -0.1 0.18 ± 4% perf-profile.self.cycles-pp._raw_spin_lock_irq
0.18 ± 4% -0.1 0.12 ± 6% perf-profile.self.cycles-pp.available_idle_cpu
0.37 ± 2% -0.0 0.32 ± 2% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.20 ± 3% -0.0 0.17 ± 4% perf-profile.self.cycles-pp.update_load_avg
0.14 ± 3% -0.0 0.11 ± 3% perf-profile.self.cycles-pp.__schedule
0.09 ± 4% -0.0 0.07 ± 8% perf-profile.self.cycles-pp.prepare_task_switch
0.10 -0.0 0.08 ± 4% perf-profile.self.cycles-pp.task_h_load
0.10 ± 5% -0.0 0.08 ± 6% perf-profile.self.cycles-pp.__switch_to_asm
0.08 ± 4% -0.0 0.06 perf-profile.self.cycles-pp.sched_mm_cid_migrate_to
0.07 ± 5% -0.0 0.05 ± 7% perf-profile.self.cycles-pp.menu_select
0.09 ± 5% -0.0 0.08 ± 6% perf-profile.self.cycles-pp.switch_mm_irqs_off
0.06 ± 7% -0.0 0.05 perf-profile.self.cycles-pp.__switch_to
0.07 ± 7% -0.0 0.05 ± 8% perf-profile.self.cycles-pp.enqueue_entity
0.10 ± 4% -0.0 0.09 ± 7% perf-profile.self.cycles-pp.update_curr
0.05 +0.0 0.06 perf-profile.self.cycles-pp.rep_movs_alternative
0.06 +0.0 0.07 ± 5% perf-profile.self.cycles-pp.xas_load
0.08 ± 4% +0.0 0.10 ± 5% perf-profile.self.cycles-pp.__flush_workqueue
0.07 +0.0 0.08 ± 5% perf-profile.self.cycles-pp.iomap_set_range_uptodate
0.08 ± 5% +0.0 0.10 ± 3% perf-profile.self.cycles-pp.memcpy_orig
0.05 ± 7% +0.0 0.07 ± 5% perf-profile.self.cycles-pp.down_read
0.08 ± 5% +0.0 0.11 ± 4% perf-profile.self.cycles-pp.__mutex_lock
0.09 ± 4% +0.0 0.12 ± 6% perf-profile.self.cycles-pp.xlog_cil_insert_items
0.03 ± 70% +0.0 0.06 perf-profile.self.cycles-pp.get_jiffies_update
0.02 ± 99% +0.0 0.06 ± 7% perf-profile.self.cycles-pp.__folio_end_writeback
0.15 +0.0 0.19 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
0.10 ± 12% +0.1 0.16 ± 9% perf-profile.self.cycles-pp.xfs_log_ticket_ungrant
0.00 +0.1 0.06 ± 11% perf-profile.self.cycles-pp.balance_dirty_pages_ratelimited_flags
0.30 ± 2% +0.1 0.37 ± 2% perf-profile.self.cycles-pp.copy_to_brd
0.95 +0.1 1.03 perf-profile.self.cycles-pp.mutex_spin_on_owner
0.11 ± 11% +0.1 0.20 ± 14% perf-profile.self.cycles-pp.xlog_grant_add_space
6.96 ± 2% +0.5 7.49 ± 3% perf-profile.self.cycles-pp.intel_idle
56.27 +13.5 69.81 perf-profile.self.cycles-pp.osq_lock
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [axboe-block:for-next] [block] 1122c0c1cc: aim7.jobs-per-min 22.6% improvement
2024-06-25 2:28 [axboe-block:for-next] [block] 1122c0c1cc: aim7.jobs-per-min 22.6% improvement kernel test robot
@ 2024-06-25 8:57 ` Christoph Hellwig
2024-06-26 2:10 ` Oliver Sang
0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2024-06-25 8:57 UTC (permalink / raw)
To: kernel test robot
Cc: Christoph Hellwig, oe-lkp, lkp, Jens Axboe, Ulf Hansson,
Damien Le Moal, Hannes Reinecke, linux-block, linux-um, drbd-dev,
nbd, linuxppc-dev, virtualization, xen-devel, linux-bcache,
dm-devel, linux-raid, linux-mmc, linux-mtd, nvdimm, linux-nvme,
linux-scsi, ying.huang, feng.tang, fengwei.yin
Hi Oliver,
can you test the patch below? It restores the previous behavior if
the device did not have a volatile write cache. I think at least
for raid0 and raid1 without bitmap the new behavior actually is correct
and better, but it will need fixes for other modes. If the underlying
devices did have a volatile write cache I'm a bit lost what the problem
was and this probably won't fix the issue.
---
From 81c816827197f811e14add7a79220ed9eef6af02 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Tue, 25 Jun 2024 08:48:18 +0200
Subject: md: set md-specific flags for all queue limits
The md driver wants to enforce a number of flags to an all devices, even
when not inheriting them from the underlying devices. To make sure these
flags survive the queue_limits_set calls that md uses to update the
queue limits without deriving them form the previous limits add a new
md_init_stacking_limits helper that calls blk_set_stacking_limits and sets
these flags.
Fixes: 1122c0c1cc71 ("block: move cache control settings out of queue->flags")
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/md/md.c | 13 ++++++++-----
drivers/md/md.h | 1 +
drivers/md/raid0.c | 2 +-
drivers/md/raid1.c | 2 +-
drivers/md/raid10.c | 2 +-
drivers/md/raid5.c | 2 +-
6 files changed, 13 insertions(+), 9 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 69ea54aedd99a1..8368438e58e989 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5853,6 +5853,13 @@ static void mddev_delayed_delete(struct work_struct *ws)
kobject_put(&mddev->kobj);
}
+void md_init_stacking_limits(struct queue_limits *lim)
+{
+ blk_set_stacking_limits(lim);
+ lim->features = BLK_FEAT_WRITE_CACHE | BLK_FEAT_FUA |
+ BLK_FEAT_IO_STAT | BLK_FEAT_NOWAIT;
+}
+
struct mddev *md_alloc(dev_t dev, char *name)
{
/*
@@ -5871,10 +5878,6 @@ struct mddev *md_alloc(dev_t dev, char *name)
int shift;
int unit;
int error;
- struct queue_limits lim = {
- .features = BLK_FEAT_WRITE_CACHE | BLK_FEAT_FUA |
- BLK_FEAT_IO_STAT | BLK_FEAT_NOWAIT,
- };
/*
* Wait for any previous instance of this device to be completely
@@ -5914,7 +5917,7 @@ struct mddev *md_alloc(dev_t dev, char *name)
*/
mddev->hold_active = UNTIL_STOP;
- disk = blk_alloc_disk(&lim, NUMA_NO_NODE);
+ disk = blk_alloc_disk(NULL, NUMA_NO_NODE);
if (IS_ERR(disk)) {
error = PTR_ERR(disk);
goto out_free_mddev;
diff --git a/drivers/md/md.h b/drivers/md/md.h
index c4d7ebf9587d07..28cb4b0b6c1740 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -893,6 +893,7 @@ extern int strict_strtoul_scaled(const char *cp, unsigned long *res, int scale);
extern int mddev_init(struct mddev *mddev);
extern void mddev_destroy(struct mddev *mddev);
+void md_init_stacking_limits(struct queue_limits *lim);
struct mddev *md_alloc(dev_t dev, char *name);
void mddev_put(struct mddev *mddev);
extern int md_run(struct mddev *mddev);
diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index 62634e2a33bd0f..32d58752477847 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -379,7 +379,7 @@ static int raid0_set_limits(struct mddev *mddev)
struct queue_limits lim;
int err;
- blk_set_stacking_limits(&lim);
+ md_init_stacking_limits(&lim);
lim.max_hw_sectors = mddev->chunk_sectors;
lim.max_write_zeroes_sectors = mddev->chunk_sectors;
lim.io_min = mddev->chunk_sectors << 9;
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 1a0eba65b8a92b..04a0c2ca173245 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -3194,7 +3194,7 @@ static int raid1_set_limits(struct mddev *mddev)
struct queue_limits lim;
int err;
- blk_set_stacking_limits(&lim);
+ md_init_stacking_limits(&lim);
lim.max_write_zeroes_sectors = 0;
err = mddev_stack_rdev_limits(mddev, &lim, MDDEV_STACK_INTEGRITY);
if (err) {
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 3334aa803c8380..2a9c4ee982e023 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -3974,7 +3974,7 @@ static int raid10_set_queue_limits(struct mddev *mddev)
struct queue_limits lim;
int err;
- blk_set_stacking_limits(&lim);
+ md_init_stacking_limits(&lim);
lim.max_write_zeroes_sectors = 0;
lim.io_min = mddev->chunk_sectors << 9;
lim.io_opt = lim.io_min * raid10_nr_stripes(conf);
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 0192a6323f09ba..10219205160bbf 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7708,7 +7708,7 @@ static int raid5_set_limits(struct mddev *mddev)
*/
stripe = roundup_pow_of_two(data_disks * (mddev->chunk_sectors << 9));
- blk_set_stacking_limits(&lim);
+ md_init_stacking_limits(&lim);
lim.io_min = mddev->chunk_sectors << 9;
lim.io_opt = lim.io_min * (conf->raid_disks - conf->max_degraded);
lim.features |= BLK_FEAT_RAID_PARTIAL_STRIPES_EXPENSIVE;
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [axboe-block:for-next] [block] 1122c0c1cc: aim7.jobs-per-min 22.6% improvement
2024-06-25 8:57 ` Christoph Hellwig
@ 2024-06-26 2:10 ` Oliver Sang
2024-06-26 3:39 ` Christoph Hellwig
0 siblings, 1 reply; 8+ messages in thread
From: Oliver Sang @ 2024-06-26 2:10 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Christoph Hellwig, oe-lkp, lkp, Jens Axboe, Ulf Hansson,
Damien Le Moal, Hannes Reinecke, linux-block, linux-um, drbd-dev,
nbd, linuxppc-dev, virtualization, xen-devel, linux-bcache,
dm-devel, linux-raid, linux-mmc, linux-mtd, nvdimm, linux-nvme,
linux-scsi, ying.huang, feng.tang, fengwei.yin, oliver.sang
hi, Christoph Hellwig,
On Tue, Jun 25, 2024 at 01:57:35AM -0700, Christoph Hellwig wrote:
> Hi Oliver,
>
> can you test the patch below? It restores the previous behavior if
> the device did not have a volatile write cache. I think at least
> for raid0 and raid1 without bitmap the new behavior actually is correct
> and better, but it will need fixes for other modes. If the underlying
> devices did have a volatile write cache I'm a bit lost what the problem
> was and this probably won't fix the issue.
I'm not sure I understand this test request. as in title, we see a good
improvement of aim7 for 1122c0c1cc, and we didn't observe other issues for
this commit.
do you mean this improvement is not expected or exposes some problems instead?
then by below patch, should the performance back to the level of parent of
1122c0c1cc?
sure! it's our great pleasure to test your patches. I noticed there are
[1]
https://lore.kernel.org/all/20240625110603.50885-2-hch@lst.de/
which includes "[PATCH 1/7] md: set md-specific flags for all queue limits"
[2]
https://lore.kernel.org/all/20240625145955.115252-2-hch@lst.de/
which includes "[PATCH 1/8] md: set md-specific flags for all queue limits"
which one you suggest us to test?
do we only need to apply the first patch "md: set md-specific flags for all queue limits"
upon 1122c0c1cc?
then is the expectation the performance back to parent of 1122c0c1cc?
thanks
>
> ---
> From 81c816827197f811e14add7a79220ed9eef6af02 Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig <hch@lst.de>
> Date: Tue, 25 Jun 2024 08:48:18 +0200
> Subject: md: set md-specific flags for all queue limits
>
> The md driver wants to enforce a number of flags to an all devices, even
> when not inheriting them from the underlying devices. To make sure these
> flags survive the queue_limits_set calls that md uses to update the
> queue limits without deriving them form the previous limits add a new
> md_init_stacking_limits helper that calls blk_set_stacking_limits and sets
> these flags.
>
> Fixes: 1122c0c1cc71 ("block: move cache control settings out of queue->flags")
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> drivers/md/md.c | 13 ++++++++-----
> drivers/md/md.h | 1 +
> drivers/md/raid0.c | 2 +-
> drivers/md/raid1.c | 2 +-
> drivers/md/raid10.c | 2 +-
> drivers/md/raid5.c | 2 +-
> 6 files changed, 13 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 69ea54aedd99a1..8368438e58e989 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -5853,6 +5853,13 @@ static void mddev_delayed_delete(struct work_struct *ws)
> kobject_put(&mddev->kobj);
> }
>
> +void md_init_stacking_limits(struct queue_limits *lim)
> +{
> + blk_set_stacking_limits(lim);
> + lim->features = BLK_FEAT_WRITE_CACHE | BLK_FEAT_FUA |
> + BLK_FEAT_IO_STAT | BLK_FEAT_NOWAIT;
> +}
> +
> struct mddev *md_alloc(dev_t dev, char *name)
> {
> /*
> @@ -5871,10 +5878,6 @@ struct mddev *md_alloc(dev_t dev, char *name)
> int shift;
> int unit;
> int error;
> - struct queue_limits lim = {
> - .features = BLK_FEAT_WRITE_CACHE | BLK_FEAT_FUA |
> - BLK_FEAT_IO_STAT | BLK_FEAT_NOWAIT,
> - };
>
> /*
> * Wait for any previous instance of this device to be completely
> @@ -5914,7 +5917,7 @@ struct mddev *md_alloc(dev_t dev, char *name)
> */
> mddev->hold_active = UNTIL_STOP;
>
> - disk = blk_alloc_disk(&lim, NUMA_NO_NODE);
> + disk = blk_alloc_disk(NULL, NUMA_NO_NODE);
> if (IS_ERR(disk)) {
> error = PTR_ERR(disk);
> goto out_free_mddev;
> diff --git a/drivers/md/md.h b/drivers/md/md.h
> index c4d7ebf9587d07..28cb4b0b6c1740 100644
> --- a/drivers/md/md.h
> +++ b/drivers/md/md.h
> @@ -893,6 +893,7 @@ extern int strict_strtoul_scaled(const char *cp, unsigned long *res, int scale);
>
> extern int mddev_init(struct mddev *mddev);
> extern void mddev_destroy(struct mddev *mddev);
> +void md_init_stacking_limits(struct queue_limits *lim);
> struct mddev *md_alloc(dev_t dev, char *name);
> void mddev_put(struct mddev *mddev);
> extern int md_run(struct mddev *mddev);
> diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
> index 62634e2a33bd0f..32d58752477847 100644
> --- a/drivers/md/raid0.c
> +++ b/drivers/md/raid0.c
> @@ -379,7 +379,7 @@ static int raid0_set_limits(struct mddev *mddev)
> struct queue_limits lim;
> int err;
>
> - blk_set_stacking_limits(&lim);
> + md_init_stacking_limits(&lim);
> lim.max_hw_sectors = mddev->chunk_sectors;
> lim.max_write_zeroes_sectors = mddev->chunk_sectors;
> lim.io_min = mddev->chunk_sectors << 9;
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index 1a0eba65b8a92b..04a0c2ca173245 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -3194,7 +3194,7 @@ static int raid1_set_limits(struct mddev *mddev)
> struct queue_limits lim;
> int err;
>
> - blk_set_stacking_limits(&lim);
> + md_init_stacking_limits(&lim);
> lim.max_write_zeroes_sectors = 0;
> err = mddev_stack_rdev_limits(mddev, &lim, MDDEV_STACK_INTEGRITY);
> if (err) {
> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index 3334aa803c8380..2a9c4ee982e023 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -3974,7 +3974,7 @@ static int raid10_set_queue_limits(struct mddev *mddev)
> struct queue_limits lim;
> int err;
>
> - blk_set_stacking_limits(&lim);
> + md_init_stacking_limits(&lim);
> lim.max_write_zeroes_sectors = 0;
> lim.io_min = mddev->chunk_sectors << 9;
> lim.io_opt = lim.io_min * raid10_nr_stripes(conf);
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 0192a6323f09ba..10219205160bbf 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -7708,7 +7708,7 @@ static int raid5_set_limits(struct mddev *mddev)
> */
> stripe = roundup_pow_of_two(data_disks * (mddev->chunk_sectors << 9));
>
> - blk_set_stacking_limits(&lim);
> + md_init_stacking_limits(&lim);
> lim.io_min = mddev->chunk_sectors << 9;
> lim.io_opt = lim.io_min * (conf->raid_disks - conf->max_degraded);
> lim.features |= BLK_FEAT_RAID_PARTIAL_STRIPES_EXPENSIVE;
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [axboe-block:for-next] [block] 1122c0c1cc: aim7.jobs-per-min 22.6% improvement
2024-06-26 2:10 ` Oliver Sang
@ 2024-06-26 3:39 ` Christoph Hellwig
2024-06-27 2:35 ` Oliver Sang
0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2024-06-26 3:39 UTC (permalink / raw)
To: Oliver Sang
Cc: Christoph Hellwig, Christoph Hellwig, oe-lkp, lkp, Jens Axboe,
Ulf Hansson, Damien Le Moal, Hannes Reinecke, linux-block,
linux-um, drbd-dev, nbd, linuxppc-dev, virtualization, xen-devel,
linux-bcache, dm-devel, linux-raid, linux-mmc, linux-mtd, nvdimm,
linux-nvme, linux-scsi, ying.huang, feng.tang, fengwei.yin
On Wed, Jun 26, 2024 at 10:10:49AM +0800, Oliver Sang wrote:
> I'm not sure I understand this test request. as in title, we see a good
> improvement of aim7 for 1122c0c1cc, and we didn't observe other issues for
> this commit.
The improvement suggests we are not sending cache flushes when we should
send them, or at least just handle them in md.
> do you mean this improvement is not expected or exposes some problems instead?
> then by below patch, should the performance back to the level of parent of
> 1122c0c1cc?
>
> sure! it's our great pleasure to test your patches. I noticed there are
> [1]
> https://lore.kernel.org/all/20240625110603.50885-2-hch@lst.de/
> which includes "[PATCH 1/7] md: set md-specific flags for all queue limits"
> [2]
> https://lore.kernel.org/all/20240625145955.115252-2-hch@lst.de/
> which includes "[PATCH 1/8] md: set md-specific flags for all queue limits"
>
> which one you suggest us to test?
> do we only need to apply the first patch "md: set md-specific flags for all queue limits"
> upon 1122c0c1cc?
> then is the expectation the performance back to parent of 1122c0c1cc?
Either just the patch in reply or the entire [2] series would be fine.
Thanks!
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [axboe-block:for-next] [block] 1122c0c1cc: aim7.jobs-per-min 22.6% improvement
2024-06-26 3:39 ` Christoph Hellwig
@ 2024-06-27 2:35 ` Oliver Sang
2024-06-27 4:54 ` Christoph Hellwig
0 siblings, 1 reply; 8+ messages in thread
From: Oliver Sang @ 2024-06-27 2:35 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Christoph Hellwig, oe-lkp, lkp, Jens Axboe, Ulf Hansson,
Damien Le Moal, Hannes Reinecke, linux-block, linux-um, drbd-dev,
nbd, linuxppc-dev, virtualization, xen-devel, linux-bcache,
dm-devel, linux-raid, linux-mmc, linux-mtd, nvdimm, linux-nvme,
linux-scsi, ying.huang, feng.tang, fengwei.yin, oliver.sang
hi, Christoph Hellwig,
On Tue, Jun 25, 2024 at 08:39:50PM -0700, Christoph Hellwig wrote:
> On Wed, Jun 26, 2024 at 10:10:49AM +0800, Oliver Sang wrote:
> > I'm not sure I understand this test request. as in title, we see a good
> > improvement of aim7 for 1122c0c1cc, and we didn't observe other issues for
> > this commit.
>
> The improvement suggests we are not sending cache flushes when we should
> send them, or at least just handle them in md.
thanks for explanation!
>
> > do you mean this improvement is not expected or exposes some problems instead?
> > then by below patch, should the performance back to the level of parent of
> > 1122c0c1cc?
> >
> > sure! it's our great pleasure to test your patches. I noticed there are
> > [1]
> > https://lore.kernel.org/all/20240625110603.50885-2-hch@lst.de/
> > which includes "[PATCH 1/7] md: set md-specific flags for all queue limits"
> > [2]
> > https://lore.kernel.org/all/20240625145955.115252-2-hch@lst.de/
> > which includes "[PATCH 1/8] md: set md-specific flags for all queue limits"
> >
> > which one you suggest us to test?
> > do we only need to apply the first patch "md: set md-specific flags for all queue limits"
> > upon 1122c0c1cc?
> > then is the expectation the performance back to parent of 1122c0c1cc?
>
> Either just the patch in reply or the entire [2] series would be fine.
I failed to apply patch in your previous reply to 1122c0c1cc or current tip
of axboe-block/for-next:
c1440ed442a58 (axboe-block/for-next) Merge branch 'for-6.11/block' into for-next
but it's ok to apply upon next:
* 0fc4bfab2cd45 (tag: next-20240625) Add linux-next specific files for 20240625
I've already started the test based on this applyment.
is the expectation that patch should not introduce performance change comparing
to 0fc4bfab2cd45?
or if this applyment is not ok, please just give me guidance. Thanks!
>
> Thanks!
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [axboe-block:for-next] [block] 1122c0c1cc: aim7.jobs-per-min 22.6% improvement
2024-06-27 2:35 ` Oliver Sang
@ 2024-06-27 4:54 ` Christoph Hellwig
2024-07-01 8:22 ` Oliver Sang
0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2024-06-27 4:54 UTC (permalink / raw)
To: Oliver Sang
Cc: Christoph Hellwig, oe-lkp, lkp, Jens Axboe, Ulf Hansson,
Damien Le Moal, Hannes Reinecke, linux-block, linux-um, drbd-dev,
nbd, linuxppc-dev, virtualization, xen-devel, linux-bcache,
dm-devel, linux-raid, linux-mmc, linux-mtd, nvdimm, linux-nvme,
linux-scsi, ying.huang, feng.tang, fengwei.yin
On Thu, Jun 27, 2024 at 10:35:38AM +0800, Oliver Sang wrote:
>
> I failed to apply patch in your previous reply to 1122c0c1cc or current tip
> of axboe-block/for-next:
> c1440ed442a58 (axboe-block/for-next) Merge branch 'for-6.11/block' into for-next
That already includes it.
>
> but it's ok to apply upon next:
> * 0fc4bfab2cd45 (tag: next-20240625) Add linux-next specific files for 20240625
>
> I've already started the test based on this applyment.
> is the expectation that patch should not introduce performance change comparing
> to 0fc4bfab2cd45?
>
> or if this applyment is not ok, please just give me guidance. Thanks!
The expectation is that the latest block branch (and thus linux-next)
doesn't see this performance change.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [axboe-block:for-next] [block] 1122c0c1cc: aim7.jobs-per-min 22.6% improvement
2024-06-27 4:54 ` Christoph Hellwig
@ 2024-07-01 8:22 ` Oliver Sang
2024-07-02 7:32 ` Christoph Hellwig
0 siblings, 1 reply; 8+ messages in thread
From: Oliver Sang @ 2024-07-01 8:22 UTC (permalink / raw)
To: Christoph Hellwig
Cc: oe-lkp, lkp, Jens Axboe, Ulf Hansson, Damien Le Moal,
Hannes Reinecke, linux-block, linux-um, drbd-dev, nbd,
linuxppc-dev, virtualization, xen-devel, linux-bcache, dm-devel,
linux-raid, linux-mmc, linux-mtd, nvdimm, linux-nvme, linux-scsi,
ying.huang, feng.tang, fengwei.yin, oliver.sang
hi, Christoph Hellwig,
On Wed, Jun 26, 2024 at 09:54:05PM -0700, Christoph Hellwig wrote:
> On Thu, Jun 27, 2024 at 10:35:38AM +0800, Oliver Sang wrote:
> >
> > I failed to apply patch in your previous reply to 1122c0c1cc or current tip
> > of axboe-block/for-next:
> > c1440ed442a58 (axboe-block/for-next) Merge branch 'for-6.11/block' into for-next
>
> That already includes it.
for the patch in your previous reply [1]
the bot applied it automatically as:
* 5c683739f6c2f patch in [1]
* 0fc4bfab2cd45 (tag: next-20240625) Add linux-next specific files for 20240625
for patch set [2], the bot applied it as:
* 6490f979767736 block: move dma_pad_mask into queue_limits
* 278817f42e219b block: remove the fallback case in queue_dma_alignment
* 81afb19d619a04 block: remove disk_update_readahead
* 037d85402b8b83 block: conding style fixup for blk_queue_max_guaranteed_bio
* 4fe67425ae31a8 block: convert features and flags to __bitwise types
* e3c2d2ad4136f2 block: rename BLK_FLAG_MISALIGNED
* 33ead159243d1c block: correctly report cache type
* 6725109120e0ba md: set md-specific flags for all queue limits
* e6d130064a02f5 Merge branch 'for-6.11/block' into for-next
but both build failed with the error:
- "ERROR: modpost: \"md_init_stacking_limits\" [drivers/md/raid456.ko] undefined!"
- "ERROR: modpost: \"md_init_stacking_limits\" [drivers/md/raid1.ko] undefined!"
- "ERROR: modpost: \"md_init_stacking_limits\" [drivers/md/raid0.ko] undefined!"
- "ERROR: modpost: \"md_init_stacking_limits\" [drivers/md/raid10.ko] undefined!"
since you mentioned the axboe-block/for-next branch has already includes the
patch-set, I got a snapshot of the branch as below several days ago:
* bc512ae8cb934 (axboe-block/for-next) Merge branch 'for-6.11/block' into for-next <-----------
|\
| * 18048c1af7836 (axboe-block/for-6.11/block) loop: Fix a race between loop detach and loop open
| * 63db4a1f795a1 block: Delete blk_queue_flag_test_and_set()
* | e21d05740862c Merge branch 'for-6.11/block' into for-next
|\|
| * e269537e491da block: clean up the check in blkdev_iomap_begin()
* | 9c6e1f8702d51 Merge branch 'for-6.11/block' into for-next
|\|
| * 69b6517687a4b block: use the right type for stub rq_integrity_vec()
* | c1440ed442a58 Merge branch 'for-6.11/block' into for-next
|\|
| * e94b45d08b5d1 block: move dma_pad_mask into queue_limits <----------------
| * abfc9d810926d block: remove the fallback case in queue_dma_alignment
| * 73781b3b81e76 block: remove disk_update_readahead
| * 3302f6f090522 block: conding style fixup for blk_queue_max_guaranteed_bio
| * fcf865e357f80 block: convert features and flags to __bitwise types
| * ec9b1cf0b0ebf block: rename BLK_FEAT_MISALIGNED
| * 78887d004fb2b block: correctly report cache type
| * 573d5abf3df00 md: set md-specific flags for all queue limits <----------------
* | 72e9cd924fccc Merge branch 'for-6.11/block' into for-next
|\|
| * cf546dd289e0f block: change rq_integrity_vec to respect the iterator <-------------
from below, it seems the patchset doesn't introduce any performance improvement
but a regression now. is this expected?
=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/load/md/rootfs/tbox_group/test/testcase:
gcc-13/performance/4BRD_12G/xfs/x86_64-rhel-8.3/300/RAID0/debian-12-x86_64-20240206.cgz/lkp-csl-2sp3/sync_disk_rw/aim7
cf546dd289e0f6d2 573d5abf3df00c879fbd25774e4 e94b45d08b5d1c230c0f59c3eed bc512ae8cb934ac31470bc825fa
---------------- --------------------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev %change %stddev
\ | \ | \ | \
21493 -19.6% 17278 -19.2% 17371 -19.7% 17264 aim7.jobs-per-min
[1] https://lore.kernel.org/all/ZnqGf49cvy6W-xWf@infradead.org/
[2] https://lore.kernel.org/all/20240625145955.115252-2-hch@lst.de/
>
> >
> > but it's ok to apply upon next:
> > * 0fc4bfab2cd45 (tag: next-20240625) Add linux-next specific files for 20240625
> >
> > I've already started the test based on this applyment.
> > is the expectation that patch should not introduce performance change comparing
> > to 0fc4bfab2cd45?
> >
> > or if this applyment is not ok, please just give me guidance. Thanks!
>
> The expectation is that the latest block branch (and thus linux-next)
> doesn't see this performance change.
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [axboe-block:for-next] [block] 1122c0c1cc: aim7.jobs-per-min 22.6% improvement
2024-07-01 8:22 ` Oliver Sang
@ 2024-07-02 7:32 ` Christoph Hellwig
0 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2024-07-02 7:32 UTC (permalink / raw)
To: Oliver Sang
Cc: Christoph Hellwig, oe-lkp, lkp, Jens Axboe, Ulf Hansson,
Damien Le Moal, Hannes Reinecke, linux-block, linux-um, drbd-dev,
nbd, linuxppc-dev, virtualization, xen-devel, linux-bcache,
dm-devel, linux-raid, linux-mmc, linux-mtd, nvdimm, linux-nvme,
linux-scsi, ying.huang, feng.tang, fengwei.yin
On Mon, Jul 01, 2024 at 04:22:19PM +0800, Oliver Sang wrote:
> from below, it seems the patchset doesn't introduce any performance improvement
> but a regression now. is this expected?
Not having the improvement at least alleviate my concerns about data
integrity. I'm still curious where it comes from as it isn't exactly
expected.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-07-02 7:32 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-25 2:28 [axboe-block:for-next] [block] 1122c0c1cc: aim7.jobs-per-min 22.6% improvement kernel test robot
2024-06-25 8:57 ` Christoph Hellwig
2024-06-26 2:10 ` Oliver Sang
2024-06-26 3:39 ` Christoph Hellwig
2024-06-27 2:35 ` Oliver Sang
2024-06-27 4:54 ` Christoph Hellwig
2024-07-01 8:22 ` Oliver Sang
2024-07-02 7:32 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).