[cel:nfsd-testing] [nfsd] ab62726202: fsmark.app

public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed

* [cel:nfsd-testing] [nfsd]  ab62726202:  fsmark.app_overhead 186.4% regression
@ 2024-12-27  9:13 kernel test robot
  2024-12-28 17:32 ` Chuck Lever
  0 siblings, 1 reply; 4+ messages in thread
From: kernel test robot @ 2024-12-27  9:13 UTC (permalink / raw)
  To: NeilBrown; +Cc: oe-lkp, lkp, Chuck Lever, Jeff Layton, linux-nfs, oliver.sang



Hello,

kernel test robot noticed a 186.4% regression of fsmark.app_overhead on:
(but no diff for fsmark.files_per_sec as below (a))


commit: ab627262022ed8c6a68e619ed03a14e47acf2e39 ("nfsd: allocate new session-based DRC slots on demand.")
https://git.kernel.org/cgit/linux/kernel/git/cel/linux nfsd-testing

testcase: fsmark
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:

	iterations: 1x
	nr_threads: 32t
	disk: 1HDD
	fs: btrfs
	fs2: nfsv4
	filesize: 16MB
	test_size: 15G
	sync_method: NoSync
	nr_directories: 16d
	nr_files_per_directory: 256fpd
	cpufreq_governor: performance




If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202412271641.cfba5666-lkp@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20241227/202412271641.cfba5666-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/disk/filesize/fs2/fs/iterations/kconfig/nr_directories/nr_files_per_directory/nr_threads/rootfs/sync_method/tbox_group/test_size/testcase:
  gcc-12/performance/1HDD/16MB/nfsv4/btrfs/1x/x86_64-rhel-9.4/16d/256fpd/32t/debian-12-x86_64-20240206.cgz/NoSync/lkp-icl-2sp7/15G/fsmark

commit: 
  ccd01c7601 ("nfsd: add session slot count to /proc/fs/nfsd/clients/*/info")
  ab62726202 ("nfsd: allocate new session-based DRC slots on demand.")

ccd01c76017847d1 ab627262022ed8c6a68e619ed03 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      5.48 ±  9%     +24.9%       6.85 ± 14%  sched_debug.cpu.nr_uninterruptible.stddev
     12489           +11.1%      13876        uptime.idle
 3.393e+08 ± 16%    +186.4%  9.717e+08 ±  9%  fsmark.app_overhead
      6.40            +0.0%       6.40        fsmark.files_per_sec     <-------- (a)
      6.00           +27.8%       7.67 ±  6%  fsmark.time.percent_of_cpu_this_job_got
     72.33           +15.8%      83.79        iostat.cpu.idle
     25.91 ±  3%     -44.3%      14.42 ± 11%  iostat.cpu.iowait
     72.08           +11.6       83.64        mpstat.cpu.all.idle%
     26.18 ±  3%     -11.6       14.58 ± 11%  mpstat.cpu.all.iowait%
    153772 ±  5%     +19.1%     183126 ±  8%  meminfo.DirectMap4k
    156099           +19.5%     186594        meminfo.Dirty
    467358           -12.9%     406910 ±  2%  meminfo.Writeback
     72.35           +15.8%      83.79        vmstat.cpu.id
     25.90 ±  3%     -44.3%      14.41 ± 11%  vmstat.cpu.wa
     17.61 ±  3%     -45.8%       9.55 ± 10%  vmstat.procs.b
      5909 ±  2%      -6.2%       5545        vmstat.system.in
      0.03 ± 56%     -73.4%       0.01 ±149%  perf-sched.sch_delay.avg.ms.btrfs_commit_transaction.btrfs_sync_file.nfsd_commit.nfsd4_commit
      0.05 ± 60%     -72.1%       0.02 ±165%  perf-sched.sch_delay.max.ms.btrfs_commit_transaction.btrfs_sync_file.nfsd_commit.nfsd4_commit
      0.07 ± 41%     +36.1%       0.10 ±  8%  perf-sched.sch_delay.max.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.btrfs_tree_read_lock_nested
     13.03 ±170%    +566.0%      86.78 ± 77%  perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.svc_tcp_sendto
    206.83 ± 14%     -31.5%     141.67 ±  6%  perf-sched.wait_and_delay.count.rpc_wait_bit_killable.__wait_on_bit.out_of_line_wait_on_bit.__rpc_execute
      0.30 ± 62%     -82.1%       0.05 ±110%  perf-sched.wait_time.avg.ms.handle_reserve_ticket.__reserve_bytes.btrfs_reserve_data_bytes.btrfs_check_data_free_space
      7.37 ±  4%     -15.8%       6.20 ±  4%  perf-stat.i.MPKI
     44.13 ±  2%      -2.9       41.25 ±  2%  perf-stat.i.cache-miss-rate%
    103.65 ±  2%     +17.9%     122.17 ±  8%  perf-stat.i.cpu-migrations
    627.67 ±  3%     +25.4%     787.18 ±  6%  perf-stat.i.cycles-between-cache-misses
      0.67            +3.7%       0.70        perf-stat.i.ipc
      1.35            +2.2%       1.38        perf-stat.overall.cpi
    373.39            +4.1%     388.79        perf-stat.overall.cycles-between-cache-misses
      0.74            -2.1%       0.73        perf-stat.overall.ipc
    102.89 ±  2%     +17.9%     121.32 ±  8%  perf-stat.ps.cpu-migrations
     39054           +19.0%      46460 ±  2%  proc-vmstat.nr_dirty
     15139            +2.2%      15476        proc-vmstat.nr_kernel_stack
     45710            +1.9%      46570        proc-vmstat.nr_slab_unreclaimable
    116900           -13.5%     101162        proc-vmstat.nr_writeback
     87038           -18.2%      71185 ±  2%  proc-vmstat.nr_zone_write_pending
   6949807            -3.8%    6688660        proc-vmstat.numa_hit
   6882153            -3.8%    6622312        proc-vmstat.numa_local
  13471776            -2.0%   13204489        proc-vmstat.pgalloc_normal
    584292            +3.8%     606391 ±  3%  proc-vmstat.pgfault
     25859            +9.8%      28392 ±  9%  proc-vmstat.pgreuse
      2.02 ±  8%      -0.3        1.71 ±  5%  perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      1.86 ±  8%      -0.3        1.58 ±  6%  perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
      3.42 ±  5%      -0.6        2.87 ±  5%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      2.96 ±  4%      -0.4        2.55 ±  5%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
      0.35 ± 45%      -0.2        0.14 ± 71%  perf-profile.children.cycles-pp.khugepaged
      0.34 ± 46%      -0.2        0.14 ± 71%  perf-profile.children.cycles-pp.hpage_collapse_scan_pmd
      0.34 ± 46%      -0.2        0.14 ± 71%  perf-profile.children.cycles-pp.khugepaged_scan_mm_slot
      0.34 ± 47%      -0.2        0.14 ± 72%  perf-profile.children.cycles-pp.collapse_huge_page
      1.21 ± 10%      -0.2        1.01 ±  8%  perf-profile.children.cycles-pp.__hrtimer_run_queues
      0.82 ±  9%      -0.1        0.68 ± 10%  perf-profile.children.cycles-pp.update_process_times
      0.41 ±  8%      -0.1        0.29 ± 22%  perf-profile.children.cycles-pp.btrfs_check_data_free_space
      0.21 ±  7%      -0.1        0.11 ± 73%  perf-profile.children.cycles-pp.copy_mc_enhanced_fast_string
      0.55 ± 11%      -0.1        0.46 ± 14%  perf-profile.children.cycles-pp.__set_extent_bit
      0.33 ±  9%      -0.1        0.28 ±  8%  perf-profile.children.cycles-pp.nfs_request_add_commit_list
      0.17 ±  9%      -0.0        0.13 ± 16%  perf-profile.children.cycles-pp.readn
      0.08 ± 13%      -0.0        0.06 ± 14%  perf-profile.children.cycles-pp.load_elf_interp
      1.00 ± 16%      +1.2        2.18 ± 53%  perf-profile.children.cycles-pp.folio_batch_move_lru
      0.21 ±  8%      -0.1        0.11 ± 73%  perf-profile.self.cycles-pp.copy_mc_enhanced_fast_string
      0.05 ± 49%      +0.1        0.15 ± 61%  perf-profile.self.cycles-pp.nfs_update_folio
      0.94 ±  5%      +0.2        1.11 ±  4%  perf-profile.self.cycles-pp._raw_spin_lock_irqsave
      0.25 ± 17%      +0.4        0.63 ± 61%  perf-profile.self.cycles-pp.nfs_page_async_flush




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [cel:nfsd-testing] [nfsd] ab62726202: fsmark.app_overhead 186.4% regression
  2024-12-27  9:13 [cel:nfsd-testing] [nfsd] ab62726202: fsmark.app_overhead 186.4% regression kernel test robot
@ 2024-12-28 17:32 ` Chuck Lever
  2024-12-28 18:17   ` Jeff Layton
  2024-12-31  2:50   ` Oliver Sang
  0 siblings, 2 replies; 4+ messages in thread
From: Chuck Lever @ 2024-12-28 17:32 UTC (permalink / raw)
  To: kernel test robot, NeilBrown; +Cc: oe-lkp, lkp, Jeff Layton, linux-nfs

On 12/27/24 4:13 AM, kernel test robot wrote:
> 
> 
> Hello,
> 
> kernel test robot noticed a 186.4% regression of fsmark.app_overhead on:
> (but no diff for fsmark.files_per_sec as below (a))

Hello Oliver -

I have questions about this test result.

Is this https://github.com/josefbacik/fs_mark ?

I don't understand what "app_overhead" is measuring. Is this "think
time"?

A more concerning regression might be:

        13.03 ±170%    +566.0%      86.78 ± 77%

perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.svc_tcp_sendto

But these metrics look like they improved:

         0.03 ± 56%     -73.4%       0.01 ±149%
perf-sched.sch_delay.avg.ms.btrfs_commit_transaction.btrfs_sync_file.nfsd_commit.nfsd4_commit
         0.05 ± 60%     -72.1%       0.02 ±165%
perf-sched.sch_delay.max.ms.btrfs_commit_transaction.btrfs_sync_file.nfsd_commit.nfsd4_commit

This is a quite mixed result, IMO -- I'm not convinced it's actionable.
Can someone help explain/analyze the metrics?


> commit: ab627262022ed8c6a68e619ed03a14e47acf2e39 ("nfsd: allocate new session-based DRC slots on demand.")
> https://git.kernel.org/cgit/linux/kernel/git/cel/linux nfsd-testing
> 
> testcase: fsmark
> config: x86_64-rhel-9.4
> compiler: gcc-12
> test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
> parameters:
> 
> 	iterations: 1x
> 	nr_threads: 32t
> 	disk: 1HDD
> 	fs: btrfs
> 	fs2: nfsv4
> 	filesize: 16MB
> 	test_size: 15G
> 	sync_method: NoSync
> 	nr_directories: 16d
> 	nr_files_per_directory: 256fpd
> 	cpufreq_governor: performance
> 
> 
> 
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202412271641.cfba5666-lkp@intel.com
> 
> 
> Details are as below:
> -------------------------------------------------------------------------------------------------->
> 
> 
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20241227/202412271641.cfba5666-lkp@intel.com
> 
> =========================================================================================
> compiler/cpufreq_governor/disk/filesize/fs2/fs/iterations/kconfig/nr_directories/nr_files_per_directory/nr_threads/rootfs/sync_method/tbox_group/test_size/testcase:
>    gcc-12/performance/1HDD/16MB/nfsv4/btrfs/1x/x86_64-rhel-9.4/16d/256fpd/32t/debian-12-x86_64-20240206.cgz/NoSync/lkp-icl-2sp7/15G/fsmark
> 
> commit:
>    ccd01c7601 ("nfsd: add session slot count to /proc/fs/nfsd/clients/*/info")
>    ab62726202 ("nfsd: allocate new session-based DRC slots on demand.")
> 
> ccd01c76017847d1 ab627262022ed8c6a68e619ed03
> ---------------- ---------------------------
>           %stddev     %change         %stddev
>               \          |                \
>        5.48 ±  9%     +24.9%       6.85 ± 14%  sched_debug.cpu.nr_uninterruptible.stddev
>       12489           +11.1%      13876        uptime.idle
>   3.393e+08 ± 16%    +186.4%  9.717e+08 ±  9%  fsmark.app_overhead
>        6.40            +0.0%       6.40        fsmark.files_per_sec     <-------- (a)
>        6.00           +27.8%       7.67 ±  6%  fsmark.time.percent_of_cpu_this_job_got
>       72.33           +15.8%      83.79        iostat.cpu.idle
>       25.91 ±  3%     -44.3%      14.42 ± 11%  iostat.cpu.iowait
>       72.08           +11.6       83.64        mpstat.cpu.all.idle%
>       26.18 ±  3%     -11.6       14.58 ± 11%  mpstat.cpu.all.iowait%
>      153772 ±  5%     +19.1%     183126 ±  8%  meminfo.DirectMap4k
>      156099           +19.5%     186594        meminfo.Dirty
>      467358           -12.9%     406910 ±  2%  meminfo.Writeback
>       72.35           +15.8%      83.79        vmstat.cpu.id
>       25.90 ±  3%     -44.3%      14.41 ± 11%  vmstat.cpu.wa
>       17.61 ±  3%     -45.8%       9.55 ± 10%  vmstat.procs.b
>        5909 ±  2%      -6.2%       5545        vmstat.system.in
>        0.03 ± 56%     -73.4%       0.01 ±149%  perf-sched.sch_delay.avg.ms.btrfs_commit_transaction.btrfs_sync_file.nfsd_commit.nfsd4_commit
>        0.05 ± 60%     -72.1%       0.02 ±165%  perf-sched.sch_delay.max.ms.btrfs_commit_transaction.btrfs_sync_file.nfsd_commit.nfsd4_commit
>        0.07 ± 41%     +36.1%       0.10 ±  8%  perf-sched.sch_delay.max.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.btrfs_tree_read_lock_nested
>       13.03 ±170%    +566.0%      86.78 ± 77%  perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.svc_tcp_sendto
>      206.83 ± 14%     -31.5%     141.67 ±  6%  perf-sched.wait_and_delay.count.rpc_wait_bit_killable.__wait_on_bit.out_of_line_wait_on_bit.__rpc_execute
>        0.30 ± 62%     -82.1%       0.05 ±110%  perf-sched.wait_time.avg.ms.handle_reserve_ticket.__reserve_bytes.btrfs_reserve_data_bytes.btrfs_check_data_free_space
>        7.37 ±  4%     -15.8%       6.20 ±  4%  perf-stat.i.MPKI
>       44.13 ±  2%      -2.9       41.25 ±  2%  perf-stat.i.cache-miss-rate%
>      103.65 ±  2%     +17.9%     122.17 ±  8%  perf-stat.i.cpu-migrations
>      627.67 ±  3%     +25.4%     787.18 ±  6%  perf-stat.i.cycles-between-cache-misses
>        0.67            +3.7%       0.70        perf-stat.i.ipc
>        1.35            +2.2%       1.38        perf-stat.overall.cpi
>      373.39            +4.1%     388.79        perf-stat.overall.cycles-between-cache-misses
>        0.74            -2.1%       0.73        perf-stat.overall.ipc
>      102.89 ±  2%     +17.9%     121.32 ±  8%  perf-stat.ps.cpu-migrations
>       39054           +19.0%      46460 ±  2%  proc-vmstat.nr_dirty
>       15139            +2.2%      15476        proc-vmstat.nr_kernel_stack
>       45710            +1.9%      46570        proc-vmstat.nr_slab_unreclaimable
>      116900           -13.5%     101162        proc-vmstat.nr_writeback
>       87038           -18.2%      71185 ±  2%  proc-vmstat.nr_zone_write_pending
>     6949807            -3.8%    6688660        proc-vmstat.numa_hit
>     6882153            -3.8%    6622312        proc-vmstat.numa_local
>    13471776            -2.0%   13204489        proc-vmstat.pgalloc_normal
>      584292            +3.8%     606391 ±  3%  proc-vmstat.pgfault
>       25859            +9.8%      28392 ±  9%  proc-vmstat.pgreuse
>        2.02 ±  8%      -0.3        1.71 ±  5%  perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
>        1.86 ±  8%      -0.3        1.58 ±  6%  perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
>        3.42 ±  5%      -0.6        2.87 ±  5%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
>        2.96 ±  4%      -0.4        2.55 ±  5%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
>        0.35 ± 45%      -0.2        0.14 ± 71%  perf-profile.children.cycles-pp.khugepaged
>        0.34 ± 46%      -0.2        0.14 ± 71%  perf-profile.children.cycles-pp.hpage_collapse_scan_pmd
>        0.34 ± 46%      -0.2        0.14 ± 71%  perf-profile.children.cycles-pp.khugepaged_scan_mm_slot
>        0.34 ± 47%      -0.2        0.14 ± 72%  perf-profile.children.cycles-pp.collapse_huge_page
>        1.21 ± 10%      -0.2        1.01 ±  8%  perf-profile.children.cycles-pp.__hrtimer_run_queues
>        0.82 ±  9%      -0.1        0.68 ± 10%  perf-profile.children.cycles-pp.update_process_times
>        0.41 ±  8%      -0.1        0.29 ± 22%  perf-profile.children.cycles-pp.btrfs_check_data_free_space
>        0.21 ±  7%      -0.1        0.11 ± 73%  perf-profile.children.cycles-pp.copy_mc_enhanced_fast_string
>        0.55 ± 11%      -0.1        0.46 ± 14%  perf-profile.children.cycles-pp.__set_extent_bit
>        0.33 ±  9%      -0.1        0.28 ±  8%  perf-profile.children.cycles-pp.nfs_request_add_commit_list
>        0.17 ±  9%      -0.0        0.13 ± 16%  perf-profile.children.cycles-pp.readn
>        0.08 ± 13%      -0.0        0.06 ± 14%  perf-profile.children.cycles-pp.load_elf_interp
>        1.00 ± 16%      +1.2        2.18 ± 53%  perf-profile.children.cycles-pp.folio_batch_move_lru
>        0.21 ±  8%      -0.1        0.11 ± 73%  perf-profile.self.cycles-pp.copy_mc_enhanced_fast_string
>        0.05 ± 49%      +0.1        0.15 ± 61%  perf-profile.self.cycles-pp.nfs_update_folio
>        0.94 ±  5%      +0.2        1.11 ±  4%  perf-profile.self.cycles-pp._raw_spin_lock_irqsave
>        0.25 ± 17%      +0.4        0.63 ± 61%  perf-profile.self.cycles-pp.nfs_page_async_flush
> 
> 
> 
> 
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
> 
> 


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [cel:nfsd-testing] [nfsd] ab62726202: fsmark.app_overhead 186.4% regression
  2024-12-28 17:32 ` Chuck Lever
@ 2024-12-28 18:17   ` Jeff Layton
  2024-12-31  2:50   ` Oliver Sang
  1 sibling, 0 replies; 4+ messages in thread
From: Jeff Layton @ 2024-12-28 18:17 UTC (permalink / raw)
  To: Chuck Lever, kernel test robot, NeilBrown; +Cc: oe-lkp, lkp, linux-nfs

On Sat, 2024-12-28 at 12:32 -0500, Chuck Lever wrote:
> On 12/27/24 4:13 AM, kernel test robot wrote:
> > 
> > 
> > Hello,
> > 
> > kernel test robot noticed a 186.4% regression of fsmark.app_overhead on:
> > (but no diff for fsmark.files_per_sec as below (a))
> 
> Hello Oliver -
> 
> I have questions about this test result.
> 
> Is this https://github.com/josefbacik/fs_mark ?
> 
> I don't understand what "app_overhead" is measuring. Is this "think
> time"?
> 
> A more concerning regression might be:
> 
>         13.03 ±170%    +566.0%      86.78 ± 77%
> 
> perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.svc_tcp_sendto
> 
> But these metrics look like they improved:
> 
>          0.03 ± 56%     -73.4%       0.01 ±149%
> perf-sched.sch_delay.avg.ms.btrfs_commit_transaction.btrfs_sync_file.nfsd_commit.nfsd4_commit
>          0.05 ± 60%     -72.1%       0.02 ±165%
> perf-sched.sch_delay.max.ms.btrfs_commit_transaction.btrfs_sync_file.nfsd_commit.nfsd4_commit
> 
> This is a quite mixed result, IMO -- I'm not convinced it's actionable.
> Can someone help explain/analyze the metrics?
> 

I've spent some time staring at fs_mark [1] with the similar regression
reported due to the OPEN_XOR_DELEGATION patches.

It forks a bunch of processes and does a bunch of create/write/fsync,
etc. in a directory hierarchy, based on the command-line options.

"App overhead" in that program consists of this:

        total_file_ops =
            creat_usec + total_write_usec + fsync_usec + avg_sync_usec +
            close_usec;
        app_overhead_usec = loop_usecs - total_file_ops;

IOW, it measures the timing of various calls into the kernel and
tallies the time they took. It then subtracts the total of those from
the total time in the loop, and that's the "App overhead".

The problem I had was that the "App overhead" covers quite a bit of the
code (some of which does kernel syscalls too!). My effort at bracketing
where the increase came from didn't give me reliable results. I meant
to get back to it, but haven't had the time recently.

[1]: https://github.com/josefbacik/fs_mark

> 
> > commit: ab627262022ed8c6a68e619ed03a14e47acf2e39 ("nfsd: allocate new session-based DRC slots on demand.")
> > https://git.kernel.org/cgit/linux/kernel/git/cel/linux nfsd-testing
> > 
> > testcase: fsmark
> > config: x86_64-rhel-9.4
> > compiler: gcc-12
> > test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
> > parameters:
> > 
> > 	iterations: 1x
> > 	nr_threads: 32t
> > 	disk: 1HDD
> > 	fs: btrfs
> > 	fs2: nfsv4
> > 	filesize: 16MB
> > 	test_size: 15G
> > 	sync_method: NoSync
> > 	nr_directories: 16d
> > 	nr_files_per_directory: 256fpd
> > 	cpufreq_governor: performance
> > 
> > 
> > 
> > 
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > > Reported-by: kernel test robot <oliver.sang@intel.com>
> > > Closes: https://lore.kernel.org/oe-lkp/202412271641.cfba5666-lkp@intel.com
> > 
> > 
> > Details are as below:
> > -------------------------------------------------------------------------------------------------->
> > 
> > 
> > The kernel config and materials to reproduce are available at:
> > https://download.01.org/0day-ci/archive/20241227/202412271641.cfba5666-lkp@intel.com
> > 
> > =========================================================================================
> > compiler/cpufreq_governor/disk/filesize/fs2/fs/iterations/kconfig/nr_directories/nr_files_per_directory/nr_threads/rootfs/sync_method/tbox_group/test_size/testcase:
> >    gcc-12/performance/1HDD/16MB/nfsv4/btrfs/1x/x86_64-rhel-9.4/16d/256fpd/32t/debian-12-x86_64-20240206.cgz/NoSync/lkp-icl-2sp7/15G/fsmark
> > 
> > commit:
> >    ccd01c7601 ("nfsd: add session slot count to /proc/fs/nfsd/clients/*/info")
> >    ab62726202 ("nfsd: allocate new session-based DRC slots on demand.")
> > 
> > ccd01c76017847d1 ab627262022ed8c6a68e619ed03
> > ---------------- ---------------------------
> >           %stddev     %change         %stddev
> >               \          |                \
> >        5.48 ±  9%     +24.9%       6.85 ± 14%  sched_debug.cpu.nr_uninterruptible.stddev
> >       12489           +11.1%      13876        uptime.idle
> >   3.393e+08 ± 16%    +186.4%  9.717e+08 ±  9%  fsmark.app_overhead
> >        6.40            +0.0%       6.40        fsmark.files_per_sec     <-------- (a)
> >        6.00           +27.8%       7.67 ±  6%  fsmark.time.percent_of_cpu_this_job_got
> >       72.33           +15.8%      83.79        iostat.cpu.idle
> >       25.91 ±  3%     -44.3%      14.42 ± 11%  iostat.cpu.iowait
> >       72.08           +11.6       83.64        mpstat.cpu.all.idle%
> >       26.18 ±  3%     -11.6       14.58 ± 11%  mpstat.cpu.all.iowait%
> >      153772 ±  5%     +19.1%     183126 ±  8%  meminfo.DirectMap4k
> >      156099           +19.5%     186594        meminfo.Dirty
> >      467358           -12.9%     406910 ±  2%  meminfo.Writeback
> >       72.35           +15.8%      83.79        vmstat.cpu.id
> >       25.90 ±  3%     -44.3%      14.41 ± 11%  vmstat.cpu.wa
> >       17.61 ±  3%     -45.8%       9.55 ± 10%  vmstat.procs.b
> >        5909 ±  2%      -6.2%       5545        vmstat.system.in
> >        0.03 ± 56%     -73.4%       0.01 ±149%  perf-sched.sch_delay.avg.ms.btrfs_commit_transaction.btrfs_sync_file.nfsd_commit.nfsd4_commit
> >        0.05 ± 60%     -72.1%       0.02 ±165%  perf-sched.sch_delay.max.ms.btrfs_commit_transaction.btrfs_sync_file.nfsd_commit.nfsd4_commit
> >        0.07 ± 41%     +36.1%       0.10 ±  8%  perf-sched.sch_delay.max.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.btrfs_tree_read_lock_nested
> >       13.03 ±170%    +566.0%      86.78 ± 77%  perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.svc_tcp_sendto
> >      206.83 ± 14%     -31.5%     141.67 ±  6%  perf-sched.wait_and_delay.count.rpc_wait_bit_killable.__wait_on_bit.out_of_line_wait_on_bit.__rpc_execute
> >        0.30 ± 62%     -82.1%       0.05 ±110%  perf-sched.wait_time.avg.ms.handle_reserve_ticket.__reserve_bytes.btrfs_reserve_data_bytes.btrfs_check_data_free_space
> >        7.37 ±  4%     -15.8%       6.20 ±  4%  perf-stat.i.MPKI
> >       44.13 ±  2%      -2.9       41.25 ±  2%  perf-stat.i.cache-miss-rate%
> >      103.65 ±  2%     +17.9%     122.17 ±  8%  perf-stat.i.cpu-migrations
> >      627.67 ±  3%     +25.4%     787.18 ±  6%  perf-stat.i.cycles-between-cache-misses
> >        0.67            +3.7%       0.70        perf-stat.i.ipc
> >        1.35            +2.2%       1.38        perf-stat.overall.cpi
> >      373.39            +4.1%     388.79        perf-stat.overall.cycles-between-cache-misses
> >        0.74            -2.1%       0.73        perf-stat.overall.ipc
> >      102.89 ±  2%     +17.9%     121.32 ±  8%  perf-stat.ps.cpu-migrations
> >       39054           +19.0%      46460 ±  2%  proc-vmstat.nr_dirty
> >       15139            +2.2%      15476        proc-vmstat.nr_kernel_stack
> >       45710            +1.9%      46570        proc-vmstat.nr_slab_unreclaimable
> >      116900           -13.5%     101162        proc-vmstat.nr_writeback
> >       87038           -18.2%      71185 ±  2%  proc-vmstat.nr_zone_write_pending
> >     6949807            -3.8%    6688660        proc-vmstat.numa_hit
> >     6882153            -3.8%    6622312        proc-vmstat.numa_local
> >    13471776            -2.0%   13204489        proc-vmstat.pgalloc_normal
> >      584292            +3.8%     606391 ±  3%  proc-vmstat.pgfault
> >       25859            +9.8%      28392 ±  9%  proc-vmstat.pgreuse
> >        2.02 ±  8%      -0.3        1.71 ±  5%  perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
> >        1.86 ±  8%      -0.3        1.58 ±  6%  perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
> >        3.42 ±  5%      -0.6        2.87 ±  5%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
> >        2.96 ±  4%      -0.4        2.55 ±  5%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
> >        0.35 ± 45%      -0.2        0.14 ± 71%  perf-profile.children.cycles-pp.khugepaged
> >        0.34 ± 46%      -0.2        0.14 ± 71%  perf-profile.children.cycles-pp.hpage_collapse_scan_pmd
> >        0.34 ± 46%      -0.2        0.14 ± 71%  perf-profile.children.cycles-pp.khugepaged_scan_mm_slot
> >        0.34 ± 47%      -0.2        0.14 ± 72%  perf-profile.children.cycles-pp.collapse_huge_page
> >        1.21 ± 10%      -0.2        1.01 ±  8%  perf-profile.children.cycles-pp.__hrtimer_run_queues
> >        0.82 ±  9%      -0.1        0.68 ± 10%  perf-profile.children.cycles-pp.update_process_times
> >        0.41 ±  8%      -0.1        0.29 ± 22%  perf-profile.children.cycles-pp.btrfs_check_data_free_space
> >        0.21 ±  7%      -0.1        0.11 ± 73%  perf-profile.children.cycles-pp.copy_mc_enhanced_fast_string
> >        0.55 ± 11%      -0.1        0.46 ± 14%  perf-profile.children.cycles-pp.__set_extent_bit
> >        0.33 ±  9%      -0.1        0.28 ±  8%  perf-profile.children.cycles-pp.nfs_request_add_commit_list
> >        0.17 ±  9%      -0.0        0.13 ± 16%  perf-profile.children.cycles-pp.readn
> >        0.08 ± 13%      -0.0        0.06 ± 14%  perf-profile.children.cycles-pp.load_elf_interp
> >        1.00 ± 16%      +1.2        2.18 ± 53%  perf-profile.children.cycles-pp.folio_batch_move_lru
> >        0.21 ±  8%      -0.1        0.11 ± 73%  perf-profile.self.cycles-pp.copy_mc_enhanced_fast_string
> >        0.05 ± 49%      +0.1        0.15 ± 61%  perf-profile.self.cycles-pp.nfs_update_folio
> >        0.94 ±  5%      +0.2        1.11 ±  4%  perf-profile.self.cycles-pp._raw_spin_lock_irqsave
> >        0.25 ± 17%      +0.4        0.63 ± 61%  perf-profile.self.cycles-pp.nfs_page_async_flush
> > 
> > 
> > 
> > 
> > Disclaimer:
> > Results have been estimated based on internal Intel analysis and are provided
> > for informational purposes only. Any difference in system hardware or software
> > design or configuration may affect actual performance.
> > 
> > 
> 
> 

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [cel:nfsd-testing] [nfsd] ab62726202: fsmark.app_overhead 186.4% regression
  2024-12-28 17:32 ` Chuck Lever
  2024-12-28 18:17   ` Jeff Layton
@ 2024-12-31  2:50   ` Oliver Sang
  1 sibling, 0 replies; 4+ messages in thread
From: Oliver Sang @ 2024-12-31  2:50 UTC (permalink / raw)
  To: Chuck Lever; +Cc: NeilBrown, oe-lkp, lkp, Jeff Layton, linux-nfs, oliver.sang

hi, Chuck Lever,

On Sat, Dec 28, 2024 at 12:32:55PM -0500, Chuck Lever wrote:
> On 12/27/24 4:13 AM, kernel test robot wrote:
> > 
> > 
> > Hello,
> > 
> > kernel test robot noticed a 186.4% regression of fsmark.app_overhead on:
> > (but no diff for fsmark.files_per_sec as below (a))
> 
> Hello Oliver -
> 
> I have questions about this test result.
> 
> Is this https://github.com/josefbacik/fs_mark ?

yes, we use fsmark test suite from this url.

> 
> I don't understand what "app_overhead" is measuring. Is this "think
> time"?

sorry, we just follow the guidance to enable the test.

as in https://download.01.org/0day-ci/archive/20241227/202412271641.cfba5666-lkp@intel.com/repro-script
we run the test by:

fs_mark -d /nfs/sdb1/1 -d /nfs/sdb1/2 -d /nfs/sdb1/3 -d /nfs/sdb1/4 -d /nfs/sdb1/5 -d /nfs/sdb1/6 -d /nfs/sdb1/7 -d /nfs/sdb1/8 -d /nfs/sdb1/9 -d /nfs/sdb1/10 -d /nfs/sdb1/11 -d /nfs/sdb1/12 -d /nfs/sdb1/13 -d /nfs/sdb1/14 -d /nfs/sdb1/15 -d /nfs/sdb1/16 -d /nfs/sdb1/17 -d /nfs/sdb1/18 -d /nfs/sdb1/19 -d /nfs/sdb1/20 -d /nfs/sdb1/21 -d /nfs/sdb1/22 -d /nfs/sdb1/23 -d /nfs/sdb1/24 -d /nfs/sdb1/25 -d /nfs/sdb1/26 -d /nfs/sdb1/27 -d /nfs/sdb1/28 -d /nfs/sdb1/29 -d /nfs/sdb1/30 -d /nfs/sdb1/31 -d /nfs/sdb1/32 -D 16 -N 256 -n 30 -L 1 -S 0 -s 16777216



below is an example of output (there is oneline to explain App overhead (1)):

2024-12-27 00:12:33 fs_mark -d /nfs/sdb1/1 -d /nfs/sdb1/2 -d /nfs/sdb1/3 -d /nfs/sdb1/4 -d /nfs/sdb1/5 -d /nfs/sdb1/6 -d /nfs/sdb1/7 -d /nfs/sdb1/8 -d /nfs/sdb1/9 -d /nfs/sdb1/10 -d /nfs/sdb1/11 -d /nfs/sdb1/12 -d /nfs/sdb1/13 -d /nfs/sdb1/14 -d /nfs/sdb1/15 -d /nfs/sdb1/16 -d /nfs/sdb1/17 -d /nfs/sdb1/18 -d /nfs/sdb1/19 -d /nfs/sdb1/20 -d /nfs/sdb1/21 -d /nfs/sdb1/22 -d /nfs/sdb1/23 -d /nfs/sdb1/24 -d /nfs/sdb1/25 -d /nfs/sdb1/26 -d /nfs/sdb1/27 -d /nfs/sdb1/28 -d /nfs/sdb1/29 -d /nfs/sdb1/30 -d /nfs/sdb1/31 -d /nfs/sdb1/32 -D 16 -N 256 -n 30 -L 1 -S 0 -s 16777216

#  fs_mark  -d  /nfs/sdb1/1  -d  /nfs/sdb1/2  -d  /nfs/sdb1/3  -d  /nfs/sdb1/4  -d  /nfs/sdb1/5  -d  /nfs/sdb1/6  -d  /nfs/sdb1/7  -d  /nfs/sdb1/8  -d  /nfs/sdb1/9  -d  /nfs/sdb1/10  -d  /nfs/sdb1/11  -d  /nfs/sdb1/12  -d  /nfs/sdb1/13  -d  /nfs/sdb1/14  -d  /nfs/sdb1/15  -d  /nfs/sdb1/16  -d  /nfs/sdb1/17  -d  /nfs/sdb1/18  -d  /nfs/sdb1/19  -d  /nfs/sdb1/20  -d  /nfs/sdb1/21  -d  /nfs/sdb1/22  -d  /nfs/sdb1/23  -d  /nfs/sdb1/24  -d  /nfs/sdb1/25  -d  /nfs/sdb1/26  -d  /nfs/sdb1/27  -d  /nfs/sdb1/28  -d  /nfs/sdb1/29  -d  /nfs/sdb1/30  -d  /nfs/sdb1/31  -d  /nfs/sdb1/32  -D  16  -N  256  -n  30  -L  1  -S  0  -s  16777216
#       Version 3.3, 32 thread(s) starting at Fri Dec 27 00:12:33 2024
#       Sync method: NO SYNC: Test does not issue sync() or fsync() calls.
#       Directories:  Round Robin between directories across 16 subdirectories with 256 files per subdirectory.
#       File names: 40 bytes long, (16 initial bytes of time stamp with 24 random bytes at end of name)
#       Files info: size 16777216 bytes, written with an IO size of 16384 bytes per write
#       App overhead is time in microseconds spent in the test not doing file writing related system calls.   <---- (1)

FSUse%        Count         Size    Files/sec     App Overhead
     7          960     16777216          6.4        920331167
Average Files/sec:          6.0
p50 Files/sec: 6
p90 Files/sec: 6
p99 Files/sec: 6


> 
> A more concerning regression might be:
> 
>        13.03 ±170%    +566.0%      86.78 ± 77%
> 
> perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.svc_tcp_sendto
> 
> But these metrics look like they improved:
> 
>         0.03 ± 56%     -73.4%       0.01 ±149%
> perf-sched.sch_delay.avg.ms.btrfs_commit_transaction.btrfs_sync_file.nfsd_commit.nfsd4_commit
>         0.05 ± 60%     -72.1%       0.02 ±165%
> perf-sched.sch_delay.max.ms.btrfs_commit_transaction.btrfs_sync_file.nfsd_commit.nfsd4_commit
> 
> This is a quite mixed result, IMO -- I'm not convinced it's actionable.
> Can someone help explain/analyze the metrics?
> 
> 
> > commit: ab627262022ed8c6a68e619ed03a14e47acf2e39 ("nfsd: allocate new session-based DRC slots on demand.")
> > https://git.kernel.org/cgit/linux/kernel/git/cel/linux nfsd-testing
> > 
> > testcase: fsmark
> > config: x86_64-rhel-9.4
> > compiler: gcc-12
> > test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
> > parameters:
> > 
> > 	iterations: 1x
> > 	nr_threads: 32t
> > 	disk: 1HDD
> > 	fs: btrfs
> > 	fs2: nfsv4
> > 	filesize: 16MB
> > 	test_size: 15G
> > 	sync_method: NoSync
> > 	nr_directories: 16d
> > 	nr_files_per_directory: 256fpd
> > 	cpufreq_governor: performance
> > 
> > 
> > 
> > 
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot <oliver.sang@intel.com>
> > | Closes: https://lore.kernel.org/oe-lkp/202412271641.cfba5666-lkp@intel.com
> > 
> > 
> > Details are as below:
> > -------------------------------------------------------------------------------------------------->
> > 
> > 
> > The kernel config and materials to reproduce are available at:
> > https://download.01.org/0day-ci/archive/20241227/202412271641.cfba5666-lkp@intel.com
> > 
> > =========================================================================================
> > compiler/cpufreq_governor/disk/filesize/fs2/fs/iterations/kconfig/nr_directories/nr_files_per_directory/nr_threads/rootfs/sync_method/tbox_group/test_size/testcase:
> >    gcc-12/performance/1HDD/16MB/nfsv4/btrfs/1x/x86_64-rhel-9.4/16d/256fpd/32t/debian-12-x86_64-20240206.cgz/NoSync/lkp-icl-2sp7/15G/fsmark
> > 
> > commit:
> >    ccd01c7601 ("nfsd: add session slot count to /proc/fs/nfsd/clients/*/info")
> >    ab62726202 ("nfsd: allocate new session-based DRC slots on demand.")
> > 
> > ccd01c76017847d1 ab627262022ed8c6a68e619ed03
> > ---------------- ---------------------------
> >           %stddev     %change         %stddev
> >               \          |                \
> >        5.48 ±  9%     +24.9%       6.85 ± 14%  sched_debug.cpu.nr_uninterruptible.stddev
> >       12489           +11.1%      13876        uptime.idle
> >   3.393e+08 ± 16%    +186.4%  9.717e+08 ±  9%  fsmark.app_overhead
> >        6.40            +0.0%       6.40        fsmark.files_per_sec     <-------- (a)
> >        6.00           +27.8%       7.67 ±  6%  fsmark.time.percent_of_cpu_this_job_got
> >       72.33           +15.8%      83.79        iostat.cpu.idle
> >       25.91 ±  3%     -44.3%      14.42 ± 11%  iostat.cpu.iowait
> >       72.08           +11.6       83.64        mpstat.cpu.all.idle%
> >       26.18 ±  3%     -11.6       14.58 ± 11%  mpstat.cpu.all.iowait%
> >      153772 ±  5%     +19.1%     183126 ±  8%  meminfo.DirectMap4k
> >      156099           +19.5%     186594        meminfo.Dirty
> >      467358           -12.9%     406910 ±  2%  meminfo.Writeback
> >       72.35           +15.8%      83.79        vmstat.cpu.id
> >       25.90 ±  3%     -44.3%      14.41 ± 11%  vmstat.cpu.wa
> >       17.61 ±  3%     -45.8%       9.55 ± 10%  vmstat.procs.b
> >        5909 ±  2%      -6.2%       5545        vmstat.system.in
> >        0.03 ± 56%     -73.4%       0.01 ±149%  perf-sched.sch_delay.avg.ms.btrfs_commit_transaction.btrfs_sync_file.nfsd_commit.nfsd4_commit
> >        0.05 ± 60%     -72.1%       0.02 ±165%  perf-sched.sch_delay.max.ms.btrfs_commit_transaction.btrfs_sync_file.nfsd_commit.nfsd4_commit
> >        0.07 ± 41%     +36.1%       0.10 ±  8%  perf-sched.sch_delay.max.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.btrfs_tree_read_lock_nested
> >       13.03 ±170%    +566.0%      86.78 ± 77%  perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.svc_tcp_sendto
> >      206.83 ± 14%     -31.5%     141.67 ±  6%  perf-sched.wait_and_delay.count.rpc_wait_bit_killable.__wait_on_bit.out_of_line_wait_on_bit.__rpc_execute
> >        0.30 ± 62%     -82.1%       0.05 ±110%  perf-sched.wait_time.avg.ms.handle_reserve_ticket.__reserve_bytes.btrfs_reserve_data_bytes.btrfs_check_data_free_space
> >        7.37 ±  4%     -15.8%       6.20 ±  4%  perf-stat.i.MPKI
> >       44.13 ±  2%      -2.9       41.25 ±  2%  perf-stat.i.cache-miss-rate%
> >      103.65 ±  2%     +17.9%     122.17 ±  8%  perf-stat.i.cpu-migrations
> >      627.67 ±  3%     +25.4%     787.18 ±  6%  perf-stat.i.cycles-between-cache-misses
> >        0.67            +3.7%       0.70        perf-stat.i.ipc
> >        1.35            +2.2%       1.38        perf-stat.overall.cpi
> >      373.39            +4.1%     388.79        perf-stat.overall.cycles-between-cache-misses
> >        0.74            -2.1%       0.73        perf-stat.overall.ipc
> >      102.89 ±  2%     +17.9%     121.32 ±  8%  perf-stat.ps.cpu-migrations
> >       39054           +19.0%      46460 ±  2%  proc-vmstat.nr_dirty
> >       15139            +2.2%      15476        proc-vmstat.nr_kernel_stack
> >       45710            +1.9%      46570        proc-vmstat.nr_slab_unreclaimable
> >      116900           -13.5%     101162        proc-vmstat.nr_writeback
> >       87038           -18.2%      71185 ±  2%  proc-vmstat.nr_zone_write_pending
> >     6949807            -3.8%    6688660        proc-vmstat.numa_hit
> >     6882153            -3.8%    6622312        proc-vmstat.numa_local
> >    13471776            -2.0%   13204489        proc-vmstat.pgalloc_normal
> >      584292            +3.8%     606391 ±  3%  proc-vmstat.pgfault
> >       25859            +9.8%      28392 ±  9%  proc-vmstat.pgreuse
> >        2.02 ±  8%      -0.3        1.71 ±  5%  perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
> >        1.86 ±  8%      -0.3        1.58 ±  6%  perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
> >        3.42 ±  5%      -0.6        2.87 ±  5%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
> >        2.96 ±  4%      -0.4        2.55 ±  5%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
> >        0.35 ± 45%      -0.2        0.14 ± 71%  perf-profile.children.cycles-pp.khugepaged
> >        0.34 ± 46%      -0.2        0.14 ± 71%  perf-profile.children.cycles-pp.hpage_collapse_scan_pmd
> >        0.34 ± 46%      -0.2        0.14 ± 71%  perf-profile.children.cycles-pp.khugepaged_scan_mm_slot
> >        0.34 ± 47%      -0.2        0.14 ± 72%  perf-profile.children.cycles-pp.collapse_huge_page
> >        1.21 ± 10%      -0.2        1.01 ±  8%  perf-profile.children.cycles-pp.__hrtimer_run_queues
> >        0.82 ±  9%      -0.1        0.68 ± 10%  perf-profile.children.cycles-pp.update_process_times
> >        0.41 ±  8%      -0.1        0.29 ± 22%  perf-profile.children.cycles-pp.btrfs_check_data_free_space
> >        0.21 ±  7%      -0.1        0.11 ± 73%  perf-profile.children.cycles-pp.copy_mc_enhanced_fast_string
> >        0.55 ± 11%      -0.1        0.46 ± 14%  perf-profile.children.cycles-pp.__set_extent_bit
> >        0.33 ±  9%      -0.1        0.28 ±  8%  perf-profile.children.cycles-pp.nfs_request_add_commit_list
> >        0.17 ±  9%      -0.0        0.13 ± 16%  perf-profile.children.cycles-pp.readn
> >        0.08 ± 13%      -0.0        0.06 ± 14%  perf-profile.children.cycles-pp.load_elf_interp
> >        1.00 ± 16%      +1.2        2.18 ± 53%  perf-profile.children.cycles-pp.folio_batch_move_lru
> >        0.21 ±  8%      -0.1        0.11 ± 73%  perf-profile.self.cycles-pp.copy_mc_enhanced_fast_string
> >        0.05 ± 49%      +0.1        0.15 ± 61%  perf-profile.self.cycles-pp.nfs_update_folio
> >        0.94 ±  5%      +0.2        1.11 ±  4%  perf-profile.self.cycles-pp._raw_spin_lock_irqsave
> >        0.25 ± 17%      +0.4        0.63 ± 61%  perf-profile.self.cycles-pp.nfs_page_async_flush
> > 
> > 
> > 
> > 
> > Disclaimer:
> > Results have been estimated based on internal Intel analysis and are provided
> > for informational purposes only. Any difference in system hardware or software
> > design or configuration may affect actual performance.
> > 
> > 
> 
> 
> -- 
> Chuck Lever

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-12-31  2:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-27  9:13 [cel:nfsd-testing] [nfsd] ab62726202: fsmark.app_overhead 186.4% regression kernel test robot
2024-12-28 17:32 ` Chuck Lever
2024-12-28 18:17   ` Jeff Layton
2024-12-31  2:50   ` Oliver Sang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox