Re: [cel:nfsd-testing] [nfsd] ab62726202: fsmark.app_overhead 186.4% regression

public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Chuck Lever <chuck.lever@oracle.com>
To: kernel test robot <oliver.sang@intel.com>, NeilBrown <neilb@suse.de>
Cc: oe-lkp@lists.linux.dev, lkp@intel.com,
	Jeff Layton <jlayton@kernel.org>,
	linux-nfs@vger.kernel.org
Subject: Re: [cel:nfsd-testing] [nfsd] ab62726202: fsmark.app_overhead 186.4% regression
Date: Sat, 28 Dec 2024 12:32:55 -0500	[thread overview]
Message-ID: <eafd5b52-694c-4abb-8c2d-84094def4751@oracle.com> (raw)
In-Reply-To: <202412271641.cfba5666-lkp@intel.com>

On 12/27/24 4:13 AM, kernel test robot wrote:
> 
> 
> Hello,
> 
> kernel test robot noticed a 186.4% regression of fsmark.app_overhead on:
> (but no diff for fsmark.files_per_sec as below (a))

Hello Oliver -

I have questions about this test result.

Is this https://github.com/josefbacik/fs_mark ?

I don't understand what "app_overhead" is measuring. Is this "think
time"?

A more concerning regression might be:

        13.03 ±170%    +566.0%      86.78 ± 77%

perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.svc_tcp_sendto

But these metrics look like they improved:

         0.03 ± 56%     -73.4%       0.01 ±149%
perf-sched.sch_delay.avg.ms.btrfs_commit_transaction.btrfs_sync_file.nfsd_commit.nfsd4_commit
         0.05 ± 60%     -72.1%       0.02 ±165%
perf-sched.sch_delay.max.ms.btrfs_commit_transaction.btrfs_sync_file.nfsd_commit.nfsd4_commit

This is a quite mixed result, IMO -- I'm not convinced it's actionable.
Can someone help explain/analyze the metrics?


> commit: ab627262022ed8c6a68e619ed03a14e47acf2e39 ("nfsd: allocate new session-based DRC slots on demand.")
> https://git.kernel.org/cgit/linux/kernel/git/cel/linux nfsd-testing
> 
> testcase: fsmark
> config: x86_64-rhel-9.4
> compiler: gcc-12
> test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
> parameters:
> 
> 	iterations: 1x
> 	nr_threads: 32t
> 	disk: 1HDD
> 	fs: btrfs
> 	fs2: nfsv4
> 	filesize: 16MB
> 	test_size: 15G
> 	sync_method: NoSync
> 	nr_directories: 16d
> 	nr_files_per_directory: 256fpd
> 	cpufreq_governor: performance
> 
> 
> 
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202412271641.cfba5666-lkp@intel.com
> 
> 
> Details are as below:
> -------------------------------------------------------------------------------------------------->
> 
> 
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20241227/202412271641.cfba5666-lkp@intel.com
> 
> =========================================================================================
> compiler/cpufreq_governor/disk/filesize/fs2/fs/iterations/kconfig/nr_directories/nr_files_per_directory/nr_threads/rootfs/sync_method/tbox_group/test_size/testcase:
>    gcc-12/performance/1HDD/16MB/nfsv4/btrfs/1x/x86_64-rhel-9.4/16d/256fpd/32t/debian-12-x86_64-20240206.cgz/NoSync/lkp-icl-2sp7/15G/fsmark
> 
> commit:
>    ccd01c7601 ("nfsd: add session slot count to /proc/fs/nfsd/clients/*/info")
>    ab62726202 ("nfsd: allocate new session-based DRC slots on demand.")
> 
> ccd01c76017847d1 ab627262022ed8c6a68e619ed03
> ---------------- ---------------------------
>           %stddev     %change         %stddev
>               \          |                \
>        5.48 ±  9%     +24.9%       6.85 ± 14%  sched_debug.cpu.nr_uninterruptible.stddev
>       12489           +11.1%      13876        uptime.idle
>   3.393e+08 ± 16%    +186.4%  9.717e+08 ±  9%  fsmark.app_overhead
>        6.40            +0.0%       6.40        fsmark.files_per_sec     <-------- (a)
>        6.00           +27.8%       7.67 ±  6%  fsmark.time.percent_of_cpu_this_job_got
>       72.33           +15.8%      83.79        iostat.cpu.idle
>       25.91 ±  3%     -44.3%      14.42 ± 11%  iostat.cpu.iowait
>       72.08           +11.6       83.64        mpstat.cpu.all.idle%
>       26.18 ±  3%     -11.6       14.58 ± 11%  mpstat.cpu.all.iowait%
>      153772 ±  5%     +19.1%     183126 ±  8%  meminfo.DirectMap4k
>      156099           +19.5%     186594        meminfo.Dirty
>      467358           -12.9%     406910 ±  2%  meminfo.Writeback
>       72.35           +15.8%      83.79        vmstat.cpu.id
>       25.90 ±  3%     -44.3%      14.41 ± 11%  vmstat.cpu.wa
>       17.61 ±  3%     -45.8%       9.55 ± 10%  vmstat.procs.b
>        5909 ±  2%      -6.2%       5545        vmstat.system.in
>        0.03 ± 56%     -73.4%       0.01 ±149%  perf-sched.sch_delay.avg.ms.btrfs_commit_transaction.btrfs_sync_file.nfsd_commit.nfsd4_commit
>        0.05 ± 60%     -72.1%       0.02 ±165%  perf-sched.sch_delay.max.ms.btrfs_commit_transaction.btrfs_sync_file.nfsd_commit.nfsd4_commit
>        0.07 ± 41%     +36.1%       0.10 ±  8%  perf-sched.sch_delay.max.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.btrfs_tree_read_lock_nested
>       13.03 ±170%    +566.0%      86.78 ± 77%  perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.svc_tcp_sendto
>      206.83 ± 14%     -31.5%     141.67 ±  6%  perf-sched.wait_and_delay.count.rpc_wait_bit_killable.__wait_on_bit.out_of_line_wait_on_bit.__rpc_execute
>        0.30 ± 62%     -82.1%       0.05 ±110%  perf-sched.wait_time.avg.ms.handle_reserve_ticket.__reserve_bytes.btrfs_reserve_data_bytes.btrfs_check_data_free_space
>        7.37 ±  4%     -15.8%       6.20 ±  4%  perf-stat.i.MPKI
>       44.13 ±  2%      -2.9       41.25 ±  2%  perf-stat.i.cache-miss-rate%
>      103.65 ±  2%     +17.9%     122.17 ±  8%  perf-stat.i.cpu-migrations
>      627.67 ±  3%     +25.4%     787.18 ±  6%  perf-stat.i.cycles-between-cache-misses
>        0.67            +3.7%       0.70        perf-stat.i.ipc
>        1.35            +2.2%       1.38        perf-stat.overall.cpi
>      373.39            +4.1%     388.79        perf-stat.overall.cycles-between-cache-misses
>        0.74            -2.1%       0.73        perf-stat.overall.ipc
>      102.89 ±  2%     +17.9%     121.32 ±  8%  perf-stat.ps.cpu-migrations
>       39054           +19.0%      46460 ±  2%  proc-vmstat.nr_dirty
>       15139            +2.2%      15476        proc-vmstat.nr_kernel_stack
>       45710            +1.9%      46570        proc-vmstat.nr_slab_unreclaimable
>      116900           -13.5%     101162        proc-vmstat.nr_writeback
>       87038           -18.2%      71185 ±  2%  proc-vmstat.nr_zone_write_pending
>     6949807            -3.8%    6688660        proc-vmstat.numa_hit
>     6882153            -3.8%    6622312        proc-vmstat.numa_local
>    13471776            -2.0%   13204489        proc-vmstat.pgalloc_normal
>      584292            +3.8%     606391 ±  3%  proc-vmstat.pgfault
>       25859            +9.8%      28392 ±  9%  proc-vmstat.pgreuse
>        2.02 ±  8%      -0.3        1.71 ±  5%  perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
>        1.86 ±  8%      -0.3        1.58 ±  6%  perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
>        3.42 ±  5%      -0.6        2.87 ±  5%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
>        2.96 ±  4%      -0.4        2.55 ±  5%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
>        0.35 ± 45%      -0.2        0.14 ± 71%  perf-profile.children.cycles-pp.khugepaged
>        0.34 ± 46%      -0.2        0.14 ± 71%  perf-profile.children.cycles-pp.hpage_collapse_scan_pmd
>        0.34 ± 46%      -0.2        0.14 ± 71%  perf-profile.children.cycles-pp.khugepaged_scan_mm_slot
>        0.34 ± 47%      -0.2        0.14 ± 72%  perf-profile.children.cycles-pp.collapse_huge_page
>        1.21 ± 10%      -0.2        1.01 ±  8%  perf-profile.children.cycles-pp.__hrtimer_run_queues
>        0.82 ±  9%      -0.1        0.68 ± 10%  perf-profile.children.cycles-pp.update_process_times
>        0.41 ±  8%      -0.1        0.29 ± 22%  perf-profile.children.cycles-pp.btrfs_check_data_free_space
>        0.21 ±  7%      -0.1        0.11 ± 73%  perf-profile.children.cycles-pp.copy_mc_enhanced_fast_string
>        0.55 ± 11%      -0.1        0.46 ± 14%  perf-profile.children.cycles-pp.__set_extent_bit
>        0.33 ±  9%      -0.1        0.28 ±  8%  perf-profile.children.cycles-pp.nfs_request_add_commit_list
>        0.17 ±  9%      -0.0        0.13 ± 16%  perf-profile.children.cycles-pp.readn
>        0.08 ± 13%      -0.0        0.06 ± 14%  perf-profile.children.cycles-pp.load_elf_interp
>        1.00 ± 16%      +1.2        2.18 ± 53%  perf-profile.children.cycles-pp.folio_batch_move_lru
>        0.21 ±  8%      -0.1        0.11 ± 73%  perf-profile.self.cycles-pp.copy_mc_enhanced_fast_string
>        0.05 ± 49%      +0.1        0.15 ± 61%  perf-profile.self.cycles-pp.nfs_update_folio
>        0.94 ±  5%      +0.2        1.11 ±  4%  perf-profile.self.cycles-pp._raw_spin_lock_irqsave
>        0.25 ± 17%      +0.4        0.63 ± 61%  perf-profile.self.cycles-pp.nfs_page_async_flush
> 
> 
> 
> 
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
> 
> 


-- 
Chuck Lever

next prev parent reply	other threads:[~2024-12-28 17:33 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-27  9:13 [cel:nfsd-testing] [nfsd] ab62726202: fsmark.app_overhead 186.4% regression kernel test robot
2024-12-28 17:32 ` Chuck Lever [this message]
2024-12-28 18:17   ` Jeff Layton
2024-12-31  2:50   ` Oliver Sang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=eafd5b52-694c-4abb-8c2d-84094def4751@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=neilb@suse.de \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oliver.sang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox