All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chen Rong <rong.a.chen@intel.com>
To: Waiman Long <longman@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"IngoMolnar@shao2-debian" <mingo@redhat.com>,
	Will Deacon <will.deacon@arm.com>,
	Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org,
	"linux-alpha@vger.kernel.org"
	<linux-arm-kernel@lists.infradead.org>,
	linux-hexagon@vger.kernel.org, linux-ia64@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org, linux-sh@vger.kernel.org,
	sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org,
	linux-arch@vger.kernel.org, x86@kernel.org,
	Arnd Bergmann <arnd@arndb.de>, Borislav Petkov <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Tim Chen <tim.c.chen@linux.intel.com>
Subject: Re: [PATCH-tip 00/22] locking/rwsem: Rework rwsem-xadd & enable new rwsem features
Date: Wed, 13 Feb 2019 17:19:36 +0800	[thread overview]
Message-ID: <20190213091936.GA22829@shao2-debian> (raw)
In-Reply-To: <1549566446-27967-1-git-send-email-longman@redhat.com>

Hi all,

Kernel test robot reported a will-it-scale.per_thread_ops -64.1% regression on IVB-desktop for v4.20-rc1.
The first bad commit is: 9bc8039e715da3b53dbac89525323a9f2f69b7b5, Yang Shi <yang.shi@linux.alibaba.com>: mm: brk: downgrade mmap_sem to read when shrinking
(https://lists.01.org/pipermail/lkp/2018-November/009335.html).

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-7/performance/x86_64-rhel-7.2/thread/100%/debian-x86_64-2018-04-03.cgz/lkp-ivb-d01/brk1/will-it-scale/0x20

commit: 
  85a06835f6 ("mm: mremap: downgrade mmap_sem to read when shrinking")
  9bc8039e71 ("mm: brk: downgrade mmap_sem to read when shrinking")

85a06835f6f1ba79 9bc8039e715da3b53dbac89525 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    196250 ±  8%     -64.1%      70494        will-it-scale.per_thread_ops
    127330 ± 19%     -98.0%       2525 ± 24%  will-it-scale.time.involuntary_context_switches
    727.50 ±  2%     -77.0%     167.25        will-it-scale.time.percent_of_cpu_this_job_got
      2141 ±  2%     -77.6%     479.12        will-it-scale.time.system_time
     50.48 ±  7%     -48.5%      25.98        will-it-scale.time.user_time
  34925294 ± 18%    +270.3%  1.293e+08 ±  4%  will-it-scale.time.voluntary_context_switches
   1570007 ±  8%     -64.1%     563958        will-it-scale.workload
      6435 ±  2%      -6.4%       6024        proc-vmstat.nr_shmem
      1298 ± 16%     -44.5%     721.00 ± 18%  proc-vmstat.pgactivate
      2341           +16.4%       2724        slabinfo.kmalloc-96.active_objs
      2341           +16.4%       2724        slabinfo.kmalloc-96.num_objs
      6346 ±150%     -87.8%     776.25 ±  9%  softirqs.NET_RX
    160107 ±  8%    +151.9%     403273        softirqs.SCHED
   1097999           -13.0%     955526        softirqs.TIMER
      5.50 ±  9%     -81.8%       1.00        vmstat.procs.r
    230700 ± 19%    +269.9%     853292 ±  4%  vmstat.system.cs
     26706 ±  3%     +15.7%      30910 ±  5%  vmstat.system.in
     11.24 ± 23%     +72.2       83.39        mpstat.cpu.idle%
      0.00 ±131%      +0.0        0.04 ± 99%  mpstat.cpu.iowait%
     86.32 ±  2%     -70.8       15.54        mpstat.cpu.sys%
      2.44 ±  7%      -1.4        1.04 ±  8%  mpstat.cpu.usr%
  20610709 ± 15%   +2376.0%  5.103e+08 ± 34%  cpuidle.C1.time
   3233399 ±  8%    +241.5%   11042785 ± 25%  cpuidle.C1.usage
  36172040 ±  6%    +931.3%   3.73e+08 ± 15%  cpuidle.C1E.time
    783605 ±  4%    +548.7%    5083041 ± 18%  cpuidle.C1E.usage
  28753819 ± 39%   +1054.5%  3.319e+08 ± 49%  cpuidle.C3.time
    283912 ± 25%    +688.4%    2238225 ± 34%  cpuidle.C3.usage
 1.507e+08 ± 47%    +292.3%  5.913e+08 ± 28%  cpuidle.C6.time
    339861 ± 37%    +549.7%    2208222 ± 24%  cpuidle.C6.usage
   2709719 ±  5%    +824.2%   25043444        cpuidle.POLL.time
  28602864 ± 18%    +173.7%   78276116 ± 10%  cpuidle.POLL.usage


We found that the patchset could fix the regression.

tests: 1
testcase/path_params/tbox_group/run: will-it-scale/performance-thread-100%-brk1-ucode=0x20/lkp-ivb-d01

commit: 
  85a06835f6 ("mm: mremap: downgrade mmap_sem to read when shrinking")
  fb835fe7f0 ("locking/rwsem: Ensure an RT task will not spin on reader")

85a06835f6f1ba79  fb835fe7f0adbd7c2c074b98ec  
----------------  --------------------------  
         %stddev      change         %stddev
             \          |                \  
    120736 ± 22%        56%     188019 ±  6%  will-it-scale.time.involuntary_context_switches
      2126 ±  3%         4%       2215        will-it-scale.time.system_time
       722 ±  3%         4%        752        will-it-scale.time.percent_of_cpu_this_job_got
  36256485 ± 27%       -35%   23682989 ±  3%  will-it-scale.time.voluntary_context_switches
      3151 ±  9%        11%       3504        turbostat.Avg_MHz
    229285 ± 32%       -30%     160660 ±  3%  vmstat.system.cs
    120736 ± 22%        56%     188019 ±  6%  time.involuntary_context_switches
      2126 ±  3%         4%       2215        time.system_time
       722 ±  3%         4%        752        time.percent_of_cpu_this_job_got
  36256485 ± 27%       -35%   23682989 ±  3%  time.voluntary_context_switches
        23             643%        171 ±  3%  proc-vmstat.nr_zone_inactive_file
        23             643%        171 ±  3%  proc-vmstat.nr_inactive_file
      3664              12%       4121        proc-vmstat.nr_kernel_stack
      6392               6%       6785        proc-vmstat.nr_slab_unreclaimable
      9991                       10176        proc-vmstat.nr_slab_reclaimable
     63938                       62394        proc-vmstat.nr_zone_active_anon
     63938                       62394        proc-vmstat.nr_active_anon
    386388 ±  9%        -6%     362272        proc-vmstat.pgfree
    368296 ±  9%       -10%     333074        proc-vmstat.numa_hit
    368296 ±  9%       -10%     333074        proc-vmstat.numa_local
      5169 ± 13%       -28%       3745        proc-vmstat.nr_shmem
      1801 ± 21%       -83%        309        proc-vmstat.pgactivate
         0            1e+04      11441        latency_stats.avg.msleep.cpuinfo_open.proc_reg_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     13165 ±222%     -1e+04          0        latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup_revalidate.lookup_fast.walk_component.link_path_walk.path_lookupat.filename_lookup
     22499 ±151%     -2e+04        657 ±  7%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
    117414 ±181%     -9e+04      24418 ± 44%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
    666005 ±218%     -7e+05        198 ±141%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_access.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat.filename_lookup
   2600097 ±132%     -3e+06        572        latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
  34391390 ±150%     -3e+07      21807 ±141%  latency_stats.avg.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
  34624774 ±149%     -3e+07      37668 ± 58%  latency_stats.avg.max
         0            1e+04      11441        latency_stats.max.msleep.cpuinfo_open.proc_reg_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     22499 ±151%     -2e+04        657 ±  7%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
     37845 ±222%     -4e+04          0        latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup_revalidate.lookup_fast.walk_component.link_path_walk.path_lookupat.filename_lookup
     80096 ± 59%     -8e+04          0        latency_stats.max.call_rwsem_down_write_failed_killable.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
    177149 ±195%     -2e+05      24418 ± 44%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
    689417 ±209%     -7e+05        200 ±141%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_access.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat.filename_lookup
  18679699 ±129%     -2e+07        656        latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
  83587334 ±129%     -8e+07      43457 ±141%  latency_stats.max.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
  84867236 ±126%     -8e+07      59318 ± 86%  latency_stats.max.max
         0            1e+04      11441        latency_stats.sum.msleep.cpuinfo_open.proc_reg_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     22499 ±151%     -2e+04        657 ±  7%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
     39431 ±222%     -4e+04          0        latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup_revalidate.lookup_fast.walk_component.link_path_walk.path_lookupat.filename_lookup
    216448 ±200%     -2e+05      24418 ± 44%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
    691960 ±208%     -7e+05        397 ±141%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_access.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat.filename_lookup
  24239011 ±140%     -2e+07       4768 ± 10%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
 1.771e+08 ±122%     -2e+08      43614 ±141%  latency_stats.sum.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
 1.939e+08 ± 36%     -2e+08          0        latency_stats.sum.call_rwsem_down_write_failed_killable.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
 2.943e+08 ± 51%     -2e+08   51929782        latency_stats.sum.max
    407463 ± 10%      -100%          0        perf-stat.total.page-faults
  74225651 ± 26%      -100%          0        perf-stat.total.context-switches
     55293 ± 25%      -100%          0        perf-stat.total.cpu-migrations
    407463 ± 10%      -100%          0        perf-stat.total.minor-faults


tests: 1
testcase/path_params/tbox_group/run: will-it-scale/performance-thread-100%-brk1-ucode=0x20/lkp-ivb-d01

commit: 
  9bc8039e71 ("mm: brk: downgrade mmap_sem to read when shrinking")
  fb835fe7f0 ("locking/rwsem: Ensure an RT task will not spin on reader")

9bc8039e715da3b5  fb835fe7f0adbd7c2c074b98ec  
----------------  --------------------------  
         %stddev      change         %stddev
             \          |                \  
      3500 ± 36%      5272%     188019 ±  6%  will-it-scale.time.involuntary_context_switches
       483             358%       2215        will-it-scale.time.system_time
       168             346%        752        will-it-scale.time.percent_of_cpu_this_job_got
     71190             180%     199232 ±  4%  will-it-scale.per_thread_ops
    569524             180%    1593862 ±  4%  will-it-scale.workload
     25.85              93%      49.95 ±  3%  will-it-scale.time.user_time
 1.314e+08 ±  3%       -82%   23682989 ±  3%  will-it-scale.time.voluntary_context_switches
     30501 ±  9%       -15%      25813 ±  4%  vmstat.system.in
    799593 ± 10%       -80%     160660 ±  3%  vmstat.system.cs
       887 ± 11%       295%       3504        turbostat.Avg_MHz
     23.60 ± 10%        68%      39.54        turbostat.CorWatt
     28.38 ±  8%        57%      44.43        turbostat.PkgWatt
      3500 ± 36%      5272%     188019 ±  6%  time.involuntary_context_switches
       483             358%       2215        time.system_time
       168             346%        752        time.percent_of_cpu_this_job_got
     25.85              93%      49.95 ±  3%  time.user_time
 1.314e+08 ±  3%       -82%   23682989 ±  3%  time.voluntary_context_switches
         0 ± 44%     46220%        386        proc-vmstat.nr_zone_active_file
         0 ± 44%     46220%        386        proc-vmstat.nr_active_file
        23             643%        171 ±  3%  proc-vmstat.nr_zone_inactive_file
        23             643%        171 ±  3%  proc-vmstat.nr_inactive_file
      3690              12%       4121        proc-vmstat.nr_kernel_stack
      6419               6%       6785        proc-vmstat.nr_slab_unreclaimable
      9961                       10176        proc-vmstat.nr_slab_reclaimable
    229251                      231278        proc-vmstat.nr_zone_unevictable
    229251                      231278        proc-vmstat.nr_unevictable
      1008                        1005        proc-vmstat.nr_page_table_pages
     63178                       62394        proc-vmstat.nr_zone_active_anon
     63178                       62394        proc-vmstat.nr_active_anon
    432061 ± 12%       -11%     385372        proc-vmstat.pgfault
    408099 ± 10%       -11%     362272        proc-vmstat.pgfree
    422206 ±  9%       -11%     373690        proc-vmstat.pgalloc_normal
    382357 ± 11%       -13%     333074        proc-vmstat.numa_hit
    382357 ± 11%       -13%     333074        proc-vmstat.numa_local
      4428 ± 17%       -15%       3745        proc-vmstat.nr_shmem
         0            1e+04      11441        latency_stats.avg.msleep.cpuinfo_open.proc_reg_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     11180 ±168%     -1e+04        657 ±  7%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
     19239 ±223%     -2e+04          0        latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_get_acl.get_acl.posix_acl_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open
     63702 ±169%     -4e+04      24418 ± 44%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
     77617 ±205%     -8e+04        510 ± 11%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
   3043762 ±124%     -3e+06        572        latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
  11630441 ±139%     -1e+07      21807 ±141%  latency_stats.avg.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
  12242832 ±129%     -1e+07      37668 ± 58%  latency_stats.avg.max
         0            1e+04      11441        latency_stats.max.msleep.cpuinfo_open.proc_reg_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     11180 ±168%     -1e+04        657 ±  7%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
     19239 ±223%     -2e+04          0        latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_get_acl.get_acl.posix_acl_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open
     29152 ± 11%     -3e+04          0        latency_stats.max.call_rwsem_down_write_failed_killable.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
     65909 ±164%     -4e+04      24418 ± 44%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
     77617 ±205%     -8e+04        510 ± 11%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
  17301268 ±125%     -2e+07        656        latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
  44248611 ±140%     -4e+07      43457 ±141%  latency_stats.max.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
  46380610 ±130%     -5e+07      59318 ± 86%  latency_stats.max.max
         0            1e+04      11441        latency_stats.sum.msleep.cpuinfo_open.proc_reg_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     11180 ±168%     -1e+04        657 ±  7%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
     19239 ±223%     -2e+04          0        latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_get_acl.get_acl.posix_acl_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open
     74047 ±148%     -5e+04      24418 ± 44%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
     77617 ±205%     -8e+04        510 ± 11%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
  26043088 ±130%     -3e+07       4768 ± 10%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
  82480038 ±152%     -8e+07      43614 ±141%  latency_stats.sum.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
 1.771e+09           -2e+09   51929782        latency_stats.sum.max
 1.771e+09           -2e+09          0        latency_stats.sum.call_rwsem_down_write_failed_killable.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
    420016 ± 12%      -100%          0        perf-stat.total.page-faults
 2.648e+08 ±  3%      -100%          0        perf-stat.total.context-switches
     52212 ± 18%      -100%          0        perf-stat.total.cpu-migrations
    420016 ± 12%      -100%          0        perf-stat.total.minor-faults

Best Regards,
Rong Chen

WARNING: multiple messages have this Message-ID (diff)
From: Chen Rong <rong.a.chen@intel.com>
To: Waiman Long <longman@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"IngoMolnar@shao2-debian" <mingo@redhat.com>,
	Will Deacon <will.deacon@arm.com>,
	Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org,
	"linux-alpha@vger.kernel.org"
	<linux-arm-kernel@lists.infradead.org>,
	linux-hexagon@vger.kernel.org, linux-ia64@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org, linux-sh@vger.kernel.org,
	sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org,
	linux-arch@vger.kernel.org, x86@kernel.org,
	Arnd Bergmann <arnd@arndb.de>, Borislav Petkov <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Tim Chen <tim.c.chen@linux.intel.com>
Subject: Re: [PATCH-tip 00/22] locking/rwsem: Rework rwsem-xadd & enable new rwsem features
Date: Wed, 13 Feb 2019 09:19:36 +0000	[thread overview]
Message-ID: <20190213091936.GA22829@shao2-debian> (raw)
In-Reply-To: <1549566446-27967-1-git-send-email-longman@redhat.com>

Hi all,

Kernel test robot reported a will-it-scale.per_thread_ops -64.1% regression on IVB-desktop for v4.20-rc1.
The first bad commit is: 9bc8039e715da3b53dbac89525323a9f2f69b7b5, Yang Shi <yang.shi@linux.alibaba.com>: mm: brk: downgrade mmap_sem to read when shrinking
(https://lists.01.org/pipermail/lkp/2018-November/009335.html).

============================================compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-7/performance/x86_64-rhel-7.2/thread/100%/debian-x86_64-2018-04-03.cgz/lkp-ivb-d01/brk1/will-it-scale/0x20

commit: 
  85a06835f6 ("mm: mremap: downgrade mmap_sem to read when shrinking")
  9bc8039e71 ("mm: brk: downgrade mmap_sem to read when shrinking")

85a06835f6f1ba79 9bc8039e715da3b53dbac89525 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    196250 ±  8%     -64.1%      70494        will-it-scale.per_thread_ops
    127330 ± 19%     -98.0%       2525 ± 24%  will-it-scale.time.involuntary_context_switches
    727.50 ±  2%     -77.0%     167.25        will-it-scale.time.percent_of_cpu_this_job_got
      2141 ±  2%     -77.6%     479.12        will-it-scale.time.system_time
     50.48 ±  7%     -48.5%      25.98        will-it-scale.time.user_time
  34925294 ± 18%    +270.3%  1.293e+08 ±  4%  will-it-scale.time.voluntary_context_switches
   1570007 ±  8%     -64.1%     563958        will-it-scale.workload
      6435 ±  2%      -6.4%       6024        proc-vmstat.nr_shmem
      1298 ± 16%     -44.5%     721.00 ± 18%  proc-vmstat.pgactivate
      2341           +16.4%       2724        slabinfo.kmalloc-96.active_objs
      2341           +16.4%       2724        slabinfo.kmalloc-96.num_objs
      6346 ±150%     -87.8%     776.25 ±  9%  softirqs.NET_RX
    160107 ±  8%    +151.9%     403273        softirqs.SCHED
   1097999           -13.0%     955526        softirqs.TIMER
      5.50 ±  9%     -81.8%       1.00        vmstat.procs.r
    230700 ± 19%    +269.9%     853292 ±  4%  vmstat.system.cs
     26706 ±  3%     +15.7%      30910 ±  5%  vmstat.system.in
     11.24 ± 23%     +72.2       83.39        mpstat.cpu.idle%
      0.00 ±131%      +0.0        0.04 ± 99%  mpstat.cpu.iowait%
     86.32 ±  2%     -70.8       15.54        mpstat.cpu.sys%
      2.44 ±  7%      -1.4        1.04 ±  8%  mpstat.cpu.usr%
  20610709 ± 15%   +2376.0%  5.103e+08 ± 34%  cpuidle.C1.time
   3233399 ±  8%    +241.5%   11042785 ± 25%  cpuidle.C1.usage
  36172040 ±  6%    +931.3%   3.73e+08 ± 15%  cpuidle.C1E.time
    783605 ±  4%    +548.7%    5083041 ± 18%  cpuidle.C1E.usage
  28753819 ± 39%   +1054.5%  3.319e+08 ± 49%  cpuidle.C3.time
    283912 ± 25%    +688.4%    2238225 ± 34%  cpuidle.C3.usage
 1.507e+08 ± 47%    +292.3%  5.913e+08 ± 28%  cpuidle.C6.time
    339861 ± 37%    +549.7%    2208222 ± 24%  cpuidle.C6.usage
   2709719 ±  5%    +824.2%   25043444        cpuidle.POLL.time
  28602864 ± 18%    +173.7%   78276116 ± 10%  cpuidle.POLL.usage


We found that the patchset could fix the regression.

tests: 1
testcase/path_params/tbox_group/run: will-it-scale/performance-thread-100%-brk1-ucode=0x20/lkp-ivb-d01

commit: 
  85a06835f6 ("mm: mremap: downgrade mmap_sem to read when shrinking")
  fb835fe7f0 ("locking/rwsem: Ensure an RT task will not spin on reader")

85a06835f6f1ba79  fb835fe7f0adbd7c2c074b98ec  
----------------  --------------------------  
         %stddev      change         %stddev
             \          |                \  
    120736 ± 22%        56%     188019 ±  6%  will-it-scale.time.involuntary_context_switches
      2126 ±  3%         4%       2215        will-it-scale.time.system_time
       722 ±  3%         4%        752        will-it-scale.time.percent_of_cpu_this_job_got
  36256485 ± 27%       -35%   23682989 ±  3%  will-it-scale.time.voluntary_context_switches
      3151 ±  9%        11%       3504        turbostat.Avg_MHz
    229285 ± 32%       -30%     160660 ±  3%  vmstat.system.cs
    120736 ± 22%        56%     188019 ±  6%  time.involuntary_context_switches
      2126 ±  3%         4%       2215        time.system_time
       722 ±  3%         4%        752        time.percent_of_cpu_this_job_got
  36256485 ± 27%       -35%   23682989 ±  3%  time.voluntary_context_switches
        23             643%        171 ±  3%  proc-vmstat.nr_zone_inactive_file
        23             643%        171 ±  3%  proc-vmstat.nr_inactive_file
      3664              12%       4121        proc-vmstat.nr_kernel_stack
      6392               6%       6785        proc-vmstat.nr_slab_unreclaimable
      9991                       10176        proc-vmstat.nr_slab_reclaimable
     63938                       62394        proc-vmstat.nr_zone_active_anon
     63938                       62394        proc-vmstat.nr_active_anon
    386388 ±  9%        -6%     362272        proc-vmstat.pgfree
    368296 ±  9%       -10%     333074        proc-vmstat.numa_hit
    368296 ±  9%       -10%     333074        proc-vmstat.numa_local
      5169 ± 13%       -28%       3745        proc-vmstat.nr_shmem
      1801 ± 21%       -83%        309        proc-vmstat.pgactivate
         0            1e+04      11441        latency_stats.avg.msleep.cpuinfo_open.proc_reg_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     13165 ±222%     -1e+04          0        latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup_revalidate.lookup_fast.walk_component.link_path_walk.path_lookupat.filename_lookup
     22499 ±151%     -2e+04        657 ±  7%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
    117414 ±181%     -9e+04      24418 ± 44%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
    666005 ±218%     -7e+05        198 ±141%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_access.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat.filename_lookup
   2600097 ±132%     -3e+06        572        latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
  34391390 ±150%     -3e+07      21807 ±141%  latency_stats.avg.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
  34624774 ±149%     -3e+07      37668 ± 58%  latency_stats.avg.max
         0            1e+04      11441        latency_stats.max.msleep.cpuinfo_open.proc_reg_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     22499 ±151%     -2e+04        657 ±  7%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
     37845 ±222%     -4e+04          0        latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup_revalidate.lookup_fast.walk_component.link_path_walk.path_lookupat.filename_lookup
     80096 ± 59%     -8e+04          0        latency_stats.max.call_rwsem_down_write_failed_killable.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
    177149 ±195%     -2e+05      24418 ± 44%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
    689417 ±209%     -7e+05        200 ±141%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_access.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat.filename_lookup
  18679699 ±129%     -2e+07        656        latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
  83587334 ±129%     -8e+07      43457 ±141%  latency_stats.max.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
  84867236 ±126%     -8e+07      59318 ± 86%  latency_stats.max.max
         0            1e+04      11441        latency_stats.sum.msleep.cpuinfo_open.proc_reg_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     22499 ±151%     -2e+04        657 ±  7%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
     39431 ±222%     -4e+04          0        latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup_revalidate.lookup_fast.walk_component.link_path_walk.path_lookupat.filename_lookup
    216448 ±200%     -2e+05      24418 ± 44%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
    691960 ±208%     -7e+05        397 ±141%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_access.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat.filename_lookup
  24239011 ±140%     -2e+07       4768 ± 10%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
 1.771e+08 ±122%     -2e+08      43614 ±141%  latency_stats.sum.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
 1.939e+08 ± 36%     -2e+08          0        latency_stats.sum.call_rwsem_down_write_failed_killable.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
 2.943e+08 ± 51%     -2e+08   51929782        latency_stats.sum.max
    407463 ± 10%      -100%          0        perf-stat.total.page-faults
  74225651 ± 26%      -100%          0        perf-stat.total.context-switches
     55293 ± 25%      -100%          0        perf-stat.total.cpu-migrations
    407463 ± 10%      -100%          0        perf-stat.total.minor-faults


tests: 1
testcase/path_params/tbox_group/run: will-it-scale/performance-thread-100%-brk1-ucode=0x20/lkp-ivb-d01

commit: 
  9bc8039e71 ("mm: brk: downgrade mmap_sem to read when shrinking")
  fb835fe7f0 ("locking/rwsem: Ensure an RT task will not spin on reader")

9bc8039e715da3b5  fb835fe7f0adbd7c2c074b98ec  
----------------  --------------------------  
         %stddev      change         %stddev
             \          |                \  
      3500 ± 36%      5272%     188019 ±  6%  will-it-scale.time.involuntary_context_switches
       483             358%       2215        will-it-scale.time.system_time
       168             346%        752        will-it-scale.time.percent_of_cpu_this_job_got
     71190             180%     199232 ±  4%  will-it-scale.per_thread_ops
    569524             180%    1593862 ±  4%  will-it-scale.workload
     25.85              93%      49.95 ±  3%  will-it-scale.time.user_time
 1.314e+08 ±  3%       -82%   23682989 ±  3%  will-it-scale.time.voluntary_context_switches
     30501 ±  9%       -15%      25813 ±  4%  vmstat.system.in
    799593 ± 10%       -80%     160660 ±  3%  vmstat.system.cs
       887 ± 11%       295%       3504        turbostat.Avg_MHz
     23.60 ± 10%        68%      39.54        turbostat.CorWatt
     28.38 ±  8%        57%      44.43        turbostat.PkgWatt
      3500 ± 36%      5272%     188019 ±  6%  time.involuntary_context_switches
       483             358%       2215        time.system_time
       168             346%        752        time.percent_of_cpu_this_job_got
     25.85              93%      49.95 ±  3%  time.user_time
 1.314e+08 ±  3%       -82%   23682989 ±  3%  time.voluntary_context_switches
         0 ± 44%     46220%        386        proc-vmstat.nr_zone_active_file
         0 ± 44%     46220%        386        proc-vmstat.nr_active_file
        23             643%        171 ±  3%  proc-vmstat.nr_zone_inactive_file
        23             643%        171 ±  3%  proc-vmstat.nr_inactive_file
      3690              12%       4121        proc-vmstat.nr_kernel_stack
      6419               6%       6785        proc-vmstat.nr_slab_unreclaimable
      9961                       10176        proc-vmstat.nr_slab_reclaimable
    229251                      231278        proc-vmstat.nr_zone_unevictable
    229251                      231278        proc-vmstat.nr_unevictable
      1008                        1005        proc-vmstat.nr_page_table_pages
     63178                       62394        proc-vmstat.nr_zone_active_anon
     63178                       62394        proc-vmstat.nr_active_anon
    432061 ± 12%       -11%     385372        proc-vmstat.pgfault
    408099 ± 10%       -11%     362272        proc-vmstat.pgfree
    422206 ±  9%       -11%     373690        proc-vmstat.pgalloc_normal
    382357 ± 11%       -13%     333074        proc-vmstat.numa_hit
    382357 ± 11%       -13%     333074        proc-vmstat.numa_local
      4428 ± 17%       -15%       3745        proc-vmstat.nr_shmem
         0            1e+04      11441        latency_stats.avg.msleep.cpuinfo_open.proc_reg_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     11180 ±168%     -1e+04        657 ±  7%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
     19239 ±223%     -2e+04          0        latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_get_acl.get_acl.posix_acl_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open
     63702 ±169%     -4e+04      24418 ± 44%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
     77617 ±205%     -8e+04        510 ± 11%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
   3043762 ±124%     -3e+06        572        latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
  11630441 ±139%     -1e+07      21807 ±141%  latency_stats.avg.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
  12242832 ±129%     -1e+07      37668 ± 58%  latency_stats.avg.max
         0            1e+04      11441        latency_stats.max.msleep.cpuinfo_open.proc_reg_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     11180 ±168%     -1e+04        657 ±  7%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
     19239 ±223%     -2e+04          0        latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_get_acl.get_acl.posix_acl_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open
     29152 ± 11%     -3e+04          0        latency_stats.max.call_rwsem_down_write_failed_killable.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
     65909 ±164%     -4e+04      24418 ± 44%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
     77617 ±205%     -8e+04        510 ± 11%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
  17301268 ±125%     -2e+07        656        latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
  44248611 ±140%     -4e+07      43457 ±141%  latency_stats.max.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
  46380610 ±130%     -5e+07      59318 ± 86%  latency_stats.max.max
         0            1e+04      11441        latency_stats.sum.msleep.cpuinfo_open.proc_reg_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     11180 ±168%     -1e+04        657 ±  7%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
     19239 ±223%     -2e+04          0        latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_get_acl.get_acl.posix_acl_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open
     74047 ±148%     -5e+04      24418 ± 44%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
     77617 ±205%     -8e+04        510 ± 11%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
  26043088 ±130%     -3e+07       4768 ± 10%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
  82480038 ±152%     -8e+07      43614 ±141%  latency_stats.sum.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
 1.771e+09           -2e+09   51929782        latency_stats.sum.max
 1.771e+09           -2e+09          0        latency_stats.sum.call_rwsem_down_write_failed_killable.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
    420016 ± 12%      -100%          0        perf-stat.total.page-faults
 2.648e+08 ±  3%      -100%          0        perf-stat.total.context-switches
     52212 ± 18%      -100%          0        perf-stat.total.cpu-migrations
    420016 ± 12%      -100%          0        perf-stat.total.minor-faults

Best Regards,
Rong Chen

WARNING: multiple messages have this Message-ID (diff)
From: Chen Rong <rong.a.chen@intel.com>
To: Waiman Long <longman@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"IngoMolnar@shao2-debian" <mingo@redhat.com>,
	Will Deacon <will.deacon@arm.com>,
	Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org, linux-xtensa@linux-xtensa.org,
	Davidlohr Bueso <dave@stgolabs.net>,
	linux-ia64@vger.kernel.org, Tim Chen <tim.c.chen@linux.intel.com>,
	Arnd Bergmann <arnd@arndb.de>,
	linux-sh@vger.kernel.org, linux-hexagon@vger.kernel.org,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Borislav Petkov <bp@alien8.de>, "H. Peter Anvin" <hpa@zytor.com>,
	sparclinux@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	linuxppc-dev@lists.ozlabs.org,
	"linux-alpha@vger.kernel.org"
	<linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH-tip 00/22] locking/rwsem: Rework rwsem-xadd & enable new rwsem features
Date: Wed, 13 Feb 2019 17:19:36 +0800	[thread overview]
Message-ID: <20190213091936.GA22829@shao2-debian> (raw)
In-Reply-To: <1549566446-27967-1-git-send-email-longman@redhat.com>

Hi all,

Kernel test robot reported a will-it-scale.per_thread_ops -64.1% regression on IVB-desktop for v4.20-rc1.
The first bad commit is: 9bc8039e715da3b53dbac89525323a9f2f69b7b5, Yang Shi <yang.shi@linux.alibaba.com>: mm: brk: downgrade mmap_sem to read when shrinking
(https://lists.01.org/pipermail/lkp/2018-November/009335.html).

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-7/performance/x86_64-rhel-7.2/thread/100%/debian-x86_64-2018-04-03.cgz/lkp-ivb-d01/brk1/will-it-scale/0x20

commit: 
  85a06835f6 ("mm: mremap: downgrade mmap_sem to read when shrinking")
  9bc8039e71 ("mm: brk: downgrade mmap_sem to read when shrinking")

85a06835f6f1ba79 9bc8039e715da3b53dbac89525 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    196250 ±  8%     -64.1%      70494        will-it-scale.per_thread_ops
    127330 ± 19%     -98.0%       2525 ± 24%  will-it-scale.time.involuntary_context_switches
    727.50 ±  2%     -77.0%     167.25        will-it-scale.time.percent_of_cpu_this_job_got
      2141 ±  2%     -77.6%     479.12        will-it-scale.time.system_time
     50.48 ±  7%     -48.5%      25.98        will-it-scale.time.user_time
  34925294 ± 18%    +270.3%  1.293e+08 ±  4%  will-it-scale.time.voluntary_context_switches
   1570007 ±  8%     -64.1%     563958        will-it-scale.workload
      6435 ±  2%      -6.4%       6024        proc-vmstat.nr_shmem
      1298 ± 16%     -44.5%     721.00 ± 18%  proc-vmstat.pgactivate
      2341           +16.4%       2724        slabinfo.kmalloc-96.active_objs
      2341           +16.4%       2724        slabinfo.kmalloc-96.num_objs
      6346 ±150%     -87.8%     776.25 ±  9%  softirqs.NET_RX
    160107 ±  8%    +151.9%     403273        softirqs.SCHED
   1097999           -13.0%     955526        softirqs.TIMER
      5.50 ±  9%     -81.8%       1.00        vmstat.procs.r
    230700 ± 19%    +269.9%     853292 ±  4%  vmstat.system.cs
     26706 ±  3%     +15.7%      30910 ±  5%  vmstat.system.in
     11.24 ± 23%     +72.2       83.39        mpstat.cpu.idle%
      0.00 ±131%      +0.0        0.04 ± 99%  mpstat.cpu.iowait%
     86.32 ±  2%     -70.8       15.54        mpstat.cpu.sys%
      2.44 ±  7%      -1.4        1.04 ±  8%  mpstat.cpu.usr%
  20610709 ± 15%   +2376.0%  5.103e+08 ± 34%  cpuidle.C1.time
   3233399 ±  8%    +241.5%   11042785 ± 25%  cpuidle.C1.usage
  36172040 ±  6%    +931.3%   3.73e+08 ± 15%  cpuidle.C1E.time
    783605 ±  4%    +548.7%    5083041 ± 18%  cpuidle.C1E.usage
  28753819 ± 39%   +1054.5%  3.319e+08 ± 49%  cpuidle.C3.time
    283912 ± 25%    +688.4%    2238225 ± 34%  cpuidle.C3.usage
 1.507e+08 ± 47%    +292.3%  5.913e+08 ± 28%  cpuidle.C6.time
    339861 ± 37%    +549.7%    2208222 ± 24%  cpuidle.C6.usage
   2709719 ±  5%    +824.2%   25043444        cpuidle.POLL.time
  28602864 ± 18%    +173.7%   78276116 ± 10%  cpuidle.POLL.usage


We found that the patchset could fix the regression.

tests: 1
testcase/path_params/tbox_group/run: will-it-scale/performance-thread-100%-brk1-ucode=0x20/lkp-ivb-d01

commit: 
  85a06835f6 ("mm: mremap: downgrade mmap_sem to read when shrinking")
  fb835fe7f0 ("locking/rwsem: Ensure an RT task will not spin on reader")

85a06835f6f1ba79  fb835fe7f0adbd7c2c074b98ec  
----------------  --------------------------  
         %stddev      change         %stddev
             \          |                \  
    120736 ± 22%        56%     188019 ±  6%  will-it-scale.time.involuntary_context_switches
      2126 ±  3%         4%       2215        will-it-scale.time.system_time
       722 ±  3%         4%        752        will-it-scale.time.percent_of_cpu_this_job_got
  36256485 ± 27%       -35%   23682989 ±  3%  will-it-scale.time.voluntary_context_switches
      3151 ±  9%        11%       3504        turbostat.Avg_MHz
    229285 ± 32%       -30%     160660 ±  3%  vmstat.system.cs
    120736 ± 22%        56%     188019 ±  6%  time.involuntary_context_switches
      2126 ±  3%         4%       2215        time.system_time
       722 ±  3%         4%        752        time.percent_of_cpu_this_job_got
  36256485 ± 27%       -35%   23682989 ±  3%  time.voluntary_context_switches
        23             643%        171 ±  3%  proc-vmstat.nr_zone_inactive_file
        23             643%        171 ±  3%  proc-vmstat.nr_inactive_file
      3664              12%       4121        proc-vmstat.nr_kernel_stack
      6392               6%       6785        proc-vmstat.nr_slab_unreclaimable
      9991                       10176        proc-vmstat.nr_slab_reclaimable
     63938                       62394        proc-vmstat.nr_zone_active_anon
     63938                       62394        proc-vmstat.nr_active_anon
    386388 ±  9%        -6%     362272        proc-vmstat.pgfree
    368296 ±  9%       -10%     333074        proc-vmstat.numa_hit
    368296 ±  9%       -10%     333074        proc-vmstat.numa_local
      5169 ± 13%       -28%       3745        proc-vmstat.nr_shmem
      1801 ± 21%       -83%        309        proc-vmstat.pgactivate
         0            1e+04      11441        latency_stats.avg.msleep.cpuinfo_open.proc_reg_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     13165 ±222%     -1e+04          0        latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup_revalidate.lookup_fast.walk_component.link_path_walk.path_lookupat.filename_lookup
     22499 ±151%     -2e+04        657 ±  7%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
    117414 ±181%     -9e+04      24418 ± 44%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
    666005 ±218%     -7e+05        198 ±141%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_access.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat.filename_lookup
   2600097 ±132%     -3e+06        572        latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
  34391390 ±150%     -3e+07      21807 ±141%  latency_stats.avg.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
  34624774 ±149%     -3e+07      37668 ± 58%  latency_stats.avg.max
         0            1e+04      11441        latency_stats.max.msleep.cpuinfo_open.proc_reg_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     22499 ±151%     -2e+04        657 ±  7%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
     37845 ±222%     -4e+04          0        latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup_revalidate.lookup_fast.walk_component.link_path_walk.path_lookupat.filename_lookup
     80096 ± 59%     -8e+04          0        latency_stats.max.call_rwsem_down_write_failed_killable.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
    177149 ±195%     -2e+05      24418 ± 44%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
    689417 ±209%     -7e+05        200 ±141%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_access.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat.filename_lookup
  18679699 ±129%     -2e+07        656        latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
  83587334 ±129%     -8e+07      43457 ±141%  latency_stats.max.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
  84867236 ±126%     -8e+07      59318 ± 86%  latency_stats.max.max
         0            1e+04      11441        latency_stats.sum.msleep.cpuinfo_open.proc_reg_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     22499 ±151%     -2e+04        657 ±  7%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
     39431 ±222%     -4e+04          0        latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup_revalidate.lookup_fast.walk_component.link_path_walk.path_lookupat.filename_lookup
    216448 ±200%     -2e+05      24418 ± 44%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
    691960 ±208%     -7e+05        397 ±141%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_access.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat.filename_lookup
  24239011 ±140%     -2e+07       4768 ± 10%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
 1.771e+08 ±122%     -2e+08      43614 ±141%  latency_stats.sum.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
 1.939e+08 ± 36%     -2e+08          0        latency_stats.sum.call_rwsem_down_write_failed_killable.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
 2.943e+08 ± 51%     -2e+08   51929782        latency_stats.sum.max
    407463 ± 10%      -100%          0        perf-stat.total.page-faults
  74225651 ± 26%      -100%          0        perf-stat.total.context-switches
     55293 ± 25%      -100%          0        perf-stat.total.cpu-migrations
    407463 ± 10%      -100%          0        perf-stat.total.minor-faults


tests: 1
testcase/path_params/tbox_group/run: will-it-scale/performance-thread-100%-brk1-ucode=0x20/lkp-ivb-d01

commit: 
  9bc8039e71 ("mm: brk: downgrade mmap_sem to read when shrinking")
  fb835fe7f0 ("locking/rwsem: Ensure an RT task will not spin on reader")

9bc8039e715da3b5  fb835fe7f0adbd7c2c074b98ec  
----------------  --------------------------  
         %stddev      change         %stddev
             \          |                \  
      3500 ± 36%      5272%     188019 ±  6%  will-it-scale.time.involuntary_context_switches
       483             358%       2215        will-it-scale.time.system_time
       168             346%        752        will-it-scale.time.percent_of_cpu_this_job_got
     71190             180%     199232 ±  4%  will-it-scale.per_thread_ops
    569524             180%    1593862 ±  4%  will-it-scale.workload
     25.85              93%      49.95 ±  3%  will-it-scale.time.user_time
 1.314e+08 ±  3%       -82%   23682989 ±  3%  will-it-scale.time.voluntary_context_switches
     30501 ±  9%       -15%      25813 ±  4%  vmstat.system.in
    799593 ± 10%       -80%     160660 ±  3%  vmstat.system.cs
       887 ± 11%       295%       3504        turbostat.Avg_MHz
     23.60 ± 10%        68%      39.54        turbostat.CorWatt
     28.38 ±  8%        57%      44.43        turbostat.PkgWatt
      3500 ± 36%      5272%     188019 ±  6%  time.involuntary_context_switches
       483             358%       2215        time.system_time
       168             346%        752        time.percent_of_cpu_this_job_got
     25.85              93%      49.95 ±  3%  time.user_time
 1.314e+08 ±  3%       -82%   23682989 ±  3%  time.voluntary_context_switches
         0 ± 44%     46220%        386        proc-vmstat.nr_zone_active_file
         0 ± 44%     46220%        386        proc-vmstat.nr_active_file
        23             643%        171 ±  3%  proc-vmstat.nr_zone_inactive_file
        23             643%        171 ±  3%  proc-vmstat.nr_inactive_file
      3690              12%       4121        proc-vmstat.nr_kernel_stack
      6419               6%       6785        proc-vmstat.nr_slab_unreclaimable
      9961                       10176        proc-vmstat.nr_slab_reclaimable
    229251                      231278        proc-vmstat.nr_zone_unevictable
    229251                      231278        proc-vmstat.nr_unevictable
      1008                        1005        proc-vmstat.nr_page_table_pages
     63178                       62394        proc-vmstat.nr_zone_active_anon
     63178                       62394        proc-vmstat.nr_active_anon
    432061 ± 12%       -11%     385372        proc-vmstat.pgfault
    408099 ± 10%       -11%     362272        proc-vmstat.pgfree
    422206 ±  9%       -11%     373690        proc-vmstat.pgalloc_normal
    382357 ± 11%       -13%     333074        proc-vmstat.numa_hit
    382357 ± 11%       -13%     333074        proc-vmstat.numa_local
      4428 ± 17%       -15%       3745        proc-vmstat.nr_shmem
         0            1e+04      11441        latency_stats.avg.msleep.cpuinfo_open.proc_reg_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     11180 ±168%     -1e+04        657 ±  7%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
     19239 ±223%     -2e+04          0        latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_get_acl.get_acl.posix_acl_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open
     63702 ±169%     -4e+04      24418 ± 44%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
     77617 ±205%     -8e+04        510 ± 11%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
   3043762 ±124%     -3e+06        572        latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
  11630441 ±139%     -1e+07      21807 ±141%  latency_stats.avg.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
  12242832 ±129%     -1e+07      37668 ± 58%  latency_stats.avg.max
         0            1e+04      11441        latency_stats.max.msleep.cpuinfo_open.proc_reg_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     11180 ±168%     -1e+04        657 ±  7%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
     19239 ±223%     -2e+04          0        latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_get_acl.get_acl.posix_acl_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open
     29152 ± 11%     -3e+04          0        latency_stats.max.call_rwsem_down_write_failed_killable.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
     65909 ±164%     -4e+04      24418 ± 44%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
     77617 ±205%     -8e+04        510 ± 11%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
  17301268 ±125%     -2e+07        656        latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
  44248611 ±140%     -4e+07      43457 ±141%  latency_stats.max.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
  46380610 ±130%     -5e+07      59318 ± 86%  latency_stats.max.max
         0            1e+04      11441        latency_stats.sum.msleep.cpuinfo_open.proc_reg_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     11180 ±168%     -1e+04        657 ±  7%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
     19239 ±223%     -2e+04          0        latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_get_acl.get_acl.posix_acl_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open
     74047 ±148%     -5e+04      24418 ± 44%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
     77617 ±205%     -8e+04        510 ± 11%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
  26043088 ±130%     -3e+07       4768 ± 10%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
  82480038 ±152%     -8e+07      43614 ±141%  latency_stats.sum.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
 1.771e+09           -2e+09   51929782        latency_stats.sum.max
 1.771e+09           -2e+09          0        latency_stats.sum.call_rwsem_down_write_failed_killable.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
    420016 ± 12%      -100%          0        perf-stat.total.page-faults
 2.648e+08 ±  3%      -100%          0        perf-stat.total.context-switches
     52212 ± 18%      -100%          0        perf-stat.total.cpu-migrations
    420016 ± 12%      -100%          0        perf-stat.total.minor-faults

Best Regards,
Rong Chen

WARNING: multiple messages have this Message-ID (diff)
From: Chen Rong <rong.a.chen@intel.com>
To: Waiman Long <longman@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"IngoMolnar@shao2-debian" <mingo@redhat.com>,
	Will Deacon <will.deacon@arm.com>,
	Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org, linux-xtensa@linux-xtensa.org,
	Davidlohr Bueso <dave@stgolabs.net>,
	linux-ia64@vger.kernel.org, Tim Chen <tim.c.chen@linux.intel.com>,
	Arnd Bergmann <arnd@arndb.de>,
	linux-sh@vger.kernel.org, linux-hexagon@vger.kernel.org,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Borislav Petkov <bp@alien8.de>, "H. Peter Anvin" <hpa@zytor.com>,
	sparclinux@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	linuxppc-dev@lists.ozlabs.org,
	"linux-alpha@vger.kernel.org"
	<linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH-tip 00/22] locking/rwsem: Rework rwsem-xadd & enable new rwsem features
Date: Wed, 13 Feb 2019 17:19:36 +0800	[thread overview]
Message-ID: <20190213091936.GA22829@shao2-debian> (raw)
In-Reply-To: <1549566446-27967-1-git-send-email-longman@redhat.com>

Hi all,

Kernel test robot reported a will-it-scale.per_thread_ops -64.1% regression on IVB-desktop for v4.20-rc1.
The first bad commit is: 9bc8039e715da3b53dbac89525323a9f2f69b7b5, Yang Shi <yang.shi@linux.alibaba.com>: mm: brk: downgrade mmap_sem to read when shrinking
(https://lists.01.org/pipermail/lkp/2018-November/009335.html).

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-7/performance/x86_64-rhel-7.2/thread/100%/debian-x86_64-2018-04-03.cgz/lkp-ivb-d01/brk1/will-it-scale/0x20

commit: 
  85a06835f6 ("mm: mremap: downgrade mmap_sem to read when shrinking")
  9bc8039e71 ("mm: brk: downgrade mmap_sem to read when shrinking")

85a06835f6f1ba79 9bc8039e715da3b53dbac89525 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    196250 ±  8%     -64.1%      70494        will-it-scale.per_thread_ops
    127330 ± 19%     -98.0%       2525 ± 24%  will-it-scale.time.involuntary_context_switches
    727.50 ±  2%     -77.0%     167.25        will-it-scale.time.percent_of_cpu_this_job_got
      2141 ±  2%     -77.6%     479.12        will-it-scale.time.system_time
     50.48 ±  7%     -48.5%      25.98        will-it-scale.time.user_time
  34925294 ± 18%    +270.3%  1.293e+08 ±  4%  will-it-scale.time.voluntary_context_switches
   1570007 ±  8%     -64.1%     563958        will-it-scale.workload
      6435 ±  2%      -6.4%       6024        proc-vmstat.nr_shmem
      1298 ± 16%     -44.5%     721.00 ± 18%  proc-vmstat.pgactivate
      2341           +16.4%       2724        slabinfo.kmalloc-96.active_objs
      2341           +16.4%       2724        slabinfo.kmalloc-96.num_objs
      6346 ±150%     -87.8%     776.25 ±  9%  softirqs.NET_RX
    160107 ±  8%    +151.9%     403273        softirqs.SCHED
   1097999           -13.0%     955526        softirqs.TIMER
      5.50 ±  9%     -81.8%       1.00        vmstat.procs.r
    230700 ± 19%    +269.9%     853292 ±  4%  vmstat.system.cs
     26706 ±  3%     +15.7%      30910 ±  5%  vmstat.system.in
     11.24 ± 23%     +72.2       83.39        mpstat.cpu.idle%
      0.00 ±131%      +0.0        0.04 ± 99%  mpstat.cpu.iowait%
     86.32 ±  2%     -70.8       15.54        mpstat.cpu.sys%
      2.44 ±  7%      -1.4        1.04 ±  8%  mpstat.cpu.usr%
  20610709 ± 15%   +2376.0%  5.103e+08 ± 34%  cpuidle.C1.time
   3233399 ±  8%    +241.5%   11042785 ± 25%  cpuidle.C1.usage
  36172040 ±  6%    +931.3%   3.73e+08 ± 15%  cpuidle.C1E.time
    783605 ±  4%    +548.7%    5083041 ± 18%  cpuidle.C1E.usage
  28753819 ± 39%   +1054.5%  3.319e+08 ± 49%  cpuidle.C3.time
    283912 ± 25%    +688.4%    2238225 ± 34%  cpuidle.C3.usage
 1.507e+08 ± 47%    +292.3%  5.913e+08 ± 28%  cpuidle.C6.time
    339861 ± 37%    +549.7%    2208222 ± 24%  cpuidle.C6.usage
   2709719 ±  5%    +824.2%   25043444        cpuidle.POLL.time
  28602864 ± 18%    +173.7%   78276116 ± 10%  cpuidle.POLL.usage


We found that the patchset could fix the regression.

tests: 1
testcase/path_params/tbox_group/run: will-it-scale/performance-thread-100%-brk1-ucode=0x20/lkp-ivb-d01

commit: 
  85a06835f6 ("mm: mremap: downgrade mmap_sem to read when shrinking")
  fb835fe7f0 ("locking/rwsem: Ensure an RT task will not spin on reader")

85a06835f6f1ba79  fb835fe7f0adbd7c2c074b98ec  
----------------  --------------------------  
         %stddev      change         %stddev
             \          |                \  
    120736 ± 22%        56%     188019 ±  6%  will-it-scale.time.involuntary_context_switches
      2126 ±  3%         4%       2215        will-it-scale.time.system_time
       722 ±  3%         4%        752        will-it-scale.time.percent_of_cpu_this_job_got
  36256485 ± 27%       -35%   23682989 ±  3%  will-it-scale.time.voluntary_context_switches
      3151 ±  9%        11%       3504        turbostat.Avg_MHz
    229285 ± 32%       -30%     160660 ±  3%  vmstat.system.cs
    120736 ± 22%        56%     188019 ±  6%  time.involuntary_context_switches
      2126 ±  3%         4%       2215        time.system_time
       722 ±  3%         4%        752        time.percent_of_cpu_this_job_got
  36256485 ± 27%       -35%   23682989 ±  3%  time.voluntary_context_switches
        23             643%        171 ±  3%  proc-vmstat.nr_zone_inactive_file
        23             643%        171 ±  3%  proc-vmstat.nr_inactive_file
      3664              12%       4121        proc-vmstat.nr_kernel_stack
      6392               6%       6785        proc-vmstat.nr_slab_unreclaimable
      9991                       10176        proc-vmstat.nr_slab_reclaimable
     63938                       62394        proc-vmstat.nr_zone_active_anon
     63938                       62394        proc-vmstat.nr_active_anon
    386388 ±  9%        -6%     362272        proc-vmstat.pgfree
    368296 ±  9%       -10%     333074        proc-vmstat.numa_hit
    368296 ±  9%       -10%     333074        proc-vmstat.numa_local
      5169 ± 13%       -28%       3745        proc-vmstat.nr_shmem
      1801 ± 21%       -83%        309        proc-vmstat.pgactivate
         0            1e+04      11441        latency_stats.avg.msleep.cpuinfo_open.proc_reg_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     13165 ±222%     -1e+04          0        latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup_revalidate.lookup_fast.walk_component.link_path_walk.path_lookupat.filename_lookup
     22499 ±151%     -2e+04        657 ±  7%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
    117414 ±181%     -9e+04      24418 ± 44%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
    666005 ±218%     -7e+05        198 ±141%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_access.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat.filename_lookup
   2600097 ±132%     -3e+06        572        latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
  34391390 ±150%     -3e+07      21807 ±141%  latency_stats.avg.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
  34624774 ±149%     -3e+07      37668 ± 58%  latency_stats.avg.max
         0            1e+04      11441        latency_stats.max.msleep.cpuinfo_open.proc_reg_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     22499 ±151%     -2e+04        657 ±  7%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
     37845 ±222%     -4e+04          0        latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup_revalidate.lookup_fast.walk_component.link_path_walk.path_lookupat.filename_lookup
     80096 ± 59%     -8e+04          0        latency_stats.max.call_rwsem_down_write_failed_killable.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
    177149 ±195%     -2e+05      24418 ± 44%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
    689417 ±209%     -7e+05        200 ±141%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_access.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat.filename_lookup
  18679699 ±129%     -2e+07        656        latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
  83587334 ±129%     -8e+07      43457 ±141%  latency_stats.max.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
  84867236 ±126%     -8e+07      59318 ± 86%  latency_stats.max.max
         0            1e+04      11441        latency_stats.sum.msleep.cpuinfo_open.proc_reg_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     22499 ±151%     -2e+04        657 ±  7%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
     39431 ±222%     -4e+04          0        latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup_revalidate.lookup_fast.walk_component.link_path_walk.path_lookupat.filename_lookup
    216448 ±200%     -2e+05      24418 ± 44%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
    691960 ±208%     -7e+05        397 ±141%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_access.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat.filename_lookup
  24239011 ±140%     -2e+07       4768 ± 10%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
 1.771e+08 ±122%     -2e+08      43614 ±141%  latency_stats.sum.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
 1.939e+08 ± 36%     -2e+08          0        latency_stats.sum.call_rwsem_down_write_failed_killable.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
 2.943e+08 ± 51%     -2e+08   51929782        latency_stats.sum.max
    407463 ± 10%      -100%          0        perf-stat.total.page-faults
  74225651 ± 26%      -100%          0        perf-stat.total.context-switches
     55293 ± 25%      -100%          0        perf-stat.total.cpu-migrations
    407463 ± 10%      -100%          0        perf-stat.total.minor-faults


tests: 1
testcase/path_params/tbox_group/run: will-it-scale/performance-thread-100%-brk1-ucode=0x20/lkp-ivb-d01

commit: 
  9bc8039e71 ("mm: brk: downgrade mmap_sem to read when shrinking")
  fb835fe7f0 ("locking/rwsem: Ensure an RT task will not spin on reader")

9bc8039e715da3b5  fb835fe7f0adbd7c2c074b98ec  
----------------  --------------------------  
         %stddev      change         %stddev
             \          |                \  
      3500 ± 36%      5272%     188019 ±  6%  will-it-scale.time.involuntary_context_switches
       483             358%       2215        will-it-scale.time.system_time
       168             346%        752        will-it-scale.time.percent_of_cpu_this_job_got
     71190             180%     199232 ±  4%  will-it-scale.per_thread_ops
    569524             180%    1593862 ±  4%  will-it-scale.workload
     25.85              93%      49.95 ±  3%  will-it-scale.time.user_time
 1.314e+08 ±  3%       -82%   23682989 ±  3%  will-it-scale.time.voluntary_context_switches
     30501 ±  9%       -15%      25813 ±  4%  vmstat.system.in
    799593 ± 10%       -80%     160660 ±  3%  vmstat.system.cs
       887 ± 11%       295%       3504        turbostat.Avg_MHz
     23.60 ± 10%        68%      39.54        turbostat.CorWatt
     28.38 ±  8%        57%      44.43        turbostat.PkgWatt
      3500 ± 36%      5272%     188019 ±  6%  time.involuntary_context_switches
       483             358%       2215        time.system_time
       168             346%        752        time.percent_of_cpu_this_job_got
     25.85              93%      49.95 ±  3%  time.user_time
 1.314e+08 ±  3%       -82%   23682989 ±  3%  time.voluntary_context_switches
         0 ± 44%     46220%        386        proc-vmstat.nr_zone_active_file
         0 ± 44%     46220%        386        proc-vmstat.nr_active_file
        23             643%        171 ±  3%  proc-vmstat.nr_zone_inactive_file
        23             643%        171 ±  3%  proc-vmstat.nr_inactive_file
      3690              12%       4121        proc-vmstat.nr_kernel_stack
      6419               6%       6785        proc-vmstat.nr_slab_unreclaimable
      9961                       10176        proc-vmstat.nr_slab_reclaimable
    229251                      231278        proc-vmstat.nr_zone_unevictable
    229251                      231278        proc-vmstat.nr_unevictable
      1008                        1005        proc-vmstat.nr_page_table_pages
     63178                       62394        proc-vmstat.nr_zone_active_anon
     63178                       62394        proc-vmstat.nr_active_anon
    432061 ± 12%       -11%     385372        proc-vmstat.pgfault
    408099 ± 10%       -11%     362272        proc-vmstat.pgfree
    422206 ±  9%       -11%     373690        proc-vmstat.pgalloc_normal
    382357 ± 11%       -13%     333074        proc-vmstat.numa_hit
    382357 ± 11%       -13%     333074        proc-vmstat.numa_local
      4428 ± 17%       -15%       3745        proc-vmstat.nr_shmem
         0            1e+04      11441        latency_stats.avg.msleep.cpuinfo_open.proc_reg_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     11180 ±168%     -1e+04        657 ±  7%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
     19239 ±223%     -2e+04          0        latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_get_acl.get_acl.posix_acl_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open
     63702 ±169%     -4e+04      24418 ± 44%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
     77617 ±205%     -8e+04        510 ± 11%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
   3043762 ±124%     -3e+06        572        latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
  11630441 ±139%     -1e+07      21807 ±141%  latency_stats.avg.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
  12242832 ±129%     -1e+07      37668 ± 58%  latency_stats.avg.max
         0            1e+04      11441        latency_stats.max.msleep.cpuinfo_open.proc_reg_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     11180 ±168%     -1e+04        657 ±  7%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
     19239 ±223%     -2e+04          0        latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_get_acl.get_acl.posix_acl_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open
     29152 ± 11%     -3e+04          0        latency_stats.max.call_rwsem_down_write_failed_killable.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
     65909 ±164%     -4e+04      24418 ± 44%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
     77617 ±205%     -8e+04        510 ± 11%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
  17301268 ±125%     -2e+07        656        latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
  44248611 ±140%     -4e+07      43457 ±141%  latency_stats.max.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
  46380610 ±130%     -5e+07      59318 ± 86%  latency_stats.max.max
         0            1e+04      11441        latency_stats.sum.msleep.cpuinfo_open.proc_reg_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     11180 ±168%     -1e+04        657 ±  7%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
     19239 ±223%     -2e+04          0        latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_get_acl.get_acl.posix_acl_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open
     74047 ±148%     -5e+04      24418 ± 44%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
     77617 ±205%     -8e+04        510 ± 11%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
  26043088 ±130%     -3e+07       4768 ± 10%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
  82480038 ±152%     -8e+07      43614 ±141%  latency_stats.sum.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
 1.771e+09           -2e+09   51929782        latency_stats.sum.max
 1.771e+09           -2e+09          0        latency_stats.sum.call_rwsem_down_write_failed_killable.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
    420016 ± 12%      -100%          0        perf-stat.total.page-faults
 2.648e+08 ±  3%      -100%          0        perf-stat.total.context-switches
     52212 ± 18%      -100%          0        perf-stat.total.cpu-migrations
    420016 ± 12%      -100%          0        perf-stat.total.minor-faults

Best Regards,
Rong Chen

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2019-02-13  9:19 UTC|newest]

Thread overview: 181+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-07 19:07 [PATCH-tip 00/22] locking/rwsem: Rework rwsem-xadd & enable new rwsem features Waiman Long
2019-02-07 19:07 ` Waiman Long
2019-02-07 19:07 ` Waiman Long
2019-02-07 19:07 ` Waiman Long
2019-02-07 19:07 ` [PATCH-tip 01/22] locking/qspinlock_stat: Introduce a generic lockevent counting APIs Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07 ` [PATCH-tip 02/22] locking/lock_events: Make lock_events available for all archs & other locks Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07 ` [PATCH-tip 03/22] locking/rwsem: Relocate rwsem_down_read_failed() Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07 ` [PATCH-tip 04/22] locking/rwsem: Remove arch specific rwsem files Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:36   ` Peter Zijlstra
2019-02-07 19:36     ` Peter Zijlstra
2019-02-07 19:36     ` Peter Zijlstra
2019-02-07 19:36     ` Peter Zijlstra
2019-02-07 19:43     ` Waiman Long
2019-02-07 19:43       ` Waiman Long
2019-02-07 19:43       ` Waiman Long
2019-02-07 19:43       ` Waiman Long
2019-02-07 19:48     ` Peter Zijlstra
2019-02-07 19:48       ` Peter Zijlstra
2019-02-07 19:48       ` Peter Zijlstra
2019-02-07 19:48       ` Peter Zijlstra
2019-02-07 19:07 ` [PATCH-tip 05/22] locking/rwsem: Move owner setting code from rwsem.c to rwsem.h Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07 ` [PATCH-tip 06/22] locking/rwsem: Rename kernel/locking/rwsem.h Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07 ` [PATCH-tip 07/22] locking/rwsem: Move rwsem internal function declarations to rwsem-xadd.h Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07 ` [PATCH-tip 08/22] locking/rwsem: Add debug check for __down_read*() Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07 ` [PATCH-tip 09/22] locking/rwsem: Enhance DEBUG_RWSEMS_WARN_ON() macro Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07 ` [PATCH-tip 10/22] locking/rwsem: Enable lock event counting Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07 ` [PATCH-tip 11/22] locking/rwsem: Implement a new locking scheme Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07 ` [PATCH-tip 12/22] locking/rwsem: Implement lock handoff to prevent lock starvation Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07 ` [PATCH-tip 13/22] locking/rwsem: Remove rwsem_wake() wakeup optimization Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07 ` [PATCH-tip 14/22] locking/rwsem: Add more rwsem owner access helpers Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07 ` [PATCH-tip 15/22] locking/rwsem: Merge owner into count on x86-64 Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:45   ` Peter Zijlstra
2019-02-07 19:45     ` Peter Zijlstra
2019-02-07 19:45     ` Peter Zijlstra
2019-02-07 19:45     ` Peter Zijlstra
2019-02-07 19:55     ` Waiman Long
2019-02-07 19:55       ` Waiman Long
2019-02-07 19:55       ` Waiman Long
2019-02-07 19:55       ` Waiman Long
2019-02-07 20:08   ` Peter Zijlstra
2019-02-07 20:08     ` Peter Zijlstra
2019-02-07 20:08     ` Peter Zijlstra
2019-02-07 20:08     ` Peter Zijlstra
2019-02-07 20:54     ` Waiman Long
2019-02-07 20:54       ` Waiman Long
2019-02-07 20:54       ` Waiman Long
2019-02-07 20:54       ` Waiman Long
2019-02-08 14:19       ` Waiman Long
2019-02-08 14:19         ` Waiman Long
2019-02-08 14:19         ` Waiman Long
2019-02-08 14:19         ` Waiman Long
2019-02-08 14:19         ` Waiman Long
2019-02-07 19:07 ` [PATCH-tip 16/22] locking/rwsem: Remove redundant computation of writer lock word Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07 ` [PATCH-tip 17/22] locking/rwsem: Recheck owner if it is not on cpu Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07 ` [PATCH-tip 18/22] locking/rwsem: Make rwsem_spin_on_owner() return a tri-state value Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07 ` [PATCH-tip 19/22] locking/rwsem: Enable readers spinning on writer Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07 ` [PATCH-tip 20/22] locking/rwsem: Enable count-based spinning on reader Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07 ` [PATCH-tip 21/22] locking/rwsem: Wake up all readers in wait queue Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07 ` [PATCH-tip 22/22] locking/rwsem: Ensure an RT task will not spin on reader Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:07   ` Waiman Long
2019-02-07 19:51 ` [PATCH-tip 00/22] locking/rwsem: Rework rwsem-xadd & enable new rwsem features Davidlohr Bueso
2019-02-07 19:51   ` Davidlohr Bueso
2019-02-07 19:51   ` Davidlohr Bueso
2019-02-07 19:51   ` Davidlohr Bueso
2019-02-07 20:00   ` Waiman Long
2019-02-07 20:00     ` Waiman Long
2019-02-07 20:00     ` Waiman Long
2019-02-07 20:00     ` Waiman Long
2019-02-11  7:38     ` Ingo Molnar
2019-02-11  7:38       ` Ingo Molnar
2019-02-11  7:38       ` Ingo Molnar
2019-02-11  7:38       ` Ingo Molnar
2019-02-08 19:50 ` Linus Torvalds
2019-02-08 19:50   ` Linus Torvalds
2019-02-08 19:50   ` Linus Torvalds
2019-02-08 19:50   ` Linus Torvalds
2019-02-08 20:31   ` Waiman Long
2019-02-08 20:31     ` Waiman Long
2019-02-08 20:31     ` Waiman Long
2019-02-08 20:31     ` Waiman Long
2019-02-09  0:03     ` Linus Torvalds
2019-02-09  0:03       ` Linus Torvalds
2019-02-09  0:03       ` Linus Torvalds
2019-02-09  0:03       ` Linus Torvalds
2019-02-14 13:23     ` Davidlohr Bueso
2019-02-14 13:23       ` Davidlohr Bueso
2019-02-14 13:23       ` Davidlohr Bueso
2019-02-14 13:23       ` Davidlohr Bueso
2019-02-14 15:22       ` Waiman Long
2019-02-14 15:22         ` Waiman Long
2019-02-14 15:22         ` Waiman Long
2019-02-14 15:22         ` Waiman Long
2019-02-13  9:19 ` Chen Rong [this message]
2019-02-13  9:19   ` Chen Rong
2019-02-13  9:19   ` Chen Rong
2019-02-13  9:19   ` Chen Rong
2019-02-13 19:56   ` Linus Torvalds
2019-02-13 19:56     ` Linus Torvalds
2019-02-13 19:56     ` Linus Torvalds
2019-02-13 19:56     ` Linus Torvalds
2019-02-13 19:56     ` Linus Torvalds
2019-04-10  8:15     ` huang ying
2019-04-10  8:15       ` huang ying
2019-04-10  8:15       ` huang ying
2019-04-10  8:15       ` huang ying
2019-04-10  8:15       ` huang ying
2019-04-10 16:08       ` Waiman Long
2019-04-10 16:08         ` Waiman Long
2019-04-10 16:08         ` Waiman Long
2019-04-10 16:08         ` Waiman Long
2019-04-10 16:08         ` Waiman Long
2019-04-12  0:49         ` huang ying
2019-04-12  0:49           ` huang ying
2019-04-12  0:49           ` huang ying
2019-04-12  0:49           ` huang ying
2019-04-12  0:49           ` huang ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190213091936.GA22829@shao2-debian \
    --to=rong.a.chen@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=dave@stgolabs.net \
    --cc=hpa@zytor.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-hexagon@vger.kernel.org \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-sh@vger.kernel.org \
    --cc=linux-xtensa@linux-xtensa.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=sparclinux@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@linux.intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=will.deacon@arm.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.