All of lore.kernel.org
 help / color / mirror / Atom feed
From: Feng Tang <feng.tang@intel.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: "Sang, Oliver" <oliver.sang@intel.com>,
	Shakeel Butt <shakeelb@google.com>,
	"oe-lkp@lists.linux.dev" <oe-lkp@lists.linux.dev>,
	lkp <lkp@intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Marek Szyprowski <m.szyprowski@samsung.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-trace-kernel@vger.kernel.org" 
	<linux-trace-kernel@vger.kernel.org>,
	"Huang, Ying" <ying.huang@intel.com>,
	"zhengjun.xing@linux.intel.com" <zhengjun.xing@linux.intel.com>,
	"Yin, Fengwei" <fengwei.yin@intel.com>
Subject: Re: [linus:master] [mm]  f1a7941243:  unixbench.score -19.2% regression
Date: Tue, 31 Jan 2023 14:11:58 +0800	[thread overview]
Message-ID: <Y9ixLimDNWk9Rf6o@feng-clx> (raw)
In-Reply-To: <Y9dETROtv9Bld9TI@casper.infradead.org>

On Mon, Jan 30, 2023 at 12:15:09PM +0800, Matthew Wilcox wrote:
> On Mon, Jan 30, 2023 at 10:32:56AM +0800, kernel test robot wrote:
> > FYI, we noticed a -19.2% regression of unixbench.score due to commit:
> > 
> > commit: f1a7941243c102a44e8847e3b94ff4ff3ec56f25 ("mm: convert mm's rss stats into percpu_counter")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > 
> > in testcase: unixbench
> > on test machine: 128 threads 4 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory
> > with following parameters:
> > 
> > 	runtime: 300s
> > 	nr_task: 30%
> > 	test: spawn
> > 	cpufreq_governor: performance
> 
> ...
> 
> > 9cd6ffa60256e931 f1a7941243c102a44e8847e3b94 
> > ---------------- --------------------------- 
> >          %stddev     %change         %stddev
> >              \          |                \  
> >      11110           -19.2%       8974        unixbench.score
> >    1090843           -12.2%     957314        unixbench.time.involuntary_context_switches
> >    4243909 ±  6%     -32.4%    2867136 ±  5%  unixbench.time.major_page_faults
> >      10547           -12.6%       9216        unixbench.time.maximum_resident_set_size
> >  9.913e+08           -19.6%  7.969e+08        unixbench.time.minor_page_faults
> >       5638           +19.1%       6714        unixbench.time.system_time
> >       5502           -20.7%       4363        unixbench.time.user_time
> 
> So we're spending a lot more time in the kernel and correspondingly less
> time in userspace.
> 
> >   67991885           -16.9%   56507507        unixbench.time.voluntary_context_switches
> >   46198768           -19.1%   37355723        unixbench.workload
> >  1.365e+08           -12.5%  1.195e+08 ±  7%  cpuidle..usage
> >    1220612 ±  4%     -38.0%     757009 ± 28%  meminfo.Active
> >    1220354 ±  4%     -38.0%     756754 ± 28%  meminfo.Active(anon)
> >       0.50 ±  2%      -0.1        0.45 ±  4%  mpstat.cpu.all.soft%
> >       1.73            -0.2        1.52 ±  2%  mpstat.cpu.all.usr%
> >     532266           -18.4%     434559        vmstat.system.cs
> >     495826           -12.2%     435455 ±  8%  vmstat.system.in
> >   1.36e+08           -13.2%   1.18e+08 ±  9%  turbostat.C1
> >      68.80            +0.8       69.60        turbostat.C1%
> >  1.663e+08           -12.1%  1.462e+08 ±  8%  turbostat.IRQ
> >      15.54 ± 20%     -49.0%       7.93 ± 24%  sched_debug.cfs_rq:/.runnable_avg.min
> >      13.26 ± 19%     -46.6%       7.08 ± 29%  sched_debug.cfs_rq:/.util_avg.min
> >      48.96 ±  8%     +51.5%      74.20 ± 13%  sched_debug.cfs_rq:/.util_est_enqueued.avg
> >     138.00 ±  5%     +28.9%     177.87 ±  7%  sched_debug.cfs_rq:/.util_est_enqueued.stddev
> >     228060 ±  3%     +13.3%     258413 ±  4%  sched_debug.cpu.avg_idle.stddev
> >     432533 ±  5%     -16.4%     361517 ±  4%  sched_debug.cpu.nr_switches.min
> >  2.665e+08           -18.9%  2.162e+08        numa-numastat.node0.local_node
> >  2.666e+08           -18.9%  2.163e+08        numa-numastat.node0.numa_hit
> >  2.746e+08           -20.9%  2.172e+08        numa-numastat.node1.local_node
> >  2.747e+08           -20.9%  2.172e+08        numa-numastat.node1.numa_hit
> >  2.602e+08           -17.4%  2.149e+08        numa-numastat.node2.local_node
> >  2.603e+08           -17.4%  2.149e+08        numa-numastat.node2.numa_hit
> >  2.423e+08           -15.0%   2.06e+08        numa-numastat.node3.local_node
> >  2.424e+08           -15.0%  2.061e+08        numa-numastat.node3.numa_hit
> 
> So we're going off-node a lot more for ... something.
> 
> >  2.666e+08           -18.9%  2.163e+08        numa-vmstat.node0.numa_hit
> >  2.665e+08           -18.9%  2.162e+08        numa-vmstat.node0.numa_local
> >  2.747e+08           -20.9%  2.172e+08        numa-vmstat.node1.numa_hit
> >  2.746e+08           -20.9%  2.172e+08        numa-vmstat.node1.numa_local
> >  2.603e+08           -17.4%  2.149e+08        numa-vmstat.node2.numa_hit
> >  2.602e+08           -17.4%  2.149e+08        numa-vmstat.node2.numa_local
> >  2.424e+08           -15.0%  2.061e+08        numa-vmstat.node3.numa_hit
> >  2.423e+08           -15.0%   2.06e+08        numa-vmstat.node3.numa_local
> >     304947 ±  4%     -38.0%     189144 ± 28%  proc-vmstat.nr_active_anon
> 
> Umm.  Are we running vmstat a lot during this test?  The commit says:

Thanks for the hint.

Right. IIUC, 0Day normally will run 'vmstat 1' and collet info, and
also read /proc/vmstat and /sys/devices/system/node/nodexx/numstat
per second.

>     At the
>     moment the readers are either procfs interface, oom_killer and memory
>     reclaim which I think are not performance critical and should be ok with
>     slow read.  However I think we can make that change in a separate patch.
> 
> This would explain the increased cross-NUMA references (we're going to
> the other nodes to collect the stats), and the general slowdown.  But I
> don't think it reflects a real workload; it's reflecting that the
> monitoring of this workload that we're doing is now more accurate and
> more expensive.

We tried to disable the 3 types of 'vmstat/numastat' monitors
mentioned above. And the regression is still there with
similar rate around 19%. 

Thanks,
Feng


      parent reply	other threads:[~2023-01-31  6:14 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-30  2:32 [linus:master] [mm] f1a7941243: unixbench.score -19.2% regression kernel test robot
2023-01-30  4:15 ` Matthew Wilcox
2023-01-31  5:23   ` Shakeel Butt
2023-01-31  5:45     ` Matthew Wilcox
2023-01-31  5:57       ` Shakeel Butt
2023-01-31 18:26         ` Shakeel Butt
2023-02-27  6:35           ` Yin, Fengwei
2023-02-27 16:50             ` Shakeel Butt
2023-02-28  0:32               ` Yin, Fengwei
2023-01-31  6:11   ` Feng Tang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y9ixLimDNWk9Rf6o@feng-clx \
    --to=feng.tang@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=fengwei.yin@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=m.szyprowski@samsung.com \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oliver.sang@intel.com \
    --cc=shakeelb@google.com \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    --cc=zhengjun.xing@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.