Re: [mm] 2d146aa3aa: vm-scalability.throughput -36.4% regression

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Johannes Weiner <hannes@cmpxchg.org>
To: lkp@lists.01.org
Subject: Re: [mm] 2d146aa3aa: vm-scalability.throughput -36.4% regression
Date: Wed, 11 Aug 2021 16:12:01 -0400	[thread overview]
Message-ID: <YRQvEbMSUqIkuMnk@cmpxchg.org> (raw)
In-Reply-To: <CAHk-=wiSHHSuSQsCCLOxQA+cbcvjmEeMsTCMWPT1sFVngd9-ig@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2927 bytes --]

On Tue, Aug 10, 2021 at 07:59:53PM -1000, Linus Torvalds wrote:
> On Tue, Aug 10, 2021 at 4:59 PM kernel test robot <oliver.sang@intel.com> wrote:
> >
> > FYI, we noticed a -36.4% regression of vm-scalability.throughput due to commit:
> > 2d146aa3aa84 ("mm: memcontrol: switch to rstat")
> 
> Hmm. I guess some cost is to be expected, but that's a big regression.
> 
> I'm not sure what the code ends up doing, and how relevant this test
> is, but Johannes - could you please take a look?
> 
> I can't make heads nor tails of the profile. The profile kind of points at this:
> 
> >       2.77 ą 12%     +27.4       30.19 ą  8%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
> >       2.86 ą 12%     +27.4       30.29 ą  8%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
> >       2.77 ą 12%     +27.4       30.21 ą  8%  perf-profile.children.cycles-pp.lock_page_lruvec_irqsave
> >       4.26 ą 10%     +28.1       32.32 ą  7%  perf-profile.children.cycles-pp.lru_cache_add
> >       4.15 ą 10%     +28.2       32.33 ą  7%  perf-profile.children.cycles-pp.__pagevec_lru_add
> 
> and that seems to be from the chain __do_fault -> shmem_fault ->
> shmem_getpage_gfp -> lru_cache_add -> __pagevec_lru_add ->
> lock_page_lruvec_irqsave -> _raw_spin_lock_irqsave ->
> native_queued_spin_lock_slowpath.
> 
> That shmem_fault codepath being hot may make sense for sokme VM
> scalability test. But it seems to make little sense when I look at the
> commit that it bisected to.
> 
> We had another report of this commit causing a much more reasonable
> small slowdown (-2.8%) back in May.
> 
> I'm not sure what's up with this new report. Johannes, does this make
> sense to you?
> 
> Is it perhaps another "unlucky cache line placement" thing? Or has the
> statistics changes just changed the behavior of the test?

I'm at a loss as well.

The patch only changes how we aggregate the cgroup's memory.stat file,
it doesn't influence reclaim/LRU operations.

The test itself isn't interacting with memory.stat either - IIRC it
doesn't even run inside a dedicated cgroup in this test
environment. The patch should actually reduce accounting overhead here
because we switched from batched percpu flushing during updates to
only flushing when the stats are *read* - which doesn't happen here.

That would leave cachelines. But the cachelines the patch touched are
in struct mem_cgroup, whereas the lock this profile points out is in a
separately allocated per-node structure. The cache footprint on the
percpu data this test is hammering also didn't increase; it actually
decreased a bit, but not sure where this could cause conflicts.

I'll try to reproduce this on a smaller setup. But I have to say, I've
seen a few of these bisection reports now that didn't seem to make any
sense, which is why I've started to take these with a grain of salt.

WARNING: multiple messages have this Message-ID (diff)

From: Johannes Weiner <hannes@cmpxchg.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "kernel test robot" <oliver.sang@intel.com>,
	"Roman Gushchin" <guro@fb.com>, "Michal Hocko" <mhocko@suse.com>,
	"Shakeel Butt" <shakeelb@google.com>,
	"Michal Koutný" <mkoutny@suse.com>,
	"Balbir Singh" <bsingharora@gmail.com>,
	"Tejun Heo" <tj@kernel.org>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	lkp@lists.01.org, "kernel test robot" <lkp@intel.com>,
	"Huang, Ying" <ying.huang@intel.com>,
	"Feng Tang" <feng.tang@intel.com>,
	"Zhengjun Xing" <zhengjun.xing@linux.intel.com>
Subject: Re: [mm] 2d146aa3aa: vm-scalability.throughput -36.4% regression
Date: Wed, 11 Aug 2021 16:12:01 -0400	[thread overview]
Message-ID: <YRQvEbMSUqIkuMnk@cmpxchg.org> (raw)
In-Reply-To: <CAHk-=wiSHHSuSQsCCLOxQA+cbcvjmEeMsTCMWPT1sFVngd9-ig@mail.gmail.com>

On Tue, Aug 10, 2021 at 07:59:53PM -1000, Linus Torvalds wrote:
> On Tue, Aug 10, 2021 at 4:59 PM kernel test robot <oliver.sang@intel.com> wrote:
> >
> > FYI, we noticed a -36.4% regression of vm-scalability.throughput due to commit:
> > 2d146aa3aa84 ("mm: memcontrol: switch to rstat")
> 
> Hmm. I guess some cost is to be expected, but that's a big regression.
> 
> I'm not sure what the code ends up doing, and how relevant this test
> is, but Johannes - could you please take a look?
> 
> I can't make heads nor tails of the profile. The profile kind of points at this:
> 
> >       2.77 ą 12%     +27.4       30.19 ą  8%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
> >       2.86 ą 12%     +27.4       30.29 ą  8%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
> >       2.77 ą 12%     +27.4       30.21 ą  8%  perf-profile.children.cycles-pp.lock_page_lruvec_irqsave
> >       4.26 ą 10%     +28.1       32.32 ą  7%  perf-profile.children.cycles-pp.lru_cache_add
> >       4.15 ą 10%     +28.2       32.33 ą  7%  perf-profile.children.cycles-pp.__pagevec_lru_add
> 
> and that seems to be from the chain __do_fault -> shmem_fault ->
> shmem_getpage_gfp -> lru_cache_add -> __pagevec_lru_add ->
> lock_page_lruvec_irqsave -> _raw_spin_lock_irqsave ->
> native_queued_spin_lock_slowpath.
> 
> That shmem_fault codepath being hot may make sense for sokme VM
> scalability test. But it seems to make little sense when I look at the
> commit that it bisected to.
> 
> We had another report of this commit causing a much more reasonable
> small slowdown (-2.8%) back in May.
> 
> I'm not sure what's up with this new report. Johannes, does this make
> sense to you?
> 
> Is it perhaps another "unlucky cache line placement" thing? Or has the
> statistics changes just changed the behavior of the test?

I'm at a loss as well.

The patch only changes how we aggregate the cgroup's memory.stat file,
it doesn't influence reclaim/LRU operations.

The test itself isn't interacting with memory.stat either - IIRC it
doesn't even run inside a dedicated cgroup in this test
environment. The patch should actually reduce accounting overhead here
because we switched from batched percpu flushing during updates to
only flushing when the stats are *read* - which doesn't happen here.

That would leave cachelines. But the cachelines the patch touched are
in struct mem_cgroup, whereas the lock this profile points out is in a
separately allocated per-node structure. The cache footprint on the
percpu data this test is hammering also didn't increase; it actually
decreased a bit, but not sure where this could cause conflicts.

I'll try to reproduce this on a smaller setup. But I have to say, I've
seen a few of these bisection reports now that didn't seem to make any
sense, which is why I've started to take these with a grain of salt.

next prev parent reply	other threads:[~2021-08-11 20:12 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-11  3:17 [mm] 2d146aa3aa: vm-scalability.throughput -36.4% regression kernel test robot
2021-08-11  3:17 ` kernel test robot
2021-08-11  5:59 ` Linus Torvalds
2021-08-11  5:59   ` Linus Torvalds
2021-08-11 20:12   ` Johannes Weiner [this message]
2021-08-11 20:12     ` Johannes Weiner
2021-08-12  3:19   ` Feng Tang
2021-08-12  3:19     ` Feng Tang
2021-08-16  3:28     ` Feng Tang
2021-08-16  3:28       ` Feng Tang
2021-08-16 21:41       ` Johannes Weiner
2021-08-16 21:41         ` Johannes Weiner
2021-08-17  2:45         ` Feng Tang
2021-08-17  2:45           ` Feng Tang
2021-08-17 16:47           ` Michal Koutný
2021-08-17 16:47             ` Michal Koutný
2021-08-17 17:10             ` Shakeel Butt
2021-08-17 17:10               ` Shakeel Butt
2021-08-18  2:30             ` Feng Tang
2021-08-18  2:30               ` Feng Tang
2021-08-30 14:51               ` Michal Koutný
2021-08-30 14:51                 ` Michal Koutný
2021-08-31  6:30                 ` Feng Tang
2021-08-31  6:30                   ` Feng Tang
2021-08-31  9:23                   ` Michal Koutný
2021-08-31  9:23                     ` Michal Koutný
2021-09-01  4:50                     ` Feng Tang
2021-09-01  4:50                       ` Feng Tang
2021-09-01 15:12                       ` Andi Kleen
2021-09-01 15:12                         ` Andi Kleen
2021-09-02  1:35                         ` Feng Tang
2021-09-02  1:35                           ` Feng Tang
2021-09-02  2:23                           ` Andi Kleen
2021-09-02  2:23                             ` Andi Kleen
2021-09-02  3:46                             ` Feng Tang
2021-09-02  3:46                               ` Feng Tang
2021-09-02 10:53                               ` Michal Koutný
2021-09-02 10:53                                 ` Michal Koutný
2021-09-02 13:39                                 ` Feng Tang
2021-09-02 13:39                                   ` Feng Tang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YRQvEbMSUqIkuMnk@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=lkp@lists.01.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.