All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Hansen <dave@sr71.net>
To: Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.com>, Hugh Dickins <hughd@google.com>,
	Tejun Heo <tj@kernel.org>,
	Vladimir Davydov <vdavydov@parallels.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>
Subject: regression caused by cgroups optimization in 3.17-rc2
Date: Tue, 02 Sep 2014 12:05:41 -0700	[thread overview]
Message-ID: <54061505.8020500@sr71.net> (raw)

I'm seeing a pretty large regression in 3.17-rc2 vs 3.16 coming from the
memory cgroups code.  This is on a kernel with cgroups enabled at
compile time, but not _used_ for anything.  See the green lines in the
graph:

	https://www.sr71.net/~dave/intel/regression-from-05b843012.png

The workload is a little parallel microbenchmark doing page faults:

> https://github.com/antonblanchard/will-it-scale/blob/master/tests/page_fault2.c

The hardware is an 8-socket Westmere box with 160 hardware threads.  For
some reason, this does not affect the version of the microbenchmark
which is doing completely anonymous page faults.

I bisected it down to this commit:

> commit 05b8430123359886ef6a4146fba384e30d771b3f
> Author: Johannes Weiner <hannes@cmpxchg.org>
> Date:   Wed Aug 6 16:05:59 2014 -0700
> 
>     mm: memcontrol: use root_mem_cgroup res_counter
>     
>     Due to an old optimization to keep expensive res_counter changes at a
>     minimum, the root_mem_cgroup res_counter is never charged; there is no
>     limit at that level anyway, and any statistics can be generated on
>     demand by summing up the counters of all other cgroups.
>     
>     However, with per-cpu charge caches, res_counter operations do not even
>     show up in profiles anymore, so this optimization is no longer
>     necessary.
>     
>     Remove it to simplify the code.

It does not revert cleanly because of the hunks below.  The code in
those hunks was removed, so I tried running without properly merging
them and it spews warnings because counter->usage is seen going negative.

So, it doesn't appear we can quickly revert this.

> --- mm/memcontrol.c
> +++ mm/memcontrol.c
> @@ -3943,7 +3947,7 @@
>          * replacement page, so leave it alone when phasing out the
>          * page that is unused after the migration.
>          */
> -       if (!end_migration)
> +       if (!end_migration && !mem_cgroup_is_root(memcg))
>                 mem_cgroup_do_uncharge(memcg, nr_pages, ctype);
>  
>         return memcg;
> @@ -4076,7 +4080,8 @@
>                  * We uncharge this because swap is freed.  This memcg can
>                  * be obsolete one. We avoid calling css_tryget_online().
>                  */
> -               res_counter_uncharge(&memcg->memsw, PAGE_SIZE);
> +               if (!mem_cgroup_is_root(memcg))
> +                       res_counter_uncharge(&memcg->memsw, PAGE_SIZE);
>                 mem_cgroup_swap_statistics(memcg, false);
>                 css_put(&memcg->css);
>         }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Dave Hansen <dave@sr71.net>
To: Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.com>, Hugh Dickins <hughd@google.com>,
	Tejun Heo <tj@kernel.org>,
	Vladimir Davydov <vdavydov@parallels.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>
Subject: regression caused by cgroups optimization in 3.17-rc2
Date: Tue, 02 Sep 2014 12:05:41 -0700	[thread overview]
Message-ID: <54061505.8020500@sr71.net> (raw)

I'm seeing a pretty large regression in 3.17-rc2 vs 3.16 coming from the
memory cgroups code.  This is on a kernel with cgroups enabled at
compile time, but not _used_ for anything.  See the green lines in the
graph:

	https://www.sr71.net/~dave/intel/regression-from-05b843012.png

The workload is a little parallel microbenchmark doing page faults:

> https://github.com/antonblanchard/will-it-scale/blob/master/tests/page_fault2.c

The hardware is an 8-socket Westmere box with 160 hardware threads.  For
some reason, this does not affect the version of the microbenchmark
which is doing completely anonymous page faults.

I bisected it down to this commit:

> commit 05b8430123359886ef6a4146fba384e30d771b3f
> Author: Johannes Weiner <hannes@cmpxchg.org>
> Date:   Wed Aug 6 16:05:59 2014 -0700
> 
>     mm: memcontrol: use root_mem_cgroup res_counter
>     
>     Due to an old optimization to keep expensive res_counter changes at a
>     minimum, the root_mem_cgroup res_counter is never charged; there is no
>     limit at that level anyway, and any statistics can be generated on
>     demand by summing up the counters of all other cgroups.
>     
>     However, with per-cpu charge caches, res_counter operations do not even
>     show up in profiles anymore, so this optimization is no longer
>     necessary.
>     
>     Remove it to simplify the code.

It does not revert cleanly because of the hunks below.  The code in
those hunks was removed, so I tried running without properly merging
them and it spews warnings because counter->usage is seen going negative.

So, it doesn't appear we can quickly revert this.

> --- mm/memcontrol.c
> +++ mm/memcontrol.c
> @@ -3943,7 +3947,7 @@
>          * replacement page, so leave it alone when phasing out the
>          * page that is unused after the migration.
>          */
> -       if (!end_migration)
> +       if (!end_migration && !mem_cgroup_is_root(memcg))
>                 mem_cgroup_do_uncharge(memcg, nr_pages, ctype);
>  
>         return memcg;
> @@ -4076,7 +4080,8 @@
>                  * We uncharge this because swap is freed.  This memcg can
>                  * be obsolete one. We avoid calling css_tryget_online().
>                  */
> -               res_counter_uncharge(&memcg->memsw, PAGE_SIZE);
> +               if (!mem_cgroup_is_root(memcg))
> +                       res_counter_uncharge(&memcg->memsw, PAGE_SIZE);
>                 mem_cgroup_swap_statistics(memcg, false);
>                 css_put(&memcg->css);
>         }

             reply	other threads:[~2014-09-02 19:05 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-02 19:05 Dave Hansen [this message]
2014-09-02 19:05 ` regression caused by cgroups optimization in 3.17-rc2 Dave Hansen
2014-09-02 20:18 ` Dave Hansen
2014-09-02 20:57   ` Dave Hansen
2014-09-02 20:57     ` Dave Hansen
2014-09-04 14:27     ` Michal Hocko
2014-09-04 14:27       ` Michal Hocko
2014-09-04 20:27       ` Dave Hansen
2014-09-04 20:27         ` Dave Hansen
2014-09-04 22:53         ` Dave Hansen
2014-09-04 22:53           ` Dave Hansen
2014-09-05  9:28           ` Michal Hocko
2014-09-05  9:28             ` Michal Hocko
2014-09-05  9:25         ` Michal Hocko
2014-09-05  9:25           ` Michal Hocko
2014-09-05 14:47           ` Johannes Weiner
2014-09-05 14:47             ` Johannes Weiner
2014-09-05 15:39             ` Michal Hocko
2014-09-05 15:39               ` Michal Hocko
2014-09-10 16:29           ` Michal Hocko
2014-09-10 16:29             ` Michal Hocko
2014-09-10 16:57             ` Dave Hansen
2014-09-10 16:57               ` Dave Hansen
2014-09-10 17:05               ` Michal Hocko
2014-09-10 17:05                 ` Michal Hocko
2014-09-05 12:35         ` Johannes Weiner
2014-09-05 12:35           ` Johannes Weiner
2014-09-08 15:47           ` Dave Hansen
2014-09-08 15:47             ` Dave Hansen
2014-09-09 14:50             ` Johannes Weiner
2014-09-09 14:50               ` Johannes Weiner
2014-09-09 18:23               ` Dave Hansen
2014-09-09 18:23                 ` Dave Hansen
2014-09-02 22:18 ` Johannes Weiner
2014-09-02 22:18   ` Johannes Weiner
2014-09-02 22:36   ` Dave Hansen
2014-09-03  0:10     ` Johannes Weiner
2014-09-03  0:10       ` Johannes Weiner
2014-09-03  0:20       ` Linus Torvalds
2014-09-03  0:20         ` Linus Torvalds
2014-09-03  1:33         ` Johannes Weiner
2014-09-03  1:33           ` Johannes Weiner
2014-09-03  3:15           ` Dave Hansen
2014-09-03  3:15             ` Dave Hansen
2014-09-03  0:30       ` Dave Hansen
2014-09-03  0:30         ` Dave Hansen
2014-09-04 15:08         ` Johannes Weiner
2014-09-04 15:08           ` Johannes Weiner
2014-09-04 20:50           ` Dave Hansen
2014-09-04 20:50             ` Dave Hansen
2014-09-05  8:04           ` Michal Hocko
2014-09-05  8:04             ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54061505.8020500@sr71.net \
    --to=dave@sr71.net \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vdavydov@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.