From: Tejun Heo <tj@kernel.org>
To: mhocko@suse.cz, hannes@cmpxchg.org, bsingharora@gmail.com
Cc: cgroups@vger.kernel.org, linux-mm@kvack.org, lizefan@huawei.com
Subject: [PATCHSET] memcg: fix and reimplement iterator
Date: Mon, 3 Jun 2013 17:44:36 -0700 [thread overview]
Message-ID: <1370306679-13129-1-git-send-email-tj@kernel.org> (raw)
mem_cgroup_iter() wraps around cgroup_next_descendant_pre() to provide
pre-order walk of memcg hierarchy. In addition to normal walk, it
also implements shared iteration keyed by zone, node and priority so
that multiple reclaimers don't end up hitting on the same nodes.
Reclaimers working on the same zone, node and priority will push the
same iterator forward.
Unfortunately, the way this is implemented is disturbingly
complicated. It ends up implementing pretty unique synchronization
construct inside memcg which is never a good sign for any subsystem.
While the implemented sychronization is overly elaborate and fragile,
the intention behind it is understandable as previously cgroup
iterator required each iteration to be contained inside a single RCU
read critical section disallowing implementation of saner mechanism.
To work around the limitation, memcg developed this Rube Goldberg
machine to detect whether the cached last cgroup is still alive, which
of course was ever so subtly broken.
Now that cgroup iterations can survive being dropped out of RCU read
critical section, this can be made a lot simpler. This patchset
contains the following three patches.
0001-memcg-fix-subtle-memory-barrier-bug-in-mem_cgroup_it.patch
0002-memcg-restructure-mem_cgroup_iter.patch
0003-memcg-simplify-mem_cgroup_reclaim_iter.patch
0001 is fix for a subtle memory barrier bug in the current
implementation. Should be applied to for-3.10-fixes and backported
through -stable. In general, memory barriers are bad ideas. Please
don't do it unless utterly necessary, and, if you're doing it, please
add ample documentation explaining how they're paired and what they're
achieving. Documenting is often extremely helpful for the implementor
oneself too because one ends up looking at and thinking about things a
lot more carefully.
0002 restructure mem_cgroup_iter() so that it's easier to follow and
change.
0003 reimplements the iterator sharing so that it's simpler and more
conventional. It depends on the new cgroup iterator updates.
This patchset is on top of cgroup/for-3.11[1] which contains the
iterator updates this patchset depends upon and available in the
following git branch.
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-memcg-simpler-iter
Lightly tested. Proceed with caution.
mm/memcontrol.c | 134 ++++++++++++++++++++++----------------------------------
1 file changed, 54 insertions(+), 80 deletions(-)
Thanks.
--
tejun
[1] git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git for-3.11
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next reply other threads:[~2013-06-04 0:44 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-04 0:44 Tejun Heo [this message]
2013-06-04 0:44 ` [PATCH 1/3] memcg: fix subtle memory barrier bug in mem_cgroup_iter() Tejun Heo
2013-06-04 13:03 ` Michal Hocko
2013-06-04 13:58 ` Johannes Weiner
2013-06-04 15:29 ` Michal Hocko
2013-06-04 0:44 ` [PATCH 2/3] memcg: restructure mem_cgroup_iter() Tejun Heo
2013-06-04 13:21 ` Michal Hocko
2013-06-04 20:51 ` Tejun Heo
2013-06-04 0:44 ` [PATCH 3/3] memcg: simplify mem_cgroup_reclaim_iter Tejun Heo
2013-06-04 13:18 ` Michal Hocko
2013-06-04 20:50 ` Tejun Heo
2013-06-04 21:28 ` Michal Hocko
2013-06-04 21:55 ` Tejun Heo
2013-06-05 7:30 ` Michal Hocko
2013-06-05 8:20 ` Tejun Heo
2013-06-05 8:36 ` Michal Hocko
2013-06-05 8:44 ` Tejun Heo
2013-06-05 8:55 ` Michal Hocko
2013-06-05 9:03 ` Tejun Heo
2013-06-05 14:39 ` Johannes Weiner
2013-06-05 14:50 ` Johannes Weiner
2013-06-05 14:56 ` Michal Hocko
2013-06-05 17:22 ` Tejun Heo
2013-06-05 19:45 ` Johannes Weiner
2013-06-05 20:06 ` Tejun Heo
2013-06-05 21:17 ` Johannes Weiner
2013-06-05 22:20 ` Tejun Heo
2013-06-05 22:27 ` Tejun Heo
2013-06-06 11:50 ` Michal Hocko
2013-06-07 0:52 ` Tejun Heo
2013-06-07 7:37 ` Michal Hocko
2013-06-07 23:25 ` Tejun Heo
2013-06-10 8:02 ` Michal Hocko
2013-06-10 19:54 ` Tejun Heo
2013-06-10 20:48 ` Michal Hocko
2013-06-10 23:13 ` Tejun Heo
2013-06-11 7:27 ` Michal Hocko
2013-06-11 7:44 ` Tejun Heo
2013-06-11 7:55 ` Michal Hocko
2013-06-11 8:00 ` Tejun Heo
2013-06-04 21:40 ` Johannes Weiner
2013-06-04 21:49 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1370306679-13129-1-git-send-email-tj@kernel.org \
--to=tj@kernel.org \
--cc=bsingharora@gmail.com \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=linux-mm@kvack.org \
--cc=lizefan@huawei.com \
--cc=mhocko@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).