From: Michal Hocko <mhocko@suse.cz>
To: Jeff Liu <jeff.liu@oracle.com>
Cc: linux-mm@kvack.org, Glauber Costa <glommer@parallels.com>,
handai.szj@taobao.com
Subject: Re: [PATCH v2 2/6] memcg: bypass swap accounting for the root memcg
Date: Wed, 30 Jan 2013 17:29:46 +0100 [thread overview]
Message-ID: <20130130162946.GA21253@dhcp22.suse.cz> (raw)
In-Reply-To: <510943D8.9000902@oracle.com>
On Thu 31-01-13 00:01:28, Jeff Liu wrote:
> On 01/29/2013 10:13 PM, Michal Hocko wrote:
> > On Mon 28-01-13 18:54:38, Jeff Liu wrote:
> >> Root memcg with swap cgroup is special since we only do tracking
> >> but can not set limits against it. In order to facilitate
> >> the implementation of the coming swap cgroup structures delay
> >> allocation mechanism, we can bypass the default swap statistics
> >> upon the root memcg and figure it out through the global stats
> >> instead as below:
> >>
> >> root_memcg_swap_stat: total_swap_pages - nr_swap_pages -
> >> used_swap_pages_of_all_memcgs
> >
> > How do you protect from races with swap{in,out}? Or they are
> > tolerable?
> To be honest, I previously have not taken race with swapin/out into
> consideration.
>
> Yes, this patch would cause a little error since it has to iterate
> each memcg which can introduce a bit overhead based on how many memcgs
> are configured.
>
> However, considering our current implementation of swap statistics, we
> do account when swap cache is uncharged, but it is possible that the
> swap slot is already allocated before that.
I am not sure I follow you here. I was merely interested in races while
there is a swapping activity while the value is calculated. The errors,
or let's say imprecision, shouldn't be big but it would be good to think
how big it can be and how it can be reduced (e.g. what if we start
accounting for root once there is another group existing - this would
solve the problem of delayed allocation and the imprecision as well).
> That is to say, there is a inconsistent window in swap accounting stats IMHO.
> As a figure shows to human, I think it can be tolerated to some extents. :)
> >
> >> memcg_total_swap_stats: root_memcg_swap_stat + other_memcg_swap_stats
> >
> > I am not sure I understand and if I do then it is not true:
> > root (swap = 10M, use_hierarchy = 0/1)
> > \
> > A (swap = 1M, use_hierarchy = 1)
> > \
> > B (swap = 2M)
> >
> > total for A is 3M regardless of what root has "accounted" while
> > total for root should be 10 for use_hierarchy = 0 and 13 for the
> > other
>
> I am not sure I catch your point, but I think the total for root
> should be 13 no matter use_hierarchy = 0 or 1, and the current patch
> is just doing that.
I do not see any reason to make root different wrt. other roots of
hierarchy. Anyway this is not important right now.
> Originally, for_each_mem_cgroup_tree(iter, memcg) does statistics by
> iterating all those children memcgs including the memcg itself. But
> now, as we don't account the root memcg swap statistics anymore(hence
> the stats is 0), we need to add the local swap stats of root memcg
> itself(10M) to the memcg_total_swap_stats. So actually we don't
> change the way of accounting memcg_total_swap_stats.
I guess you are talking about tatal_ numbers. And yes, your patch
doesn't change that.
> > case (this is btw. broken in the tree already now because
> > for_each_mem_cgroup_tree resp. mem_cgroup_iter doesn't honor
> > use_hierarchy for the root cgroup - this is a separate topic
> > though).
>
> Yes, I noticed that the for_each_mem_cgroup_tree() resp,
> mem_cgroup_iter() don't take the root->use_hierarchy into
> consideration, as it has the following logic:
> if (!root->use_hierarchy && root != root_mem_cgroup) {
> if (prev)
> return NULL;
> return root;
> }
>
> As i don't change the for_each_mem_cgroup_tree(), so it is in
> accordance with the original behavior.
True, and I was just mentioning that as I noticed this only during
the review. It wasn't meant to dispute your patch. Sorry if this wasn't
clear enough. We have that behavior for ages and nobody complained so it
is probably not worth fixing (especially when use_hierarchy is on the
way out very sloooooowly).
[...]
Thanks
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-01-30 16:30 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-28 10:54 [PATCH v2 0/6] memcg: disable swap cgroup allocation at swapon Jeff Liu
2013-01-28 10:54 ` [PATCH v2 1/6] memcg: refactor swap_cgroup_swapon() Jeff Liu
2013-01-29 9:15 ` Lord Glauber Costa of Sealand
2013-01-29 13:41 ` Michal Hocko
2013-01-28 10:54 ` [PATCH v2 2/6] memcg: bypass swap accounting for the root memcg Jeff Liu
2013-01-29 10:18 ` Lord Glauber Costa of Sealand
2013-01-31 6:18 ` Jeff Liu
2013-01-29 14:13 ` Michal Hocko
2013-01-30 16:01 ` Jeff Liu
2013-01-30 16:29 ` Michal Hocko [this message]
2013-01-31 4:00 ` Jeff Liu
2013-01-28 10:54 ` [PATCH v2 3/6] memcg: introduce memsw_accounting_users Jeff Liu
2013-01-29 9:46 ` Lord Glauber Costa of Sealand
2013-01-29 10:52 ` Jeff Liu
2013-01-29 14:26 ` Michal Hocko
2013-01-29 14:24 ` Michal Hocko
2013-01-29 15:16 ` Jeff Liu
2013-01-28 10:54 ` [PATCH v2 4/6] memcg: export nr_swap_files Jeff Liu
2013-01-29 9:47 ` Lord Glauber Costa of Sealand
2013-01-29 14:31 ` Michal Hocko
2013-01-29 15:17 ` Jeff Liu
2013-01-28 10:54 ` [PATCH v2 5/6] memcg: introduce swap_cgroup_init()/swap_cgroup_free() Jeff Liu
2013-01-29 9:57 ` Lord Glauber Costa of Sealand
2013-01-29 10:21 ` Jeff Liu
2013-01-29 14:56 ` Michal Hocko
2013-01-29 15:51 ` Jeff Liu
2013-01-29 16:09 ` Michal Hocko
2013-01-28 10:54 ` [PATCH v2 6/6] memcg: init/free swap cgroup strucutres upon create/free child memcg Jeff Liu
2013-01-29 9:59 ` Lord Glauber Costa of Sealand
2013-01-29 10:27 ` Jeff Liu
2013-01-29 15:11 ` Michal Hocko
2013-01-29 15:15 ` [PATCH v2 0/6] memcg: disable swap cgroup allocation at swapon Michal Hocko
2013-01-29 16:50 ` Jeff Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130130162946.GA21253@dhcp22.suse.cz \
--to=mhocko@suse.cz \
--cc=glommer@parallels.com \
--cc=handai.szj@taobao.com \
--cc=jeff.liu@oracle.com \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).