linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Liu <jeff.liu@oracle.com>
To: Michal Hocko <mhocko@suse.cz>
Cc: linux-mm@kvack.org, Glauber Costa <glommer@parallels.com>,
	handai.szj@taobao.com
Subject: Re: [PATCH v2 2/6] memcg: bypass swap accounting for the root memcg
Date: Thu, 31 Jan 2013 12:00:57 +0800	[thread overview]
Message-ID: <5109EC79.1050604@oracle.com> (raw)
In-Reply-To: <20130130162946.GA21253@dhcp22.suse.cz>

On 01/31/2013 12:29 AM, Michal Hocko wrote:
> On Thu 31-01-13 00:01:28, Jeff Liu wrote:
>> On 01/29/2013 10:13 PM, Michal Hocko wrote:
>>> On Mon 28-01-13 18:54:38, Jeff Liu wrote:
>>>> Root memcg with swap cgroup is special since we only do tracking
>>>> but can not set limits against it.  In order to facilitate
>>>> the implementation of the coming swap cgroup structures delay
>>>> allocation mechanism, we can bypass the default swap statistics
>>>> upon the root memcg and figure it out through the global stats
>>>> instead as below:
>>>>
>>>> root_memcg_swap_stat: total_swap_pages - nr_swap_pages -
>>>> used_swap_pages_of_all_memcgs
>>>
>>> How do you protect from races with swap{in,out}? Or they are
>>> tolerable?
> 
>> To be honest, I previously have not taken race with swapin/out into
>> consideration.
>>
>> Yes, this patch would cause a little error since it has to iterate
>> each memcg which can introduce a bit overhead based on how many memcgs
>> are configured.
>>
>> However, considering our current implementation of swap statistics, we
>> do account when swap cache is uncharged, but it is possible that the
>> swap slot is already allocated before that.
> 
> I am not sure I follow you here. I was merely interested in races while
> there is a swapping activity while the value is calculated. The errors,
> or let's say imprecision, shouldn't be big but it would be good to think
> how big it can be and how it can be reduced 
...
> (e.g. what if we start
> accounting for root once there is another group existing - this would
> solve the problem of delayed allocation and the imprecision as well).
At first, I also tried to account for root memcg swap in this sway, i.e.
return the root_memcg css_id from swap_cgroup_cmpxchg() &&
swap_cgroup_record() if there is no children memcg exists, however, it
caused the kernel panic with deadlock...
If return 0 from both functions in this case, no panic but it's
obviously that the root memcg swap accounting number is wrong, so that's
why I choose the current idea to bypass it...

I need some time to dig into the details anyway.
> 
>> That is to say, there is a inconsistent window in swap accounting stats IMHO.
>> As a figure shows to human, I think it can be tolerated to some extents. :)
>>>
>>>> memcg_total_swap_stats: root_memcg_swap_stat + other_memcg_swap_stats
>>>
>>> I am not sure I understand and if I do then it is not true:
>>> root (swap = 10M, use_hierarchy = 0/1)
>>>  \
>>>   A (swap = 1M, use_hierarchy = 1)
>>>    \
>>>     B (swap = 2M)
>>>
>>> total for A is 3M regardless of what root has "accounted" while
>>> total for root should be 10 for use_hierarchy = 0 and 13 for the
>>> other
>>
>> I am not sure I catch your point, but I think the total for root
>> should be 13 no matter use_hierarchy = 0 or 1, and the current patch
>> is just doing that.
> 
> I do not see any reason to make root different wrt. other roots of
> hierarchy. Anyway this is not important right now.
> 
>> Originally, for_each_mem_cgroup_tree(iter, memcg) does statistics by
>> iterating all those children memcgs including the memcg itself.  But
>> now, as we don't account the root memcg swap statistics anymore(hence
>> the stats is 0), we need to add the local swap stats of root memcg
>> itself(10M) to the memcg_total_swap_stats.  So actually we don't
>> change the way of accounting memcg_total_swap_stats.
> 
> I guess you are talking about tatal_ numbers. And yes, your patch
> doesn't change that.
Yes.

Thanks you!
-Jeff
>  
>>> case (this is btw. broken in the tree already now because
>>> for_each_mem_cgroup_tree resp. mem_cgroup_iter doesn't honor
>>> use_hierarchy for the root cgroup - this is a separate topic
>>> though).
>>
>> Yes, I noticed that the for_each_mem_cgroup_tree() resp,
>> mem_cgroup_iter() don't take the root->use_hierarchy into
>> consideration, as it has the following logic:
>> if (!root->use_hierarchy && root != root_mem_cgroup) {
>>  	if (prev)
>> 		return NULL;
>> 	return root;
>> }
>>
>> As i don't change the for_each_mem_cgroup_tree(), so it is in
>> accordance with the original behavior.
> 
> True, and I was just mentioning that as I noticed this only during
> the review. It wasn't meant to dispute your patch. Sorry if this wasn't
> clear enough. We have that behavior for ages and nobody complained so it
> is probably not worth fixing (especially when use_hierarchy is on the
> way out very sloooooowly).
> 
> [...]
> 
> Thanks
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-01-31  4:01 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-28 10:54 [PATCH v2 0/6] memcg: disable swap cgroup allocation at swapon Jeff Liu
2013-01-28 10:54 ` [PATCH v2 1/6] memcg: refactor swap_cgroup_swapon() Jeff Liu
2013-01-29  9:15   ` Lord Glauber Costa of Sealand
2013-01-29 13:41   ` Michal Hocko
2013-01-28 10:54 ` [PATCH v2 2/6] memcg: bypass swap accounting for the root memcg Jeff Liu
2013-01-29 10:18   ` Lord Glauber Costa of Sealand
2013-01-31  6:18     ` Jeff Liu
2013-01-29 14:13   ` Michal Hocko
2013-01-30 16:01     ` Jeff Liu
2013-01-30 16:29       ` Michal Hocko
2013-01-31  4:00         ` Jeff Liu [this message]
2013-01-28 10:54 ` [PATCH v2 3/6] memcg: introduce memsw_accounting_users Jeff Liu
2013-01-29  9:46   ` Lord Glauber Costa of Sealand
2013-01-29 10:52     ` Jeff Liu
2013-01-29 14:26     ` Michal Hocko
2013-01-29 14:24   ` Michal Hocko
2013-01-29 15:16     ` Jeff Liu
2013-01-28 10:54 ` [PATCH v2 4/6] memcg: export nr_swap_files Jeff Liu
2013-01-29  9:47   ` Lord Glauber Costa of Sealand
2013-01-29 14:31   ` Michal Hocko
2013-01-29 15:17     ` Jeff Liu
2013-01-28 10:54 ` [PATCH v2 5/6] memcg: introduce swap_cgroup_init()/swap_cgroup_free() Jeff Liu
2013-01-29  9:57   ` Lord Glauber Costa of Sealand
2013-01-29 10:21     ` Jeff Liu
2013-01-29 14:56   ` Michal Hocko
2013-01-29 15:51     ` Jeff Liu
2013-01-29 16:09       ` Michal Hocko
2013-01-28 10:54 ` [PATCH v2 6/6] memcg: init/free swap cgroup strucutres upon create/free child memcg Jeff Liu
2013-01-29  9:59   ` Lord Glauber Costa of Sealand
2013-01-29 10:27     ` Jeff Liu
2013-01-29 15:11   ` Michal Hocko
2013-01-29 15:15 ` [PATCH v2 0/6] memcg: disable swap cgroup allocation at swapon Michal Hocko
2013-01-29 16:50   ` Jeff Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5109EC79.1050604@oracle.com \
    --to=jeff.liu@oracle.com \
    --cc=glommer@parallels.com \
    --cc=handai.szj@taobao.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).