All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Liu <jeff.liu@oracle.com>
To: Michal Hocko <mhocko@suse.cz>
Cc: linux-mm@kvack.org, Glauber Costa <glommer@parallels.com>,
	handai.szj@taobao.com
Subject: Re: [PATCH v2 2/6] memcg: bypass swap accounting for the root memcg
Date: Thu, 31 Jan 2013 12:00:57 +0800	[thread overview]
Message-ID: <5109EC79.1050604@oracle.com> (raw)
In-Reply-To: <20130130162946.GA21253@dhcp22.suse.cz>

On 01/31/2013 12:29 AM, Michal Hocko wrote:
> On Thu 31-01-13 00:01:28, Jeff Liu wrote:
>> On 01/29/2013 10:13 PM, Michal Hocko wrote:
>>> On Mon 28-01-13 18:54:38, Jeff Liu wrote:
>>>> Root memcg with swap cgroup is special since we only do tracking
>>>> but can not set limits against it.  In order to facilitate
>>>> the implementation of the coming swap cgroup structures delay
>>>> allocation mechanism, we can bypass the default swap statistics
>>>> upon the root memcg and figure it out through the global stats
>>>> instead as below:
>>>>
>>>> root_memcg_swap_stat: total_swap_pages - nr_swap_pages -
>>>> used_swap_pages_of_all_memcgs
>>>
>>> How do you protect from races with swap{in,out}? Or they are
>>> tolerable?
> 
>> To be honest, I previously have not taken race with swapin/out into
>> consideration.
>>
>> Yes, this patch would cause a little error since it has to iterate
>> each memcg which can introduce a bit overhead based on how many memcgs
>> are configured.
>>
>> However, considering our current implementation of swap statistics, we
>> do account when swap cache is uncharged, but it is possible that the
>> swap slot is already allocated before that.
> 
> I am not sure I follow you here. I was merely interested in races while
> there is a swapping activity while the value is calculated. The errors,
> or let's say imprecision, shouldn't be big but it would be good to think
> how big it can be and how it can be reduced 
...
> (e.g. what if we start
> accounting for root once there is another group existing - this would
> solve the problem of delayed allocation and the imprecision as well).
At first, I also tried to account for root memcg swap in this sway, i.e.
return the root_memcg css_id from swap_cgroup_cmpxchg() &&
swap_cgroup_record() if there is no children memcg exists, however, it
caused the kernel panic with deadlock...
If return 0 from both functions in this case, no panic but it's
obviously that the root memcg swap accounting number is wrong, so that's
why I choose the current idea to bypass it...

I need some time to dig into the details anyway.
> 
>> That is to say, there is a inconsistent window in swap accounting stats IMHO.
>> As a figure shows to human, I think it can be tolerated to some extents. :)
>>>
>>>> memcg_total_swap_stats: root_memcg_swap_stat + other_memcg_swap_stats
>>>
>>> I am not sure I understand and if I do then it is not true:
>>> root (swap = 10M, use_hierarchy = 0/1)
>>>  \
>>>   A (swap = 1M, use_hierarchy = 1)
>>>    \
>>>     B (swap = 2M)
>>>
>>> total for A is 3M regardless of what root has "accounted" while
>>> total for root should be 10 for use_hierarchy = 0 and 13 for the
>>> other
>>
>> I am not sure I catch your point, but I think the total for root
>> should be 13 no matter use_hierarchy = 0 or 1, and the current patch
>> is just doing that.
> 
> I do not see any reason to make root different wrt. other roots of
> hierarchy. Anyway this is not important right now.
> 
>> Originally, for_each_mem_cgroup_tree(iter, memcg) does statistics by
>> iterating all those children memcgs including the memcg itself.  But
>> now, as we don't account the root memcg swap statistics anymore(hence
>> the stats is 0), we need to add the local swap stats of root memcg
>> itself(10M) to the memcg_total_swap_stats.  So actually we don't
>> change the way of accounting memcg_total_swap_stats.
> 
> I guess you are talking about tatal_ numbers. And yes, your patch
> doesn't change that.
Yes.

Thanks you!
-Jeff
>  
>>> case (this is btw. broken in the tree already now because
>>> for_each_mem_cgroup_tree resp. mem_cgroup_iter doesn't honor
>>> use_hierarchy for the root cgroup - this is a separate topic
>>> though).
>>
>> Yes, I noticed that the for_each_mem_cgroup_tree() resp,
>> mem_cgroup_iter() don't take the root->use_hierarchy into
>> consideration, as it has the following logic:
>> if (!root->use_hierarchy && root != root_mem_cgroup) {
>>  	if (prev)
>> 		return NULL;
>> 	return root;
>> }
>>
>> As i don't change the for_each_mem_cgroup_tree(), so it is in
>> accordance with the original behavior.
> 
> True, and I was just mentioning that as I noticed this only during
> the review. It wasn't meant to dispute your patch. Sorry if this wasn't
> clear enough. We have that behavior for ages and nobody complained so it
> is probably not worth fixing (especially when use_hierarchy is on the
> way out very sloooooowly).
> 
> [...]
> 
> Thanks
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-01-31  4:01 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-28 10:54 [PATCH v2 0/6] memcg: disable swap cgroup allocation at swapon Jeff Liu
2013-01-28 10:54 ` [PATCH v2 1/6] memcg: refactor swap_cgroup_swapon() Jeff Liu
     [not found]   ` <510658E6.9030108-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2013-01-29  9:15     ` Lord Glauber Costa of Sealand
2013-01-29  9:15       ` Lord Glauber Costa of Sealand
2013-01-29 13:41     ` Michal Hocko
2013-01-29 13:41       ` Michal Hocko
2013-01-28 10:54 ` [PATCH v2 2/6] memcg: bypass swap accounting for the root memcg Jeff Liu
2013-01-29 10:18   ` Lord Glauber Costa of Sealand
2013-01-31  6:18     ` Jeff Liu
2013-01-29 14:13   ` Michal Hocko
2013-01-30 16:01     ` Jeff Liu
2013-01-30 16:29       ` Michal Hocko
2013-01-31  4:00         ` Jeff Liu [this message]
2013-01-28 10:54 ` [PATCH v2 3/6] memcg: introduce memsw_accounting_users Jeff Liu
     [not found]   ` <510658F0.9050802-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2013-01-29  9:46     ` Lord Glauber Costa of Sealand
2013-01-29  9:46       ` Lord Glauber Costa of Sealand
2013-01-29 10:52       ` Jeff Liu
2013-01-29 14:26       ` Michal Hocko
2013-01-29 14:24     ` Michal Hocko
2013-01-29 14:24       ` Michal Hocko
2013-01-29 15:16       ` Jeff Liu
2013-01-28 10:54 ` [PATCH v2 4/6] memcg: export nr_swap_files Jeff Liu
     [not found]   ` <510658F6.4010504-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2013-01-29  9:47     ` Lord Glauber Costa of Sealand
2013-01-29  9:47       ` Lord Glauber Costa of Sealand
2013-01-29 14:31     ` Michal Hocko
2013-01-29 14:31       ` Michal Hocko
2013-01-29 15:17       ` Jeff Liu
2013-01-28 10:54 ` [PATCH v2 5/6] memcg: introduce swap_cgroup_init()/swap_cgroup_free() Jeff Liu
2013-01-29  9:57   ` Lord Glauber Costa of Sealand
2013-01-29 10:21     ` Jeff Liu
     [not found]   ` <510658F7.6050806-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2013-01-29 14:56     ` Michal Hocko
2013-01-29 14:56       ` Michal Hocko
2013-01-29 15:51       ` Jeff Liu
     [not found]         ` <5107F00E.7070302-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2013-01-29 16:09           ` Michal Hocko
2013-01-29 16:09             ` Michal Hocko
2013-01-28 10:54 ` [PATCH v2 6/6] memcg: init/free swap cgroup strucutres upon create/free child memcg Jeff Liu
     [not found]   ` <510658FC.50009-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2013-01-29  9:59     ` Lord Glauber Costa of Sealand
2013-01-29  9:59       ` Lord Glauber Costa of Sealand
2013-01-29 10:27       ` Jeff Liu
2013-01-29 15:11     ` Michal Hocko
2013-01-29 15:11       ` Michal Hocko
     [not found] ` <510658E3.1020306-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2013-01-29 15:15   ` [PATCH v2 0/6] memcg: disable swap cgroup allocation at swapon Michal Hocko
2013-01-29 15:15     ` Michal Hocko
2013-01-29 16:50     ` Jeff Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5109EC79.1050604@oracle.com \
    --to=jeff.liu@oracle.com \
    --cc=glommer@parallels.com \
    --cc=handai.szj@taobao.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.