linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [RFC] Shared page accounting for memory cgroup
Date: Mon, 18 Jan 2010 13:56:44 +0530	[thread overview]
Message-ID: <4B541B44.3090407@linux.vnet.ibm.com> (raw)
In-Reply-To: <20100118094920.151e1370.nishimura@mxp.nes.nec.co.jp>

On Monday 18 January 2010 06:19 AM, Daisuke Nishimura wrote:
> On Mon, 18 Jan 2010 01:00:44 +0530, Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
>> On Fri, Jan 8, 2010 at 5:17 AM, KAMEZAWA Hiroyuki
>> <kamezawa.hiroyu@jp.fujitsu.com> wrote:
>>> On Thu, 7 Jan 2010 14:57:36 +0530
>>> Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
>>>
>>>> * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2010-01-07 18:08:00]:
>>>>
>>>>> On Thu, 7 Jan 2010 17:48:14 +0900
>>>>> KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
>>>>>>>> "How pages are shared" doesn't show good hints. I don't hear such parameter
>>>>>>>> is used in production's resource monitoring software.
>>>>>>>>
>>>>>>>
>>>>>>> You mean "How many pages are shared" are not good hints, please see my
>>>>>>> justification above. With Virtualization (look at KSM for example),
>>>>>>> shared pages are going to be increasingly important part of the
>>>>>>> accounting.
>>>>>>>
>>>>>>
>>>>>> Considering KSM, your cuounting style is tooo bad.
>>>>>>
>>>>>> You should add
>>>>>>
>>>>>>  - MEM_CGROUP_STAT_SHARED_BY_KSM
>>>>>>  - MEM_CGROUP_STAT_FOR_TMPFS/SYSV_IPC_SHMEM
>>>>>>
>>>>
>>>> No.. I am just talking about shared memory being important and shared
>>>> accounting being useful, no counters for KSM in particular (in the
>>>> memcg context).
>>>>
>>> Think so ? The number of memcg-private pages is in interest in my point of view.
>>>
>>> Anyway, I don't change my opinion as "sum of rss" is not necessary to be calculated
>>> in the kernel.
>>> If you want to provide that in memcg, please add it to global VM as /proc/meminfo.
>>>
>>> IIUC, KSM/SHMEM has some official method in global VM.
>>>
>>
>> Kamezawa-San,
>>
>> I implemented the same in user space and I get really bad results, here is why
>>
>> 1. I need to hold and walk the tasks list in cgroups and extract RSS
>> through /proc (results in worse hold times for the fork() scenario you
>> menioned)
>> 2. The data is highly inconsistent due to the higher margin of error
>> in accumulating data which is changing as we run. By the time we total
>> and look at the memcg data, the data is stale
>>
>> Would you be OK with the patch, if I renamed "shared_usage_in_bytes"
>> to "non_private_usage_in_bytes"?
>>
> I think the name is still ambiguous.
> 
> For example, if process A belongs to /cgroup/memory/01 and process B to /cgroup/memory/02,
> both process have 10MB anonymous pages and 10MB file caches of the same pages, and all of the
> file caches are charged to 01.
> In this case, the value in 01 is 0MB(=20MB - 20MB) and 10MB(20MB - 10MB), right?
> 

Correct, file cache is almost always considered shared, so it has

1. non-private or shared usage of 10MB
2. 10 MB of file cache

> I don't think "non private usage" is appropriate to this value.
> Why don't you just show "sum_of_each_process_rss" ? I think it would be easier
> to understand for users.

Here is my concern

1. The gap between looking at memcg stat and sum of all RSS is way
higher in user space
2. Summing up all rss without walking the tasks atomically can and
will lead to consistency issues. Data can be stale as long as it
represents a consistent snapshot of data

We need to differentiate between

1. Data snapshot (taken at a time, but valid at that point)
2. Data taken from different sources that does not form a uniform
snapshot, because the timestamping of the each of the collected data
items is different


> But, hmm, I don't see any strong reason to do this in kernel, then :(

Please see my reason above for doing it in the kernel.

Balbir

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-01-18  8:26 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-29 18:27 [RFC] Shared page accounting for memory cgroup Balbir Singh
2010-01-03 23:51 ` KAMEZAWA Hiroyuki
2010-01-04  0:07   ` Balbir Singh
2010-01-04  0:35     ` KAMEZAWA Hiroyuki
2010-01-04  0:50       ` Balbir Singh
2010-01-06  4:02         ` KAMEZAWA Hiroyuki
2010-01-06  7:01           ` Balbir Singh
2010-01-06  7:12             ` KAMEZAWA Hiroyuki
2010-01-07  7:15               ` Balbir Singh
2010-01-07  7:36                 ` KAMEZAWA Hiroyuki
2010-01-07  8:34                   ` Balbir Singh
2010-01-07  8:48                     ` KAMEZAWA Hiroyuki
2010-01-07  9:08                       ` KAMEZAWA Hiroyuki
2010-01-07  9:27                         ` Balbir Singh
2010-01-07 23:47                           ` KAMEZAWA Hiroyuki
2010-01-17 19:30                             ` Balbir Singh
2010-01-18  0:05                               ` KAMEZAWA Hiroyuki
2010-01-18  0:22                                 ` KAMEZAWA Hiroyuki
2010-01-18  0:49                               ` Daisuke Nishimura
2010-01-18  8:26                                 ` Balbir Singh [this message]
2010-01-19  1:22                                   ` Daisuke Nishimura
2010-01-19  1:49                                     ` Balbir Singh
2010-01-19  2:34                                       ` Daisuke Nishimura
2010-01-19  3:52                                         ` Balbir Singh
2010-01-20  4:09                                           ` Daisuke Nishimura
2010-01-20  7:15                                             ` Daisuke Nishimura
2010-01-20  7:43                                               ` KAMEZAWA Hiroyuki
2010-01-20  8:18                                               ` Balbir Singh
2010-01-20  8:17                                             ` Balbir Singh
2010-01-21  1:04                                               ` Daisuke Nishimura
2010-01-21  1:30                                                 ` KAMEZAWA Hiroyuki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B541B44.3090407@linux.vnet.ibm.com \
    --to=balbir@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nishimura@mxp.nes.nec.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).