From: Ying Han <yinghan@google.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: lsf@lists.linux-foundation.org, linux-mm@kvack.org,
"balbir@linux.vnet.ibm.com" <balbir@linux.vnet.ibm.com>,
Michal Hocko <mhocko@suse.cz>, Greg Thelen <gthelen@google.com>,
"minchan.kim@gmail.com" <minchan.kim@gmail.com>,
"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
walken@google.com
Subject: Re: [LSF][MM] rough agenda for memcg.
Date: Wed, 30 Mar 2011 23:03:36 -0700 [thread overview]
Message-ID: <BANLkTi=A5nnQDZRXKAz-b3DzrCw57nFDBQ@mail.gmail.com> (raw)
In-Reply-To: <20110331110113.a01f7b8b.kamezawa.hiroyu@jp.fujitsu.com>
[-- Attachment #1: Type: text/plain, Size: 5156 bytes --]
On Wed, Mar 30, 2011 at 7:01 PM, KAMEZAWA Hiroyuki <
kamezawa.hiroyu@jp.fujitsu.com> wrote:
>
> Hi,
>
> In this LSF/MM, we have some memcg topics in the 1st day.
>
> From schedule,
>
> 1. Memory cgroup : Where next ? 1hour (Balbir Singh/Kamezawa)
> 2. Memcg Dirty Limit and writeback 30min(Greg Thelen)
> 3. Memcg LRU management 30min (Ying Han, Michal Hocko)
> 4. Page cgroup on a diet (Johannes Weiner)
>
> 2.5 hours. This seems long...or short ? ;)
>
> I'd like to sort out topics before going. Please fix if I don't catch
> enough.
>
> mentiont to 1. later...
>
> Main topics on 2. Memcg Dirty Limit and writeback ....is
>
> a) How to implement per-memcg dirty inode finding method (list) and
> how flusher threads handle memcg.
>
> b) Hot to interact with IO-Less dirty page reclaim.
> IIUC, if memcg doesn't handle this correctly, OOM happens.
>
> Greg, do we need to have a shared session with I/O guys ?
> If needed, current schedule is O.K. ?
>
> Main topics on 3. Memcg LRU management
>
> a) Isolation/Gurantee for memcg.
> Current memcg doesn't have enough isolation when globarl reclaim runs.
> .....Because it's designed not to affect global reclaim.
> But from user's point of view, it's nonsense and we should have some
> hints
> for isolate set of memory or implement a guarantee.
>
> One way to go is updating softlimit better. To do this, we should know
> what
> is problem now. I'm sorry I can't prepare data on this until LSF/MM.
>
I generated example which shows the inefficiency of soft_limit reclaim,
which is so far based on the code
inspection. I am not sure if I can get some data before LSF.
> Another way is implementing a guarantee. But this will require some
> interaction
> with page allocator and pgscan mechanism. This will be a big work.
>
Not sure about this..
>
> b) single LRU and per memcg zone->lru_lock.
> I hear zone->lru_lock contention caused by memcg is a problem on Google
> servers.
> Okay, please show data. (I've never seen it.)
>
To clarify, the lock contention is bad after per-memcg background reclaim
patch. The worst case we have #-of-cpu per-memcg kswapd
reclaiming on per-memcg lru and all competing the zone->lru_lock.
--Ying
Then, we need to discuss Pros. and Cons. of current design and need to
> consinder
> how to improve it. I think Google and Michal have their own
> implementation.
>
> Current design of double-LRU is from the 1st inclusion of memcg to the
> kernel.
> But I don't know that discussion was there. Balbir, could you explain
> the reason
> of this design ? Then, we can go ahead, somewhere.
>
>
> Main topics on 4. Page cgroup on diet is...
>
> a) page_cgroup is too big!, we need diet....
> I think Johannes removes -> page pointer already. Ok, what's the next
> to
> be removed ?
>
> I guess the next candidate is ->lru which is related to 3-b).
>
> Main topics on 1.Memory control groups: where next? is..
>
> To be honest, I just do bug fixes in these days. And hot topics are on
> above..
> I don't have concrete topics. What I can think of from recent linux-mm
> emails are...
>
> a) Kernel memory accounting.
> b) Need some work with Cleancache ?
> c) Should we provide a auto memory cgroup for file caches ?
> (Then we can implement a file-cache-limit.)
> d) Do we have a problem with current OOM-disable+notifier design ?
> e) ROOT cgroup should have a limit/softlimit, again ?
> f) vm_overcommit_memory should be supproted with memcg ?
> (I remember there was a trial. But I think it should be done in other
> cgroup
> as vmemory cgroup.)
> ...
>
> I think
> a) discussing about this is too early. There is no patch.
> I think we'll just waste time.
> b) enable/disable cleancache per memcg or some share/limit ??
> But we can discuss this kind of things after cleancache is in
> production use...
>
> c) AFAIK, some other OSs have this kind of feature, a box for file-cache.
> Because file-cache is a shared object between all cgroups, it's
> difficult
> to handle. It may be better to have a auto cgroup for file caches and
> add knobs
> for memcg.
>
> d) I think it works well.
>
> e) It seems Michal wants this for lazy users. Hmm, should we have a knob ?
> It's helpful that some guy have a performance number on the latest
> kernel
> with and without memcg (in limitless case).
> IIUC, with THP enabled as 'always', the number of page fault
> dramatically reduced and
> memcg's accounting cost gets down...
>
> f) I think someone mention about this...
>
> Maybe c) and d) _can_ be a topic but seems not very important.
>
> So, for this slot, I'd like to discuss
>
> I) Softlimit/Isolation (was 3-A) for 1hour
> If we have extra time, kernel memory accounting or file-cache handling
> will be good.
>
> II) Dirty page handling. (for 30min)
> Maybe we'll discuss about per-memcg inode queueing issue.
>
> III) Discussing the current and future design of LRU.(for 30+min)
>
> IV) Diet of page_cgroup (for 30-min)
> Maybe this can be combined with III.
>
> Thanks,
> -Kame
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
[-- Attachment #2: Type: text/html, Size: 6545 bytes --]
next prev parent reply other threads:[~2011-03-31 6:03 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-31 2:01 [LSF][MM] rough agenda for memcg KAMEZAWA Hiroyuki
2011-03-31 5:52 ` Greg Thelen
2011-03-31 12:27 ` [Lsf] " Jan Kara
2011-03-31 6:01 ` Balbir Singh
2011-03-31 6:03 ` Ying Han [this message]
2011-03-31 9:15 ` Zhu Yanhai
2011-04-01 2:36 ` Michel Lespinasse
2011-03-31 9:23 ` [Lsf] " Pavel Emelyanov
2011-03-31 16:20 ` Andrea Arcangeli
2011-03-31 18:14 ` Ying Han
2011-03-31 19:00 ` Pavel Emelyanov
2011-03-31 19:16 ` Ying Han
2011-03-31 19:22 ` Ying Han
2011-03-31 15:59 ` Andrea Arcangeli
2011-03-31 20:06 ` Johannes Weiner
2011-04-01 3:18 ` Michel Lespinasse
2011-04-01 1:16 ` KAMEZAWA Hiroyuki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='BANLkTi=A5nnQDZRXKAz-b3DzrCw57nFDBQ@mail.gmail.com' \
--to=yinghan@google.com \
--cc=balbir@linux.vnet.ibm.com \
--cc=gthelen@google.com \
--cc=hannes@cmpxchg.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=lsf@lists.linux-foundation.org \
--cc=mhocko@suse.cz \
--cc=minchan.kim@gmail.com \
--cc=walken@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).