From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx131.postini.com [74.125.245.131]) by kanga.kvack.org (Postfix) with SMTP id F13AB6B0029 for ; Tue, 5 Feb 2013 07:35:19 -0500 (EST) Date: Tue, 5 Feb 2013 13:35:15 +0100 From: Michal Hocko Subject: [LSF/MM TOPIC] Few things I would like to discuss Message-ID: <20130205123515.GA26229@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-linux-mm@kvack.org List-ID: To: lsf-pc@lists.linux-foundation.org Cc: linux-mm@kvack.org Hi, I would like to discuss the following topics: * memcg oom should be more sensitive to locked contexts because now it is possible that a task is sitting in mem_cgroup_handle_oom holding some other lock (e.g. i_mutex or mmap_sem) up the chain which might block other task to terminate on OOM so we basically end up in a deadlock. Almost all memcg charges happen from the page fault path where we can retry but one class of them happen from add_to_page_cache_locked and that is a bit more problematic. * memcg doesn't use PF_MEMALLOC for the targeted reclaim code paths which asks for stack overflows (and we have already seen those - e.g. from the xfs pageout paths). The primary problem to use the flag is that there is no dirty pages throttling and writeback kicked out for memcg so if we didn't writeback from the reclaim the caller could be blocked for ever. Memcg dirty accounting is shaping slowly so we should start thinking about the writeback as well. * While we are at the memcg dirty pages accounting (https://lkml.org/lkml/2012/12/25/95). It turned out that the locking is really nasty (https://lkml.org/lkml/2013/1/2/48). The locking should be reworked without incurring any penalty on the fast path. This sounds really challenging. * I would really like to finally settle down on something wrt. soft limit reclaim. I am pretty sure Ying would like to discuss this topic as well so I will not go into details about it. I will post what I have before the conference so that we can discuss her approach and what was the primary disagreement the last time. I can go into more ditails as a follow up if people are interested of course. * Finally I would like to collect feedback for the mm git tree. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx134.postini.com [74.125.245.134]) by kanga.kvack.org (Postfix) with SMTP id 78C706B0002 for ; Tue, 5 Feb 2013 09:12:49 -0500 (EST) Message-ID: <51111377.4030502@parallels.com> Date: Tue, 5 Feb 2013 18:13:11 +0400 From: Glauber Costa MIME-Version: 1.0 Subject: Re: [LSF/MM TOPIC] Few things I would like to discuss References: <20130205123515.GA26229@dhcp22.suse.cz> In-Reply-To: <20130205123515.GA26229@dhcp22.suse.cz> Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Michal Hocko Cc: lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org On 02/05/2013 04:35 PM, Michal Hocko wrote: > Hi, > I would like to discuss the following topics: > * memcg oom should be more sensitive to locked contexts because now > it is possible that a task is sitting in mem_cgroup_handle_oom holding > some other lock (e.g. i_mutex or mmap_sem) up the chain which might > block other task to terminate on OOM so we basically end up in a > deadlock. Almost all memcg charges happen from the page fault path > where we can retry but one class of them happen from > add_to_page_cache_locked and that is a bit more problematic. This is not the case with kmemcg on. Those charges will usually happen from the slab/slub grow_cache mechanism, or during fork. This is not to invalidate your reasoning - since those are usually tricky in terms of context as well, and would benefit just as much - but to complete it. > * I would really like to finally settle down on something wrt. soft > limit reclaim. I am pretty sure Ying would like to discuss this topic > as well so I will not go into details about it. I will post what I > have before the conference so that we can discuss her approach and > what was the primary disagreement the last time. I can go into more > ditails as a follow up if people are interested of course. This interests me very much as well. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx159.postini.com [74.125.245.159]) by kanga.kvack.org (Postfix) with SMTP id 0FB4B6B0005 for ; Tue, 12 Feb 2013 19:39:37 -0500 (EST) Received: from m3.gw.fujitsu.co.jp (unknown [10.0.50.73]) by fgwmail6.fujitsu.co.jp (Postfix) with ESMTP id C83F53EE0BC for ; Wed, 13 Feb 2013 09:39:35 +0900 (JST) Received: from smail (m3 [127.0.0.1]) by outgoing.m3.gw.fujitsu.co.jp (Postfix) with ESMTP id 70DBB45DEC0 for ; Wed, 13 Feb 2013 09:39:35 +0900 (JST) Received: from s3.gw.fujitsu.co.jp (s3.gw.fujitsu.co.jp [10.0.50.93]) by m3.gw.fujitsu.co.jp (Postfix) with ESMTP id 39EED45DEBF for ; Wed, 13 Feb 2013 09:39:35 +0900 (JST) Received: from s3.gw.fujitsu.co.jp (localhost.localdomain [127.0.0.1]) by s3.gw.fujitsu.co.jp (Postfix) with ESMTP id 2AFF91DB8040 for ; Wed, 13 Feb 2013 09:39:35 +0900 (JST) Received: from m1001.s.css.fujitsu.com (m1001.s.css.fujitsu.com [10.240.81.139]) by s3.gw.fujitsu.co.jp (Postfix) with ESMTP id D7E3C1DB803C for ; Wed, 13 Feb 2013 09:39:34 +0900 (JST) Message-ID: <511AE0B5.4020502@jp.fujitsu.com> Date: Wed, 13 Feb 2013 09:39:17 +0900 From: Kamezawa Hiroyuki MIME-Version: 1.0 Subject: Re: [LSF/MM TOPIC] Few things I would like to discuss References: <20130205123515.GA26229@dhcp22.suse.cz> In-Reply-To: <20130205123515.GA26229@dhcp22.suse.cz> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Michal Hocko Cc: lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org (2013/02/05 21:35), Michal Hocko wrote: > Hi, > I would like to discuss the following topics: I missed the deadline :( > * memcg oom should be more sensitive to locked contexts because now > it is possible that a task is sitting in mem_cgroup_handle_oom holding > some other lock (e.g. i_mutex or mmap_sem) up the chain which might > block other task to terminate on OOM so we basically end up in a > deadlock. Almost all memcg charges happen from the page fault path > where we can retry but one class of them happen from > add_to_page_cache_locked and that is a bit more problematic. Yes, this is a topic should be discussed. > * memcg doesn't use PF_MEMALLOC for the targeted reclaim code paths > which asks for stack overflows (and we have already seen those - > e.g. from the xfs pageout paths). The primary problem to use the flag > is that there is no dirty pages throttling and writeback kicked out > for memcg so if we didn't writeback from the reclaim the caller could > be blocked for ever. Memcg dirty accounting is shaping slowly so we > should start thinking about the writeback as well. Sure. > * While we are at the memcg dirty pages accounting > (https://lkml.org/lkml/2012/12/25/95). It turned out that the locking > is really nasty (https://lkml.org/lkml/2013/1/2/48). The locking > should be reworked without incurring any penalty on the fast path. > This sounds really challenging. I'd like to fix the locking problem. > * I would really like to finally settle down on something wrt. soft > limit reclaim. I am pretty sure Ying would like to discuss this topic > as well so I will not go into details about it. I will post what I > have before the conference so that we can discuss her approach and > what was the primary disagreement the last time. I can go into more > ditails as a follow up if people are interested of course. > * Finally I would like to collect feedback for the mm git tree. > Other points related to memcg is ... + kernel memory accounting + per-zone-per-memcg inode/dentry caching. Glaubler tries to account inode/dentry in kmem controller. To do that, I think inode and dentry should be hanldled per zone, at first. IIUC, there are ongoing work but not merged yet. + overheads by memcg Mel explained memcg's big overheads last year's MM summit. AFAIK, we have not made any progress with that. If someone have detailed data, please share again... Thanks, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx159.postini.com [74.125.245.159]) by kanga.kvack.org (Postfix) with SMTP id 1EEAB6B0007 for ; Tue, 12 Feb 2013 22:53:44 -0500 (EST) Message-ID: <1360727617.2544.6.camel@dabdike> Subject: Re: [Lsf-pc] [LSF/MM TOPIC] Few things I would like to discuss From: James Bottomley Date: Wed, 13 Feb 2013 07:53:37 +0400 In-Reply-To: <511AE0B5.4020502@jp.fujitsu.com> References: <20130205123515.GA26229@dhcp22.suse.cz> <511AE0B5.4020502@jp.fujitsu.com> Content-Type: text/plain; charset="ISO-8859-15" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Kamezawa Hiroyuki Cc: Michal Hocko , linux-mm@kvack.org, lsf-pc@lists.linux-foundation.org On Wed, 2013-02-13 at 09:39 +0900, Kamezawa Hiroyuki wrote: > (2013/02/05 21:35), Michal Hocko wrote: > > Hi, > > I would like to discuss the following topics: > > I missed the deadline :( I wouldn't call it a deadline, more a sort of guideline ... we only call it a deadline because if we didn't some of the procrastinators out there wouldn't submit anything until the day before the actual summit. We do do the first cut of invites and topics on 6 Feb, but we keep some slots open all the way up to the summit just in case something important comes along. James -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx118.postini.com [74.125.245.118]) by kanga.kvack.org (Postfix) with SMTP id DFD4A6B0005 for ; Wed, 13 Feb 2013 03:19:59 -0500 (EST) Message-ID: <511B4CC8.9040309@parallels.com> Date: Wed, 13 Feb 2013 12:20:24 +0400 From: Glauber Costa MIME-Version: 1.0 Subject: Re: [LSF/MM TOPIC] Few things I would like to discuss References: <20130205123515.GA26229@dhcp22.suse.cz> <511AE0B5.4020502@jp.fujitsu.com> In-Reply-To: <511AE0B5.4020502@jp.fujitsu.com> Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Kamezawa Hiroyuki Cc: Michal Hocko , lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org, Johannes Weiner , Dave Chinner > > Other points related to memcg is ... > > + kernel memory accounting + per-zone-per-memcg inode/dentry caching. > Glaubler tries to account inode/dentry in kmem controller. To do that, > I think inode and dentry should be hanldled per zone, at first. IIUC, > there are > ongoing work but not merged yet. > Yes, I've already managed to post an initial version - comments appreciated. Actually, Johannes correctly pointed out to me once that memcg pressure is never per-zone, so there is no reason for us to keep per-zone information. The logic behind this is that if there is per-zone pressure, it is always global pressure; memcg can only provide go/no-go signals, and knows nothing about zones. The only reason I am actually keeping per-zone information, is to avoid keeping the inodes/dentries in two lists. Without per-zone, we would have to keep it in a nodeless memcg list, and then in a per-zone (it is actually per-node) list, and then when global pressure kicks in, follow the zone lists. This means extra 16 bytes per objects, which adds up quickly to a large memory overhead. > + overheads by memcg > Mel explained memcg's big overheads last year's MM summit. AFAIK, we > have not > made any progress with that. If someone have detailed data, please > share again... > I had a patch for that, but didn't manage to go back to it again. Jeff Liu did some extra work to handle lazy swap enablement as well, that would go all right with it. I can probably find the time to resuscitate it before the summit. We could focus on what is still missing. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org