All of lore.kernel.org
 help / color / mirror / Atom feed
From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: Paul Menage <menage@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Pavel Emelianov <xemul@openvz.org>,
	Hugh Dickins <hugh@veritas.com>,
	Sudhir Kumar <skumar@linux.vnet.ibm.com>,
	YAMAMOTO Takashi <yamamoto@valinux.co.jp>,
	lizf@cn.fujitsu.com, linux-kernel@vger.kernel.org,
	taka@valinux.co.jp, linux-mm@kvack.org,
	David Rientjes <rientjes@google.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [RFC][0/3] Virtual address space control for cgroups (v2)
Date: Fri, 28 Mar 2008 23:43:32 +0530	[thread overview]
Message-ID: <47ED354C.2040502@linux.vnet.ibm.com> (raw)
In-Reply-To: <6599ad830803280737lf6882bapd9707c02bf26ef12@mail.gmail.com>

Paul Menage wrote:
> On Thu, Mar 27, 2008 at 8:59 PM, Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
>>  > Java (or at least, Sun's JRE) is an example of a common application
>>  > that does this. It creates a huge heap mapping at startup, and faults
>>  > it in as necessary.
>>  >
>>
>>  Isn't this controlled by the java -Xm options?
>>
> 
> Probably - that was just an example, and the behaviour of Java isn't
> exactly unreasonable. A different example would be an app that maps a
> massive database file, but only pages small amounts of it in at any
> one time.
> 
>>  I understand, but
>>
>>  1. The system by default enforces overcommit on most distros, so why should we
>>  not have something similar and that flexible for cgroups.
> 
> Right, I guess I should make it clear that I'm *not* arguing that we
> shouldn't have a virtual address space limit subsystem.
> 
> My main arguments in this and my previous email were to back up my
> assertion that there are a significant set of real-world cases where
> it doesn't help, and hence it should be a separate subsystem that can
> be turned on or off as desired.
> 
> It strikes me that when split into its own subsystem, this is going to
> be very simple - basically just a resource counter and some file
> handlers. We should probably have something like
> include/linux/rescounter_subsys_template.h, so you can do:
> 
> #define SUBSYS_NAME va
> #define SUBSYS_UNIT_SUFFIX in_bytes
> #include <linux/rescounter_subsys_template.h>
> 
> then all you have to add are the hooks to call the rescounter
> charge/uncharge functions and you're done. It would be nice to have a
> separate trivial subsystem like this for each of the rlimit types, not
> just virtual address space.
> 

OK, I'll consider doing a separate controller, once we get the mm->owner issue
sorted out.

>>   And specifying
>>  > them manually requires either unusually clueful users (most of whom
>>  > have enough trouble figuring out how much physical memory they'll
>>  > need, and would just set very high virtual address space limits) or
>>  > sysadmins with way too much time on their hands ...
>>  >
>>
>>  It's a one time thing to setup for sysadmins
>>
> 
> Sure, it's a one-time thing to setup *if* your cluster workload is
> completely static.
> 
>>  > As I said, I think focussing on ways to tell apps that they're running
>>  > low on physical memory would be much more productive.
>>  >
>>
>>  We intend to do that as well. We intend to have user space OOM notification.
> 
> We've been playing with a user-space OOM notification system at Google
> - it's on my TODO list to push it to mainline (as an independent
> subsystem, since either cpusets or the memory controller can be used
> to cause OOMs that are localized to a cgroup). What we have works
> pretty well but I think our interface is a bit too much of a kludge at
> this point.

It's good to know you have something generic working. I was planning to start
work on it later.

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

WARNING: multiple messages have this Message-ID (diff)
From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: Paul Menage <menage@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Pavel Emelianov <xemul@openvz.org>,
	Hugh Dickins <hugh@veritas.com>,
	Sudhir Kumar <skumar@linux.vnet.ibm.com>,
	YAMAMOTO Takashi <yamamoto@valinux.co.jp>,
	lizf@cn.fujitsu.com, linux-kernel@vger.kernel.org,
	taka@valinux.co.jp, linux-mm@kvack.org,
	David Rientjes <rientjes@google.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [RFC][0/3] Virtual address space control for cgroups (v2)
Date: Fri, 28 Mar 2008 23:43:32 +0530	[thread overview]
Message-ID: <47ED354C.2040502@linux.vnet.ibm.com> (raw)
In-Reply-To: <6599ad830803280737lf6882bapd9707c02bf26ef12@mail.gmail.com>

Paul Menage wrote:
> On Thu, Mar 27, 2008 at 8:59 PM, Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
>>  > Java (or at least, Sun's JRE) is an example of a common application
>>  > that does this. It creates a huge heap mapping at startup, and faults
>>  > it in as necessary.
>>  >
>>
>>  Isn't this controlled by the java -Xm options?
>>
> 
> Probably - that was just an example, and the behaviour of Java isn't
> exactly unreasonable. A different example would be an app that maps a
> massive database file, but only pages small amounts of it in at any
> one time.
> 
>>  I understand, but
>>
>>  1. The system by default enforces overcommit on most distros, so why should we
>>  not have something similar and that flexible for cgroups.
> 
> Right, I guess I should make it clear that I'm *not* arguing that we
> shouldn't have a virtual address space limit subsystem.
> 
> My main arguments in this and my previous email were to back up my
> assertion that there are a significant set of real-world cases where
> it doesn't help, and hence it should be a separate subsystem that can
> be turned on or off as desired.
> 
> It strikes me that when split into its own subsystem, this is going to
> be very simple - basically just a resource counter and some file
> handlers. We should probably have something like
> include/linux/rescounter_subsys_template.h, so you can do:
> 
> #define SUBSYS_NAME va
> #define SUBSYS_UNIT_SUFFIX in_bytes
> #include <linux/rescounter_subsys_template.h>
> 
> then all you have to add are the hooks to call the rescounter
> charge/uncharge functions and you're done. It would be nice to have a
> separate trivial subsystem like this for each of the rlimit types, not
> just virtual address space.
> 

OK, I'll consider doing a separate controller, once we get the mm->owner issue
sorted out.

>>   And specifying
>>  > them manually requires either unusually clueful users (most of whom
>>  > have enough trouble figuring out how much physical memory they'll
>>  > need, and would just set very high virtual address space limits) or
>>  > sysadmins with way too much time on their hands ...
>>  >
>>
>>  It's a one time thing to setup for sysadmins
>>
> 
> Sure, it's a one-time thing to setup *if* your cluster workload is
> completely static.
> 
>>  > As I said, I think focussing on ways to tell apps that they're running
>>  > low on physical memory would be much more productive.
>>  >
>>
>>  We intend to do that as well. We intend to have user space OOM notification.
> 
> We've been playing with a user-space OOM notification system at Google
> - it's on my TODO list to push it to mainline (as an independent
> subsystem, since either cpusets or the memory controller can be used
> to cause OOMs that are localized to a cgroup). What we have works
> pretty well but I think our interface is a bit too much of a kludge at
> this point.

It's good to know you have something generic working. I was planning to start
work on it later.

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2008-03-28 18:17 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-26 18:49 [RFC][0/3] Virtual address space control for cgroups (v2) Balbir Singh
2008-03-26 18:49 ` Balbir Singh
2008-03-26 18:50 ` [RFC][1/3] Add user interface for virtual address space control (v2) Balbir Singh
2008-03-26 18:50   ` Balbir Singh
2008-03-27  9:14   ` KAMEZAWA Hiroyuki
2008-03-27  9:14     ` KAMEZAWA Hiroyuki
2008-03-27  9:39     ` Pavel Emelyanov
2008-03-27  9:39       ` Pavel Emelyanov
2008-03-27  9:46       ` Balbir Singh
2008-03-27  9:46         ` Balbir Singh
2008-03-26 18:50 ` [RFC][2/3] Account and control virtual address space allocations (v2) Balbir Singh
2008-03-26 18:50   ` Balbir Singh
2008-03-26 19:10   ` Balbir Singh
2008-03-26 19:10     ` Balbir Singh
2008-03-27  7:19   ` Pavel Emelyanov
2008-03-27  7:19     ` Pavel Emelyanov
2008-03-27  8:02     ` Balbir Singh
2008-03-27  8:02       ` Balbir Singh
2008-03-27  8:24       ` Pavel Emelyanov
2008-03-27  8:24         ` Pavel Emelyanov
2008-03-27  8:30         ` Balbir Singh
2008-03-27  8:30           ` Balbir Singh
2008-03-27  8:38           ` Pavel Emelyanov
2008-03-27  8:38             ` Pavel Emelyanov
2008-03-26 18:50 ` [RFC][3/3] Update documentation for virtual address space control (v2) Balbir Singh
2008-03-26 18:50   ` Balbir Singh
2008-03-26 22:22 ` [RFC][0/3] Virtual address space control for cgroups (v2) Paul Menage
2008-03-26 22:22   ` Paul Menage
2008-03-27  8:04   ` Balbir Singh
2008-03-27  8:04     ` Balbir Singh
2008-03-27 14:28     ` Paul Menage
2008-03-27 14:28       ` Paul Menage
2008-03-27 17:50       ` Balbir Singh
2008-03-27 17:50         ` Balbir Singh
2008-03-27 18:44         ` Paul Menage
2008-03-27 18:44           ` Paul Menage
2008-03-28  3:59           ` Balbir Singh
2008-03-28  3:59             ` Balbir Singh
2008-03-28 14:37             ` Paul Menage
2008-03-28 14:37               ` Paul Menage
2008-03-28 18:13               ` Balbir Singh [this message]
2008-03-28 18:13                 ` Balbir Singh
2008-03-27 10:03   ` KAMEZAWA Hiroyuki
2008-03-27 10:03     ` KAMEZAWA Hiroyuki
2008-03-27 13:59     ` Paul Menage
2008-03-27 13:59       ` Paul Menage

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47ED354C.2040502@linux.vnet.ibm.com \
    --to=balbir@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=hugh@veritas.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=menage@google.com \
    --cc=rientjes@google.com \
    --cc=skumar@linux.vnet.ibm.com \
    --cc=taka@valinux.co.jp \
    --cc=xemul@openvz.org \
    --cc=yamamoto@valinux.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.