All of lore.kernel.org
 help / color / mirror / Atom feed
* memrlimit controller merge to mainline
@ 2008-07-25  8:14 Paul Menage
       [not found] ` <6599ad830807250114h7ab0fdb1u98c0968961647642-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Paul Menage @ 2008-07-25  8:14 UTC (permalink / raw)
  To: Balbir Singh
  Cc: hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org, Andrew Morton,
	Linux Containers

Hi Balbir,

Andrew included the memrlimit controller in his latest set of patches
to Linus for mainline.

Although the memrlimit controller basically works as intended, my
impression from the mini-summit on Tuesday is that our consensus is
that this still doesn't have concrete practical use-cases yet:

- avoiding swap over-use is better handled by the forthcoming swap controller

- applications that can usefully handle mmap() returning NULL don't
really exist yet (and since the system as a whole allows address space
overcommit limits, if it was practical/useful to write such apps then
presumably they would already exist)

So I think we'd be complicating some of the vm paths in mainline with
a feature that isn't likely to get a lot of real use.

What do you (and others on the containers list) think? Should we ask
Andrew/Linus to hold off on this for now? My preference would be to do
that until we have someone who can stand up with a concrete scenario
where they want to use this in a real environment.

Paul

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found] ` <6599ad830807250114h7ab0fdb1u98c0968961647642-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2008-07-25  8:25   ` Andrew Morton
       [not found]     ` <20080725012519.a5fed7d6.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
  2008-07-25  9:06   ` Hugh Dickins
  2008-07-25 12:30   ` Balbir Singh
  2 siblings, 1 reply; 35+ messages in thread
From: Andrew Morton @ 2008-07-25  8:25 UTC (permalink / raw)
  To: Paul Menage
  Cc: hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org, Linux Containers,
	Balbir Singh

On Fri, 25 Jul 2008 04:14:55 -0400 "Paul Menage" <menage-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:

> Hi Balbir,
> 
> Andrew included the memrlimit controller in his latest set of patches
> to Linus for mainline.

I've asked Linus to drop all 238 patches.  I'll be resending them minus
the offending memrlimit patches.

Did I mention that conferences suck?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found] ` <6599ad830807250114h7ab0fdb1u98c0968961647642-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2008-07-25  8:25   ` Andrew Morton
@ 2008-07-25  9:06   ` Hugh Dickins
       [not found]     ` <Pine.LNX.4.64.0807251004570.31120-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
  2008-07-25 12:30   ` Balbir Singh
  2 siblings, 1 reply; 35+ messages in thread
From: Hugh Dickins @ 2008-07-25  9:06 UTC (permalink / raw)
  To: Paul Menage; +Cc: Andrew Morton, Linux Containers, Balbir Singh

On Fri, 25 Jul 2008, Paul Menage wrote:
> 
> So I think we'd be complicating some of the vm paths in mainline with
> a feature that isn't likely to get a lot of real use.
> 
> What do you (and others on the containers list) think? Should we ask
> Andrew/Linus to hold off on this for now? My preference would be to do
> that until we have someone who can stand up with a concrete scenario
> where they want to use this in a real environment.

I see Andrew has already acted, so it's now moot.  But I'd like to
say that I do agree with you and the conclusion to hold off for now.

I was a bit alarmed earlier to see those patches sailing on through;
but realized that I'd done very little to substantiate my "hatred of
the whole thing", and decided that I didn't feel strongly enough to
stand in the way now.  But I am glad you've stepped in, thank you.

(Different topic, but one day I ought to get around to saying again
how absurd I think a swap controller; whereas a mem+swap controller
makes plenty of sense.  I think Rik and others said the same.)

By the way, here's a BUG I got from CONFIG_CGROUP_MEMRLIMIT_CTLR=y
but no use of it, when doing swapoff a week ago.  Not investigated
at all, I'm afraid, but at a guess it might come from memrlimit work
placing too much faith in the mm_users count - swapoff is only one
of several places which have to inc/dec mm_users for some reason.

BUG: unable to handle kernel paging request at 6b6b6b8b
IP: [<7817078f>] memrlimit_cgroup_uncharge_as+0x18/0x29
*pde = 00000000 
Oops: 0000 [#1] PREEMPT SMP 
last sysfs file: /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map
Modules linked in: acpi_cpufreq snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device thermal ac battery button

Pid: 22500, comm: swapoff Not tainted (2.6.26-rc8-mm1 #7)
EIP: 0060:[<7817078f>] EFLAGS: 00010206 CPU: 0
EIP is at memrlimit_cgroup_uncharge_as+0x18/0x29
EAX: 6b6b6b6b EBX: 7963215c ECX: 7c032000 EDX: 0025e000
ESI: 96902518 EDI: 9fbb1aa0 EBP: 7c033e9c ESP: 7c033e9c
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process swapoff (pid: 22500, ti=7c032000 task=907e2b70 task.ti=7c032000)
Stack: 7c033edc 78161323 9fbb1aa0 0000025e ffffff77 7c033ecc 96902518 00000000 
       ffffffff 7c033ec8 00000000 00000089 7963215c 9fbb1aa0 9fbb1b28 a272f040 
       7c033ef4 781226b1 9fbb1aa0 9fbb1aa0 790fa884 a272f0c8 7c033f80 78165ce3 
Call Trace:
 [<78161323>] ? exit_mmap+0xaf/0x133
 [<781226b1>] ? mmput+0x4c/0xba
 [<78165ce3>] ? try_to_unuse+0x20b/0x3f5
 [<78371534>] ? _spin_unlock+0x22/0x3c
 [<7816636a>] ? sys_swapoff+0x17b/0x37c
 [<78102d95>] ? sysenter_past_esp+0x6a/0xa5
 =======================
Code: 24 0c 00 00 8b 40 20 52 83 c0 0c 50 e8 ad a6 fd ff c9 c3 55 89 e5 8b 45 08 8b 55 0c 8b 80 30 02 00 00 c1 e2 0c 8b 80 24 0c 00 00 <8b> 40 20 52 83 c0 0c 50 e8 e6 a6 fd ff 58 5a c9 c3 55 89 e5 8b 
EIP: [<7817078f>] memrlimit_cgroup_uncharge_as+0x18/0x29 SS:ESP 0068:7c033e9c

Hugh

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found] ` <6599ad830807250114h7ab0fdb1u98c0968961647642-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2008-07-25  8:25   ` Andrew Morton
  2008-07-25  9:06   ` Hugh Dickins
@ 2008-07-25 12:30   ` Balbir Singh
       [not found]     ` <4889C77F.5090909-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
  2 siblings, 1 reply; 35+ messages in thread
From: Balbir Singh @ 2008-07-25 12:30 UTC (permalink / raw)
  To: Paul Menage
  Cc: hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org, Andrew Morton,
	Linux Containers

Paul Menage wrote:
> Hi Balbir,
> 
> Andrew included the memrlimit controller in his latest set of patches
> to Linus for mainline.
> 
> Although the memrlimit controller basically works as intended, my
> impression from the mini-summit on Tuesday is that our consensus is
> that this still doesn't have concrete practical use-cases yet:
> 
> - avoiding swap over-use is better handled by the forthcoming swap controller
> 
> - applications that can usefully handle mmap() returning NULL don't
> really exist yet (and since the system as a whole allows address space
> overcommit limits, if it was practical/useful to write such apps then
> presumably they would already exist)
> 

There are applications that can/need to handle overcommit, just that we are not
aware of them fully. Immediately after our meeting, I was pointed to
http://www.linuxfoundation.org/en/Carrier_Grade_Linux/Requirements_Alpha1#AVL.4.1_VM_Strict_Over-Commit

> So I think we'd be complicating some of the vm paths in mainline with
> a feature that isn't likely to get a lot of real use.
> 

I did disagree in the meeting and there is also the use case of the feature
forming the infrastructure for other rlimit controllers.

> What do you (and others on the containers list) think? Should we ask
> Andrew/Linus to hold off on this for now? My preference would be to do
> that until we have someone who can stand up with a concrete scenario
> where they want to use this in a real environment.

While we can argue about use cases, the feature needs more testing and I am OK
holding off/reverting the merge to make it more stable and that would give us
more time to argue on its usefulness. To say that overcommit handling is not
useful is wrong. Meanwhile, I'll go back and look at the bug report that Hugh
has posted and also look at building an mlock controller on top of memrlimits.

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]     ` <20080725012519.a5fed7d6.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
@ 2008-07-25 12:56       ` Balbir Singh
  2008-07-25 12:57       ` Balbir Singh
  1 sibling, 0 replies; 35+ messages in thread
From: Balbir Singh @ 2008-07-25 12:56 UTC (permalink / raw)
  To: Andrew Morton
  Cc: hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org, Paul Menage,
	Linux Containers

Andrew Morton wrote:
> On Fri, 25 Jul 2008 04:14:55 -0400 "Paul Menage" <menage-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
> 
>> Hi Balbir,
>>
>> Andrew included the memrlimit controller in his latest set of patches
>> to Linus for mainline.
> 
> I've asked Linus to drop all 238 patches.  I'll be resending them minus
> the offending memrlimit patches.
> 

Sorry for making your work more harder.

> Did I mention that conferences suck?

Not yet, but we know now :)

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]     ` <20080725012519.a5fed7d6.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
  2008-07-25 12:56       ` Balbir Singh
@ 2008-07-25 12:57       ` Balbir Singh
  1 sibling, 0 replies; 35+ messages in thread
From: Balbir Singh @ 2008-07-25 12:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org, Paul Menage,
	Linux Containers

Andrew Morton wrote:
> On Fri, 25 Jul 2008 04:14:55 -0400 "Paul Menage" <menage-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
> 
>> Hi Balbir,
>>
>> Andrew included the memrlimit controller in his latest set of patches
>> to Linus for mainline.
> 
> I've asked Linus to drop all 238 patches.  I'll be resending them minus
> the offending memrlimit patches.
> 

Sorry for making your work more harder.

> Did I mention that conferences suck?

Not yet, but we know now :)

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]     ` <Pine.LNX.4.64.0807251004570.31120-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
@ 2008-07-25 13:32       ` Balbir Singh
       [not found]         ` <4889D5EE.4010601-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
  2008-07-25 14:06       ` Paul Menage
  2008-08-04 19:04       ` Balbir Singh
  2 siblings, 1 reply; 35+ messages in thread
From: Balbir Singh @ 2008-07-25 13:32 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Linux Containers, Paul Menage, Andrew Morton

Hugh Dickins wrote:
> On Fri, 25 Jul 2008, Paul Menage wrote:
>> So I think we'd be complicating some of the vm paths in mainline with
>> a feature that isn't likely to get a lot of real use.
>>
>> What do you (and others on the containers list) think? Should we ask
>> Andrew/Linus to hold off on this for now? My preference would be to do
>> that until we have someone who can stand up with a concrete scenario
>> where they want to use this in a real environment.
> 
> I see Andrew has already acted, so it's now moot.  But I'd like to
> say that I do agree with you and the conclusion to hold off for now.
> 
> I was a bit alarmed earlier to see those patches sailing on through;
> but realized that I'd done very little to substantiate my "hatred of
> the whole thing", and decided that I didn't feel strongly enough to
> stand in the way now.  But I am glad you've stepped in, thank you.
> 
> (Different topic, but one day I ought to get around to saying again
> how absurd I think a swap controller; whereas a mem+swap controller
> makes plenty of sense.  I think Rik and others said the same.)
> 

We will have a memory+swap controller working together.

> By the way, here's a BUG I got from CONFIG_CGROUP_MEMRLIMIT_CTLR=y
> but no use of it, when doing swapoff a week ago.  Not investigated
> at all, I'm afraid, but at a guess it might come from memrlimit work
> placing too much faith in the mm_users count - swapoff is only one
> of several places which have to inc/dec mm_users for some reason.
> 

I'll try and reproduce the problem right away. I've been running some kernbench
on top of memrlimit (but not with a lot of stress or trying to swapoff the swap
device).

> BUG: unable to handle kernel paging request at 6b6b6b8b
> IP: [<7817078f>] memrlimit_cgroup_uncharge_as+0x18/0x29
> *pde = 00000000 
> Oops: 0000 [#1] PREEMPT SMP 
> last sysfs file: /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map
> Modules linked in: acpi_cpufreq snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device thermal ac battery button
> 
> Pid: 22500, comm: swapoff Not tainted (2.6.26-rc8-mm1 #7)
> EIP: 0060:[<7817078f>] EFLAGS: 00010206 CPU: 0
> EIP is at memrlimit_cgroup_uncharge_as+0x18/0x29
> EAX: 6b6b6b6b EBX: 7963215c ECX: 7c032000 EDX: 0025e000
> ESI: 96902518 EDI: 9fbb1aa0 EBP: 7c033e9c ESP: 7c033e9c
>  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> Process swapoff (pid: 22500, ti=7c032000 task=907e2b70 task.ti=7c032000)
> Stack: 7c033edc 78161323 9fbb1aa0 0000025e ffffff77 7c033ecc 96902518 00000000 
>        ffffffff 7c033ec8 00000000 00000089 7963215c 9fbb1aa0 9fbb1b28 a272f040 
>        7c033ef4 781226b1 9fbb1aa0 9fbb1aa0 790fa884 a272f0c8 7c033f80 78165ce3 
> Call Trace:
>  [<78161323>] ? exit_mmap+0xaf/0x133
>  [<781226b1>] ? mmput+0x4c/0xba
>  [<78165ce3>] ? try_to_unuse+0x20b/0x3f5
>  [<78371534>] ? _spin_unlock+0x22/0x3c
>  [<7816636a>] ? sys_swapoff+0x17b/0x37c
>  [<78102d95>] ? sysenter_past_esp+0x6a/0xa5
>  =======================
> Code: 24 0c 00 00 8b 40 20 52 83 c0 0c 50 e8 ad a6 fd ff c9 c3 55 89 e5 8b 45 08 8b 55 0c 8b 80 30 02 00 00 c1 e2 0c 8b 80 24 0c 00 00 <8b> 40 20 52 83 c0 0c 50 e8 e6 a6 fd ff 58 5a c9 c3 55 89 e5 8b 
> EIP: [<7817078f>] memrlimit_cgroup_uncharge_as+0x18/0x29 SS:ESP 0068:7c033e9c
> 
> Hugh

I'll try and recreate the problem and fix it. If memrlimit_cgroup_uncharge_as()
created the problem, it's most likely related to mm->owner not being correct and
we are dereferencing the wrong memory.

Thanks for the bug report, I'll look further.

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]     ` <4889C77F.5090909-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
@ 2008-07-25 13:47       ` Joe MacDonald
  2008-07-25 14:11       ` Paul Menage
  1 sibling, 0 replies; 35+ messages in thread
From: Joe MacDonald @ 2008-07-25 13:47 UTC (permalink / raw)
  To: balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org, Paul Menage,
	Linux Containers, Andrew Morton

2008/7/25 Balbir Singh <balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>:

> There are applications that can/need to handle overcommit, just that we are not
> aware of them fully. Immediately after our meeting, I was pointed to
> http://www.linuxfoundation.org/en/Carrier_Grade_Linux/Requirements_Alpha1#AVL.4.1_VM_Strict_Over-Commit

I need to get caught up on this thread, but I did promise Balbir at
the mini-summit that I would appear soon-ish with actual use-cases on
this from some of the CGL folks.  Specifically the case I was thinking
of, other than the CGL requirement for VM Strict Overcommit, was finer
grained rlimit accounting.  It started out in the Collaboration Summit
meeting in Austin as a discussion about the SCOPE gaps document and
CGOS-4.5 (curiously called Coarse Resource Enforcement, when it's
really trying to address per-thread limits).

The full document is here in PDF form:

http://www.scope-alliance.org/pr/SCOPE_CGOS_GAPS_PROFILE_v2.pdf

I'm suspecting now, though, that after re-reading the requirement from
SCOPE and the memrlimit discussion, they may in fact be disjoint sets
of functionality.

-J.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]     ` <Pine.LNX.4.64.0807251004570.31120-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
  2008-07-25 13:32       ` Balbir Singh
@ 2008-07-25 14:06       ` Paul Menage
       [not found]         ` <6599ad830807250706t23e483b5j18d683c0470d1d22-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2008-08-04 19:04       ` Balbir Singh
  2 siblings, 1 reply; 35+ messages in thread
From: Paul Menage @ 2008-07-25 14:06 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Andrew Morton, Linux Containers, Balbir Singh

On Fri, Jul 25, 2008 at 5:06 AM, Hugh Dickins <hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org> wrote:
>
> (Different topic, but one day I ought to get around to saying again
> how absurd I think a swap controller; whereas a mem+swap controller
> makes plenty of sense.  I think Rik and others said the same.)

Agreed that a swap controller without a memory controller doesn't make
much sense, but a memory controller without a swap controller can make
sense on machines that don't intend to use swap.

So if they were separate controllers, we'd use the proposed cgroup
dependency features to make the swap controller depend on the memory
controller - in which case you'd only be able to mount the swap
controller on a hierarchy that also had the memory controller, and the
swap controller would be able to make use of the page ownership
information.

It's more of a modularity issue than a functionality issue, I think -
the swap controller and memory controller are tracking fundamentally
different things (space on disk versus pages in memory), and the only
dependency between them is the "memory controller" tracking the
ownership of a page and providing it to the "swap controller".

>
> By the way, here's a BUG I got from CONFIG_CGROUP_MEMRLIMIT_CTLR=y
> but no use of it, when doing swapoff a week ago.  Not investigated
> at all, I'm afraid, but at a guess it might come from memrlimit work
> placing too much faith in the mm_users count - swapoff is only one
> of several places which have to inc/dec mm_users for some reason.
>
> BUG: unable to handle kernel paging request at 6b6b6b8b

Possibly the mm->owner tracking breaks in that case, if the last user
exits while swapoff is occurring without relinquishing ownership?

That looks as though mm->owner points to a task that had been poisoned
after being freed. That could be awkward to fix :-(

Paul

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]     ` <4889C77F.5090909-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
  2008-07-25 13:47       ` Joe MacDonald
@ 2008-07-25 14:11       ` Paul Menage
       [not found]         ` <6599ad830807250711m4f34c447oc259b0af40f68da4-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 35+ messages in thread
From: Paul Menage @ 2008-07-25 14:11 UTC (permalink / raw)
  To: balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org, Andrew Morton,
	Linux Containers

On Fri, Jul 25, 2008 at 8:30 AM, Balbir Singh <balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
>
> There are applications that can/need to handle overcommit, just that we are not
> aware of them fully. Immediately after our meeting, I was pointed to
> http://www.linuxfoundation.org/en/Carrier_Grade_Linux/Requirements_Alpha1#AVL.4.1_VM_Strict_Over-Commit

Thanks, that'll be interesting to take a look at.

>
>> So I think we'd be complicating some of the vm paths in mainline with
>> a feature that isn't likely to get a lot of real use.
>>
>
> I did disagree in the meeting

Yes, but (my impression of) the overall feeling in the meeting was
that it wasn't yet the right time to push it to mainline.

> and there is also the use case of the feature
> forming the infrastructure for other rlimit controllers.

Agreed, but that's something for the future.

Paul

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]         ` <6599ad830807250711m4f34c447oc259b0af40f68da4-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2008-07-25 16:07           ` Balbir Singh
  0 siblings, 0 replies; 35+ messages in thread
From: Balbir Singh @ 2008-07-25 16:07 UTC (permalink / raw)
  To: Paul Menage
  Cc: hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org, Andrew Morton,
	Linux Containers

Paul Menage wrote:
> On Fri, Jul 25, 2008 at 8:30 AM, Balbir Singh <balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
>> There are applications that can/need to handle overcommit, just that we are not
>> aware of them fully. Immediately after our meeting, I was pointed to
>> http://www.linuxfoundation.org/en/Carrier_Grade_Linux/Requirements_Alpha1#AVL.4.1_VM_Strict_Over-Commit
> 
> Thanks, that'll be interesting to take a look at.
> 
>>> So I think we'd be complicating some of the vm paths in mainline with
>>> a feature that isn't likely to get a lot of real use.
>>>
>> I did disagree in the meeting
> 
> Yes, but (my impression of) the overall feeling in the meeting was
> that it wasn't yet the right time to push it to mainline.
> 

Yes! I need to test it more and I'll focus more on that front.

>> and there is also the use case of the feature
>> forming the infrastructure for other rlimit controllers.
> 
> Agreed, but that's something for the future.

I'll work on the mlock controller and post that as well.

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]         ` <6599ad830807250706t23e483b5j18d683c0470d1d22-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2008-07-25 16:46           ` Hugh Dickins
       [not found]             ` <Pine.LNX.4.64.0807251715070.12089-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Hugh Dickins @ 2008-07-25 16:46 UTC (permalink / raw)
  To: Paul Menage; +Cc: Andrew Morton, Rik van Riel, Linux Containers, Balbir Singh

On Fri, 25 Jul 2008, Paul Menage wrote:
> On Fri, Jul 25, 2008 at 5:06 AM, Hugh Dickins <hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org> wrote:
> >
> > (Different topic, but one day I ought to get around to saying again
> > how absurd I think a swap controller; whereas a mem+swap controller
> > makes plenty of sense.  I think Rik and others said the same.)
> 
> Agreed that a swap controller without a memory controller doesn't make
> much sense, but a memory controller without a swap controller can make
> sense on machines that don't intend to use swap.

I agree that a memory controller without a swap controller can
make sense: I hope so, anyway, since that's what's in mainline.
Even if swap is used, memory is a more precious resource than swap,
and you were right to go about controlling memory first.

> 
> So if they were separate controllers, we'd use the proposed cgroup
> dependency features to make the swap controller depend on the memory
> controller - in which case you'd only be able to mount the swap
> controller on a hierarchy that also had the memory controller, and the
> swap controller would be able to make use of the page ownership
> information.
> 
> It's more of a modularity issue than a functionality issue, I think -
> the swap controller and memory controller are tracking fundamentally
> different things (space on disk versus pages in memory), and the only
> dependency between them is the "memory controller" tracking the
> ownership of a page and providing it to the "swap controller".

It sounds as if you're interpreting my "mem+swap controller" as a
mem controller and a swap controller and the swap controller makes
use of some of the mem controller infrastructure.

No, I'm trying to say something stronger than that.  I'm saying,
as I've said before, that I cannot imagine why anyone would want
to control swap itself - what they want to control is the total
of mem+swap.  Swap is a second-class citizen, nobody wants swap
if they can have mem, so why control it separately?

IIRC Rik expressed the same by pointing out that a cgroup at its
swap limit would then be forced to grow in mem (until it hits its
mem limit): so controlling the less precious resource would increase
pressure on the more precious resource.  (Actually, that probably
bears little relation to what he said - sorry, Rik!)  I don't recall
what answer he got, perhaps I'd be persuaded if I heard it again.

Hugh

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]         ` <4889D5EE.4010601-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
@ 2008-07-25 17:38           ` Hugh Dickins
       [not found]             ` <Pine.LNX.4.64.0807251820440.20617-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Hugh Dickins @ 2008-07-25 17:38 UTC (permalink / raw)
  To: Balbir Singh; +Cc: Linux Containers, Paul Menage, Andrew Morton

On Fri, 25 Jul 2008, Balbir Singh wrote:
> 
> I'll try and recreate the problem and fix it. If memrlimit_cgroup_uncharge_as()
> created the problem, it's most likely related to mm->owner not being correct and
> we are dereferencing the wrong memory.
> 
> Thanks for the bug report, I'll look further.

Good luck!  I have only seen it once, on a dual-core laptop; though
I don't remember to try swapoff while busy as often as I should (be
sure to alternate between a couple or more of swapareas, so you can
swap a new one on just before swapping an old one off, to be pretty
sure of success).

May be easier to find in the source: my suspicion is that a bad
mm_users assumption will come into it.  But I realize now that it
could be entirely unrelated to memrlimit, just that uncharge_as
was the one to get hit by bad refcounting elsewhere.

Oh, that reminds me, I never reported back on my res_counter warnings
at shutdown: never saw them again, once I added in the set of changes
you came up with shortly after that - thanks.

Hugh

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]             ` <Pine.LNX.4.64.0807251820440.20617-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
@ 2008-07-25 19:08               ` Balbir Singh
  0 siblings, 0 replies; 35+ messages in thread
From: Balbir Singh @ 2008-07-25 19:08 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Linux Containers, Paul Menage, Andrew Morton

Hugh Dickins wrote:
> On Fri, 25 Jul 2008, Balbir Singh wrote:
>> I'll try and recreate the problem and fix it. If memrlimit_cgroup_uncharge_as()
>> created the problem, it's most likely related to mm->owner not being correct and
>> we are dereferencing the wrong memory.
>>
>> Thanks for the bug report, I'll look further.
> 
> Good luck!  I have only seen it once, on a dual-core laptop; though
> I don't remember to try swapoff while busy as often as I should (be
> sure to alternate between a couple or more of swapareas, so you can
> swap a new one on just before swapping an old one off, to be pretty
> sure of success).
> 

Thanks, that's very useful information. I would have never tried juggling swap
devices otherwise.

> May be easier to find in the source: my suspicion is that a bad
> mm_users assumption will come into it.  But I realize now that it
> could be entirely unrelated to memrlimit, just that uncharge_as
> was the one to get hit by bad refcounting elsewhere.
> 
> Oh, that reminds me, I never reported back on my res_counter warnings
> at shutdown: never saw them again, once I added in the set of changes
> you came up with shortly after that - thanks.
> 

I am glad those messages are gone, thanks for the bug report. I find bug fixing
more exciting that kernel development on most occasions.


-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]             ` <Pine.LNX.4.64.0807251715070.12089-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
@ 2008-07-25 19:24               ` Paul Menage
       [not found]                 ` <6599ad830807251224g218e17faj5c8224ba398a51c8-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2008-07-25 19:28               ` Balbir Singh
  2008-07-29  6:01               ` KAMEZAWA Hiroyuki
  2 siblings, 1 reply; 35+ messages in thread
From: Paul Menage @ 2008-07-25 19:24 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Andrew Morton, Rik van Riel, Linux Containers, Balbir Singh

On Fri, Jul 25, 2008 at 12:46 PM, Hugh Dickins <hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org> wrote:
> No, I'm trying to say something stronger than that.  I'm saying,
> as I've said before, that I cannot imagine why anyone would want
> to control swap itself - what they want to control is the total
> of mem+swap.  Swap is a second-class citizen, nobody wants swap
> if they can have mem, so why control it separately?

Scheduling jobs on to machines is much more straightforward when they
request xGB of memory and yGB of swap rather than just (x+y)GB of
(memory+swap). We want to be able to guarantee to jobs that they will
be able to use xGB of real memory.

Actually my preferred approach to swap controlling would be something like:

- allow malloc to support mmaping pages from a temporary file rather
than mmapping anonymous memory

- add an fcntl/ioctl that marks a file as skipping dirty background
write, i.e. only write out dirty pages from this file when we find
them in reclaim rather than generally in the background. This makes
them act more like anonymous pages in terms of write-out.

- don't allow these jobs to use real swap

That way you don't need to track swap space at all, assuming you're
tracking disk space usage, memory and I/O - the app gets to do its own
swapping. But I realise this might not work in a general environment.

Paul

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]             ` <Pine.LNX.4.64.0807251715070.12089-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
  2008-07-25 19:24               ` Paul Menage
@ 2008-07-25 19:28               ` Balbir Singh
       [not found]                 ` <488A294B.4090609-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
  2008-07-29  6:01               ` KAMEZAWA Hiroyuki
  2 siblings, 1 reply; 35+ messages in thread
From: Balbir Singh @ 2008-07-25 19:28 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Linux Containers, Paul Menage, Andrew Morton, Rik van Riel

Hugh Dickins wrote:
> On Fri, 25 Jul 2008, Paul Menage wrote:
>> On Fri, Jul 25, 2008 at 5:06 AM, Hugh Dickins <hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org> wrote:
>>> (Different topic, but one day I ought to get around to saying again
>>> how absurd I think a swap controller; whereas a mem+swap controller
>>> makes plenty of sense.  I think Rik and others said the same.)
>> Agreed that a swap controller without a memory controller doesn't make
>> much sense, but a memory controller without a swap controller can make
>> sense on machines that don't intend to use swap.
> 
> I agree that a memory controller without a swap controller can
> make sense: I hope so, anyway, since that's what's in mainline.
> Even if swap is used, memory is a more precious resource than swap,
> and you were right to go about controlling memory first.
> 

Yes, I agree.

>> So if they were separate controllers, we'd use the proposed cgroup
>> dependency features to make the swap controller depend on the memory
>> controller - in which case you'd only be able to mount the swap
>> controller on a hierarchy that also had the memory controller, and the
>> swap controller would be able to make use of the page ownership
>> information.
>>
>> It's more of a modularity issue than a functionality issue, I think -
>> the swap controller and memory controller are tracking fundamentally
>> different things (space on disk versus pages in memory), and the only
>> dependency between them is the "memory controller" tracking the
>> ownership of a page and providing it to the "swap controller".
> 
> It sounds as if you're interpreting my "mem+swap controller" as a
> mem controller and a swap controller and the swap controller makes
> use of some of the mem controller infrastructure.
> 
> No, I'm trying to say something stronger than that.  I'm saying,
> as I've said before, that I cannot imagine why anyone would want
> to control swap itself - what they want to control is the total
> of mem+swap.  Swap is a second-class citizen, nobody wants swap
> if they can have mem, so why control it separately?
> 
> IIRC Rik expressed the same by pointing out that a cgroup at its
> swap limit would then be forced to grow in mem (until it hits its
> mem limit): so controlling the less precious resource would increase
> pressure on the more precious resource.  (Actually, that probably
> bears little relation to what he said - sorry, Rik!)  I don't recall
> what answer he got, perhaps I'd be persuaded if I heard it again.
> 

I see what your saying. When you look at Linux right now, we control swap
independent of memory, so I am not totally opposed to setting swap, instead of
swap+mem. I might not want to swap from a particular cgroup, in which case, I
set swap to 0 and risk OOMing, which might be an acceptable trade-off depending
on my setup. I could easily change this policy on demand and add swap if OOMing
was no longer OK.

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]             ` <Pine.LNX.4.64.0807251715070.12089-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
  2008-07-25 19:24               ` Paul Menage
  2008-07-25 19:28               ` Balbir Singh
@ 2008-07-29  6:01               ` KAMEZAWA Hiroyuki
       [not found]                 ` <20080729150111.f879c989.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
  2 siblings, 1 reply; 35+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-07-29  6:01 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Rik van Riel, Containers,
	nishimura-YQH0OdQVrdy45+QrQBaojngSJqDPrsil@public.gmane.org,
	Linux-FOgKQjlUJ6BQetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Paul Menage,
	Andrew Morton, Balbir Singh

On Fri, 25 Jul 2008 17:46:45 +0100 (BST)
Hugh Dickins <hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org> wrote:

> IIRC Rik expressed the same by pointing out that a cgroup at its
> swap limit would then be forced to grow in mem (until it hits its
> mem limit): so controlling the less precious resource would increase
> pressure on the more precious resource.  (Actually, that probably
> bears little relation to what he said - sorry, Rik!)  I don't recall
> what answer he got, perhaps I'd be persuaded if I heard it again.
> 
Added Nishimura to CC.

IMHO, from user point of view, both of
 - having 2 controls as mem controller + swap controller
 - mem + swap controller
doesn't have much difference. The users will use as they like.

From memory controller's point of view, treating mem+swap by the same
controller makes sense. Because memory controller can check wheter we can use
more swap or not, we can avoid hopeless-scanning of Anon at swap-shortage.
(By split-lru, I think we can do this avoidance.)
 
Another-Topic?

In recent servers, memory is big, swap is (relatively) small.
And under memory resource controller, the whole swap is easily occupied
by a group. I want to avoid it.

For users, swap is not precious because it's not fast. 
But for memory reclaiming, swap is precious resource to page out
anonymous/shmem/tmpfs memory. I think usual system-admin considers swap as
some emergency spare of memory. I'd like to allow this "emergency spare" to each
cgroup.
(For example, swap is used even if vm.swappiness==0. This is for avoiding OOM-Killer
 under some situation, this behavior is added by Rik.)


== following is another use case I explained to Rik at 23/May/08 ==

IIRC, a man shown his motivation to controll swap in OLS2007/BOF as following.
Consider following system. (and there is no swap controller.) 
Memory 4G. Swap 1G. with 2 cgroups A, B.

state 1) swap is not used.
  A....memory limit to be 1G  no swap usage memory_usage=0M
  B....memory limit to be 1G  no swap usage memory_usage=0M

state 2) Run a big program on A.
  A....memory limit to be 1G and try to use 1.7G. uses 700MBytes of swap.
       memory_usage=1G swap_usage=700M
  B....memory_usage=0M

state 3) A some of programs ends in 'A'
  A....memory_usage=500M swap_usage=700M
  B....memory_usage=0M.

state 4) Run a big program on B.
  A...memory_usage=500M swap_usage=700M.
  B...memory_usage=1G   swap_usage=300M

Group B can only use 1.3G because of unfair swap use of group A.
But users think why A uses 700M of swap with 500M of free memory....
==



Thanks,
-Kame

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]                 ` <20080729150111.f879c989.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
@ 2008-07-30  0:16                   ` Hugh Dickins
       [not found]                     ` <Pine.LNX.4.64.0807300113200.14699-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Hugh Dickins @ 2008-07-30  0:16 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Rik van Riel, Linux Containers,
	nishimura-YQH0OdQVrdy45+QrQBaojngSJqDPrsil@public.gmane.org,
	Paul Menage, Andrew Morton, Balbir Singh

On Tue, 29 Jul 2008, KAMEZAWA Hiroyuki wrote:
> On Fri, 25 Jul 2008 17:46:45 +0100 (BST)
> Hugh Dickins <hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org> wrote:
> 
> > IIRC Rik expressed the same by pointing out that a cgroup at its
> > swap limit would then be forced to grow in mem (until it hits its
> > mem limit): so controlling the less precious resource would increase
> > pressure on the more precious resource.  (Actually, that probably
> > bears little relation to what he said - sorry, Rik!)  I don't recall
> > what answer he got, perhaps I'd be persuaded if I heard it again.
> > 
> Added Nishimura to CC.
> 
> IMHO, from user point of view, both of
>  - having 2 controls as mem controller + swap controller
>  - mem + swap controller
> doesn't have much difference. The users will use as they like.

I'm not suggesting either one of those alternatives.

I'm suggesting we have a mem controller (the thing we already have)
and a mem+swap controller (which we don't yet have: a controller
for the total mem+swap of a cgroup); the mem+swap controller likely
making use of much that is in the mem controller, as Paul has said.

(Unfortunately I don't have a good name for this "mem+swap".)

I happen to believe that the mem+swap controller would actually be
a lot more useful than the current mem controller, and would expect
many to run with mem+swap controller enabled but mem controller
disabled or unlimited.  How much is mem and how much is swap being
left to global reclaim to decide, not imposed by any cgroup policy.

What I don't like the sound of at all is a swap controller.  Do you
think that a mem controller (limit 1G) and a mem+swap controller
(limit 2G) is equivalent to a mem controller (limit 1G) and a
swap controller (limit 1G)?  No: imagine memory pressure from
outside the cgroup - with the mem+swap controller it can push as
much as suits of the 2G out to swap; whereas with the swap controller,
once 1G is out, it has to stop pushing any more of that cgroup out.
I think that's absurd - but perhaps I just haven't looked, and
I've totally misinterpreted the talk of a swap controller.

> 
> >From memory controller's point of view, treating mem+swap by the same
> controller makes sense. Because memory controller can check wheter we can use
> more swap or not, we can avoid hopeless-scanning of Anon at swap-shortage.
> (By split-lru, I think we can do this avoidance.)

That's a detail I'm not concerned with on this level.

>  
> Another-Topic?
> 
> In recent servers, memory is big, swap is (relatively) small.

You'll know much more about those common proportions than I do.
I'd wonder why such big memory servers have any swap at all:
to cope with VM management defects we should be fixing?

> And under memory resource controller, the whole swap is easily occupied
> by a group. I want to avoid it.

Why?  I presume because you're thinking it a precious resource.
I don't think its relative smallness makes it more precious.

> 
> For users, swap is not precious because it's not fast. 

Yes, and that's my view.

> But for memory reclaiming, swap is precious resource to page out
> anonymous/shmem/tmpfs memory.

I see that makes swap a useful resource, I don't see that it makes
it a precious resource.  We page out to it precisely because it's
less precious than the memory; both users and kernel would much
prefer to keep all the data in memory, but sometimes there isn't
enough memory so we go to swap.

There is just one way in which I see swap as precious, and that
is to get around some VM management stupidity.  If, for example,
on i386 there's a shortage of lowmem and lots of anonymous in lowmem
that we should shift to highmem, then I think it's still the case
that we have to do that balancing via writing out to and reading
in from swap, because nobody has actually hooked up page migration
to do that when appropriate?  But that's an argument for extending
page migration, not for needing a swap controller.

> I think usual system-admin considers swap as some emergency spare of memory.

Yes, I do too.

> I'd like to allow this "emergency spare" to each cgroup.

We do allow that emergency spare to each cgroup.  Perhaps you're
saying you want to divide it up in advance between the cgroups?
But why?  Sounds like a nice idea (reminds me of what Paul said
about using temporary files), but a solution to what problem?

> (For example, swap is used even if vm.swappiness==0. This is for avoiding
> OOM-Killer under some situation, this behavior is added by Rik.)

Sorry, I don't know what you're referring to there, but again,
suspect it's a detail we don't need to be concerned with here.

> 
> == following is another use case I explained to Rik at 23/May/08 ==
> 
> IIRC, a man shown his motivation to controll swap in OLS2007/BOF as following.
> Consider following system. (and there is no swap controller.) 
> Memory 4G. Swap 1G. with 2 cgroups A, B.
> 
> state 1) swap is not used.
>   A....memory limit to be 1G  no swap usage memory_usage=0M
>   B....memory limit to be 1G  no swap usage memory_usage=0M
> 
> state 2) Run a big program on A.
>   A....memory limit to be 1G and try to use 1.7G. uses 700MBytes of swap.
>        memory_usage=1G swap_usage=700M
>   B....memory_usage=0M
> 
> state 3) A some of programs ends in 'A'
>   A....memory_usage=500M swap_usage=700M
>   B....memory_usage=0M.
> 
> state 4) Run a big program on B.
>   A...memory_usage=500M swap_usage=700M.
>   B...memory_usage=1G   swap_usage=300M

Right, thanks a lot for looking that out again, it's a good example
which helped to focus my mind.  But I don't think I'm learning from
it what you intended.

If you believe a swap controller would make that better, what limits
do you suggest?  If you assign A a swap limit of 700M or above, it
changes nothing; if you assign A a swap limit below 700M, it cannot
do all the work that it could do in the example.

The example tells me two things: one, that artificial limits can
indeed push you into awkward corners; two, that a mem+swap controller
makes more sense than a mem controller - give both A and B a mem+swap
limit of 2.5G, or 1.7G even, they'll run much better that way.

(Three: we should have a way of migrating pages back from swap,
other than use or swapoff?  Certainly there are arguments for
swap prefetch, but I don't see this as one of them: let A's
pages stay on swap until A needs them in memory, why not?)

> 
> Group B can only use 1.3G because of unfair swap use of group A.

"unfair swap use"!  A is _disadvantaged_ by having its pages out
on swap, or will be disadvantaged if it ever needs them again.
The anomaly comes from imposing a low mem limit on B instead of
a more liberal mem+swap limit.

> But users think why A uses 700M of swap with 500M of free memory....

Because at this time A isn't actively using any of that 700M.

Hugh

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]                 ` <6599ad830807251224g218e17faj5c8224ba398a51c8-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2008-07-30  0:31                   ` Hugh Dickins
       [not found]                     ` <Pine.LNX.4.64.0807300117210.14699-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Hugh Dickins @ 2008-07-30  0:31 UTC (permalink / raw)
  To: Paul Menage; +Cc: Andrew Morton, Rik van Riel, Linux Containers, Balbir Singh

On Fri, 25 Jul 2008, Paul Menage wrote:
> On Fri, Jul 25, 2008 at 12:46 PM, Hugh Dickins <hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org> wrote:
> > No, I'm trying to say something stronger than that.  I'm saying,
> > as I've said before, that I cannot imagine why anyone would want
> > to control swap itself - what they want to control is the total
> > of mem+swap.  Swap is a second-class citizen, nobody wants swap
> > if they can have mem, so why control it separately?
> 
> Scheduling jobs on to machines is much more straightforward when they
> request xGB of memory and yGB of swap rather than just (x+y)GB of
> (memory+swap). We want to be able to guarantee to jobs that they will
> be able to use xGB of real memory.

I don't see that I'm denying you a way to guarantee that (though I've
been thinking more of the limits than the guarantees): I'm not saying
that you cannot have a mem controller, I'm saying that you can also
have a mem+swap controller; but that a swap-by-itself controller
makes no sense to me.

> Actually my preferred approach to swap controlling would be something like:
> 
> - allow malloc to support mmaping pages from a temporary file rather
> than mmapping anonymous memory

I think that works until you get to fork: shared files and
private/anonymous/swap behave differently from then on.

Hugh

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]                     ` <Pine.LNX.4.64.0807300117210.14699-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
@ 2008-07-30  0:33                       ` Paul Menage
  0 siblings, 0 replies; 35+ messages in thread
From: Paul Menage @ 2008-07-30  0:33 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Andrew Morton, Rik van Riel, Linux Containers, Balbir Singh

On Tue, Jul 29, 2008 at 5:31 PM, Hugh Dickins <hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org> wrote:
>
> I don't see that I'm denying you a way to guarantee that (though I've
> been thinking more of the limits than the guarantees): I'm not saying
> that you cannot have a mem controller, I'm saying that you can also
> have a mem+swap controller; but that a swap-by-itself controller
> makes no sense to me.

OK, fair enough.

>
> I think that works until you get to fork: shared files and
> private/anonymous/swap behave differently from then on.
>

Good point. It works as long as you never do a plain fork() without
immediate execve() though.

Paul

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]                 ` <488A294B.4090609-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
@ 2008-07-30  0:48                   ` Hugh Dickins
  0 siblings, 0 replies; 35+ messages in thread
From: Hugh Dickins @ 2008-07-30  0:48 UTC (permalink / raw)
  To: Balbir Singh; +Cc: Linux Containers, Paul Menage, Andrew Morton, Rik van Riel

On Fri, 25 Jul 2008, Balbir Singh wrote:
> 
> I see what your saying. When you look at Linux right now, we control swap
> independent of memory, so I am not totally opposed to setting swap, instead of
> swap+mem. I might not want to swap from a particular cgroup, in which case, I
> set swap to 0 and risk OOMing, which might be an acceptable trade-off depending
> on my setup. I could easily change this policy on demand and add swap if OOMing
> was no longer OK.

It's taken me a while to understand your point.  I think you're
saying that with a swap controller, you can set the swap limit to 0
on a cgroup if you want to keep it entirely in memory, without setting
any mem limit upon it; whereas with my mem+swap controller, you'd have
to set a mem limit then an equal mem+swap limit to achieve the same
"never go to swap" effect, and maybe you don't want to set a mem limit.

Hmm, but an unreachably high mem limit, and equal mem+swap limit,
would achieve that effect.  Sorry, I don't think I have understood
(and even if the unreachably high limit didn't work, this seems more
about setting a don't-swap flag than imposing a swap limit).

Hugh

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]                     ` <Pine.LNX.4.64.0807300113200.14699-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
@ 2008-07-30  1:17                       ` KAMEZAWA Hiroyuki
       [not found]                         ` <20080730101719.5eb18635.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-07-30  1:17 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Rik van Riel, Containers,
	nishimura-YQH0OdQVrdy45+QrQBaojngSJqDPrsil@public.gmane.org,
	Linux-FOgKQjlUJ6BQetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Paul Menage,
	Andrew Morton, Balbir Singh

On Wed, 30 Jul 2008 01:16:17 +0100 (BST)
Hugh Dickins <hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org> wrote:

> On Tue, 29 Jul 2008, KAMEZAWA Hiroyuki wrote:
> > On Fri, 25 Jul 2008 17:46:45 +0100 (BST)
> > Hugh Dickins <hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org> wrote:
> > 
> > > IIRC Rik expressed the same by pointing out that a cgroup at its
> > > swap limit would then be forced to grow in mem (until it hits its
> > > mem limit): so controlling the less precious resource would increase
> > > pressure on the more precious resource.  (Actually, that probably
> > > bears little relation to what he said - sorry, Rik!)  I don't recall
> > > what answer he got, perhaps I'd be persuaded if I heard it again.
> > > 
> > Added Nishimura to CC.
> > 
> > IMHO, from user point of view, both of
> >  - having 2 controls as mem controller + swap controller
> >  - mem + swap controller
> > doesn't have much difference. The users will use as they like.
> 
> I'm not suggesting either one of those alternatives.
> 
> I'm suggesting we have a mem controller (the thing we already have)
> and a mem+swap controller (which we don't yet have: a controller
> for the total mem+swap of a cgroup); the mem+swap controller likely
> making use of much that is in the mem controller, as Paul has said.
> 
Ah, what mem+swap controller means is limitiing mem+swap by 'a' limit ?
It's a choice for me. From view of global LRU management, it's better.
If we can avoid an accident that the swap is fully used by some silly program,
anything is ok to me.

How about you, Nishimura-san ?

A story I talked is based on the assumption that there may be not enough swap
space against memory. We can ask cutomers to equip tons of swap when 
memory is huge. BTW, what is the maximum swap size now ?
Can we extend it if it's small ?


<snip>
> > state 4) Run a big program on B.
> >   A...memory_usage=500M swap_usage=700M.
> >   B...memory_usage=1G   swap_usage=300M
> If you believe a swap controller would make that better, what limits
> do you suggest?  If you assign A a swap limit of 700M or above, it
> changes nothing; if you assign A a swap limit below 700M, it cannot
> do all the work that it could do in the example.

Of course, set A's swap_limit of 300M and get swap pages into memory and
free swap entries and make A on memory. (before B starts.)

> > But users think why A uses 700M of swap with 500M of free memory....
> 
> Because at this time A isn't actively using any of that 700M.

That's a weakness of "do all by automatic detection and ideal algoritm".
It's just a result of LRU algorithm, which is not always the users think
ideal.


Thanks,
-Kame

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]                         ` <20080730101719.5eb18635.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
@ 2008-07-30  2:16                           ` KAMEZAWA Hiroyuki
  2008-07-30  2:52                           ` KAMEZAWA Hiroyuki
  2008-07-30  4:23                           ` Daisuke Nishimura
  2 siblings, 0 replies; 35+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-07-30  2:16 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Rik van Riel, Containers,
	nishimura-YQH0OdQVrdy45+QrQBaojngSJqDPrsil@public.gmane.org,
	Andrew-FOgKQjlUJ6BQetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Linux-FOgKQjlUJ6BQetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Hugh Dickins,
	Paul Menage, Morton, Balbir Singh

On Wed, 30 Jul 2008 10:17:19 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> wrote:

> A story I talked is based on the assumption that there may be not enough swap
> space against memory. We can ask cutomers to equip tons of swap when 
> memory is huge. BTW, what is the maximum swap size now ?
> Can we extend it if it's small ?
> 

I'm sorry for a noise. (maybe we can use tera-bytes in 64bit land.)

BTW, Redhat's RHEL instlation guide says

M=memory size
S=swap size

If M < 2G
   S = M * 2
else
   S = M + 2

as recomended value (for RHEL5). And maybe other guides will say the same
kind of....

Maybe we can  have enough swap space in usual case. Hmm.

Thanks,
-Kame

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]                         ` <20080730101719.5eb18635.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
  2008-07-30  2:16                           ` KAMEZAWA Hiroyuki
@ 2008-07-30  2:52                           ` KAMEZAWA Hiroyuki
       [not found]                             ` <20080730115226.3fec2540.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
  2008-07-30  4:23                           ` Daisuke Nishimura
  2 siblings, 1 reply; 35+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-07-30  2:52 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Rik van Riel, Containers,
	nishimura-YQH0OdQVrdy45+QrQBaojngSJqDPrsil@public.gmane.org,
	Andrew-FOgKQjlUJ6BQetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Linux-FOgKQjlUJ6BQetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Hugh Dickins,
	Paul Menage, Morton, Balbir Singh

On Wed, 30 Jul 2008 10:17:19 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> wrote:

> On Wed, 30 Jul 2008 01:16:17 +0100 (BST)
> Hugh Dickins <hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org> wrote:
> 
> > On Tue, 29 Jul 2008, KAMEZAWA Hiroyuki wrote:
> > > On Fri, 25 Jul 2008 17:46:45 +0100 (BST)
> > > Hugh Dickins <hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org> wrote:
> > > 
> > > > IIRC Rik expressed the same by pointing out that a cgroup at its
> > > > swap limit would then be forced to grow in mem (until it hits its
> > > > mem limit): so controlling the less precious resource would increase
> > > > pressure on the more precious resource.  (Actually, that probably
> > > > bears little relation to what he said - sorry, Rik!)  I don't recall
> > > > what answer he got, perhaps I'd be persuaded if I heard it again.
> > > > 
> > > Added Nishimura to CC.
> > > 
> > > IMHO, from user point of view, both of
> > >  - having 2 controls as mem controller + swap controller
> > >  - mem + swap controller
> > > doesn't have much difference. The users will use as they like.
> > 
> > I'm not suggesting either one of those alternatives.
> > 
> > I'm suggesting we have a mem controller (the thing we already have)
> > and a mem+swap controller (which we don't yet have: a controller
> > for the total mem+swap of a cgroup); the mem+swap controller likely
> > making use of much that is in the mem controller, as Paul has said.
> > 
> Ah, what mem+swap controller means is limitiing mem+swap by 'a' limit ?
> It's a choice for me. From view of global LRU management, it's better.
> If we can avoid an accident that the swap is fully used by some silly program,
> anything is ok to me.
> 
Hmm.

mem+swap controller means a shrink to memory resource controller 
(try_to_free_mem_cgroup_pages()) should drop only file caches.
(Because kick-out-to-swap will never changes the usage.)

right ? only global-lru can make a swap.
maybe I can add optimization to do this. Hmm. I should see how OOM works
under some situation.

Thanks,
-Kame

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]                             ` <20080730115226.3fec2540.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
@ 2008-07-30  3:11                               ` KAMEZAWA Hiroyuki
       [not found]                                 ` <20080730121115.b1e3a7be.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-07-30  3:11 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Rik van Riel, Containers,
	nishimura-YQH0OdQVrdy45+QrQBaojngSJqDPrsil@public.gmane.org,
	Andrew-FOgKQjlUJ6BQetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Linux-FOgKQjlUJ6BQetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Hugh Dickins,
	Paul Menage, Morton, Balbir Singh

On Wed, 30 Jul 2008 11:52:26 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> wrote:
> mem+swap controller means a shrink to memory resource controller 
> (try_to_free_mem_cgroup_pages()) should drop only file caches.
> (Because kick-out-to-swap will never changes the usage.)
> 
> right ? only global-lru can make a swap.
> maybe I can add optimization to do this. Hmm. I should see how OOM works
> under some situation.
> 
(I'm sorry that I'm not a good writer of e-mail.)

A brief summary about changes to mem controller.

 - mem+swap controller which limits the # sum of pages and swap_entries.
 - mem+swap controller just drops file caches when it reaches limit.
 - under mem+swap controller, recaliming Anon pages make no sense.
   Then,
      - LRU for Anon is not necessary.
      - LRU for tmpfs/shmem is not necessary.
      just showing account is better.
 - we should see try_to_free_mem_cgroup() again to avoid too much OOM.
   Maybe Retries=5 is too small because we never do swap under us.
   a problem like struck-into-ext3-journal can easily make file-cache reclaim
   difficult.
 - need some changes to documentation.
 - Should we have on/off switch of taking swap into account ?
   or should we implement mem+swap contoller in different name than
   "memory" controller ?
   If swap is not accounted, we need to do swap-out in memory reclaiming path,
   again.
   

Thanks,
-Kame

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]                                 ` <20080730121115.b1e3a7be.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
@ 2008-07-30  4:14                                   ` KAMEZAWA Hiroyuki
       [not found]                                     ` <20080730131407.526d323b.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
  2008-07-30  5:40                                   ` KAMEZAWA Hiroyuki
  1 sibling, 1 reply; 35+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-07-30  4:14 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Rik van Riel, Containers,
	nishimura-YQH0OdQVrdy45+QrQBaojngSJqDPrsil@public.gmane.org,
	Andrew-FOgKQjlUJ6BQetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Linux-FOgKQjlUJ6BQetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Hugh Dickins,
	Paul Menage, Morton, Balbir Singh

On Wed, 30 Jul 2008 12:11:15 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> wrote:

> On Wed, 30 Jul 2008 11:52:26 +0900
> KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> wrote:
> > mem+swap controller means a shrink to memory resource controller 
> > (try_to_free_mem_cgroup_pages()) should drop only file caches.
> > (Because kick-out-to-swap will never changes the usage.)
> > 
> > right ? only global-lru can make a swap.
> > maybe I can add optimization to do this. Hmm. I should see how OOM works
> > under some situation.
> > 
> (I'm sorry that I'm not a good writer of e-mail.)
> 
> A brief summary about changes to mem controller.
> 
>  - mem+swap controller which limits the # sum of pages and swap_entries.
>  - mem+swap controller just drops file caches when it reaches limit.
>  - under mem+swap controller, recaliming Anon pages make no sense.
>    Then,
>       - LRU for Anon is not necessary.
>       - LRU for tmpfs/shmem is not necessary.
>       just showing account is better.
>  - we should see try_to_free_mem_cgroup() again to avoid too much OOM.
>    Maybe Retries=5 is too small because we never do swap under us.
>    a problem like struck-into-ext3-journal can easily make file-cache reclaim
>    difficult.
>  - need some changes to documentation.
>  - Should we have on/off switch of taking swap into account ?
>    or should we implement mem+swap contoller in different name than
>    "memory" controller ?
>    If swap is not accounted, we need to do swap-out in memory reclaiming path,
>    again.
>    
Then, mem+swap controller finally means
 - under mem+swap controller, program works with no swap. Only global LRU
   may make pages swapped-out.
 - If swap-accounting-mode is off, swap can be used unlimitedly.

Hmm, sounds a bit differenct from what I want. How about others ?

Thanks,
-Kame



> 
> Thanks,
> -Kame
> 
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]                         ` <20080730101719.5eb18635.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
  2008-07-30  2:16                           ` KAMEZAWA Hiroyuki
  2008-07-30  2:52                           ` KAMEZAWA Hiroyuki
@ 2008-07-30  4:23                           ` Daisuke Nishimura
  2 siblings, 0 replies; 35+ messages in thread
From: Daisuke Nishimura @ 2008-07-30  4:23 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Rik van Riel, Containers,
	nishimura-YQH0OdQVrdy45+QrQBaojngSJqDPrsil,
	Linux-FOgKQjlUJ6BQetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Hugh Dickins,
	Paul Menage, Andrew Morton, Balbir Singh

On Wed, 30 Jul 2008 10:17:19 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> wrote:
> On Wed, 30 Jul 2008 01:16:17 +0100 (BST)
> Hugh Dickins <hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org> wrote:
> 
> > On Tue, 29 Jul 2008, KAMEZAWA Hiroyuki wrote:
> > > On Fri, 25 Jul 2008 17:46:45 +0100 (BST)
> > > Hugh Dickins <hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org> wrote:
> > > 
> > > > IIRC Rik expressed the same by pointing out that a cgroup at its
> > > > swap limit would then be forced to grow in mem (until it hits its
> > > > mem limit): so controlling the less precious resource would increase
> > > > pressure on the more precious resource.  (Actually, that probably
> > > > bears little relation to what he said - sorry, Rik!)  I don't recall
> > > > what answer he got, perhaps I'd be persuaded if I heard it again.
> > > > 
> > > Added Nishimura to CC.
> > > 
> > > IMHO, from user point of view, both of
> > >  - having 2 controls as mem controller + swap controller
> > >  - mem + swap controller
> > > doesn't have much difference. The users will use as they like.
> > 
> > I'm not suggesting either one of those alternatives.
> > 
> > I'm suggesting we have a mem controller (the thing we already have)
> > and a mem+swap controller (which we don't yet have: a controller
> > for the total mem+swap of a cgroup); the mem+swap controller likely
> > making use of much that is in the mem controller, as Paul has said.
> > 
> Ah, what mem+swap controller means is limitiing mem+swap by 'a' limit ?
> It's a choice for me. From view of global LRU management, it's better.

> If we can avoid an accident that the swap is fully used by some silly program,
> anything is ok to me.
> 
This was the intention of swap controller, and I agree that
anything would be ok if it can avoid these situations.

(snip)
> > > state 4) Run a big program on B.
> > >   A...memory_usage=500M swap_usage=700M.
> > >   B...memory_usage=1G   swap_usage=300M
> > If you believe a swap controller would make that better, what limits
> > do you suggest?  If you assign A a swap limit of 700M or above, it
> > changes nothing; if you assign A a swap limit below 700M, it cannot
> > do all the work that it could do in the example.
> 
> Of course, set A's swap_limit of 300M and get swap pages into memory and
> free swap entries and make A on memory. (before B starts.)
> 
I think so too.
That's why I said before that shrinking should be supported
in swap controller too, so that users (middle ware) can
decrease the swap usage by themselves.

> > > But users think why A uses 700M of swap with 500M of free memory....
> > 
> > Because at this time A isn't actively using any of that 700M.
> 
> That's a weakness of "do all by automatic detection and ideal algoritm".
> It's just a result of LRU algorithm, which is not always the users think
> ideal.
> 


Thanks,
Daisuke Nishimura.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]                                     ` <20080730131407.526d323b.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
@ 2008-07-30  4:58                                       ` Daisuke Nishimura
       [not found]                                         ` <20080730135803.a7750e21.nishimura-YQH0OdQVrdy45+QrQBaojngSJqDPrsil@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Daisuke Nishimura @ 2008-07-30  4:58 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Rik van Riel, Containers,
	nishimura-YQH0OdQVrdy45+QrQBaojngSJqDPrsil, Hugh Dickins,
	Paul Menage, Andrew Morton, Balbir Singh

On Wed, 30 Jul 2008 13:14:07 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> wrote:
> On Wed, 30 Jul 2008 12:11:15 +0900
> KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> wrote:
> 
> > On Wed, 30 Jul 2008 11:52:26 +0900
> > KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> wrote:
> > > mem+swap controller means a shrink to memory resource controller 
> > > (try_to_free_mem_cgroup_pages()) should drop only file caches.
> > > (Because kick-out-to-swap will never changes the usage.)
> > > 
> > > right ? only global-lru can make a swap.
> > > maybe I can add optimization to do this. Hmm. I should see how OOM works
> > > under some situation.
> > > 
I'm thinking mem+swap controller in a different way: an add-on to
mem controller, just as current swap controller.
I mean adding "memory.(mem+swap)_limit".

> > (I'm sorry that I'm not a good writer of e-mail.)
> > 
> > A brief summary about changes to mem controller.
> > 
> >  - mem+swap controller which limits the # sum of pages and swap_entries.
> >  - mem+swap controller just drops file caches when it reaches limit.
> >  - under mem+swap controller, recaliming Anon pages make no sense.
> >    Then,
> >       - LRU for Anon is not necessary.
> >       - LRU for tmpfs/shmem is not necessary.
> >       just showing account is better.
> >  - we should see try_to_free_mem_cgroup() again to avoid too much OOM.
> >    Maybe Retries=5 is too small because we never do swap under us.
> >    a problem like struck-into-ext3-journal can easily make file-cache reclaim
> >    difficult.
> >  - need some changes to documentation.
> >  - Should we have on/off switch of taking swap into account ?
> >    or should we implement mem+swap contoller in different name than
> >    "memory" controller ?
> >    If swap is not accounted, we need to do swap-out in memory reclaiming path,
> >    again.
> >    
> Then, mem+swap controller finally means
>  - under mem+swap controller, program works with no swap. Only global LRU
>    may make pages swapped-out.
>  - If swap-accounting-mode is off, swap can be used unlimitedly.
> 
> Hmm, sounds a bit differenct from what I want. How about others ?
> 

Thanks,
Daisuke Nishimura.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]                                         ` <20080730135803.a7750e21.nishimura-YQH0OdQVrdy45+QrQBaojngSJqDPrsil@public.gmane.org>
@ 2008-07-30  5:11                                           ` KAMEZAWA Hiroyuki
       [not found]                                             ` <20080730141147.837446aa.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-07-30  5:11 UTC (permalink / raw)
  To: Daisuke Nishimura
  Cc: Balbir-FOgKQjlUJ6BQetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Rik van Riel,
	Containers, Hugh Dickins, Paul Menage, Andrew Morton, Singh

On Wed, 30 Jul 2008 13:58:03 +0900
Daisuke Nishimura <nishimura-YQH0OdQVrdy45+QrQBaojngSJqDPrsil@public.gmane.org> wrote:

> On Wed, 30 Jul 2008 13:14:07 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> wrote:
> > On Wed, 30 Jul 2008 12:11:15 +0900
> > KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> wrote:
> > 
> > > On Wed, 30 Jul 2008 11:52:26 +0900
> > > KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> wrote:
> > > > mem+swap controller means a shrink to memory resource controller 
> > > > (try_to_free_mem_cgroup_pages()) should drop only file caches.
> > > > (Because kick-out-to-swap will never changes the usage.)
> > > > 
> > > > right ? only global-lru can make a swap.
> > > > maybe I can add optimization to do this. Hmm. I should see how OOM works
> > > > under some situation.
> > > > 
> I'm thinking mem+swap controller in a different way: an add-on to
> mem controller, just as current swap controller.
> I mean adding "memory.(mem+swap)_limit".
> 
Hmm ? adding a control file other than
 - memory.limit_in_bytes
?

Thanks,
-Kame

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]                                 ` <20080730121115.b1e3a7be.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
  2008-07-30  4:14                                   ` KAMEZAWA Hiroyuki
@ 2008-07-30  5:40                                   ` KAMEZAWA Hiroyuki
  1 sibling, 0 replies; 35+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-07-30  5:40 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Rik van Riel, Containers,
	nishimura-YQH0OdQVrdy45+QrQBaojngSJqDPrsil@public.gmane.org,
	Andrew-FOgKQjlUJ6BQetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Linux-FOgKQjlUJ6BQetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Hugh Dickins,
	Paul Menage, Morton, Balbir Singh

Sorry for many mails ;(

I think I misunderstood something...

Following is ?

A brief summary about changes in memroy controller.
 - memory.limit_in_bytes works as it is now.
 - new parameter: memory.limit_in_bytes_includes_swap will be added.
   + memory.limit_in_bytes_includes_swap controlls the total amount of
     RAM + SWAP,
   + memory.limit_in_bytes <= memory.limit_in_bytes_includes_swap

As a result.
 - memory controller works as it is but doesn't use too much swap.
 - global-lru cannot be affected by controller's parameter.


Hmm, seems reasonable. minor problem is how-to-handle 2 counts/limits ?

BTW, does anyone have good names ?
  (example) memory.memory_limits_in_bytes.  (for accounting memory) 
            memory.total_limits_in_bytes.   (for accountign memory+swap)

Thanks,
-Kame


On Wed, 30 Jul 2008 12:11:15 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> wrote:
> A brief summary about changes to mem controller.
> 
>  - mem+swap controller which limits the # sum of pages and swap_entries.
>  - mem+swap controller just drops file caches when it reaches limit.
>  - under mem+swap controller, recaliming Anon pages make no sense.
>    Then,
>       - LRU for Anon is not necessary.
>       - LRU for tmpfs/shmem is not necessary.
>       just showing account is better.
>  - we should see try_to_free_mem_cgroup() again to avoid too much OOM.
>    Maybe Retries=5 is too small because we never do swap under us.
>    a problem like struck-into-ext3-journal can easily make file-cache reclaim
>    difficult.
>  - need some changes to documentation.
>  - Should we have on/off switch of taking swap into account ?
>    or should we implement mem+swap contoller in different name than
>    "memory" controller ?
>    If swap is not accounted, we need to do swap-out in memory reclaiming path,
>    again.
>    
> 
> Thanks,
> -Kame
> 
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]                                             ` <20080730141147.837446aa.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
@ 2008-07-30  5:41                                               ` Daisuke Nishimura
  0 siblings, 0 replies; 35+ messages in thread
From: Daisuke Nishimura @ 2008-07-30  5:41 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Rik van Riel, Containers,
	nishimura-YQH0OdQVrdy45+QrQBaojngSJqDPrsil, Hugh Dickins,
	Paul Menage, Andrew Morton, Balbir Singh

On Wed, 30 Jul 2008 14:11:47 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> wrote:
> On Wed, 30 Jul 2008 13:58:03 +0900
> Daisuke Nishimura <nishimura-YQH0OdQVrdy45+QrQBaojngSJqDPrsil@public.gmane.org> wrote:
> 
> > On Wed, 30 Jul 2008 13:14:07 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> wrote:
> > > On Wed, 30 Jul 2008 12:11:15 +0900
> > > KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> wrote:
> > > 
> > > > On Wed, 30 Jul 2008 11:52:26 +0900
> > > > KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> wrote:
> > > > > mem+swap controller means a shrink to memory resource controller 
> > > > > (try_to_free_mem_cgroup_pages()) should drop only file caches.
> > > > > (Because kick-out-to-swap will never changes the usage.)
> > > > > 
> > > > > right ? only global-lru can make a swap.
> > > > > maybe I can add optimization to do this. Hmm. I should see how OOM works
> > > > > under some situation.
> > > > > 
> > I'm thinking mem+swap controller in a different way: an add-on to
> > mem controller, just as current swap controller.
> > I mean adding "memory.(mem+swap)_limit".
> > 
> Hmm ? adding a control file other than
>  - memory.limit_in_bytes
> ?
> 
Yes.

I just thought:
- memory.rss_limit_in_bytes (same as current limit_in_bytes)
- memory.total_limit_in_bytes (rss + swap)


Thanks,
Daisuke Nishimura.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]     ` <Pine.LNX.4.64.0807251004570.31120-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
  2008-07-25 13:32       ` Balbir Singh
  2008-07-25 14:06       ` Paul Menage
@ 2008-08-04 19:04       ` Balbir Singh
       [not found]         ` <489752AA.9060500-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
  2 siblings, 1 reply; 35+ messages in thread
From: Balbir Singh @ 2008-08-04 19:04 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Linux Containers, Paul Menage, Andrew Morton

Hugh Dickins wrote:
[snip]
> 
> BUG: unable to handle kernel paging request at 6b6b6b8b
> IP: [<7817078f>] memrlimit_cgroup_uncharge_as+0x18/0x29
> *pde = 00000000 
> Oops: 0000 [#1] PREEMPT SMP 
> last sysfs file: /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map
> Modules linked in: acpi_cpufreq snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device thermal ac battery button
> 
> Pid: 22500, comm: swapoff Not tainted (2.6.26-rc8-mm1 #7)
> EIP: 0060:[<7817078f>] EFLAGS: 00010206 CPU: 0
> EIP is at memrlimit_cgroup_uncharge_as+0x18/0x29
> EAX: 6b6b6b6b EBX: 7963215c ECX: 7c032000 EDX: 0025e000
> ESI: 96902518 EDI: 9fbb1aa0 EBP: 7c033e9c ESP: 7c033e9c
>  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> Process swapoff (pid: 22500, ti=7c032000 task=907e2b70 task.ti=7c032000)
> Stack: 7c033edc 78161323 9fbb1aa0 0000025e ffffff77 7c033ecc 96902518 00000000 
>        ffffffff 7c033ec8 00000000 00000089 7963215c 9fbb1aa0 9fbb1b28 a272f040 
>        7c033ef4 781226b1 9fbb1aa0 9fbb1aa0 790fa884 a272f0c8 7c033f80 78165ce3 
> Call Trace:
>  [<78161323>] ? exit_mmap+0xaf/0x133
>  [<781226b1>] ? mmput+0x4c/0xba
>  [<78165ce3>] ? try_to_unuse+0x20b/0x3f5
>  [<78371534>] ? _spin_unlock+0x22/0x3c
>  [<7816636a>] ? sys_swapoff+0x17b/0x37c
>  [<78102d95>] ? sysenter_past_esp+0x6a/0xa5
>  =======================
> Code: 24 0c 00 00 8b 40 20 52 83 c0 0c 50 e8 ad a6 fd ff c9 c3 55 89 e5 8b 45 08 8b 55 0c 8b 80 30 02 00 00 c1 e2 0c 8b 80 24 0c 00 00 <8b> 40 20 52 83 c0 0c 50 e8 e6 a6 fd ff 58 5a c9 c3 55 89 e5 8b 
> EIP: [<7817078f>] memrlimit_cgroup_uncharge_as+0x18/0x29 SS:ESP 0068:7c033e9c

Hi, Hugh,

I am unable to reproduce the problem, but I do have an initial hypothesis

CPU0					CPU1
					try_to_unuse
task 1 stars exiting			look at mm = task1->mm
..					increment mm_users
task 1 exits
mm->owner needs to be updated, but
no new owner is found
(mm_users > 1, but no other task
has task->mm = task1->mm)
mm_update_next_owner() leaves

grace period
					user count drops, call mmput(mm)
task 1 freed
					dereferencing mm->owner fails



I do have a potential solution in mind, but I want to make sure my hypothesis is
correct.



-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]         ` <489752AA.9060500-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
@ 2008-08-04 21:52           ` Hugh Dickins
       [not found]             ` <Pine.LNX.4.64.0808042226430.4300-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Hugh Dickins @ 2008-08-04 21:52 UTC (permalink / raw)
  To: Balbir Singh; +Cc: Linux Containers, Paul Menage, Andrew Morton

On Tue, 5 Aug 2008, Balbir Singh wrote:
> Hugh Dickins wrote:
> [snip]
> > 
> > BUG: unable to handle kernel paging request at 6b6b6b8b
> > IP: [<7817078f>] memrlimit_cgroup_uncharge_as+0x18/0x29
> > Pid: 22500, comm: swapoff Not tainted (2.6.26-rc8-mm1 #7)
> >  [<78161323>] ? exit_mmap+0xaf/0x133
> >  [<781226b1>] ? mmput+0x4c/0xba
> >  [<78165ce3>] ? try_to_unuse+0x20b/0x3f5
> >  [<78371534>] ? _spin_unlock+0x22/0x3c
> >  [<7816636a>] ? sys_swapoff+0x17b/0x37c
> >  [<78102d95>] ? sysenter_past_esp+0x6a/0xa5
> 
> I am unable to reproduce the problem,

Me neither, I've spent many hours trying 2.6.27-rc1-mm1 and then
back to 2.6.26-rc8-mm1.  But I've been SO stupid: saw it originally
on one machine with SLAB_DEBUG=y, have been trying since mostly on
another with SLUB_DEBUG=y, but never thought to boot with
slub_debug=P,task_struct until now.

> but I do have an initial hypothesis
> 
> CPU0					CPU1
> 					try_to_unuse
> task 1 stars exiting			look at mm = task1->mm
> ..					increment mm_users
> task 1 exits
> mm->owner needs to be updated, but
> no new owner is found
> (mm_users > 1, but no other task
> has task->mm = task1->mm)
> mm_update_next_owner() leaves
> 
> grace period
> 					user count drops, call mmput(mm)
> task 1 freed
> 					dereferencing mm->owner fails

Yes, that looks right to me: seems obvious now.  I don't think your
careful alternation of CPU0/1 events at the end matters: the swapoff
CPU simply dereferences mm->owner after that task has gone.

(That's a shame, I'd always hoped that mm->owner->comm was going to
be good for use in mm messages, even when tearing down the mm.)

> I do have a potential solution in mind, but I want to make sure my
> hypothesis is correct.

It seems wrong that memrlimit_cgroup_uncharge_as should be called
after mm->owner may have been changed, even if it's to something safe.
But I forget the mm/task exit details, surely they're tricky.

By the way, is the ordering in mm_update_next_owner the best?
Would there be less movement if it searched amongst siblings before
it searched amongst children?  Ought it to make a first pass trying
to stay within the same cgroup?

Hugh

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]             ` <Pine.LNX.4.64.0808042226430.4300-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
@ 2008-08-05  4:53               ` Balbir Singh
  2008-08-10 17:04               ` Balbir Singh
  1 sibling, 0 replies; 35+ messages in thread
From: Balbir Singh @ 2008-08-05  4:53 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Linux Containers, Paul Menage, Andrew Morton

Hugh Dickins wrote:
> On Tue, 5 Aug 2008, Balbir Singh wrote:
>> Hugh Dickins wrote:
>> [snip]
>>> BUG: unable to handle kernel paging request at 6b6b6b8b
>>> IP: [<7817078f>] memrlimit_cgroup_uncharge_as+0x18/0x29
>>> Pid: 22500, comm: swapoff Not tainted (2.6.26-rc8-mm1 #7)
>>>  [<78161323>] ? exit_mmap+0xaf/0x133
>>>  [<781226b1>] ? mmput+0x4c/0xba
>>>  [<78165ce3>] ? try_to_unuse+0x20b/0x3f5
>>>  [<78371534>] ? _spin_unlock+0x22/0x3c
>>>  [<7816636a>] ? sys_swapoff+0x17b/0x37c
>>>  [<78102d95>] ? sysenter_past_esp+0x6a/0xa5
>> I am unable to reproduce the problem,
> 
> Me neither, I've spent many hours trying 2.6.27-rc1-mm1 and then
> back to 2.6.26-rc8-mm1.  But I've been SO stupid: saw it originally
> on one machine with SLAB_DEBUG=y, have been trying since mostly on
> another with SLUB_DEBUG=y, but never thought to boot with
> slub_debug=P,task_struct until now.
> 

Unfortunately, I've not tried on 32 bit and not at all with SLAB_DEBUG=y. I'll
give the latter a trial run and see what I get.

>> but I do have an initial hypothesis
>>
>> CPU0					CPU1
>> 					try_to_unuse
>> task 1 stars exiting			look at mm = task1->mm
>> ..					increment mm_users
>> task 1 exits
>> mm->owner needs to be updated, but
>> no new owner is found
>> (mm_users > 1, but no other task
>> has task->mm = task1->mm)
>> mm_update_next_owner() leaves
>>
>> grace period
>> 					user count drops, call mmput(mm)
>> task 1 freed
>> 					dereferencing mm->owner fails
> 
> Yes, that looks right to me: seems obvious now.  I don't think your
> careful alternation of CPU0/1 events at the end matters: the swapoff
> CPU simply dereferences mm->owner after that task has gone.
> 
> (That's a shame, I'd always hoped that mm->owner->comm was going to
> be good for use in mm messages, even when tearing down the mm.)
> 

The problem we have is that tasks are independent of mm_struct's (in some ways)
and are associated almost like a database associates two entities through keys.

>> I do have a potential solution in mind, but I want to make sure my
>> hypothesis is correct.
> 
> It seems wrong that memrlimit_cgroup_uncharge_as should be called
> after mm->owner may have been changed, even if it's to something safe.
> But I forget the mm/task exit details, surely they're tricky.
> 

The fix would be to uncharge when a new owner can no longer be found (I am yet
to code/test it though).

> By the way, is the ordering in mm_update_next_owner the best?
> Would there be less movement if it searched amongst siblings before
> it searched amongst children?  Ought it to make a first pass trying
> to stay within the same cgroup?

Yes, we need to make a first pass at keeping it in the same cgroup. You might be
right about the sibling optimization.

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: memrlimit controller merge to mainline
       [not found]             ` <Pine.LNX.4.64.0808042226430.4300-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
  2008-08-05  4:53               ` Balbir Singh
@ 2008-08-10 17:04               ` Balbir Singh
  1 sibling, 0 replies; 35+ messages in thread
From: Balbir Singh @ 2008-08-10 17:04 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Linux Containers, Paul Menage, Andrew Morton

Hugh Dickins wrote:
>> but I do have an initial hypothesis
>>
>> CPU0					CPU1
>> 					try_to_unuse
>> task 1 stars exiting			look at mm = task1->mm
>> ..					increment mm_users
>> task 1 exits
>> mm->owner needs to be updated, but
>> no new owner is found
>> (mm_users > 1, but no other task
>> has task->mm = task1->mm)
>> mm_update_next_owner() leaves
>>
>> grace period
>> 					user count drops, call mmput(mm)
>> task 1 freed
>> 					dereferencing mm->owner fails
> 
> Yes, that looks right to me: seems obvious now.  I don't think your
> careful alternation of CPU0/1 events at the end matters: the swapoff
> CPU simply dereferences mm->owner after that task has gone.
> 
> (That's a shame, I'd always hoped that mm->owner->comm was going to
> be good for use in mm messages, even when tearing down the mm.)
> 

Hi, Hugh,

I do have fixes for the problem above, but I've run into something strange. I
see that when I create a new cgroup and set 500M as it's limit and run kernbench
under it, I see a strange problem

1. memrlimit determines that limit is exceeded and fails the fork of the new process
2. The process that failed to fork, encounters a page fault and faults in find_vma

I tried chasing the problem, but I am lost wondering how a page fault
(do_page_fault) can occur in a process that has not yet been created and is
going to fail with -ENOMEM. The interesting thing is that the OOPS occurs in
find_vma

My trace so far
----------------

limit exceeded
Pid: 3695, comm: sh Not tainted 2.6.27-rc1-mm1 #12

Call Trace:
 [<ffffffff802b0473>] memrlimit_cgroup_charge_as+0x3a/0x3c
 [<ffffffff8023a82f>] dup_mm+0xea/0x410
 [<ffffffff8023b648>] copy_process+0xabe/0x12ef
 [<ffffffff8023c0df>] do_fork+0x114/0x2d2
 [<ffffffff8025b42c>] ? trace_hardirqs_on_caller+0xf9/0x124
 [<ffffffff8025b464>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffff805bda1f>] ? _spin_unlock_irq+0x2b/0x30
 [<ffffffff805bd24e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff8020bf4b>] ? system_call_fastpath+0x16/0x1b
 [<ffffffff8020a44a>] sys_clone+0x23/0x25
 [<ffffffff8020c2c7>] ptregscall_common+0x67/0xb0

putting mm ffff88003d931400 3695 sh
copy_mm, retval -12
copy_process returning -12
copy_process returned fffffffffffffff4 -12
fork failed -12
general protection fault: 0000 [1] copy_process returned ffff880037a11600 -13194
0462029312
SMP
last sysfs file: /sys/block/sda/size
CPU 2
Modules linked in: coretemp hwmon kvm_intel kvm rtc_cmos rtc_core rtc_lib mptsas
 mptscsih mptbase scsi_transport_sas uhci_hcd ohci_hcd ehci_hcd
Pid: 3695, comm: sh Not tainted 2.6.27-rc1-mm1 #12
RIP: 0010:[<ffffffff802954f8>]  [<ffffffff802954f8>] find_vma+0x2f/0x62
RSP: 0000:ffff88003544bee8  EFLAGS: 00010202
RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000000 RCX: ffff8800399e34d8
RDX: ffff8800399e34d8 RSI: 0000003a2729ad22 RDI: ffff88003e5c8500
RBP: ffff88003544bee8 R08: 0000000000000000 R09: 0000000000000000
R10: ffff88003e5c8568 R11: 0000000000000246 R12: 0000003a2729ad22
R13: 0000000000000014 R14: ffff88003544bf58 R15: ffff88003e8bac00
FS:  00002b3b978f3f50(0000) GS:ffff8800bfd954b0(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000003a2729ad22 CR3: 000000003549f000 CR4: 00000000000026e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process sh (pid: 3695, threadinfo ffff88003544a000, task ffff88003e8bac00)
Stack:  ffff88003544bf48 ffffffff805bfec0 00000000ffffffff 00000000008cae50
 ffff88003e5c8560 ffff88003e5c8500 0003000100000000 0000000000000000
 00007fff131e72c0 00000000ffffffff 00000000008cae50 0000000000000000
Call Trace:
 [<ffffffff805bfec0>] do_page_fault+0x36f/0x7ad
 [<ffffffff805bdd4d>] error_exit+0x0/0xa9


Code: 85 ff 48 89 e5 74 55 eb 05 48 89 ca eb 47 48 8b 47 10 48 85 c0 74 0c 48 39
 70 10 76 06 48 39 70 08 76 39 48 8b 47 08 31 d2 eb 1d <48> 39 70 e0 48 8d 48 d0
 76 0f 48 39 70 d8 76 ce 48 8b 40 10 48
RIP  [<ffffffff802954f8>] find_vma+0x2f/0x62
 RSP <ffff88003544bee8>

---[ end trace 89156336afdfaec3 ]---

I hope that I'll be able to think more clearly on Monday, but it's hard to say :)

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2008-08-10 17:04 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-25  8:14 memrlimit controller merge to mainline Paul Menage
     [not found] ` <6599ad830807250114h7ab0fdb1u98c0968961647642-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-07-25  8:25   ` Andrew Morton
     [not found]     ` <20080725012519.a5fed7d6.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2008-07-25 12:56       ` Balbir Singh
2008-07-25 12:57       ` Balbir Singh
2008-07-25  9:06   ` Hugh Dickins
     [not found]     ` <Pine.LNX.4.64.0807251004570.31120-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
2008-07-25 13:32       ` Balbir Singh
     [not found]         ` <4889D5EE.4010601-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2008-07-25 17:38           ` Hugh Dickins
     [not found]             ` <Pine.LNX.4.64.0807251820440.20617-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
2008-07-25 19:08               ` Balbir Singh
2008-07-25 14:06       ` Paul Menage
     [not found]         ` <6599ad830807250706t23e483b5j18d683c0470d1d22-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-07-25 16:46           ` Hugh Dickins
     [not found]             ` <Pine.LNX.4.64.0807251715070.12089-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
2008-07-25 19:24               ` Paul Menage
     [not found]                 ` <6599ad830807251224g218e17faj5c8224ba398a51c8-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-07-30  0:31                   ` Hugh Dickins
     [not found]                     ` <Pine.LNX.4.64.0807300117210.14699-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
2008-07-30  0:33                       ` Paul Menage
2008-07-25 19:28               ` Balbir Singh
     [not found]                 ` <488A294B.4090609-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2008-07-30  0:48                   ` Hugh Dickins
2008-07-29  6:01               ` KAMEZAWA Hiroyuki
     [not found]                 ` <20080729150111.f879c989.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2008-07-30  0:16                   ` Hugh Dickins
     [not found]                     ` <Pine.LNX.4.64.0807300113200.14699-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
2008-07-30  1:17                       ` KAMEZAWA Hiroyuki
     [not found]                         ` <20080730101719.5eb18635.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2008-07-30  2:16                           ` KAMEZAWA Hiroyuki
2008-07-30  2:52                           ` KAMEZAWA Hiroyuki
     [not found]                             ` <20080730115226.3fec2540.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2008-07-30  3:11                               ` KAMEZAWA Hiroyuki
     [not found]                                 ` <20080730121115.b1e3a7be.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2008-07-30  4:14                                   ` KAMEZAWA Hiroyuki
     [not found]                                     ` <20080730131407.526d323b.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2008-07-30  4:58                                       ` Daisuke Nishimura
     [not found]                                         ` <20080730135803.a7750e21.nishimura-YQH0OdQVrdy45+QrQBaojngSJqDPrsil@public.gmane.org>
2008-07-30  5:11                                           ` KAMEZAWA Hiroyuki
     [not found]                                             ` <20080730141147.837446aa.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2008-07-30  5:41                                               ` Daisuke Nishimura
2008-07-30  5:40                                   ` KAMEZAWA Hiroyuki
2008-07-30  4:23                           ` Daisuke Nishimura
2008-08-04 19:04       ` Balbir Singh
     [not found]         ` <489752AA.9060500-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2008-08-04 21:52           ` Hugh Dickins
     [not found]             ` <Pine.LNX.4.64.0808042226430.4300-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
2008-08-05  4:53               ` Balbir Singh
2008-08-10 17:04               ` Balbir Singh
2008-07-25 12:30   ` Balbir Singh
     [not found]     ` <4889C77F.5090909-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2008-07-25 13:47       ` Joe MacDonald
2008-07-25 14:11       ` Paul Menage
     [not found]         ` <6599ad830807250711m4f34c447oc259b0af40f68da4-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-07-25 16:07           ` Balbir Singh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.