From: Balbir Singh <balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
To: Hugh Dickins <hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org>
Cc: Linux Containers
<containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>,
Paul Menage <menage-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Andrew Morton
<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
Subject: Re: memrlimit controller merge to mainline
Date: Sun, 10 Aug 2008 22:34:54 +0530 [thread overview]
Message-ID: <489F1FB6.9070503@linux.vnet.ibm.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0808042226430.4300-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
Hugh Dickins wrote:
>> but I do have an initial hypothesis
>>
>> CPU0 CPU1
>> try_to_unuse
>> task 1 stars exiting look at mm = task1->mm
>> .. increment mm_users
>> task 1 exits
>> mm->owner needs to be updated, but
>> no new owner is found
>> (mm_users > 1, but no other task
>> has task->mm = task1->mm)
>> mm_update_next_owner() leaves
>>
>> grace period
>> user count drops, call mmput(mm)
>> task 1 freed
>> dereferencing mm->owner fails
>
> Yes, that looks right to me: seems obvious now. I don't think your
> careful alternation of CPU0/1 events at the end matters: the swapoff
> CPU simply dereferences mm->owner after that task has gone.
>
> (That's a shame, I'd always hoped that mm->owner->comm was going to
> be good for use in mm messages, even when tearing down the mm.)
>
Hi, Hugh,
I do have fixes for the problem above, but I've run into something strange. I
see that when I create a new cgroup and set 500M as it's limit and run kernbench
under it, I see a strange problem
1. memrlimit determines that limit is exceeded and fails the fork of the new process
2. The process that failed to fork, encounters a page fault and faults in find_vma
I tried chasing the problem, but I am lost wondering how a page fault
(do_page_fault) can occur in a process that has not yet been created and is
going to fail with -ENOMEM. The interesting thing is that the OOPS occurs in
find_vma
My trace so far
----------------
limit exceeded
Pid: 3695, comm: sh Not tainted 2.6.27-rc1-mm1 #12
Call Trace:
[<ffffffff802b0473>] memrlimit_cgroup_charge_as+0x3a/0x3c
[<ffffffff8023a82f>] dup_mm+0xea/0x410
[<ffffffff8023b648>] copy_process+0xabe/0x12ef
[<ffffffff8023c0df>] do_fork+0x114/0x2d2
[<ffffffff8025b42c>] ? trace_hardirqs_on_caller+0xf9/0x124
[<ffffffff8025b464>] ? trace_hardirqs_on+0xd/0xf
[<ffffffff805bda1f>] ? _spin_unlock_irq+0x2b/0x30
[<ffffffff805bd24e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff8020bf4b>] ? system_call_fastpath+0x16/0x1b
[<ffffffff8020a44a>] sys_clone+0x23/0x25
[<ffffffff8020c2c7>] ptregscall_common+0x67/0xb0
putting mm ffff88003d931400 3695 sh
copy_mm, retval -12
copy_process returning -12
copy_process returned fffffffffffffff4 -12
fork failed -12
general protection fault: 0000 [1] copy_process returned ffff880037a11600 -13194
0462029312
SMP
last sysfs file: /sys/block/sda/size
CPU 2
Modules linked in: coretemp hwmon kvm_intel kvm rtc_cmos rtc_core rtc_lib mptsas
mptscsih mptbase scsi_transport_sas uhci_hcd ohci_hcd ehci_hcd
Pid: 3695, comm: sh Not tainted 2.6.27-rc1-mm1 #12
RIP: 0010:[<ffffffff802954f8>] [<ffffffff802954f8>] find_vma+0x2f/0x62
RSP: 0000:ffff88003544bee8 EFLAGS: 00010202
RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000000 RCX: ffff8800399e34d8
RDX: ffff8800399e34d8 RSI: 0000003a2729ad22 RDI: ffff88003e5c8500
RBP: ffff88003544bee8 R08: 0000000000000000 R09: 0000000000000000
R10: ffff88003e5c8568 R11: 0000000000000246 R12: 0000003a2729ad22
R13: 0000000000000014 R14: ffff88003544bf58 R15: ffff88003e8bac00
FS: 00002b3b978f3f50(0000) GS:ffff8800bfd954b0(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000003a2729ad22 CR3: 000000003549f000 CR4: 00000000000026e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process sh (pid: 3695, threadinfo ffff88003544a000, task ffff88003e8bac00)
Stack: ffff88003544bf48 ffffffff805bfec0 00000000ffffffff 00000000008cae50
ffff88003e5c8560 ffff88003e5c8500 0003000100000000 0000000000000000
00007fff131e72c0 00000000ffffffff 00000000008cae50 0000000000000000
Call Trace:
[<ffffffff805bfec0>] do_page_fault+0x36f/0x7ad
[<ffffffff805bdd4d>] error_exit+0x0/0xa9
Code: 85 ff 48 89 e5 74 55 eb 05 48 89 ca eb 47 48 8b 47 10 48 85 c0 74 0c 48 39
70 10 76 06 48 39 70 08 76 39 48 8b 47 08 31 d2 eb 1d <48> 39 70 e0 48 8d 48 d0
76 0f 48 39 70 d8 76 ce 48 8b 40 10 48
RIP [<ffffffff802954f8>] find_vma+0x2f/0x62
RSP <ffff88003544bee8>
---[ end trace 89156336afdfaec3 ]---
I hope that I'll be able to think more clearly on Monday, but it's hard to say :)
--
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
next prev parent reply other threads:[~2008-08-10 17:04 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-25 8:14 memrlimit controller merge to mainline Paul Menage
[not found] ` <6599ad830807250114h7ab0fdb1u98c0968961647642-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-07-25 8:25 ` Andrew Morton
[not found] ` <20080725012519.a5fed7d6.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2008-07-25 12:56 ` Balbir Singh
2008-07-25 12:57 ` Balbir Singh
2008-07-25 9:06 ` Hugh Dickins
[not found] ` <Pine.LNX.4.64.0807251004570.31120-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
2008-07-25 13:32 ` Balbir Singh
[not found] ` <4889D5EE.4010601-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2008-07-25 17:38 ` Hugh Dickins
[not found] ` <Pine.LNX.4.64.0807251820440.20617-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
2008-07-25 19:08 ` Balbir Singh
2008-07-25 14:06 ` Paul Menage
[not found] ` <6599ad830807250706t23e483b5j18d683c0470d1d22-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-07-25 16:46 ` Hugh Dickins
[not found] ` <Pine.LNX.4.64.0807251715070.12089-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
2008-07-25 19:24 ` Paul Menage
[not found] ` <6599ad830807251224g218e17faj5c8224ba398a51c8-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-07-30 0:31 ` Hugh Dickins
[not found] ` <Pine.LNX.4.64.0807300117210.14699-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
2008-07-30 0:33 ` Paul Menage
2008-07-25 19:28 ` Balbir Singh
[not found] ` <488A294B.4090609-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2008-07-30 0:48 ` Hugh Dickins
2008-07-29 6:01 ` KAMEZAWA Hiroyuki
[not found] ` <20080729150111.f879c989.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2008-07-30 0:16 ` Hugh Dickins
[not found] ` <Pine.LNX.4.64.0807300113200.14699-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
2008-07-30 1:17 ` KAMEZAWA Hiroyuki
[not found] ` <20080730101719.5eb18635.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2008-07-30 2:16 ` KAMEZAWA Hiroyuki
2008-07-30 2:52 ` KAMEZAWA Hiroyuki
[not found] ` <20080730115226.3fec2540.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2008-07-30 3:11 ` KAMEZAWA Hiroyuki
[not found] ` <20080730121115.b1e3a7be.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2008-07-30 4:14 ` KAMEZAWA Hiroyuki
[not found] ` <20080730131407.526d323b.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2008-07-30 4:58 ` Daisuke Nishimura
[not found] ` <20080730135803.a7750e21.nishimura-YQH0OdQVrdy45+QrQBaojngSJqDPrsil@public.gmane.org>
2008-07-30 5:11 ` KAMEZAWA Hiroyuki
[not found] ` <20080730141147.837446aa.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2008-07-30 5:41 ` Daisuke Nishimura
2008-07-30 5:40 ` KAMEZAWA Hiroyuki
2008-07-30 4:23 ` Daisuke Nishimura
2008-08-04 19:04 ` Balbir Singh
[not found] ` <489752AA.9060500-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2008-08-04 21:52 ` Hugh Dickins
[not found] ` <Pine.LNX.4.64.0808042226430.4300-popGQ1T0qN76K7/ahGyk6A@public.gmane.org>
2008-08-05 4:53 ` Balbir Singh
2008-08-10 17:04 ` Balbir Singh [this message]
2008-07-25 12:30 ` Balbir Singh
[not found] ` <4889C77F.5090909-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2008-07-25 13:47 ` Joe MacDonald
2008-07-25 14:11 ` Paul Menage
[not found] ` <6599ad830807250711m4f34c447oc259b0af40f68da4-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-07-25 16:07 ` Balbir Singh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=489F1FB6.9070503@linux.vnet.ibm.com \
--to=balbir-23vcf4htsmix0ybbhkvfkdbpr1lh4cv8@public.gmane.org \
--cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
--cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org \
--cc=menage-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.