* Re: [Bug 80881] New: Memory cgroup OOM leads to BUG: unable to handle kernel paging request at ffffffffffffffd8
[not found] <bug-80881-27@https.bugzilla.kernel.org/>
@ 2014-07-22 20:07 ` Andrew Morton
2014-07-24 12:09 ` Michal Hocko
0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2014-07-22 20:07 UTC (permalink / raw)
To: Johannes Weiner, Michal Hocko; +Cc: bugzilla-daemon, linux-mm, Paul Furtado
(switched to email. Please respond via emailed reply-to-all, not via the
bugzilla web interface).
oops in mem_cgroup_oom_synchronize() after an oom.
On Tue, 22 Jul 2014 06:45:25 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=80881
>
> Bug ID: 80881
> Summary: Memory cgroup OOM leads to BUG: unable to handle
> kernel paging request at ffffffffffffffd8
> Product: Memory Management
> Version: 2.5
> Kernel Version: 3.16.0-rc5
> Hardware: x86-64
> OS: Linux
> Tree: Mainline
> Status: NEW
> Severity: normal
> Priority: P1
> Component: Other
> Assignee: akpm@linux-foundation.org
> Reporter: paulfurtado91@gmail.com
> Regression: No
>
> Created attachment 143841
> --> https://bugzilla.kernel.org/attachment.cgi?id=143841&action=edit
> 3.16.0-rc5_console_output
>
> I was testing the stability of the memory cgroup OOM handler on kernel
> 3.16.0-rc5 by running hundreds of tasks in Apache Mesos which were using memory
> cgroups to limit their memory usage and were guaranteed to run out of memory
> (running a process which intentionally attempted to allocate more than the
> limit). After testing for a few days on several servers, I hit:
>
> [162006.001086] kernel tried to execute NX-protected page - exploit attempt?
> (uid: 0)
> [162006.001100] BUG: unable to handle kernel paging request at ffff8801d2ec7e90
>
> Note that this was running on a paravirtualized xen instance in EC2 running
> CentOS 6.5 and the kernel was version 3.16.0-rc5 compiled directly from the
> source archive on kernel.org. We're testing on many kernel versions and this is
> one of many failures, but the only one I've reproduced on 3.16.0-rc5 thus far.
> I also have at least on reproduction of this exact same error on kernel
> 3.12.24.
>
>
> The full log is attached, but here is the part I believe is relevant from the
> 3.16.0-rc5 error:
> [162005.262545] memory: usage 131072kB, limit 131072kB, failcnt 1314
> [162005.262550] memory+swap: usage 0kB, limit 18014398509481983kB, failcnt 0
> [162005.262554] kmem: usage 0kB, limit 18014398509481983kB, failcnt 0
> [162005.262558] Memory cgroup stats for
> /mesos/c206ce2a-9f11-4340-a3c9-c59b405690a7: cache:8KB rss:131064KB
> rss_huge:0KB mapped_file:0KB writeback:0KB inactive_anon:0KB
> active_anon:131064KB inactive_file:0KB active_file:0KB unevictable:0KB
> [162005.262581] [ pid ] uid tgid total_vm rss nr_ptes swapents
> oom_score_adj name
> [162005.262602] [ 3002] 0 3002 544153 22244 151 0
> 0 java7
> [162005.262609] [ 3061] 0 3061 424397 20423 88 0
> 0 java
> [162005.262615] Memory cgroup out of memory: Kill process 3002 (java7) score
> 662 or sacrifice child
> [162005.262623] Killed process 3002 (java7) total-vm:2176612kB,
> anon-rss:60400kB, file-rss:28576kB
> [162005.263453] general protection fault: 0000 [#1] SMP
> [162005.263463] Modules linked in: ipv6 dm_mod xen_netfront coretemp hwmon
> x86_pkg_temp_thermal crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel
> ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 microcode pcspkr ext4
> jbd2 mbcache raid0 xen_blkfront
> [162005.264060] CPU: 3 PID: 3062 Comm: java Not tainted 3.16.0-rc5 #1
> [162005.264060] task: ffff8801cfe8f170 ti: ffff8801d2ec4000 task.ti:
> ffff8801d2ec4000
> [162005.264060] RIP: e030:[<ffffffff811c0b80>] [<ffffffff811c0b80>]
> mem_cgroup_oom_synchronize+0x140/0x240
> [162005.264060] RSP: e02b:ffff8801d2ec7d48 EFLAGS: 00010283
> [162005.264060] RAX: 0000000000000001 RBX: ffff88009d633800 RCX:
> 000000000000000e
> [162005.264060] RDX: fffffffffffffffe RSI: ffff88009d630200 RDI:
> ffff88009d630200
> [162005.264060] RBP: ffff8801d2ec7da8 R08: 0000000000000012 R09:
> 00000000fffffffe
> [162005.264060] R10: 0000000000000000 R11: 0000000000000000 R12:
> ffff88009d633800
> [162005.264060] R13: ffff8801d2ec7d48 R14: dead000000100100 R15:
> ffff88009d633a30
> [162005.264060] FS: 00007f1748bb4700(0000) GS:ffff8801def80000(0000)
> knlGS:0000000000000000
> [162005.264060] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [162005.264060] CR2: 00007f4110300308 CR3: 00000000c05f7000 CR4:
> 0000000000002660
> [162005.264060] Stack:
> [162005.264060] ffff88009d633800 0000000000000000 ffff8801cfe8f170
> ffffffff811bae10
> [162005.264060] ffffffff81ca73f8 ffffffff81ca73f8 ffff8801d2ec7dc8
> 0000000000000006
> [162005.264060] 00000000e3b30000 00000000e3b30000 ffff8801d2ec7f58
> 0000000000000001
> [162005.264060] Call Trace:
> [162005.264060] [<ffffffff811bae10>] ? mem_cgroup_wait_acct_move+0x110/0x110
> [162005.264060] [<ffffffff81159628>] pagefault_out_of_memory+0x18/0x90
> [162005.264060] [<ffffffff8105cee9>] mm_fault_error+0xa9/0x1a0
> [162005.264060] [<ffffffff8105d488>] __do_page_fault+0x478/0x4c0
> [162005.264060] [<ffffffff81004f00>] ? xen_mc_flush+0xb0/0x1b0
> [162005.264060] [<ffffffff81003ab3>] ? xen_write_msr_safe+0xa3/0xd0
> [162005.264060] [<ffffffff81012a40>] ? __switch_to+0x2d0/0x600
> [162005.264060] [<ffffffff8109e273>] ? finish_task_switch+0x53/0xf0
> [162005.264060] [<ffffffff81643b0a>] ? __schedule+0x37a/0x6d0
> [162005.264060] [<ffffffff8105d5dc>] do_page_fault+0x2c/0x40
> [162005.264060] [<ffffffff81649858>] page_fault+0x28/0x30
> [162005.264060] Code: 44 00 00 48 89 df e8 40 ca ff ff 48 85 c0 49 89 c4 74 35
> 4c 8b b0 30 02 00 00 4c 8d b8 30 02 00 00 4d 39 fe 74 1b 0f 1f 44 00 00 <49> 8b
> 7e 10 be 01 00 00 00 e8 42 d2 04 00 4d 8b 36 4d 39 fe 75
> [162005.264060] RIP [<ffffffff811c0b80>]
> mem_cgroup_oom_synchronize+0x140/0x240
> [162005.264060] RSP <ffff8801d2ec7d48>
> [162005.458051] ---[ end trace 050b00c5503ce96a ]---
> [162006.001086] kernel tried to execute NX-protected page - exploit attempt?
> (uid: 0)
> [162006.001100] BUG: unable to handle kernel paging request at ffff8801d2ec7e90
> [162006.001108] IP: [<ffff8801d2ec7e90>] 0xffff8801d2ec7e90
> [162006.001115] PGD 1c12067 PUD 2133067 PMD 1dfd4c067 PTE 80100001d2ec7067
> [162006.001123] Oops: 0011 [#2] SMP
> [162006.001128] Modules linked in: ipv6 dm_mod xen_netfront coretemp hwmon
> x86_pkg_temp_thermal crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel
> ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 microcode pcspkr ext4
> jbd2 mbcache raid0 xen_blkfront
> [162006.001161] CPU: 3 PID: 30835 Comm: kworker/3:2 Tainted: G D
> 3.16.0-rc5 #1
> [162006.001172] Workqueue: cgroup_destroy css_killed_work_fn
> [162006.001178] task: ffff8800797fc090 ti: ffff8801d17b4000 task.ti:
> ffff8801d17b4000
> [162006.001184] RIP: e030:[<ffff8801d2ec7e90>] [<ffff8801d2ec7e90>]
> 0xffff8801d2ec7e90
> [162006.001192] RSP: e02b:ffff8801d17b7c90 EFLAGS: 00010082
> [162006.001197] RAX: ffff8801d2ec7d50 RBX: ffff8801d2ec7eb0 RCX:
> ffff88009d633800
> [162006.001203] RDX: 0000000000000000 RSI: 0000000000000003 RDI:
> ffff8801d2ec7d50
> [162006.001209] RBP: ffff8801d17b7cd8 R08: ffff88009d633800 R09:
> 0000000000000400
> [162006.001214] R10: dead000000200200 R11: dead000000100100 R12:
> 000000007e10d030
> [162006.001220] R13: ffffffff81ca73f8 R14: ffff88009d633800 R15:
> 0000000000000000
> [162006.001230] FS: 00007f9cf413b700(0000) GS:ffff8801def80000(0000)
> knlGS:0000000000000000
> [162006.001236] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [162006.001241] CR2: ffff8801d2ec7e90 CR3: 000000005ab6b000 CR4:
> 0000000000002660
> [162006.001247] Stack:
> [162006.001251] ffffffff810b51e9 dead000000200200 0000000300000000
> ffff8801d1667b40
> [162006.001259] ffffffff81ca73f0 0000000000000201 0000000000000003
> 0000000000000000
> [162006.001266] ffff88009d633800 ffff8801d17b7d18 ffffffff810b56a8
> ffff88009d633800
> [162006.001274] Call Trace:
> [162006.001281] [<ffffffff810b51e9>] ? __wake_up_common+0x59/0x90
> [162006.001288] [<ffffffff810b56a8>] __wake_up+0x48/0x70
> [162006.001297] [<ffffffff811b92dd>] memcg_oom_recover+0x3d/0x40
> [162006.001303] [<ffffffff811bea90>] mem_cgroup_reparent_charges+0x110/0x150
> [162006.001310] [<ffffffff811bec38>] mem_cgroup_css_offline+0x138/0x250
> [162006.001316] [<ffffffff810f79f9>] css_killed_work_fn+0x49/0xd0
> [162006.001324] [<ffffffff8108c91c>] process_one_work+0x17c/0x420
> [162006.001331] [<ffffffff8108dab3>] worker_thread+0x123/0x420
> [162006.001337] [<ffffffff8108d990>] ? maybe_create_worker+0x180/0x180
> [162006.001344] [<ffffffff8109369e>] kthread+0xce/0xf0
> [162006.001352] [<ffffffff810039fe>] ? xen_end_context_switch+0x1e/0x30
> [162006.001358] [<ffffffff810935d0>] ? kthread_freezable_should_stop+0x70/0x70
> [162006.001368] [<ffffffff816477fc>] ret_from_fork+0x7c/0xb0
> [162006.001374] [<ffffffff810935d0>] ? kthread_freezable_should_stop+0x70/0x70
> [162006.001379] Code: ff ff ff c8 60 c7 00 00 c9 ff ff c0 60 c7 00 00 c9 ff ff
> 00 d0 4f 38 45 7f 00 00 c0 e7 ba a9 00 88 ff ff c0 07 00 00 00 00 00 00 <00> 2a
> c7 00 00 c9 ff ff 60 4e 0a 81 ff ff ff ff ec d7 4f 38 45
> [162006.001426] RIP [<ffff8801d2ec7e90>] 0xffff8801d2ec7e90
> [162006.001433] RSP <ffff8801d17b7c90>
> [162006.001437] CR2: ffff8801d2ec7e90
> [162006.001441] ---[ end trace 050b00c5503ce96b ]---
> [162006.001505] BUG: unable to handle kernel paging request at ffffffffffffffd8
> [162006.001514] IP: [<ffffffff81092f80>] kthread_data+0x10/0x20
> [162006.001521] PGD 1c14067 PUD 1c16067 PMD 0
> [162006.001528] Oops: 0000 [#3] SMP
> [162006.001532] Modules linked in: ipv6 dm_mod xen_netfront coretemp hwmon
> x86_pkg_temp_thermal crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel
> ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 microcode pcspkr ext4
> jbd2 mbcache raid0 xen_blkfront
> [162006.001562] CPU: 3 PID: 30835 Comm: kworker/3:2 Tainted: G D
> 3.16.0-rc5 #1
> [162006.001581] task: ffff8800797fc090 ti: ffff8801d17b4000 task.ti:
> ffff8801d17b4000
> [162006.001587] RIP: e030:[<ffffffff81092f80>] [<ffffffff81092f80>]
> kthread_data+0x10/0x20
> [162006.001595] RSP: e02b:ffff8801d17b78d8 EFLAGS: 00010096
> [162006.001600] RAX: 0000000000000000 RBX: 0000000000000003 RCX:
> ffffffff81fc5160
> [162006.001605] RDX: ffff8800797fc090 RSI: 0000000000000003 RDI:
> ffff8800797fc090
> [162006.001611] RBP: ffff8801d17b78d8 R08: 0000000000000000 R09:
> dead000000200200
> [162006.001617] R10: 0000000000000000 R11: 0000000000000007 R12:
> 0000000000000003
> [162006.001623] R13: ffff8800797fc998 R14: 0000000000000001 R15:
> 0000000000000000
> [162006.001631] FS: 00007f9cf413b700(0000) GS:ffff8801def80000(0000)
> knlGS:0000000000000000
> [162006.001637] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [162006.001642] CR2: 0000000000000028 CR3: 000000005ab6b000 CR4:
> 0000000000002660
> [162006.001647] Stack:
> [162006.001650] ffff8801d17b78f8 ffffffff8108a2f5 ffff8801d17b78f8
> ffff8801def94380
> [162006.001658] ffff8801d17b7968 ffffffff81643ce2 ffff8800797fc090
> ffff8801d17b4010
> [162006.001665] 0000000000014380 0000000000014380 ffff8800797fc090
> ffffffff812b1232
> [162006.001673] Call Trace:
> [162006.001679] [<ffffffff8108a2f5>] wq_worker_sleeping+0x15/0xa0
> [162006.001685] [<ffffffff81643ce2>] __schedule+0x552/0x6d0
> [162006.001692] [<ffffffff812b1232>] ? put_io_context_active+0xd2/0x100
> [162006.001698] [<ffffffff81643ff9>] schedule+0x29/0x70
> [162006.001705] [<ffffffff81073ecd>] do_exit+0x2bd/0x470
> [162006.001711] [<ffffffff810174c9>] oops_end+0xa9/0xf0
> [162006.001718] [<ffffffff8105ca5e>] no_context+0x12e/0x200
> [162006.001724] [<ffffffff81006e4f>] ? pte_mfn_to_pfn+0x7f/0x110
> [162006.002056] [<ffffffff8105cc5d>] __bad_area_nosemaphore+0x12d/0x230
> [162006.002056] [<ffffffff81005449>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
> [162006.002056] [<ffffffff8105cd73>] bad_area_nosemaphore+0x13/0x20
> [162006.002056] [<ffffffff8105d342>] __do_page_fault+0x332/0x4c0
> [162006.002056] [<ffffffff81012885>] ? __switch_to+0x115/0x600
> [162006.002056] [<ffffffff8109e273>] ? finish_task_switch+0x53/0xf0
> [162006.002056] [<ffffffff81643b0a>] ? __schedule+0x37a/0x6d0
> [162006.002056] [<ffffffff8105d5dc>] do_page_fault+0x2c/0x40
> [162006.002056] [<ffffffff81649858>] page_fault+0x28/0x30
> [162006.002056] [<ffffffff810b51e9>] ? __wake_up_common+0x59/0x90
> [162006.002056] [<ffffffff810b56a8>] __wake_up+0x48/0x70
> [162006.002056] [<ffffffff811b92dd>] memcg_oom_recover+0x3d/0x40
> [162006.002056] [<ffffffff811bea90>] mem_cgroup_reparent_charges+0x110/0x150
> [162006.002056] [<ffffffff811bec38>] mem_cgroup_css_offline+0x138/0x250
> [162006.002056] [<ffffffff810f79f9>] css_killed_work_fn+0x49/0xd0
> [162006.002056] [<ffffffff8108c91c>] process_one_work+0x17c/0x420
> [162006.002056] [<ffffffff8108dab3>] worker_thread+0x123/0x420
> [162006.002056] [<ffffffff8108d990>] ? maybe_create_worker+0x180/0x180
> [162006.002056] [<ffffffff8109369e>] kthread+0xce/0xf0
> [162006.002056] [<ffffffff810039fe>] ? xen_end_context_switch+0x1e/0x30
> [162006.002056] [<ffffffff810935d0>] ? kthread_freezable_should_stop+0x70/0x70
> [162006.002056] [<ffffffff816477fc>] ret_from_fork+0x7c/0xb0
> [162006.002056] [<ffffffff810935d0>] ? kthread_freezable_should_stop+0x70/0x70
> [162006.002056] Code: b0 08 00 00 48 8b 40 c8 c9 48 c1 e8 02 83 e0 01 c3 66 2e
> 0f 1f 84 00 00 00 00 00 55 48 89 e5 0f 1f 44 00 00 48 8b 87 b0 08 00 00 <48> 8b
> 40 d8 c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 0f
> [162006.002056] RIP [<ffffffff81092f80>] kthread_data+0x10/0x20
> [162006.002056] RSP <ffff8801d17b78d8>
> [162006.002056] CR2: ffffffffffffffd8
> [162006.002056] ---[ end trace 050b00c5503ce96c ]---
> [162006.002056] Fixing recursive fault but reboot is needed!
>
>
>
>
> And here is the similar output which was produced on 3.12.24:
> [118601.599452] memory: usage 131072kB, limit 131072kB, failcnt 130
> [118601.599458] memory+swap: usage 0kB, limit 18014398509481983kB, failcnt 0
> [118601.599462] kmem: usage 0kB, limit 18014398509481983kB, failcnt 0
> [118601.599466] Memory cgroup stats for
> /mesos/b9ef1fd7-e1e4-42d4-9760-caf41b13dcf9: cache:4KB rss:131068KB
> rss_huge:0KB mapped_file:0KB writeback:0KB inactive_anon:0KB
> active_anon:131068KB inactive_file:4KB active_file:0KB unevictable:0KB
> [118601.599490] [ pid ] uid tgid total_vm rss nr_ptes swapents
> oom_score_adj name
> [118601.599533] [27602] 0 27602 511383 19982 148 0
> 0 java7
> [118601.599541] [27734] 0 27734 47198 1433 50 0
> 0 sudo
> [118601.599548] [27747] 0 27747 424395 18630 88 0
> 0 java
> [118601.599554] Memory cgroup out of memory: Kill process 27602 (java7) score
> 595 or sacrifice child
> [118601.599564] Killed process 27734 (sudo) total-vm:188792kB, anon-rss:1548kB,
> file-rss:4184kB
> [118601.603075] general protection fault: 0000 [#1] SMP
> [118601.603084] Modules linked in: ipv6 dm_mod xen_netfront coretemp hwmon
> x86_pkg_temp_thermal crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel
> ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 microcode pcspkr ext4
> jbd2 mbcache raid0 xen_blkfront
> [118601.603116] CPU: 1 PID: 27748 Comm: java Not tainted 3.12.24 #1
> [118601.603122] task: ffff8800a5c3e940 ti: ffff8801d1b64000 task.ti:
> ffff8801d1b64000
> [118601.603128] RIP: e030:[<ffffffff811a73e0>] [<ffffffff811a73e0>]
> mem_cgroup_oom_synchronize+0x140/0x230
> [118601.604055] RSP: e02b:ffff8801d1b65d58 EFLAGS: 00010287
> [118601.604055] RAX: 0000000000000001 RBX: ffff880004742000 RCX:
> 0000000000000021
> [118601.604055] RDX: ffffffffffffffea RSI: ffff880004740200 RDI:
> ffff880004740200
> [118601.604055] RBP: ffff8801d1b65db8 R08: 000000000000002c R09:
> 0000000000000000
> [118601.604055] R10: 0000000000000001 R11: 0000000000000000 R12:
> ffff880004742000
> [118601.604055] R13: ffff8801d1b65d58 R14: dead000000100100 R15:
> ffff880004742210
> [118601.604055] FS: 00007f8bf500a700(0000) GS:ffff8801dee80000(0000)
> knlGS:0000000000000000
> [118601.604055] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [118601.604055] CR2: 00000000e3935000 CR3: 00000000ecf19000 CR4:
> 0000000000002660
> [118601.604055] Stack:
> [118601.604055] ffff880004742000 0000000000000000 ffff8800a5c3e940
> ffffffff811a22e0
> [118601.604055] ffffffff81c7e098 ffffffff81c7e098 ffff8801d1b65dd8
> 0000000000000006
> [118601.604055] 00000000000000a9 00000000e3935000 ffff8801d1b65f58
> 0000000000000001
> [118601.604055] Call Trace:
> [118601.604055] [<ffffffff811a22e0>] ? mem_cgroup_wait_acct_move+0x110/0x110
> [118601.604055] [<ffffffff81143e68>] pagefault_out_of_memory+0x18/0x90
> [118601.604055] [<ffffffff81057b19>] mm_fault_error+0xa9/0x1a0
> [118601.604055] [<ffffffff8160eb83>] __do_page_fault+0x4a3/0x4f0
> [118601.604055] [<ffffffff81003a03>] ? xen_write_msr_safe+0xa3/0xd0
> [118601.604055] [<ffffffff81012907>] ? __switch_to+0x1a7/0x500
> [118601.604055] [<ffffffff810996a3>] ? finish_task_switch+0x53/0xe0
> [118601.604055] [<ffffffff816088ca>] ? __schedule+0x3fa/0x710
> [118601.604055] [<ffffffff8160ebde>] do_page_fault+0xe/0x10
> [118601.604055] [<ffffffff8160b098>] page_fault+0x28/0x30
> [118601.604055] Code: 44 00 00 48 89 df e8 f0 d1 ff ff 48 85 c0 49 89 c4 74 35
> 4c 8b b0 10 02 00 00 4c 8d b8 10 02 00 00 4d 39 fe 74 1b 0f 1f 44 00 00 <49> 8b
> 7e 10 be 01 00 00 00 e8 12 15 05 00 4d 8b 36 4d 39 fe 75
> [118601.604055] RIP [<ffffffff811a73e0>]
> mem_cgroup_oom_synchronize+0x140/0x230
> [118601.604055] RSP <ffff8801d1b65d58>
> [118601.727935] ---[ end trace f02b14838d14e1af ]---
> [118601.902071] kernel tried to execute NX-protected page - exploit attempt?
> (uid: 0)
> [118601.902081] BUG: unable to handle kernel paging request at ffff8800051400c0
> [118601.902086] IP: [<ffff8800051400c0>] 0xffff8800051400c0
> [118601.902091] PGD 1c0d067 PUD 1c0e067 PMD 654a067 PTE 8010000005140067
> [118601.902097] Oops: 0011 [#2] SMP
> [118601.902100] Modules linked in: ipv6 dm_mod xen_netfront coretemp hwmon
> x86_pkg_temp_thermal crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel
> ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 microcode pcspkr ext4
> jbd2 mbcache raid0 xen_blkfront
> [118601.902120] CPU: 1 PID: 19577 Comm: kworker/1:2 Tainted: G D
> 3.12.24 #1
> [118601.902127] Workqueue: cgroup_destroy css_killed_work_fn
> [118601.902130] task: ffff8800a5d1a740 ti: ffff8801d0ac2000 task.ti:
> ffff8801d0ac2000
> [118601.902134] RIP: e030:[<ffff8800051400c0>] [<ffff8800051400c0>]
> 0xffff8800051400c0
> [118601.902139] RSP: e02b:ffff8801d0ac3ca0 EFLAGS: 00010096
> [118601.902141] RAX: ffff8801d1b65d60 RBX: ffff8800ecebebe8 RCX:
> ffff880004742000
> [118601.902145] RDX: 0000000000000000 RSI: 0000000000000003 RDI:
> ffff8801d1b65d60
> [118601.902148] RBP: ffff8801d0ac3ce8 R08: ffff880004742000 R09:
> 0000000000000400
> [118601.902152] R10: 0000000000007ff0 R11: 0000000000000000 R12:
> 00000000b004d000
> [118601.902155] R13: ffffffff81c7e098 R14: ffff880004742000 R15:
> 0000000000000000
> [118601.902162] FS: 00007f7336da1700(0000) GS:ffff8801dee80000(0000)
> knlGS:0000000000000000
> [118601.902166] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [118601.902169] CR2: ffff8800051400c0 CR3: 00000001d26e4000 CR4:
> 0000000000002660
> [118601.902173] Stack:
> [118601.902175] ffffffff81094969 dead000000200200 0000000300000000
> ffff8801d0ac3ce8
> [118601.902180] ffffffff81c7e090 0000000000000201 0000000000000003
> 0000000000000000
> [118601.902185] ffff880004742000 ffff8801d0ac3d28 ffffffff81096ad8
> ffff88004e733660
> [118601.902190] Call Trace:
> [118601.902197] [<ffffffff81094969>] ? __wake_up_common+0x59/0x90
> [118601.902201] [<ffffffff81096ad8>] __wake_up+0x48/0x70
> [118601.902207] [<ffffffff811a0f0d>] memcg_oom_recover+0x3d/0x40
> [118601.902211] [<ffffffff811a53b0>] mem_cgroup_reparent_charges+0x110/0x150
> [118601.902215] [<ffffffff811a55e8>] mem_cgroup_css_offline+0xb8/0x1b0
> [118601.902218] [<ffffffff810e5c32>] css_killed_work_fn+0x52/0xf0
> [118601.902223] [<ffffffff8108450c>] process_one_work+0x17c/0x420
> [118601.902226] [<ffffffff81085a43>] worker_thread+0x123/0x400
> [118601.902230] [<ffffffff81085920>] ? manage_workers+0x170/0x170
> [118601.902234] [<ffffffff8108b9ce>] kthread+0xce/0xe0
> [118601.902239] [<ffffffff8100394e>] ? xen_end_context_switch+0x1e/0x30
> [118601.902244] [<ffffffff8108b900>] ? kthread_freezable_should_stop+0x70/0x70
> [118601.902250] [<ffffffff816134bc>] ret_from_fork+0x7c/0xb0
> [118601.902254] [<ffffffff8108b900>] ? kthread_freezable_should_stop+0x70/0x70
> [118601.902257] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
> cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc <e0> 29
> 3d fd 00 88 ff ff 48 39 0e fd 00 88 ff ff c0 0c 2a fd 00
> [118601.902288] RIP [<ffff8800051400c0>] 0xffff8800051400c0
> [118601.902291] RSP <ffff8801d0ac3ca0>
> [118601.902293] CR2: ffff8800051400c0
> [118601.902296] ---[ end trace f02b14838d14e1b0 ]---
> [118601.902349] BUG: unable to handle kernel paging request at ffffffffffffffd8
> [118601.902353] IP: [<ffffffff8108b2a0>] kthread_data+0x10/0x20
> [118601.902358] PGD 1c0f067 PUD 1c11067 PMD 0
> [118601.902362] Oops: 0000 [#3] SMP
> [118601.902364] Modules linked in: ipv6 dm_mod xen_netfront coretemp hwmon
> x86_pkg_temp_thermal crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel
> ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 microcode pcspkr ext4
> jbd2 mbcache raid0 xen_blkfront
> [118601.902381] CPU: 1 PID: 19577 Comm: kworker/1:2 Tainted: G D
> 3.12.24 #1
> [118601.903052] task: ffff8800a5d1a740 ti: ffff8801d0ac2000 task.ti:
> ffff8801d0ac2000
> [118601.903052] RIP: e030:[<ffffffff8108b2a0>] [<ffffffff8108b2a0>]
> kthread_data+0x10/0x20
> [118601.903052] RSP: e02b:ffff8801d0ac38d8 EFLAGS: 00010096
> [118601.903052] RAX: 0000000000000000 RBX: 0000000000000001 RCX:
> ffffffff81f790a0
> [118601.903052] RDX: 0000000000000004 RSI: 0000000000000001 RDI:
> ffff8800a5d1a740
> [118601.903052] RBP: ffff8801d0ac38d8 R08: 0000000000000000 R09:
> dead000000200200
> [118601.903052] R10: 00000000da3336c3 R11: 0000000000000000 R12:
> 0000000000000001
> [118601.903052] R13: ffff8800a5d1ad48 R14: 0000000000000001 R15:
> 0000000000000011
> [118601.903052] FS: 00007f7336da1700(0000) GS:ffff8801dee80000(0000)
> knlGS:0000000000000000
> [118601.903052] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [118601.903052] CR2: 0000000000000028 CR3: 00000001d26e4000 CR4:
> 0000000000002660
> [118601.903052] Stack:
> [118601.903052] ffff8801d0ac38f8 ffffffff81082685 ffff8801d0ac38f8
> ffff8801dee94480
> [118601.903052] ffff8801d0ac3988 ffffffff81608a93 ffff8801d0ac3fd8
> 0000000000014480
> [118601.903052] ffff8801d0ac2010 0000000000014480 0000000000014480
> 0000000000014480
> [118601.903052] Call Trace:
> [118601.903052] [<ffffffff81082685>] wq_worker_sleeping+0x15/0xa0
> [118601.903052] [<ffffffff81608a93>] __schedule+0x5c3/0x710
> [118601.903052] [<ffffffff81298252>] ? put_io_context_active+0xc2/0xf0
>
> --
> You are receiving this mail because:
> You are the assignee for the bug.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Bug 80881] New: Memory cgroup OOM leads to BUG: unable to handle kernel paging request at ffffffffffffffd8
2014-07-22 20:07 ` [Bug 80881] New: Memory cgroup OOM leads to BUG: unable to handle kernel paging request at ffffffffffffffd8 Andrew Morton
@ 2014-07-24 12:09 ` Michal Hocko
2014-07-24 12:34 ` Johannes Weiner
0 siblings, 1 reply; 5+ messages in thread
From: Michal Hocko @ 2014-07-24 12:09 UTC (permalink / raw)
To: Andrew Morton; +Cc: Johannes Weiner, bugzilla-daemon, linux-mm, Paul Furtado
On Tue 22-07-14 13:07:41, Andrew Morton wrote:
[...]
> > The full log is attached, but here is the part I believe is relevant from the
> > 3.16.0-rc5 error:
> > [162005.262545] memory: usage 131072kB, limit 131072kB, failcnt 1314
> > [162005.262550] memory+swap: usage 0kB, limit 18014398509481983kB, failcnt 0
> > [162005.262554] kmem: usage 0kB, limit 18014398509481983kB, failcnt 0
> > [162005.262558] Memory cgroup stats for
> > /mesos/c206ce2a-9f11-4340-a3c9-c59b405690a7: cache:8KB rss:131064KB
> > rss_huge:0KB mapped_file:0KB writeback:0KB inactive_anon:0KB
> > active_anon:131064KB inactive_file:0KB active_file:0KB unevictable:0KB
> > [162005.262581] [ pid ] uid tgid total_vm rss nr_ptes swapents
> > oom_score_adj name
> > [162005.262602] [ 3002] 0 3002 544153 22244 151 0
> > 0 java7
> > [162005.262609] [ 3061] 0 3061 424397 20423 88 0
> > 0 java
> > [162005.262615] Memory cgroup out of memory: Kill process 3002 (java7) score
> > 662 or sacrifice child
> > [162005.262623] Killed process 3002 (java7) total-vm:2176612kB,
> > anon-rss:60400kB, file-rss:28576kB
Nothing unusual here.
[fixed up line wraps]
> [162005.263453] general protection fault: 0000 [#1] SMP
> [162005.263463] Modules linked in: ipv6 dm_mod xen_netfront coretemp hwmon x86_pkg_temp_thermal crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 microcode pcspkr ext4 jbd2 mbcache raid0 xen_blkfront
> [162005.264060] CPU: 3 PID: 3062 Comm: java Not tainted 3.16.0-rc5 #1
> [162005.264060] task: ffff8801cfe8f170 ti: ffff8801d2ec4000 task.ti: ffff8801d2ec4000
> [162005.264060] RIP: e030:[<ffffffff811c0b80>] [<ffffffff811c0b80>] mem_cgroup_oom_synchronize+0x140/0x240
> [162005.264060] RSP: e02b:ffff8801d2ec7d48 EFLAGS: 00010283
> [162005.264060] RAX: 0000000000000001 RBX: ffff88009d633800 RCX: 000000000000000e
> [162005.264060] RDX: fffffffffffffffe RSI: ffff88009d630200 RDI: ffff88009d630200
> [162005.264060] RBP: ffff8801d2ec7da8 R08: 0000000000000012 R09: 00000000fffffffe
> [162005.264060] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88009d633800
> [162005.264060] R13: ffff8801d2ec7d48 R14: dead000000100100 R15: ffff88009d633a30
> [162005.264060] FS: 00007f1748bb4700(0000) GS:ffff8801def80000(0000) knlGS:0000000000000000
> [162005.264060] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [162005.264060] CR2: 00007f4110300308 CR3: 00000000c05f7000 CR4: 0000000000002660
> [162005.264060] Stack:
> [162005.264060] ffff88009d633800 0000000000000000 ffff8801cfe8f170 ffffffff811bae10
> [162005.264060] ffffffff81ca73f8 ffffffff81ca73f8 ffff8801d2ec7dc8 0000000000000006
> [162005.264060] 00000000e3b30000 00000000e3b30000 ffff8801d2ec7f58 0000000000000001
> [162005.264060] Call Trace:
> [162005.264060] [<ffffffff811bae10>] ? mem_cgroup_wait_acct_move+0x110/0x110
> [162005.264060] [<ffffffff81159628>] pagefault_out_of_memory+0x18/0x90
> [162005.264060] [<ffffffff8105cee9>] mm_fault_error+0xa9/0x1a0
> [162005.264060] [<ffffffff8105d488>] __do_page_fault+0x478/0x4c0
> [162005.264060] [<ffffffff81004f00>] ? xen_mc_flush+0xb0/0x1b0
> [162005.264060] [<ffffffff81003ab3>] ? xen_write_msr_safe+0xa3/0xd0
> [162005.264060] [<ffffffff81012a40>] ? __switch_to+0x2d0/0x600
> [162005.264060] [<ffffffff8109e273>] ? finish_task_switch+0x53/0xf0
> [162005.264060] [<ffffffff81643b0a>] ? __schedule+0x37a/0x6d0
> [162005.264060] [<ffffffff8105d5dc>] do_page_fault+0x2c/0x40
> [162005.264060] [<ffffffff81649858>] page_fault+0x28/0x30
> [162005.264060] Code: 44 00 00 48 89 df e8 40 ca ff ff 48 85 c0 49 89 c4 74 35 4c 8b b0 30 02 00 00 4c 8d b8 30 02 00 00 4d 39 fe 74 1b 0f 1f 44 00 00 <49> 8b 7e 10 be 01 00 00 00 e8 42 d2 04 00 4d 8b 36 4d 39 fe 75
> [162005.264060] RIP [<ffffffff811c0b80>] mem_cgroup_oom_synchronize+0x140/0x240
> [162005.264060] RSP <ffff8801d2ec7d48>
> [162005.458051] ---[ end trace 050b00c5503ce96a ]---
This decodes to:
[162005.264060] Code: 44 00 00 48 89 df e8 40 ca ff ff 48 85 c0 49 89 c4 74 35 4c 8b b0 30 02 00 00 4c 8d b8 30 02 00 00 4d 39 fe 74 1b 0f 1f 44 00 00 <49> 8b 7e 10 be 01 00 00 00 e8 42 d2 04 00 4d 8b 36 4d 39 fe 75
All code
========
0: 44 00 00 add %r8b,(%rax)
3: 48 89 df mov %rbx,%rdi
6: e8 40 ca ff ff callq 0xffffffffffffca4b
b: 48 85 c0 test %rax,%rax
e: 49 89 c4 mov %rax,%r12
11: 74 35 je 0x48
13: 4c 8b b0 30 02 00 00 mov 0x230(%rax),%r14
1a: 4c 8d b8 30 02 00 00 lea 0x230(%rax),%r15
21: 4d 39 fe cmp %r15,%r14
24: 74 1b je 0x41
26: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
2b:* 49 8b 7e 10 mov 0x10(%r14),%rdi <-- trapping instruction
2f: be 01 00 00 00 mov $0x1,%esi
34: e8 42 d2 04 00 callq 0x4d27b
39: 4d 8b 36 mov (%r14),%r14
3c: 4d 39 fe cmp %r15,%r14
3f: 75 .byte 0x75
R14 is dead000000100100 which is a poison value. If I am reading the
code correctly this should be somewhere in mem_cgroup_oom_notify_cb
where we stumble over event which has been removed from the notify chain.
And indeed there is nothing to protect the oom_notify chain in the oom
path. {Un}Registration is protected by memcg_oom_lock and that one is
used in mem_cgroup_oom_trylock but it is taken only locally in that
function. The issue seems to be introduced by fb2a6fc56be6 (mm: memcg:
rework and document OOM waiting and wakeup) in 3.12.
The most simplistic fix would be simply using memcg_oom_lock inside
mem_cgroup_oom_notify_cb, but I cannot say I would like it much. Another
approach would be using RCU for mem_cgroup_eventfd_list deallocation and
{un}linking.
Let's go with simpler route for now as this is not a hot path, though.
---
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Bug 80881] New: Memory cgroup OOM leads to BUG: unable to handle kernel paging request at ffffffffffffffd8
2014-07-24 12:09 ` Michal Hocko
@ 2014-07-24 12:34 ` Johannes Weiner
2014-07-24 13:15 ` Michal Hocko
0 siblings, 1 reply; 5+ messages in thread
From: Johannes Weiner @ 2014-07-24 12:34 UTC (permalink / raw)
To: Michal Hocko; +Cc: Andrew Morton, bugzilla-daemon, linux-mm, Paul Furtado
Hi Michal,
On Thu, Jul 24, 2014 at 02:09:59PM +0200, Michal Hocko wrote:
> On Tue 22-07-14 13:07:41, Andrew Morton wrote:
> [...]
> > > The full log is attached, but here is the part I believe is relevant from the
> > > 3.16.0-rc5 error:
> > > [162005.262545] memory: usage 131072kB, limit 131072kB, failcnt 1314
> > > [162005.262550] memory+swap: usage 0kB, limit 18014398509481983kB, failcnt 0
> > > [162005.262554] kmem: usage 0kB, limit 18014398509481983kB, failcnt 0
> > > [162005.262558] Memory cgroup stats for
> > > /mesos/c206ce2a-9f11-4340-a3c9-c59b405690a7: cache:8KB rss:131064KB
> > > rss_huge:0KB mapped_file:0KB writeback:0KB inactive_anon:0KB
> > > active_anon:131064KB inactive_file:0KB active_file:0KB unevictable:0KB
> > > [162005.262581] [ pid ] uid tgid total_vm rss nr_ptes swapents
> > > oom_score_adj name
> > > [162005.262602] [ 3002] 0 3002 544153 22244 151 0
> > > 0 java7
> > > [162005.262609] [ 3061] 0 3061 424397 20423 88 0
> > > 0 java
> > > [162005.262615] Memory cgroup out of memory: Kill process 3002 (java7) score
> > > 662 or sacrifice child
> > > [162005.262623] Killed process 3002 (java7) total-vm:2176612kB,
> > > anon-rss:60400kB, file-rss:28576kB
>
> Nothing unusual here.
>
> [fixed up line wraps]
> > [162005.263453] general protection fault: 0000 [#1] SMP
> > [162005.263463] Modules linked in: ipv6 dm_mod xen_netfront coretemp hwmon x86_pkg_temp_thermal crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 microcode pcspkr ext4 jbd2 mbcache raid0 xen_blkfront
> > [162005.264060] CPU: 3 PID: 3062 Comm: java Not tainted 3.16.0-rc5 #1
> > [162005.264060] task: ffff8801cfe8f170 ti: ffff8801d2ec4000 task.ti: ffff8801d2ec4000
> > [162005.264060] RIP: e030:[<ffffffff811c0b80>] [<ffffffff811c0b80>] mem_cgroup_oom_synchronize+0x140/0x240
> > [162005.264060] RSP: e02b:ffff8801d2ec7d48 EFLAGS: 00010283
> > [162005.264060] RAX: 0000000000000001 RBX: ffff88009d633800 RCX: 000000000000000e
> > [162005.264060] RDX: fffffffffffffffe RSI: ffff88009d630200 RDI: ffff88009d630200
> > [162005.264060] RBP: ffff8801d2ec7da8 R08: 0000000000000012 R09: 00000000fffffffe
> > [162005.264060] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88009d633800
> > [162005.264060] R13: ffff8801d2ec7d48 R14: dead000000100100 R15: ffff88009d633a30
> > [162005.264060] FS: 00007f1748bb4700(0000) GS:ffff8801def80000(0000) knlGS:0000000000000000
> > [162005.264060] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> > [162005.264060] CR2: 00007f4110300308 CR3: 00000000c05f7000 CR4: 0000000000002660
> > [162005.264060] Stack:
> > [162005.264060] ffff88009d633800 0000000000000000 ffff8801cfe8f170 ffffffff811bae10
> > [162005.264060] ffffffff81ca73f8 ffffffff81ca73f8 ffff8801d2ec7dc8 0000000000000006
> > [162005.264060] 00000000e3b30000 00000000e3b30000 ffff8801d2ec7f58 0000000000000001
> > [162005.264060] Call Trace:
> > [162005.264060] [<ffffffff811bae10>] ? mem_cgroup_wait_acct_move+0x110/0x110
> > [162005.264060] [<ffffffff81159628>] pagefault_out_of_memory+0x18/0x90
> > [162005.264060] [<ffffffff8105cee9>] mm_fault_error+0xa9/0x1a0
> > [162005.264060] [<ffffffff8105d488>] __do_page_fault+0x478/0x4c0
> > [162005.264060] [<ffffffff81004f00>] ? xen_mc_flush+0xb0/0x1b0
> > [162005.264060] [<ffffffff81003ab3>] ? xen_write_msr_safe+0xa3/0xd0
> > [162005.264060] [<ffffffff81012a40>] ? __switch_to+0x2d0/0x600
> > [162005.264060] [<ffffffff8109e273>] ? finish_task_switch+0x53/0xf0
> > [162005.264060] [<ffffffff81643b0a>] ? __schedule+0x37a/0x6d0
> > [162005.264060] [<ffffffff8105d5dc>] do_page_fault+0x2c/0x40
> > [162005.264060] [<ffffffff81649858>] page_fault+0x28/0x30
> > [162005.264060] Code: 44 00 00 48 89 df e8 40 ca ff ff 48 85 c0 49 89 c4 74 35 4c 8b b0 30 02 00 00 4c 8d b8 30 02 00 00 4d 39 fe 74 1b 0f 1f 44 00 00 <49> 8b 7e 10 be 01 00 00 00 e8 42 d2 04 00 4d 8b 36 4d 39 fe 75
> > [162005.264060] RIP [<ffffffff811c0b80>] mem_cgroup_oom_synchronize+0x140/0x240
> > [162005.264060] RSP <ffff8801d2ec7d48>
> > [162005.458051] ---[ end trace 050b00c5503ce96a ]---
>
> This decodes to:
> [162005.264060] Code: 44 00 00 48 89 df e8 40 ca ff ff 48 85 c0 49 89 c4 74 35 4c 8b b0 30 02 00 00 4c 8d b8 30 02 00 00 4d 39 fe 74 1b 0f 1f 44 00 00 <49> 8b 7e 10 be 01 00 00 00 e8 42 d2 04 00 4d 8b 36 4d 39 fe 75
> All code
> ========
> 0: 44 00 00 add %r8b,(%rax)
> 3: 48 89 df mov %rbx,%rdi
> 6: e8 40 ca ff ff callq 0xffffffffffffca4b
> b: 48 85 c0 test %rax,%rax
> e: 49 89 c4 mov %rax,%r12
> 11: 74 35 je 0x48
> 13: 4c 8b b0 30 02 00 00 mov 0x230(%rax),%r14
> 1a: 4c 8d b8 30 02 00 00 lea 0x230(%rax),%r15
> 21: 4d 39 fe cmp %r15,%r14
> 24: 74 1b je 0x41
> 26: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> 2b:* 49 8b 7e 10 mov 0x10(%r14),%rdi <-- trapping instruction
> 2f: be 01 00 00 00 mov $0x1,%esi
> 34: e8 42 d2 04 00 callq 0x4d27b
> 39: 4d 8b 36 mov (%r14),%r14
> 3c: 4d 39 fe cmp %r15,%r14
> 3f: 75 .byte 0x75
>
> R14 is dead000000100100 which is a poison value. If I am reading the
> code correctly this should be somewhere in mem_cgroup_oom_notify_cb
> where we stumble over event which has been removed from the notify chain.
>
> And indeed there is nothing to protect the oom_notify chain in the oom
> path. {Un}Registration is protected by memcg_oom_lock and that one is
> used in mem_cgroup_oom_trylock but it is taken only locally in that
> function. The issue seems to be introduced by fb2a6fc56be6 (mm: memcg:
> rework and document OOM waiting and wakeup) in 3.12.
>
> The most simplistic fix would be simply using memcg_oom_lock inside
> mem_cgroup_oom_notify_cb, but I cannot say I would like it much. Another
> approach would be using RCU for mem_cgroup_eventfd_list deallocation and
> {un}linking.
Thanks a lot for looking into this. Your analysis makes sense to me.
Would it be better to move mem_cgroup_oom_notify() directly into the
trylock function while the memcg_oom_lock is still held?
> Let's go with simpler route for now as this is not a hot path, though.
> ---
> >From 2c2642dbfb3f7d8c9f20f7793850426daa770078 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.cz>
> Date: Thu, 24 Jul 2014 14:00:39 +0200
> Subject: [PATCH] memcg: oom_notify use-after-free fix
>
> Paul Furtado has reported the following GPF:
> general protection fault: 0000 [#1] SMP
> Modules linked in: ipv6 dm_mod xen_netfront coretemp hwmon x86_pkg_temp_thermal crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 microcode pcspkr ext4 jbd2 mbcache raid0 xen_blkfront
> CPU: 3 PID: 3062 Comm: java Not tainted 3.16.0-rc5 #1
> task: ffff8801cfe8f170 ti: ffff8801d2ec4000 task.ti: ffff8801d2ec4000
> RIP: e030:[<ffffffff811c0b80>] [<ffffffff811c0b80>] mem_cgroup_oom_synchronize+0x140/0x240
> RSP: e02b:ffff8801d2ec7d48 EFLAGS: 00010283
> RAX: 0000000000000001 RBX: ffff88009d633800 RCX: 000000000000000e
> RDX: fffffffffffffffe RSI: ffff88009d630200 RDI: ffff88009d630200
> RBP: ffff8801d2ec7da8 R08: 0000000000000012 R09: 00000000fffffffe
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff88009d633800
> R13: ffff8801d2ec7d48 R14: dead000000100100 R15: ffff88009d633a30
> FS: 00007f1748bb4700(0000) GS:ffff8801def80000(0000) knlGS:0000000000000000
> CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00007f4110300308 CR3: 00000000c05f7000 CR4: 0000000000002660
> Stack:
> ffff88009d633800 0000000000000000 ffff8801cfe8f170 ffffffff811bae10
> ffffffff81ca73f8 ffffffff81ca73f8 ffff8801d2ec7dc8 0000000000000006
> 00000000e3b30000 00000000e3b30000 ffff8801d2ec7f58 0000000000000001
> Call Trace:
> [<ffffffff811bae10>] ? mem_cgroup_wait_acct_move+0x110/0x110
> [<ffffffff81159628>] pagefault_out_of_memory+0x18/0x90
> [<ffffffff8105cee9>] mm_fault_error+0xa9/0x1a0
> [<ffffffff8105d488>] __do_page_fault+0x478/0x4c0
> [<ffffffff81004f00>] ? xen_mc_flush+0xb0/0x1b0
> [<ffffffff81003ab3>] ? xen_write_msr_safe+0xa3/0xd0
> [<ffffffff81012a40>] ? __switch_to+0x2d0/0x600
> [<ffffffff8109e273>] ? finish_task_switch+0x53/0xf0
> [<ffffffff81643b0a>] ? __schedule+0x37a/0x6d0
> [<ffffffff8105d5dc>] do_page_fault+0x2c/0x40
> [<ffffffff81649858>] page_fault+0x28/0x30
> Code: 44 00 00 48 89 df e8 40 ca ff ff 48 85 c0 49 89 c4 74 35 4c 8b b0 30 02 00 00 4c 8d b8 30 02 00 00 4d 39 fe 74 1b 0f 1f 44 00 00 <49> 8b 7e 10 be 01 00 00 00 e8 42 d2 04 00 4d 8b 36 4d 39 fe 75
> RIP [<ffffffff811c0b80>] mem_cgroup_oom_synchronize+0x140/0x240
> RSP <ffff8801d2ec7d48>
> ---[ end trace 050b00c5503ce96a ]---
>
> fb2a6fc56be6 (mm: memcg: rework and document OOM waiting and wakeup) has
> moved mem_cgroup_oom_notify outside of memcg_oom_lock assuming it is
> protected by the hierarchical OOM-lock. Although this is true for the
> notification part the protection doesn't cover unregistration of event
> which can happen in parallel now so mem_cgroup_oom_notify can see
> already unlinked and/or freed mem_cgroup_eventfd_list.
>
> Fix this by using memcg_oom_lock also in mem_cgroup_oom_notify.
>
> Reported-by: Paul Furtado <paulfurtado91@gmail.com>
> Fixes: fb2a6fc56be6 (mm: memcg: rework and document OOM waiting and wakeup)
> Cc: stable@vger.kernel.org # 3.12+
> Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Bug 80881] New: Memory cgroup OOM leads to BUG: unable to handle kernel paging request at ffffffffffffffd8
2014-07-24 12:34 ` Johannes Weiner
@ 2014-07-24 13:15 ` Michal Hocko
2014-07-25 2:55 ` Paul Furtado
0 siblings, 1 reply; 5+ messages in thread
From: Michal Hocko @ 2014-07-24 13:15 UTC (permalink / raw)
To: Johannes Weiner; +Cc: Andrew Morton, bugzilla-daemon, linux-mm, Paul Furtado
On Thu 24-07-14 08:34:56, Johannes Weiner wrote:
[...]
> Would it be better to move mem_cgroup_oom_notify() directly into the
> trylock function while the memcg_oom_lock is still held?
I don't know. It sounds like mixing two things together. I would rather
keep them separate unless we have a good reason to do otherwise. Sharing
the same lock is just a coincidence mostly required for the registration
code to not miss event.
> > Let's go with simpler route for now as this is not a hot path, though.
> > ---
> > >From 2c2642dbfb3f7d8c9f20f7793850426daa770078 Mon Sep 17 00:00:00 2001
> > From: Michal Hocko <mhocko@suse.cz>
> > Date: Thu, 24 Jul 2014 14:00:39 +0200
> > Subject: [PATCH] memcg: oom_notify use-after-free fix
> >
> > Paul Furtado has reported the following GPF:
> > general protection fault: 0000 [#1] SMP
> > Modules linked in: ipv6 dm_mod xen_netfront coretemp hwmon x86_pkg_temp_thermal crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 microcode pcspkr ext4 jbd2 mbcache raid0 xen_blkfront
> > CPU: 3 PID: 3062 Comm: java Not tainted 3.16.0-rc5 #1
> > task: ffff8801cfe8f170 ti: ffff8801d2ec4000 task.ti: ffff8801d2ec4000
> > RIP: e030:[<ffffffff811c0b80>] [<ffffffff811c0b80>] mem_cgroup_oom_synchronize+0x140/0x240
> > RSP: e02b:ffff8801d2ec7d48 EFLAGS: 00010283
> > RAX: 0000000000000001 RBX: ffff88009d633800 RCX: 000000000000000e
> > RDX: fffffffffffffffe RSI: ffff88009d630200 RDI: ffff88009d630200
> > RBP: ffff8801d2ec7da8 R08: 0000000000000012 R09: 00000000fffffffe
> > R10: 0000000000000000 R11: 0000000000000000 R12: ffff88009d633800
> > R13: ffff8801d2ec7d48 R14: dead000000100100 R15: ffff88009d633a30
> > FS: 00007f1748bb4700(0000) GS:ffff8801def80000(0000) knlGS:0000000000000000
> > CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> > CR2: 00007f4110300308 CR3: 00000000c05f7000 CR4: 0000000000002660
> > Stack:
> > ffff88009d633800 0000000000000000 ffff8801cfe8f170 ffffffff811bae10
> > ffffffff81ca73f8 ffffffff81ca73f8 ffff8801d2ec7dc8 0000000000000006
> > 00000000e3b30000 00000000e3b30000 ffff8801d2ec7f58 0000000000000001
> > Call Trace:
> > [<ffffffff811bae10>] ? mem_cgroup_wait_acct_move+0x110/0x110
> > [<ffffffff81159628>] pagefault_out_of_memory+0x18/0x90
> > [<ffffffff8105cee9>] mm_fault_error+0xa9/0x1a0
> > [<ffffffff8105d488>] __do_page_fault+0x478/0x4c0
> > [<ffffffff81004f00>] ? xen_mc_flush+0xb0/0x1b0
> > [<ffffffff81003ab3>] ? xen_write_msr_safe+0xa3/0xd0
> > [<ffffffff81012a40>] ? __switch_to+0x2d0/0x600
> > [<ffffffff8109e273>] ? finish_task_switch+0x53/0xf0
> > [<ffffffff81643b0a>] ? __schedule+0x37a/0x6d0
> > [<ffffffff8105d5dc>] do_page_fault+0x2c/0x40
> > [<ffffffff81649858>] page_fault+0x28/0x30
> > Code: 44 00 00 48 89 df e8 40 ca ff ff 48 85 c0 49 89 c4 74 35 4c 8b b0 30 02 00 00 4c 8d b8 30 02 00 00 4d 39 fe 74 1b 0f 1f 44 00 00 <49> 8b 7e 10 be 01 00 00 00 e8 42 d2 04 00 4d 8b 36 4d 39 fe 75
> > RIP [<ffffffff811c0b80>] mem_cgroup_oom_synchronize+0x140/0x240
> > RSP <ffff8801d2ec7d48>
> > ---[ end trace 050b00c5503ce96a ]---
> >
> > fb2a6fc56be6 (mm: memcg: rework and document OOM waiting and wakeup) has
> > moved mem_cgroup_oom_notify outside of memcg_oom_lock assuming it is
> > protected by the hierarchical OOM-lock. Although this is true for the
> > notification part the protection doesn't cover unregistration of event
> > which can happen in parallel now so mem_cgroup_oom_notify can see
> > already unlinked and/or freed mem_cgroup_eventfd_list.
> >
> > Fix this by using memcg_oom_lock also in mem_cgroup_oom_notify.
> >
> > Reported-by: Paul Furtado <paulfurtado91@gmail.com>
> > Fixes: fb2a6fc56be6 (mm: memcg: rework and document OOM waiting and wakeup)
> > Cc: stable@vger.kernel.org # 3.12+
> > Signed-off-by: Michal Hocko <mhocko@suse.cz>
>
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Thanks!
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Bug 80881] New: Memory cgroup OOM leads to BUG: unable to handle kernel paging request at ffffffffffffffd8
2014-07-24 13:15 ` Michal Hocko
@ 2014-07-25 2:55 ` Paul Furtado
0 siblings, 0 replies; 5+ messages in thread
From: Paul Furtado @ 2014-07-25 2:55 UTC (permalink / raw)
To: Michal Hocko; +Cc: Johannes Weiner, Andrew Morton, bugzilla-daemon, linux-mm
Thanks for the fix and the quick turnaround time! I applied the patch
on top of 3.16.0-rc5 and we have 75 servers running anywhere from 1-20
OOMs in parallel now. They've been running for about 3 hours and no
issues yet, although it usually takes a few days to start reproducing
the oopses. I'll report back if we hit any issues. Thanks again!
On Thu, Jul 24, 2014 at 9:15 AM, Michal Hocko <mhocko@suse.cz> wrote:
> On Thu 24-07-14 08:34:56, Johannes Weiner wrote:
> [...]
>> Would it be better to move mem_cgroup_oom_notify() directly into the
>> trylock function while the memcg_oom_lock is still held?
>
> I don't know. It sounds like mixing two things together. I would rather
> keep them separate unless we have a good reason to do otherwise. Sharing
> the same lock is just a coincidence mostly required for the registration
> code to not miss event.
>
>> > Let's go with simpler route for now as this is not a hot path, though.
>> > ---
>> > >From 2c2642dbfb3f7d8c9f20f7793850426daa770078 Mon Sep 17 00:00:00 2001
>> > From: Michal Hocko <mhocko@suse.cz>
>> > Date: Thu, 24 Jul 2014 14:00:39 +0200
>> > Subject: [PATCH] memcg: oom_notify use-after-free fix
>> >
>> > Paul Furtado has reported the following GPF:
>> > general protection fault: 0000 [#1] SMP
>> > Modules linked in: ipv6 dm_mod xen_netfront coretemp hwmon x86_pkg_temp_thermal crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 microcode pcspkr ext4 jbd2 mbcache raid0 xen_blkfront
>> > CPU: 3 PID: 3062 Comm: java Not tainted 3.16.0-rc5 #1
>> > task: ffff8801cfe8f170 ti: ffff8801d2ec4000 task.ti: ffff8801d2ec4000
>> > RIP: e030:[<ffffffff811c0b80>] [<ffffffff811c0b80>] mem_cgroup_oom_synchronize+0x140/0x240
>> > RSP: e02b:ffff8801d2ec7d48 EFLAGS: 00010283
>> > RAX: 0000000000000001 RBX: ffff88009d633800 RCX: 000000000000000e
>> > RDX: fffffffffffffffe RSI: ffff88009d630200 RDI: ffff88009d630200
>> > RBP: ffff8801d2ec7da8 R08: 0000000000000012 R09: 00000000fffffffe
>> > R10: 0000000000000000 R11: 0000000000000000 R12: ffff88009d633800
>> > R13: ffff8801d2ec7d48 R14: dead000000100100 R15: ffff88009d633a30
>> > FS: 00007f1748bb4700(0000) GS:ffff8801def80000(0000) knlGS:0000000000000000
>> > CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>> > CR2: 00007f4110300308 CR3: 00000000c05f7000 CR4: 0000000000002660
>> > Stack:
>> > ffff88009d633800 0000000000000000 ffff8801cfe8f170 ffffffff811bae10
>> > ffffffff81ca73f8 ffffffff81ca73f8 ffff8801d2ec7dc8 0000000000000006
>> > 00000000e3b30000 00000000e3b30000 ffff8801d2ec7f58 0000000000000001
>> > Call Trace:
>> > [<ffffffff811bae10>] ? mem_cgroup_wait_acct_move+0x110/0x110
>> > [<ffffffff81159628>] pagefault_out_of_memory+0x18/0x90
>> > [<ffffffff8105cee9>] mm_fault_error+0xa9/0x1a0
>> > [<ffffffff8105d488>] __do_page_fault+0x478/0x4c0
>> > [<ffffffff81004f00>] ? xen_mc_flush+0xb0/0x1b0
>> > [<ffffffff81003ab3>] ? xen_write_msr_safe+0xa3/0xd0
>> > [<ffffffff81012a40>] ? __switch_to+0x2d0/0x600
>> > [<ffffffff8109e273>] ? finish_task_switch+0x53/0xf0
>> > [<ffffffff81643b0a>] ? __schedule+0x37a/0x6d0
>> > [<ffffffff8105d5dc>] do_page_fault+0x2c/0x40
>> > [<ffffffff81649858>] page_fault+0x28/0x30
>> > Code: 44 00 00 48 89 df e8 40 ca ff ff 48 85 c0 49 89 c4 74 35 4c 8b b0 30 02 00 00 4c 8d b8 30 02 00 00 4d 39 fe 74 1b 0f 1f 44 00 00 <49> 8b 7e 10 be 01 00 00 00 e8 42 d2 04 00 4d 8b 36 4d 39 fe 75
>> > RIP [<ffffffff811c0b80>] mem_cgroup_oom_synchronize+0x140/0x240
>> > RSP <ffff8801d2ec7d48>
>> > ---[ end trace 050b00c5503ce96a ]---
>> >
>> > fb2a6fc56be6 (mm: memcg: rework and document OOM waiting and wakeup) has
>> > moved mem_cgroup_oom_notify outside of memcg_oom_lock assuming it is
>> > protected by the hierarchical OOM-lock. Although this is true for the
>> > notification part the protection doesn't cover unregistration of event
>> > which can happen in parallel now so mem_cgroup_oom_notify can see
>> > already unlinked and/or freed mem_cgroup_eventfd_list.
>> >
>> > Fix this by using memcg_oom_lock also in mem_cgroup_oom_notify.
>> >
>> > Reported-by: Paul Furtado <paulfurtado91@gmail.com>
>> > Fixes: fb2a6fc56be6 (mm: memcg: rework and document OOM waiting and wakeup)
>> > Cc: stable@vger.kernel.org # 3.12+
>> > Signed-off-by: Michal Hocko <mhocko@suse.cz>
>>
>> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
>
> Thanks!
>
> --
> Michal Hocko
> SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-07-25 2:55 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <bug-80881-27@https.bugzilla.kernel.org/>
2014-07-22 20:07 ` [Bug 80881] New: Memory cgroup OOM leads to BUG: unable to handle kernel paging request at ffffffffffffffd8 Andrew Morton
2014-07-24 12:09 ` Michal Hocko
2014-07-24 12:34 ` Johannes Weiner
2014-07-24 13:15 ` Michal Hocko
2014-07-25 2:55 ` Paul Furtado
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).