diff for duplicates of <20170518132033.GA12219@castle> diff --git a/a/1.txt b/N1/1.txt index d760823..dc5ef32 100644 --- a/a/1.txt +++ b/N1/1.txt @@ -21,3 +21,108 @@ If we consider this approach, I've prepared a separate patch for this problem (stripped all oom reaper list stuff). Thanks! + +>From 317fad44a0fe79fb76e8e4fd6bd81c52ae1712e9 Mon Sep 17 00:00:00 2001 +From: Roman Gushchin <guro@fb.com> +Date: Tue, 16 May 2017 21:19:56 +0100 +Subject: [PATCH] mm,oom: prevent OOM double kill from a pagefault handling + path + +During the debugging of some OOM-related stuff, I've noticed +that sometimes OOM kills two processes instead of one. + +The problem can be easily reproduced on a vanilla kernel: + +[ 25.721494] allocate invoked oom-killer: gfp_mask=0x14280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null), order=0, oom_score_adj=0 +[ 25.725658] allocate cpuset=/ mems_allowed=0 +[ 25.727033] CPU: 1 PID: 492 Comm: allocate Not tainted 4.12.0-rc1-mm1+ #181 +[ 25.729215] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 +[ 25.729598] Call Trace: +[ 25.729598] dump_stack+0x63/0x82 +[ 25.729598] dump_header+0x97/0x21a +[ 25.729598] ? do_try_to_free_pages+0x2d7/0x360 +[ 25.729598] ? security_capable_noaudit+0x45/0x60 +[ 25.729598] oom_kill_process+0x219/0x3e0 +[ 25.729598] out_of_memory+0x11d/0x480 +[ 25.729598] __alloc_pages_slowpath+0xc84/0xd40 +[ 25.729598] __alloc_pages_nodemask+0x245/0x260 +[ 25.729598] alloc_pages_vma+0xa2/0x270 +[ 25.729598] __handle_mm_fault+0xca9/0x10c0 +[ 25.729598] handle_mm_fault+0xf3/0x210 +[ 25.729598] __do_page_fault+0x240/0x4e0 +[ 25.729598] trace_do_page_fault+0x37/0xe0 +[ 25.729598] do_async_page_fault+0x19/0x70 +[ 25.729598] async_page_fault+0x28/0x30 +< cut > +[ 25.810868] oom_reaper: reaped process 492 (allocate), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB +< cut > +[ 25.817589] allocate invoked oom-killer: gfp_mask=0x0(), nodemask=(null), order=0, oom_score_adj=0 +[ 25.818821] allocate cpuset=/ mems_allowed=0 +[ 25.819259] CPU: 1 PID: 492 Comm: allocate Not tainted 4.12.0-rc1-mm1+ #181 +[ 25.819847] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 +[ 25.820549] Call Trace: +[ 25.820733] dump_stack+0x63/0x82 +[ 25.820961] dump_header+0x97/0x21a +[ 25.820961] ? security_capable_noaudit+0x45/0x60 +[ 25.820961] oom_kill_process+0x219/0x3e0 +[ 25.820961] out_of_memory+0x11d/0x480 +[ 25.820961] pagefault_out_of_memory+0x68/0x80 +[ 25.820961] mm_fault_error+0x8f/0x190 +[ 25.820961] ? handle_mm_fault+0xf3/0x210 +[ 25.820961] __do_page_fault+0x4b2/0x4e0 +[ 25.820961] trace_do_page_fault+0x37/0xe0 +[ 25.820961] do_async_page_fault+0x19/0x70 +[ 25.820961] async_page_fault+0x28/0x30 +< cut > +[ 25.863078] Out of memory: Kill process 233 (firewalld) score 10 or sacrifice child +[ 25.863634] Killed process 233 (firewalld) total-vm:246076kB, anon-rss:20956kB, file-rss:0kB, shmem-rss:0kB + +This actually happens if pagefault_out_of_memory() is called +after the calling process has already been selected as an OOM victim +and killed. There is a race with the oom reaper: if the process +is reaped before it enters out_of_memory(), the MMF_OOM_SKIP +flag is set, and out_of_memory() will not consider the process +as a eligible victim. That means that another victim will be selected +and killed. + +Tetsuo Handa has noticed, that this is a side effect of +commit 9a67f6488eca926f ("mm: consolidate GFP_NOFAIL checks +in the allocator slowpath"). + +To avoid this, out_of_memory() shouldn't be called from +pagefault_out_of_memory(), if current task already +has been chosen as an oom victim. + +v2: dropped changes related to the oom_reaper synchronization, + as it looks like a separate and minor issue; + rebased on new mm; + renamed, updated commit message. + +Signed-off-by: Roman Gushchin <guro@fb.com> +Cc: Michal Hocko <mhocko@suse.com> +Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> +Cc: Johannes Weiner <hannes@cmpxchg.org> +Cc: Vladimir Davydov <vdavydov.dev@gmail.com> +Cc: kernel-team@fb.com +Cc: linux-mm@kvack.org +Cc: linux-kernel@vger.kernel.org +--- + mm/oom_kill.c | 3 +++ + 1 file changed, 3 insertions(+) + +diff --git a/mm/oom_kill.c b/mm/oom_kill.c +index 04c9143..9c643a3 100644 +--- a/mm/oom_kill.c ++++ b/mm/oom_kill.c +@@ -1068,6 +1068,9 @@ void pagefault_out_of_memory(void) + if (mem_cgroup_oom_synchronize(true)) + return; + ++ if (tsk_is_oom_victim(current)) ++ return; ++ + if (!mutex_trylock(&oom_lock)) + return; + out_of_memory(&oc); +-- +2.7.4 diff --git a/a/content_digest b/N1/content_digest index bbec673..a5f9bbd 100644 --- a/a/content_digest +++ b/N1/content_digest @@ -9,11 +9,11 @@ "Date\0Thu, 18 May 2017 14:20:33 +0100\0" "To\0Michal Hocko <mhocko@kernel.org>\0" "Cc\0Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>" - hannes@cmpxchg.org - vdavydov.dev@gmail.com - kernel-team@fb.com - linux-mm@kvack.org - " linux-kernel@vger.kernel.org\0" + <hannes@cmpxchg.org> + <vdavydov.dev@gmail.com> + <kernel-team@fb.com> + <linux-mm@kvack.org> + " <linux-kernel@vger.kernel.org>\0" "\00:1\0" "b\0" "On Thu, May 18, 2017 at 11:00:39AM +0200, Michal Hocko wrote:\n" @@ -38,6 +38,111 @@ "If we consider this approach, I've prepared a separate patch for this problem\n" "(stripped all oom reaper list stuff).\n" "\n" - Thanks! + "Thanks!\n" + "\n" + ">From 317fad44a0fe79fb76e8e4fd6bd81c52ae1712e9 Mon Sep 17 00:00:00 2001\n" + "From: Roman Gushchin <guro@fb.com>\n" + "Date: Tue, 16 May 2017 21:19:56 +0100\n" + "Subject: [PATCH] mm,oom: prevent OOM double kill from a pagefault handling\n" + " path\n" + "\n" + "During the debugging of some OOM-related stuff, I've noticed\n" + "that sometimes OOM kills two processes instead of one.\n" + "\n" + "The problem can be easily reproduced on a vanilla kernel:\n" + "\n" + "[ 25.721494] allocate invoked oom-killer: gfp_mask=0x14280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null), order=0, oom_score_adj=0\n" + "[ 25.725658] allocate cpuset=/ mems_allowed=0\n" + "[ 25.727033] CPU: 1 PID: 492 Comm: allocate Not tainted 4.12.0-rc1-mm1+ #181\n" + "[ 25.729215] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014\n" + "[ 25.729598] Call Trace:\n" + "[ 25.729598] dump_stack+0x63/0x82\n" + "[ 25.729598] dump_header+0x97/0x21a\n" + "[ 25.729598] ? do_try_to_free_pages+0x2d7/0x360\n" + "[ 25.729598] ? security_capable_noaudit+0x45/0x60\n" + "[ 25.729598] oom_kill_process+0x219/0x3e0\n" + "[ 25.729598] out_of_memory+0x11d/0x480\n" + "[ 25.729598] __alloc_pages_slowpath+0xc84/0xd40\n" + "[ 25.729598] __alloc_pages_nodemask+0x245/0x260\n" + "[ 25.729598] alloc_pages_vma+0xa2/0x270\n" + "[ 25.729598] __handle_mm_fault+0xca9/0x10c0\n" + "[ 25.729598] handle_mm_fault+0xf3/0x210\n" + "[ 25.729598] __do_page_fault+0x240/0x4e0\n" + "[ 25.729598] trace_do_page_fault+0x37/0xe0\n" + "[ 25.729598] do_async_page_fault+0x19/0x70\n" + "[ 25.729598] async_page_fault+0x28/0x30\n" + "< cut >\n" + "[ 25.810868] oom_reaper: reaped process 492 (allocate), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB\n" + "< cut >\n" + "[ 25.817589] allocate invoked oom-killer: gfp_mask=0x0(), nodemask=(null), order=0, oom_score_adj=0\n" + "[ 25.818821] allocate cpuset=/ mems_allowed=0\n" + "[ 25.819259] CPU: 1 PID: 492 Comm: allocate Not tainted 4.12.0-rc1-mm1+ #181\n" + "[ 25.819847] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014\n" + "[ 25.820549] Call Trace:\n" + "[ 25.820733] dump_stack+0x63/0x82\n" + "[ 25.820961] dump_header+0x97/0x21a\n" + "[ 25.820961] ? security_capable_noaudit+0x45/0x60\n" + "[ 25.820961] oom_kill_process+0x219/0x3e0\n" + "[ 25.820961] out_of_memory+0x11d/0x480\n" + "[ 25.820961] pagefault_out_of_memory+0x68/0x80\n" + "[ 25.820961] mm_fault_error+0x8f/0x190\n" + "[ 25.820961] ? handle_mm_fault+0xf3/0x210\n" + "[ 25.820961] __do_page_fault+0x4b2/0x4e0\n" + "[ 25.820961] trace_do_page_fault+0x37/0xe0\n" + "[ 25.820961] do_async_page_fault+0x19/0x70\n" + "[ 25.820961] async_page_fault+0x28/0x30\n" + "< cut >\n" + "[ 25.863078] Out of memory: Kill process 233 (firewalld) score 10 or sacrifice child\n" + "[ 25.863634] Killed process 233 (firewalld) total-vm:246076kB, anon-rss:20956kB, file-rss:0kB, shmem-rss:0kB\n" + "\n" + "This actually happens if pagefault_out_of_memory() is called\n" + "after the calling process has already been selected as an OOM victim\n" + "and killed. There is a race with the oom reaper: if the process\n" + "is reaped before it enters out_of_memory(), the MMF_OOM_SKIP\n" + "flag is set, and out_of_memory() will not consider the process\n" + "as a eligible victim. That means that another victim will be selected\n" + "and killed.\n" + "\n" + "Tetsuo Handa has noticed, that this is a side effect of\n" + "commit 9a67f6488eca926f (\"mm: consolidate GFP_NOFAIL checks\n" + "in the allocator slowpath\").\n" + "\n" + "To avoid this, out_of_memory() shouldn't be called from\n" + "pagefault_out_of_memory(), if current task already\n" + "has been chosen as an oom victim.\n" + "\n" + "v2: dropped changes related to the oom_reaper synchronization,\n" + " as it looks like a separate and minor issue;\n" + " rebased on new mm;\n" + " renamed, updated commit message.\n" + "\n" + "Signed-off-by: Roman Gushchin <guro@fb.com>\n" + "Cc: Michal Hocko <mhocko@suse.com>\n" + "Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>\n" + "Cc: Johannes Weiner <hannes@cmpxchg.org>\n" + "Cc: Vladimir Davydov <vdavydov.dev@gmail.com>\n" + "Cc: kernel-team@fb.com\n" + "Cc: linux-mm@kvack.org\n" + "Cc: linux-kernel@vger.kernel.org\n" + "---\n" + " mm/oom_kill.c | 3 +++\n" + " 1 file changed, 3 insertions(+)\n" + "\n" + "diff --git a/mm/oom_kill.c b/mm/oom_kill.c\n" + "index 04c9143..9c643a3 100644\n" + "--- a/mm/oom_kill.c\n" + "+++ b/mm/oom_kill.c\n" + "@@ -1068,6 +1068,9 @@ void pagefault_out_of_memory(void)\n" + " \tif (mem_cgroup_oom_synchronize(true))\n" + " \t\treturn;\n" + " \n" + "+\tif (tsk_is_oom_victim(current))\n" + "+\t\treturn;\n" + "+\n" + " \tif (!mutex_trylock(&oom_lock))\n" + " \t\treturn;\n" + " \tout_of_memory(&oc);\n" + "-- \n" + 2.7.4 -369b4f9788e1003f0ab13a9e782b456b7cd36c189fb9e12e7ff06583bebc7595 +71286ffb0939b64b3a19ec7c4fa5ec7108656a3fd206318125f60241a45dc0e4
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.