From: Greg KH <gregkh@suse.de>
To: linux-kernel@vger.kernel.org, stable@kernel.org
Cc: stable-review@kernel.org, torvalds@linux-foundation.org,
akpm@linux-foundation.org, alan@lxorguk.ukuu.org.uk,
David Rientjes <rientjes@google.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Oleg Nesterov <oleg@redhat.com>, Hugh Dickins <hughd@google.com>,
Andrey Vagin <avagin@openvz.org>
Subject: [13/55] oom: prevent unnecessary oom kills or kernel panics
Date: Fri, 25 Mar 2011 16:47:06 -0700 [thread overview]
Message-ID: <20110325234840.292017361@clark.kroah.org> (raw)
In-Reply-To: <20110325234853.GA21145@kroah.com>
2.6.37-stable review patch. If anyone has any objections, please let us know.
------------------
From: David Rientjes <rientjes@google.com>
commit 3a5dda7a17cf3706f79b86293f29db02d61e0d48 upstream.
This patch prevents unnecessary oom kills or kernel panics by reverting
two commits:
495789a5 (oom: make oom_score to per-process value)
cef1d352 (oom: multi threaded process coredump don't make deadlock)
First, 495789a5 (oom: make oom_score to per-process value) ignores the
fact that all threads in a thread group do not necessarily exit at the
same time.
It is imperative that select_bad_process() detect threads that are in the
exit path, specifically those with PF_EXITING set, to prevent needlessly
killing additional tasks. If a process is oom killed and the thread group
leader exits, select_bad_process() cannot detect the other threads that
are PF_EXITING by iterating over only processes. Thus, it currently
chooses another task unnecessarily for oom kill or panics the machine when
nothing else is eligible.
By iterating over threads instead, it is possible to detect threads that
are exiting and nominate them for oom kill so they get access to memory
reserves.
Second, cef1d352 (oom: multi threaded process coredump don't make
deadlock) erroneously avoids making the oom killer a no-op when an
eligible thread other than current isfound to be exiting. We want to
detect this situation so that we may allow that exiting thread time to
exit and free its memory; if it is able to exit on its own, that should
free memory so current is no loner oom. If it is not able to exit on its
own, the oom killer will nominate it for oom kill which, in this case,
only means it will get access to memory reserves.
Without this change, it is easy for the oom killer to unnecessarily target
tasks when all threads of a victim don't exit before the thread group
leader or, in the worst case, panic the machine.
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
mm/oom_kill.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -292,11 +292,11 @@ static struct task_struct *select_bad_pr
unsigned long totalpages, struct mem_cgroup *mem,
const nodemask_t *nodemask)
{
- struct task_struct *p;
+ struct task_struct *g, *p;
struct task_struct *chosen = NULL;
*ppoints = 0;
- for_each_process(p) {
+ do_each_thread(g, p) {
unsigned int points;
if (oom_unkillable_task(p, mem, nodemask))
@@ -324,7 +324,7 @@ static struct task_struct *select_bad_pr
* the process of exiting and releasing its resources.
* Otherwise we could get an easy OOM deadlock.
*/
- if (thread_group_empty(p) && (p->flags & PF_EXITING) && p->mm) {
+ if ((p->flags & PF_EXITING) && p->mm) {
if (p != current)
return ERR_PTR(-1UL);
@@ -337,7 +337,7 @@ static struct task_struct *select_bad_pr
chosen = p;
*ppoints = points;
}
- }
+ } while_each_thread(g, p);
return chosen;
}
next prev parent reply other threads:[~2011-03-26 0:27 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-25 23:48 [00/55] 2.6.37.6-stable review Greg KH
2011-03-25 23:46 ` [01/55] ALSA: hda - VIA: Fix stereo mixer recording no sound issue Greg KH
2011-03-25 23:46 ` [02/55] ALSA: hda - VIA: Fix independent headphone " Greg KH
2011-03-25 23:46 ` [03/55] ALSA: hda - VIA: Add missing support for VT1718S in A-A path Greg KH
2011-03-25 23:46 ` [04/55] ALSA: hda - VIA: Fix invalid A-A path volume adjust issue Greg KH
2011-03-25 23:46 ` [05/55] ALSA: hda - VIA: Fix codec type for VT1708BCE at the right timing Greg KH
2011-03-25 23:46 ` [06/55] ALSA: hda - VIA: Fix VT1708 cant build up Headphone control issue Greg KH
2011-03-25 23:47 ` [07/55] ethtool: Compat handling for struct ethtool_rxnfc Greg KH
2011-03-25 23:47 ` [08/55] Revert "slab: Fix missing DEBUG_SLAB last user" Greg KH
2011-03-25 23:47 ` [09/55] aio: wake all waiters when destroying ctx Greg KH
2011-03-25 23:47 ` [10/55] cgroups: if you list_empty() a head then dont list_del() it Greg KH
2011-03-25 23:47 ` [11/55] shmem: let shared anonymous be nonlinear again Greg KH
2011-03-25 23:47 ` [12/55] mm: swap: unlock swapfile inode mutex before closing file on bad swapfiles Greg KH
2011-03-25 23:47 ` Greg KH [this message]
2011-03-25 23:47 ` [14/55] oom: skip zombies when iterating tasklist Greg KH
2011-03-25 23:47 ` [15/55] oom: avoid deferring oom killer if exiting task is being traced Greg KH
2011-03-25 23:47 ` [16/55] PCI hotplug: acpiphp: set current_state to D0 in register_slot Greg KH
2011-03-25 23:47 ` [17/55] xen: set max_pfn_mapped to the last pfn mapped Greg KH
2011-03-25 23:47 ` [18/55] intel_idle: disable NHM/WSM HW C-state auto-demotion Greg KH
2011-03-25 23:47 ` [19/55] intel_idle: disable Atom/Lincroft " Greg KH
2011-03-25 23:47 ` [20/55] Prevent rt_sigqueueinfo and rt_tgsigqueueinfo from spoofing the signal code Greg KH
2011-03-25 23:47 ` [21/55] ALSA: HDA: Fix internal mic on Dell E5420/E5520 Greg KH
2011-03-25 23:47 ` [22/55] ext3: skip orphan cleanup on rocompat fs Greg KH
2011-03-25 23:47 ` [23/55] sysctl: restrict write access to dmesg_restrict Greg KH
2011-03-25 23:47 ` [24/55] procfs: fix /proc/<pid>/maps heap check Greg KH
2011-03-25 23:47 ` [25/55] proc: protect mm start_code/end_code in /proc/pid/stat Greg KH
2011-03-25 23:47 ` [26/55] fbcon: Bugfix soft cursor detection in Tile Blitting Greg KH
2011-03-25 23:47 ` [27/55] nfsd41: modify the members value of nfsd4_op_flags Greg KH
2011-03-25 23:47 ` [28/55] nfsd4: minor nfs4state.c reshuffling Greg KH
2011-03-25 23:47 ` [29/55] nfsd4: fix struct file leak Greg KH
2011-03-25 23:47 ` [30/55] nfsd: wrong index used in inner loop Greg KH
2011-03-25 23:47 ` [31/55] [media] uvcvideo: Fix uvc_fixup_video_ctrl() format search Greg KH
2011-03-25 23:47 ` [32/55] [media] uvcvideo: Fix descriptor parsing for video output devices Greg KH
2011-03-25 23:47 ` [33/55] sh: Fix ptrace fpu state initialisation Greg KH
2011-03-25 23:47 ` [34/55] sh: Fix ptrace hw_breakpoint handling Greg KH
2011-03-25 23:47 ` [35/55] USB: Do not pass negative length to snoop_urb() Greg KH
2011-03-25 23:47 ` [36/55] ehci-hcd: Bug fix: dont set a QHs Halt bit Greg KH
2011-03-25 23:47 ` [37/55] USB: uss720 fixup refcount position Greg KH
2011-03-25 23:47 ` [38/55] USB: Fix bad dma problem on WDM device disconnect Greg KH
2011-03-25 23:47 ` [39/55] USB: cdc-acm: fix memory corruption / panic Greg KH
2011-03-25 23:47 ` [40/55] USB: cdc-acm: fix potential null-pointer dereference Greg KH
2011-03-25 23:47 ` [41/55] USB: cdc-acm: fix potential null-pointer dereference on disconnect Greg KH
2011-03-25 23:47 ` [42/55] fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away Greg KH
2011-03-25 23:47 ` [43/55] Input: xen-kbdfront - advertise either absolute or relative coordinates Greg KH
2011-03-25 23:47 ` [44/55] x86: Cleanup highmap after brk is concluded Greg KH
2011-03-25 23:47 ` [45/55] drm: check for modesetting on modeset ioctls Greg KH
2011-03-25 23:47 ` [46/55] drm/i915: Prevent racy removal of request from client list Greg KH
2011-03-25 23:47 ` [47/55] drm: Fix use-after-free in drm_gem_vm_close() Greg KH
2011-03-25 23:47 ` [48/55] drm/radeon/kms: prefer legacy pll algo for tv-out Greg KH
2011-03-25 23:47 ` [49/55] drm/radeon/kms: fix hardcoded EDID handling Greg KH
2011-03-25 23:47 ` [50/55] perf: Fix tear-down of inherited group events Greg KH
2011-03-25 23:47 ` [51/55] NFS: Fix a hang/infinite loop in nfs_wb_page() Greg KH
2011-03-25 23:47 ` [52/55] SUNRPC: Never reuse the socket port after an xs_close() Greg KH
2011-03-25 23:47 ` [53/55] fs: call security_d_instantiate in d_obtain_alias V2 Greg KH
2011-03-25 23:47 ` [54/55] dcdbas: force SMI to happen when expected Greg KH
2011-03-25 23:47 ` [55/55] ext4: skip orphan cleanup if fs has unknown ROCOMPAT features Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110325234840.292017361@clark.kroah.org \
--to=gregkh@suse.de \
--cc=akpm@linux-foundation.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=avagin@openvz.org \
--cc=hughd@google.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=rientjes@google.com \
--cc=stable-review@kernel.org \
--cc=stable@kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox