From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, David Rientjes <rientjes@google.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@suse.cz>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 3.10 43/79] mm, oom: base root bonus on current usage
Date: Tue, 11 Feb 2014 11:05:47 -0800 [thread overview]
Message-ID: <20140211184722.191350208@linuxfoundation.org> (raw)
In-Reply-To: <20140211184720.928667275@linuxfoundation.org>
3.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Rientjes <rientjes@google.com>
commit 778c14affaf94a9e4953179d3e13a544ccce7707 upstream.
A 3% of system memory bonus is sometimes too excessive in comparison to
other processes.
With commit a63d83f427fb ("oom: badness heuristic rewrite"), the OOM
killer tries to avoid killing privileged tasks by subtracting 3% of
overall memory (system or cgroup) from their per-task consumption. But
as a result, all root tasks that consume less than 3% of overall memory
are considered equal, and so it only takes 33+ privileged tasks pushing
the system out of memory for the OOM killer to do something stupid and
kill dhclient or other root-owned processes. For example, on a 32G
machine it can't tell the difference between the 1M agetty and the 10G
fork bomb member.
The changelog describes this 3% boost as the equivalent to the global
overcommit limit being 3% higher for privileged tasks, but this is not
the same as discounting 3% of overall memory from _every privileged task
individually_ during OOM selection.
Replace the 3% of system memory bonus with a 3% of current memory usage
bonus.
By giving root tasks a bonus that is proportional to their actual size,
they remain comparable even when relatively small. In the example
above, the OOM killer will discount the 1M agetty's 256 badness points
down to 179, and the 10G fork bomb's 262144 points down to 183500 points
and make the right choice, instead of discounting both to 0 and killing
agetty because it's first in the task list.
Signed-off-by: David Rientjes <rientjes@google.com>
Reported-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
Documentation/filesystems/proc.txt | 4 ++--
mm/oom_kill.c | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -1372,8 +1372,8 @@ may allocate from based on an estimation
For example, if a task is using all allowed memory, its badness score will be
1000. If it is using half of its allowed memory, its score will be 500.
-There is an additional factor included in the badness score: root
-processes are given 3% extra memory over other tasks.
+There is an additional factor included in the badness score: the current memory
+and swap usage is discounted by 3% for root processes.
The amount of "allowed" memory depends on the context in which the oom killer
was called. If it is due to the memory assigned to the allocating task's cpuset
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -170,7 +170,7 @@ unsigned long oom_badness(struct task_st
* implementation used by LSMs.
*/
if (has_capability_noaudit(p, CAP_SYS_ADMIN))
- adj -= 30;
+ points -= (points * 3) / 100;
/* Normalize to oom_score_adj units */
adj *= totalpages / 1000;
next prev parent reply other threads:[~2014-02-11 19:05 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-11 19:05 [PATCH 3.10 00/79] 3.10.30-stable review Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 01/79] SELinux: Fix memory leak upon loading policy Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 02/79] tracing: Have trace buffer point back to trace_array Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 03/79] tracing: Check if tracing is enabled in trace_puts() Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 04/79] arch/sh/kernel/kgdb.c: add missing #include <linux/sched.h> Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 05/79] intel-iommu: fix off-by-one in pagetable freeing Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 06/79] Revert "EISA: Initialize device before its resources" Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 07/79] fuse: fix pipe_buf_operations Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 08/79] audit: reset audit backlog wait time after error recovery Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 09/79] audit: correct a type mismatch in audit_syscall_exit() Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 10/79] mm/memory-failure.c: shift page lock from head page to tail page after thp split Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 11/79] mm/page-writeback.c: fix dirty_balance_reserve subtraction from dirtyable memory Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 12/79] mm/page-writeback.c: do not count anon pages as " Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 13/79] mmc: fix host release issue after discard operation Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 14/79] mmc: atmel-mci: fix timeout errors in SDIO mode when using DMA Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 15/79] slub: Fix calculation of cpu slabs Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 16/79] turbostat: Dont put unprocessed uapi headers in the include path Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 18/79] ACPI / init: Flag use of ACPI and ACPI idioms for power supplies to regulator API Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 19/79] compat: fix sys_fanotify_mark Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 20/79] fs/compat: fix parameter handling for compat readv/writev syscalls Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 21/79] fs/compat: fix lookup_dcookie() parameter handling Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 22/79] tile: remove compat_sys_lookup_dcookie declaration to fix compile error Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 23/79] mtd: mxc_nand: remove duplicated ecc_stats counting Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 24/79] ore: Fix wrong math in allocation of per device BIO Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 25/79] xtensa: xtfpga: fix definitions of platform devices Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 26/79] IB/qib: Fix QP check when looping back to/from QP1 Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 27/79] spi/bcm63xx: dont substract prepend length from total length Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 28/79] spidev: fix hang when transfer_one_message fails Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 29/79] NFSv4: OPEN must handle the NFS4ERR_IO return code correctly Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 30/79] nfs4.1: properly handle ENOTSUP in SECINFO_NO_NAME Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 31/79] NFSv4.1: Handle errors correctly in nfs41_walk_client_list Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 32/79] nfs4: fix discover_server_trunking use after free Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 33/79] pnfs: Proper delay for NFS4ERR_RECALLCONFLICT in layout_get_done Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 34/79] sunrpc: Fix infinite loop in RPC state machine Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 35/79] dm thin: fix discard support to a previously shared block Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 36/79] dm thin: initialize dm_thin_new_mapping returned by get_next_mapping Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 37/79] dm: wait until embedded kobject is released before destroying a device Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 38/79] dm space map common: make sure new space is used during extend Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 39/79] dm space map metadata: fix extending the space map Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 40/79] dm space map metadata: fix bug in resizing of thin metadata Greg Kroah-Hartman
2014-02-11 19:05 ` Greg Kroah-Hartman [this message]
2014-02-11 19:05 ` [PATCH 3.10 44/79] media: anysee: fix non-working E30 Combo Plus DVB-T Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 45/79] [media] dib8000: make 32 bits read atomic Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 46/79] [media] media: s5p_mfc: remove s5p_mfc_get_node_type() function Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 47/79] [media] nxt200x: increase write buffer size Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 48/79] [media] dib8000: fix regression with dib807x Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 49/79] [media] m88rs2000: add m88rs2000_set_carrieroffset Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 50/79] [media] m88rs2000: set symbol rate accurately Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 52/79] drm/radeon: disable ss on DP for DCE3.x Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 53/79] drm/radeon: fix surface sync in fence on cayman (v2) Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 54/79] drm/radeon: set the full cache bit for fences on r7xx+ Greg Kroah-Hartman
2014-02-11 19:05 ` [PATCH 3.10 55/79] drm/radeon: fix DAC interrupt handling on DCE5+ Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.10 56/79] drm/radeon/DCE4+: clear bios scratch dpms bit (v2) Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.10 57/79] dm sysfs: fix a module unload race Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.10 58/79] drm/nouveau: fix m2mf copy to tiled gart Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.10 59/79] drm/i915: Flush outstanding requests before allocating new seqno Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.10 60/79] drm/i915: Fix the offset issue for the stolen GEM objects Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.10 61/79] drm/i915: VLV2 - Fix hotplug detect bits Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.10 62/79] i915: remove pm_qos request on error Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.10 63/79] drm/cirrus: correct register values for 16bpp Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.10 64/79] drm/mgag200: fix typo causing bw limits to be ignored on some chips Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.10 65/79] mfd: lpc_ich: Add support for Intel Avoton SoC Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.10 66/79] mfd: lpc_ich: iTCO_wdt patch for Intel Coleto Creek DeviceIDs Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.10 67/79] i2c: i801: SMBus " Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.10 68/79] ftrace: Synchronize setting function_trace_op with ftrace_trace_function Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.10 69/79] ftrace: Fix synchronization location disabling and freeing ftrace_ops Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.10 70/79] ftrace: Have function graph only trace based on global_ops filters Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.10 71/79] timekeeping: Fix lost updates to tai adjustment Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.10 72/79] timekeeping: Fix CLOCK_TAI timer/nanosleep delays Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.10 73/79] timekeeping: Fix missing timekeeping_update in suspend path Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.10 74/79] rtc-cmos: Add an alarm disable quirk Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.10 75/79] timekeeping: Avoid possible deadlock from clock_was_set_delayed Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.10 76/79] intel_pstate: Add Haswell CPU models Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.10 77/79] intel_pstate: fix no_turbo Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.10 78/79] intel_pstate: Improve accuracy by not truncating until final result Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.10 79/79] intel_pstate: Correct calculation of min pstate value Greg Kroah-Hartman
2014-02-12 4:20 ` [PATCH 3.10 00/79] 3.10.30-stable review Guenter Roeck
2014-02-12 18:57 ` Shuah Khan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140211184722.191350208@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mhocko@suse.cz \
--cc=rientjes@google.com \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).