linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [patch 00/11] userspace out of memory handling
@ 2014-03-05  3:58 David Rientjes
  2014-03-05  3:58 ` [patch 01/11] fork: collapse copy_flags into copy_process David Rientjes
                   ` (12 more replies)
  0 siblings, 13 replies; 33+ messages in thread
From: David Rientjes @ 2014-03-05  3:58 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Michal Hocko, KAMEZAWA Hiroyuki,
	Christoph Lameter, Pekka Enberg, Tejun Heo, Mel Gorman,
	Oleg Nesterov, Rik van Riel, Jianguo Wu, Tim Hockin, linux-kernel,
	linux-mm, cgroups, linux-doc

This patchset implements userspace out of memory handling.

It is based on v3.14-rc5.  Individual patches will apply cleanly or you
may pull the entire series from

	git://git.kernel.org/pub/scm/linux/kernel/git/rientjes/linux.git mm/oom

When the system or a memcg is oom, processes running on that system or
attached to that memcg cannot allocate memory.  It is impossible for a
process to reliably handle the oom condition from userspace.

First, consider only system oom conditions.  When memory is completely
depleted and nothing may be reclaimed, the kernel is forced to free some
memory; the only way it can do so is to kill a userspace process.  This
will happen instantaneously and userspace can enforce neither its own
policy nor collect information.

On system oom, there may be a hierarchy of memcgs that represent user
jobs, for example.  Each job may have a priority independent of their
current memory usage.  There is no existing kernel interface to kill the
lowest priority job; userspace can now kill the lowest priority job or
allow priorities to change based on whether the job is using more memory
than its pre-defined reservation.

Additionally, users may want to log the condition or debug applications
that are using too much memory.  They may wish to collect heap profiles
or are able to do memory freeing without killing a process by throttling
or ratelimiting.

Interactive users using X window environments may wish to have a dialogue
box appear to determine how to proceed -- it may even allow them shell
access to examine the state of the system while oom.

It's not sufficient to simply restrict all user processes to a subset of
memory and oom handling processes to the remainder via a memcg hierarchy:
kernel memory and other page allocations can easily deplete all memory
that is not charged to a user hierarchy of memory.

This patchset allows userspace to do all of these things by defining a
small memory reserve that is accessible only by processes that are
handling the notification.

Second, consider memcg oom conditions.  Processes need no special
knowledge of whether they are attached to the root memcg, where memcg
charging will always succeed, or a child memcg where charging will fail
when the limit has been reached.  This allows those processes handling
memcg oom conditions to overcharge the memcg by the amount of reserved
memory.  They need not create child memcgs with smaller limits and
attach the userspace oom handler only to the parent; such support would
not allow userspace to handle system oom conditions anyway.

This patchset introduces a standard interface through memcg that allows
both of these conditions to be handled in the same clean way: users
define memory.oom_reserve_in_bytes to define the reserve and this
amount is allowed to be overcharged to the process handling the oom
condition's memcg.  If used with the root memcg, this amount is allowed
to be allocated below the per-zone watermarks for root processes that
are handling such conditions (only root may write to
cgroup.event_control for the root memcg).
---
 Documentation/cgroups/memory.txt           |  46 ++++++++-
 Documentation/cgroups/resource_counter.txt |  12 +--
 Documentation/sysctl/vm.txt                |   5 +
 arch/m32r/mm/discontig.c                   |   1 +
 include/linux/memcontrol.h                 |  24 +++++
 include/linux/mempolicy.h                  |   3 +-
 include/linux/mmzone.h                     |   2 +
 include/linux/res_counter.h                |  16 ++--
 include/linux/sched.h                      |   2 +-
 kernel/fork.c                              |  13 +--
 kernel/res_counter.c                       |  42 ++++++---
 mm/memcontrol.c                            | 144 ++++++++++++++++++++++++++++-
 mm/mempolicy.c                             |  46 ++-------
 mm/oom_kill.c                              |   7 ++
 mm/page_alloc.c                            |  17 +++-
 mm/slab.c                                  |   8 +-
 mm/slub.c                                  |   2 +-
 17 files changed, 292 insertions(+), 98 deletions(-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2014-03-11 12:05 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-05  3:58 [patch 00/11] userspace out of memory handling David Rientjes
2014-03-05  3:58 ` [patch 01/11] fork: collapse copy_flags into copy_process David Rientjes
2014-03-05  3:58 ` [patch 02/11] mm, mempolicy: rename slab_node for clarity David Rientjes
2014-03-05  3:59 ` [patch 03/11] mm, mempolicy: remove per-process flag David Rientjes
2014-03-07 17:20   ` Andi Kleen
2014-03-07 20:48     ` Andrew Morton
2014-03-05  3:59 ` [patch 04/11] mm, memcg: add tunable for oom reserves David Rientjes
2014-03-05 21:17   ` Andrew Morton
2014-03-06  2:53     ` David Rientjes
2014-03-06 21:04   ` Tejun Heo
2014-03-05  3:59 ` [patch 05/11] res_counter: remove interface for locked charging and uncharging David Rientjes
2014-03-05  3:59 ` [patch 06/11] res_counter: add interface for maximum nofail charge David Rientjes
2014-03-05  3:59 ` [patch 07/11] mm, memcg: allow processes handling oom notifications to access reserves David Rientjes
2014-03-06 21:12   ` Tejun Heo
2014-03-05  3:59 ` [patch 08/11] mm, memcg: add memcg oom reserve documentation David Rientjes
2014-03-05  3:59 ` [patch 09/11] mm, page_alloc: allow system oom handlers to use memory reserves David Rientjes
2014-03-06 21:13   ` Tejun Heo
2014-03-05  3:59 ` [patch 10/11] mm, memcg: add memory.oom_control notification for system oom David Rientjes
2014-03-06 21:15   ` Tejun Heo
2014-03-05  3:59 ` [patch 11/11] mm, memcg: allow system oom killer to be disabled David Rientjes
2014-03-06 21:15   ` Tejun Heo
2014-03-05 21:17 ` [patch 00/11] userspace out of memory handling Andrew Morton
2014-03-06  2:52   ` David Rientjes
2014-03-11 12:03     ` Jianguo Wu
2014-03-06 20:49 ` Tejun Heo
2014-03-06 20:55   ` David Rientjes
2014-03-06 20:59     ` Tejun Heo
2014-03-06 21:08       ` David Rientjes
2014-03-06 21:11         ` Tejun Heo
2014-03-06 21:23           ` David Rientjes
2014-03-06 21:29             ` Tejun Heo
2014-03-06 21:33             ` Tejun Heo
2014-03-07 12:23               ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).