public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
* [PATCH] Docs/mm: document the OOM killer
@ 2026-03-14 15:25 Kit Dallege
  2026-03-15 20:48 ` Lorenzo Stoakes (Oracle)
  0 siblings, 1 reply; 6+ messages in thread
From: Kit Dallege @ 2026-03-14 15:25 UTC (permalink / raw)
  To: akpm, david, corbet; +Cc: linux-mm, linux-doc, Kit Dallege

Fill in the oom.rst stub that was created in commit 481cc97349d6
("mm,doc: Add new documentation structure") as part of the structured
memory management documentation following Mel Gorman's book outline.

Cover the scoring heuristic, allocation constraints, OOM reaper,
process_mrelease syscall, and sysctl knobs.

Signed-off-by: Kit Dallege <xaum.io@gmail.com>
---
 Documentation/mm/oom.rst | 67 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 67 insertions(+)

diff --git a/Documentation/mm/oom.rst b/Documentation/mm/oom.rst
index 18e9e40c1ec1..2259f871a4a7 100644
--- a/Documentation/mm/oom.rst
+++ b/Documentation/mm/oom.rst
@@ -3,3 +3,70 @@
 ======================
 Out Of Memory Handling
 ======================
+
+When the kernel cannot satisfy a memory allocation after exhausting reclaim,
+compaction, and memory reserves, it invokes the OOM killer to terminate a
+process and free memory.  The implementation is in ``mm/oom_kill.c``.
+
+Victim Selection
+================
+
+The OOM killer scores every eligible process and kills the one with the
+highest score.  The score is the sum of the process's resident pages, swap
+entries, and page table pages.  This sum is then adjusted by the per-process
+``oom_score_adj`` tunable (range -1000 to 1000, default 0), which biases
+the score by ``oom_score_adj * totalpages / 1000``.  Setting
+``oom_score_adj`` to -1000 disables OOM killing for that process entirely.
+
+The ``totalpages`` baseline depends on the allocation constraint:
+
+- **Unconstrained**: all RAM plus swap.
+- **Cpuset**: memory on nodes in the current cpuset.
+- **Memory policy**: memory on nodes in the current mempolicy.
+- **Memory cgroup**: the cgroup's memory limit.
+
+Only processes that can use memory within the constraint are considered.
+Kernel threads and init are never eligible.
+
+OOM Reaper
+==========
+
+Sending SIGKILL does not immediately free memory — the victim must be
+scheduled, unwind its stack, and tear down its address space.  To speed
+this up, the OOM reaper kernel thread (available on MMU systems) proactively
+unmaps the victim's anonymous and private pages without waiting for the
+victim to exit.
+
+The reaper gives the victim a short window to exit naturally before
+intervening.  It walks the victim's VMAs in reverse and calls
+``unmap_page_range()`` to release physical pages.  Once reaping completes
+(or is no longer possible), the mm is marked ``MMF_OOM_SKIP`` so the OOM
+killer skips it in future invocations.
+
+Before reaping, the mm is marked ``MMF_UNSTABLE`` to signal page fault
+handlers that private mappings may have been zeroed and are no longer
+reliable.
+
+process_mrelease
+================
+
+The ``process_mrelease(pidfd, flags)`` system call lets userspace OOM
+managers (such as systemd-oomd or Android's lmkd) trigger the same reaping
+mechanism on a dying process without waiting for the kernel OOM killer.
+It operates on a process that is already exiting and performs the same
+address space teardown that the OOM reaper would.
+
+Sysctl Knobs
+============
+
+``vm.panic_on_oom``
+  0 (default): kill a process.  1: panic on unconstrained OOM only.
+  2: always panic.
+
+``vm.oom_kill_allocating_task``
+  When non-zero, kill the task that triggered the OOM rather than scanning
+  for the largest process.
+
+``vm.oom_dump_tasks``
+  When non-zero (default), dump a table of all eligible tasks and their
+  memory usage to the kernel log before killing.
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] Docs/mm: document the OOM killer
  2026-03-14 15:25 [PATCH] Docs/mm: document the OOM killer Kit Dallege
@ 2026-03-15 20:48 ` Lorenzo Stoakes (Oracle)
  2026-03-16  7:32   ` Michal Hocko
  0 siblings, 1 reply; 6+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-15 20:48 UTC (permalink / raw)
  To: Kit Dallege; +Cc: akpm, david, corbet, linux-mm, linux-doc, Michal Hocko

NAK for being AI slop again, obviously.

Again, +cc the OOM maintainer you failed to bother to look up.

Reasons, as the rest:
- Worthless documentation
- Everything about patch screams 'zero effort, Claude did it all'
- Bad etiquette

As with all the rest it'd need to be totally rewritten and it's not worth the
maintainer time.

On Sat, Mar 14, 2026 at 04:25:18PM +0100, Kit Dallege wrote:
> Fill in the oom.rst stub that was created in commit 481cc97349d6
> ("mm,doc: Add new documentation structure") as part of the structured
> memory management documentation following Mel Gorman's book outline.

I mean the more I see it the more annoying it is.

>
> Cover the scoring heuristic, allocation constraints, OOM reaper,
> process_mrelease syscall, and sysctl knobs.

This sentence contains almost as much content as the patch.

>
> Signed-off-by: Kit Dallege <xaum.io@gmail.com>
> ---
>  Documentation/mm/oom.rst | 67 ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 67 insertions(+)
>
> diff --git a/Documentation/mm/oom.rst b/Documentation/mm/oom.rst
> index 18e9e40c1ec1..2259f871a4a7 100644
> --- a/Documentation/mm/oom.rst
> +++ b/Documentation/mm/oom.rst
> @@ -3,3 +3,70 @@
>  ======================
>  Out Of Memory Handling
>  ======================
> +
> +When the kernel cannot satisfy a memory allocation after exhausting reclaim,
> +compaction, and memory reserves, it invokes the OOM killer to terminate a

I mean this is just actively wrong to start with.

> +process and free memory.  The implementation is in ``mm/oom_kill.c``.

Terminate a 'process', even what that is tricky in kernel vs userland...

> +
> +Victim Selection
> +================
> +
> +The OOM killer scores every eligible process and kills the one with the
> +highest score.  The score is the sum of the process's resident pages, swap
> +entries, and page table pages.  This sum is then adjusted by the per-process
> +``oom_score_adj`` tunable (range -1000 to 1000, default 0), which biases
> +the score by ``oom_score_adj * totalpages / 1000``.  Setting
> +``oom_score_adj`` to -1000 disables OOM killing for that process entirely.
> +
> +The ``totalpages`` baseline depends on the allocation constraint:
> +
> +- **Unconstrained**: all RAM plus swap.
> +- **Cpuset**: memory on nodes in the current cpuset.
> +- **Memory policy**: memory on nodes in the current mempolicy.
> +- **Memory cgroup**: the cgroup's memory limit.
> +
> +Only processes that can use memory within the constraint are considered.
> +Kernel threads and init are never eligible.
> +
> +OOM Reaper
> +==========
> +
> +Sending SIGKILL does not immediately free memory — the victim must be
> +scheduled, unwind its stack, and tear down its address space.  To speed
> +this up, the OOM reaper kernel thread (available on MMU systems) proactively
> +unmaps the victim's anonymous and private pages without waiting for the

Anonymous AND private eh?

> +victim to exit.

Actually there IS some waiting for a specific futex case :)) though maybe
removed now.

> +
> +The reaper gives the victim a short window to exit naturally before
> +intervening.  It walks the victim's VMAs in reverse and calls

Why in reverse? Moon walk?

I mean etc. etc. this is really not helpful.

> +``unmap_page_range()`` to release physical pages.  Once reaping completes
> +(or is no longer possible), the mm is marked ``MMF_OOM_SKIP`` so the OOM
> +killer skips it in future invocations.
> +
> +Before reaping, the mm is marked ``MMF_UNSTABLE`` to signal page fault
> +handlers that private mappings may have been zeroed and are no longer
> +reliable.
> +
> +process_mrelease
> +================
> +
> +The ``process_mrelease(pidfd, flags)`` system call lets userspace OOM
> +managers (such as systemd-oomd or Android's lmkd) trigger the same reaping
> +mechanism on a dying process without waiting for the kernel OOM killer.
> +It operates on a process that is already exiting and performs the same
> +address space teardown that the OOM reaper would.
> +
> +Sysctl Knobs
> +============
> +
> +``vm.panic_on_oom``
> +  0 (default): kill a process.  1: panic on unconstrained OOM only.
> +  2: always panic.
> +
> +``vm.oom_kill_allocating_task``
> +  When non-zero, kill the task that triggered the OOM rather than scanning
> +  for the largest process.
> +
> +``vm.oom_dump_tasks``
> +  When non-zero (default), dump a table of all eligible tasks and their
> +  memory usage to the kernel log before killing.
> --
> 2.53.0
>
>
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Docs/mm: document the OOM killer
  2026-03-15 20:48 ` Lorenzo Stoakes (Oracle)
@ 2026-03-16  7:32   ` Michal Hocko
  2026-03-16 14:16     ` Lorenzo Stoakes (Oracle)
  0 siblings, 1 reply; 6+ messages in thread
From: Michal Hocko @ 2026-03-16  7:32 UTC (permalink / raw)
  To: Lorenzo Stoakes (Oracle)
  Cc: Kit Dallege, akpm, david, corbet, linux-mm, linux-doc

On Sun 15-03-26 20:48:22, Lorenzo Stoakes (Oracle) wrote:
> NAK for being AI slop again, obviously.
> 
> Again, +cc the OOM maintainer you failed to bother to look up.

Thanks!

> Reasons, as the rest:
> - Worthless documentation
> - Everything about patch screams 'zero effort, Claude did it all'
> - Bad etiquette
> 
> As with all the rest it'd need to be totally rewritten and it's not worth the
> maintainer time.
> 
> On Sat, Mar 14, 2026 at 04:25:18PM +0100, Kit Dallege wrote:
> > Fill in the oom.rst stub that was created in commit 481cc97349d6
> > ("mm,doc: Add new documentation structure") as part of the structured
> > memory management documentation following Mel Gorman's book outline.
> 
> I mean the more I see it the more annoying it is.
> 
> >
> > Cover the scoring heuristic, allocation constraints, OOM reaper,
> > process_mrelease syscall, and sysctl knobs.
> 
> This sentence contains almost as much content as the patch.

The real question is who is the expected audience of this documentation?
Administrators, kernel developers? 
Reading through this proposal this doesn't really seem to fit neither
well. For kernel developers who try to wrap their heads around the code
it is barely scratches the surface. For admins it doesn't really explain
more than an existing documentation for tunables.

So if there is a serious interest to make this useful kernel developers
oriented documentation I am more than willing to help. The code is not
really easy to follow as it is scattered. There are many subtle
expectations spread out and it is quite easy to break a delicate balance
tuned for through years. So there is a big documentatin gap I never got
around to fill up.

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Docs/mm: document the OOM killer
  2026-03-16  7:32   ` Michal Hocko
@ 2026-03-16 14:16     ` Lorenzo Stoakes (Oracle)
  2026-03-16 14:53       ` Michal Hocko
  2026-03-16 15:06       ` Jonathan Corbet
  0 siblings, 2 replies; 6+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-16 14:16 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Kit Dallege, akpm, david, corbet, linux-mm, linux-doc

On Mon, Mar 16, 2026 at 08:32:31AM +0100, Michal Hocko wrote:
> On Sun 15-03-26 20:48:22, Lorenzo Stoakes (Oracle) wrote:
> > NAK for being AI slop again, obviously.
> >
> > Again, +cc the OOM maintainer you failed to bother to look up.
>
> Thanks!

No problem!

>
> > Reasons, as the rest:
> > - Worthless documentation
> > - Everything about patch screams 'zero effort, Claude did it all'
> > - Bad etiquette
> >
> > As with all the rest it'd need to be totally rewritten and it's not worth the
> > maintainer time.
> >
> > On Sat, Mar 14, 2026 at 04:25:18PM +0100, Kit Dallege wrote:
> > > Fill in the oom.rst stub that was created in commit 481cc97349d6
> > > ("mm,doc: Add new documentation structure") as part of the structured
> > > memory management documentation following Mel Gorman's book outline.
> >
> > I mean the more I see it the more annoying it is.
> >
> > >
> > > Cover the scoring heuristic, allocation constraints, OOM reaper,
> > > process_mrelease syscall, and sysctl knobs.
> >
> > This sentence contains almost as much content as the patch.
>
> The real question is who is the expected audience of this documentation?
> Administrators, kernel developers?
> Reading through this proposal this doesn't really seem to fit neither
> well. For kernel developers who try to wrap their heads around the code
> it is barely scratches the surface. For admins it doesn't really explain
> more than an existing documentation for tunables.
>
> So if there is a serious interest to make this useful kernel developers
> oriented documentation I am more than willing to help. The code is not
> really easy to follow as it is scattered. There are many subtle
> expectations spread out and it is quite easy to break a delicate balance
> tuned for through years. So there is a big documentatin gap I never got
> around to fill up.

I mean, we definitely could do with better documentation :) Obviously I
somewhat document it from a 'learning the code in depth' perspective in my
book, but that's tied to v6.0, effectively paywalled (sorry!) and not the
same as the kind of documentation we'd ideally like the kernel to expose,
which would be less specific I thik but also up-to-date with newer kernels.

The point WRT this patch however is that really, it needs to come from
somebody who has some experience/understanding, and generating it via an
LLM is just not useful - any kernel developer with understanding could do
so.

Otherwise we end up with:

1. generated LLM documentation sent without understanding
2. maintainers have to essentially rewrite the whole documentation ourselves

And thus we essentially have the work dictated to us, but credited
elsewhere (not that credit matters all that much in the end, but it's the
principle of the thing).

So while we want documentation, we don't want _any_ documentation :P

Speaking about docs more broadly - as usual we're all so busy it's a bit of
a catch-22, though once we have something in place, iterating it won't be
so hard.

I wonder if some of us (I realise this sounds like self volunteering)
should just write up some bare bones and patch it in, then we can get the
iterating part of things moving?

>
> --
> Michal Hocko
> SUSE Labs

Cheers, Lorenzo


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Docs/mm: document the OOM killer
  2026-03-16 14:16     ` Lorenzo Stoakes (Oracle)
@ 2026-03-16 14:53       ` Michal Hocko
  2026-03-16 15:06       ` Jonathan Corbet
  1 sibling, 0 replies; 6+ messages in thread
From: Michal Hocko @ 2026-03-16 14:53 UTC (permalink / raw)
  To: Lorenzo Stoakes (Oracle)
  Cc: Kit Dallege, akpm, david, corbet, linux-mm, linux-doc

On Mon 16-03-26 14:16:19, Lorenzo Stoakes (Oracle) wrote:
> On Mon, Mar 16, 2026 at 08:32:31AM +0100, Michal Hocko wrote:
> > On Sun 15-03-26 20:48:22, Lorenzo Stoakes (Oracle) wrote:
> > > NAK for being AI slop again, obviously.
> > >
> > > Again, +cc the OOM maintainer you failed to bother to look up.
> >
> > Thanks!
> 
> No problem!
> 
> >
> > > Reasons, as the rest:
> > > - Worthless documentation
> > > - Everything about patch screams 'zero effort, Claude did it all'
> > > - Bad etiquette
> > >
> > > As with all the rest it'd need to be totally rewritten and it's not worth the
> > > maintainer time.
> > >
> > > On Sat, Mar 14, 2026 at 04:25:18PM +0100, Kit Dallege wrote:
> > > > Fill in the oom.rst stub that was created in commit 481cc97349d6
> > > > ("mm,doc: Add new documentation structure") as part of the structured
> > > > memory management documentation following Mel Gorman's book outline.
> > >
> > > I mean the more I see it the more annoying it is.
> > >
> > > >
> > > > Cover the scoring heuristic, allocation constraints, OOM reaper,
> > > > process_mrelease syscall, and sysctl knobs.
> > >
> > > This sentence contains almost as much content as the patch.
> >
> > The real question is who is the expected audience of this documentation?
> > Administrators, kernel developers?
> > Reading through this proposal this doesn't really seem to fit neither
> > well. For kernel developers who try to wrap their heads around the code
> > it is barely scratches the surface. For admins it doesn't really explain
> > more than an existing documentation for tunables.
> >
> > So if there is a serious interest to make this useful kernel developers
> > oriented documentation I am more than willing to help. The code is not
> > really easy to follow as it is scattered. There are many subtle
> > expectations spread out and it is quite easy to break a delicate balance
> > tuned for through years. So there is a big documentatin gap I never got
> > around to fill up.
> 
> I mean, we definitely could do with better documentation :) Obviously I
> somewhat document it from a 'learning the code in depth' perspective in my
> book, but that's tied to v6.0, effectively paywalled (sorry!) and not the
> same as the kind of documentation we'd ideally like the kernel to expose,
> which would be less specific I thik but also up-to-date with newer kernels.
> 
> The point WRT this patch however is that really, it needs to come from
> somebody who has some experience/understanding, and generating it via an
> LLM is just not useful - any kernel developer with understanding could do
> so.

I think we are struggling with capacity here. I am willing to help shape
an existing text but will be struggling to find time to cook up that
text myself. I do mind involving LLMs are long as the content is
properly reviewed and factually correct. 

Wrt OOM, most people/developers struggle to understand these areas from
my experience
- what is the purpose of the oom killer and its limitations
- different contexts oom handles
- when is the oom killer triggered
- oom killer in progress handling and locking
- forward progress guarantee (oom_reaper)
- coordination with task exit path
- memory reserves for oom victims

I bet there is some more but these are the most prominent ones.
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Docs/mm: document the OOM killer
  2026-03-16 14:16     ` Lorenzo Stoakes (Oracle)
  2026-03-16 14:53       ` Michal Hocko
@ 2026-03-16 15:06       ` Jonathan Corbet
  1 sibling, 0 replies; 6+ messages in thread
From: Jonathan Corbet @ 2026-03-16 15:06 UTC (permalink / raw)
  To: Lorenzo Stoakes (Oracle), Michal Hocko
  Cc: Kit Dallege, akpm, david, linux-mm, linux-doc

"Lorenzo Stoakes (Oracle)" <ljs@kernel.org> writes:

> I wonder if some of us (I realise this sounds like self volunteering)
> should just write up some bare bones and patch it in, then we can get the
> iterating part of things moving?

That, of course, was the theory behind the addition of the skeleton
documentation that's there now :)

I have also thought about trying to fill it in once a bit of spare time
opens up.  Funny how that tends not to happen, but I still would like to
do that at some point.

Michal's question, though, is something that needs a good answer: who is
the audience for Documentation/mm/ ?  Some of the stuff in the patch
under discussion, if it were to reach an acceptable point, is probably
better placed in the admin guide.  OTOH, a manual firmly aimed at people
trying to understand the MM code itself makes sense to me.

Thanks,

jon


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-03-16 15:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-14 15:25 [PATCH] Docs/mm: document the OOM killer Kit Dallege
2026-03-15 20:48 ` Lorenzo Stoakes (Oracle)
2026-03-16  7:32   ` Michal Hocko
2026-03-16 14:16     ` Lorenzo Stoakes (Oracle)
2026-03-16 14:53       ` Michal Hocko
2026-03-16 15:06       ` Jonathan Corbet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox