public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: William Lee Irwin III <wli@holomorphy.com>
To: Andrew Morton <akpm@osdl.org>
Cc: linux-kernel@vger.kernel.org
Subject: Re: [oom]: [0/4] fix OOM deadlock running OAST
Date: Wed, 23 Jun 2004 17:03:08 -0700	[thread overview]
Message-ID: <20040624000308.GK1552@holomorphy.com> (raw)
In-Reply-To: <20040623163819.013c8585.akpm@osdl.org>

William Lee Irwin III <wli@holomorphy.com> wrote:
>> It's unclear which zones must be checked for this to be of use. In
>> general, suppose one determines all possible zones from which a given
>> allocation may be satisfied (this still assumes out_of_memory() is
>> passed a gfp_mask, and then discovers ->all_unreclaimable. It is still
>> not possible to determine whether nr_swap_pages is relevant.

On Wed, Jun 23, 2004 at 04:38:19PM -0700, Andrew Morton wrote:
> I don't think it _is_ relevant.  Wev'e scanned the crap out of all the
> eligible zones, found nothing swappable outable or otherwise reclaimable. 
> That's as good a definition of oom as you're likely to get.  It takes care
> of mlocked user memory too.

The actual net effect of all this is blowing away if (nr_swap_pages > 0)
for __GFP_WIRED allocations. Removing those 2 lines (and the one line of
whitespace next to it) will pass the test I observed failure in. It's a
judgment call as to whether it's beneficial in general, as it does
insulate userspace somewhat from needing to wait for slow IO being the
ostensible cause of the allocation failure. RedHat vendor kernels have
removed the check entirely, which may be considered to be evidence of
the approach's viability.


William Lee Irwin III <wli@holomorphy.com> wrote:
>> There are larger semantic questions also. For instance, the runs that
>> deadlocked were with non-overcommit enabled (that is, I left the sysctl
>> /proc/sys/vm/overcommit_memory == 0 after boot). The intention was to
>> provide best effort unfailability to userspace allocations by detecting
>> insufficient resources up-front and returning MAP_FAILED from mmap() and
>> so on. There is a current hole in this, which is that pinned kernel
>> memory allocations aren't subtracted from the pool of memory considered
>> to be available to userspace.

On Wed, Jun 23, 2004 at 04:38:19PM -0700, Andrew Morton wrote:
> The overcommit logic doesn't need to care about pinned kernel allocations. 
> All it cares about is reclaimable and free pages.  Its calculation of what
> is reclaimable is a bit wonky because of ramfs/sysfs/ramdisk pages and
> metadata, and because of mlock.

Hmm. That sort of thing is what I had in mind by saying "pinned kernel
allocations". What did it sound like I had in mind?


On Wed, Jun 23, 2004 at 04:38:19PM -0700, Andrew Morton wrote:
> But for this problem, I do suspect that going oom if all the eligible zones
> are pegged out should suffice.  It is certainly accurate.
> It could be that the ->all_unreclaimable logic has rotted over the ages and
> may ned some attention - I haven't explicitly checked it in a year or more.
> Also there's some problem at present wherein the oomkiller takes 30 or
> more seconds to kick in and do its thing, which I need to look at.  This
> happens when there's no swap online.  It might also happen when swapspace
> is full - I didn't check.

The timeout semantics are crusty. 10 failures 1s or more apart must be
seen, but with no delay of more than 5s between failures. If the average
delay between attempts more than 1s from the last successful attempt
that incremented the counter was 3s, that would make for a 30s delay
before OOM killing. In the presence of e.g. delays waiting for IO and
so on preventing retries from happening rapidly, 30s is plausible. Also,
it's rather likely that the tasks targeted by the OOM killer are in
state TASK_UNINTERRUPTIBLE and so will never respond to the OOM kill
until whatever they were waiting for happens, which may require (you
guessed it) memory to occur.

I would prefer to migrate away from timeout-based semantics and eliminate
indefinite retry logic and out-of-context reclamation (which is the source
of reclamation failures having no context to which to return errors).
However, this also requires some method of recovering from or insulating
administrators from indefinite retry loops originating from userspace, and
better contexts to which to return failure than syslog daemons.


-- wli

  reply	other threads:[~2004-06-24  0:03 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-06-23 21:07 [oom]: [0/4] fix OOM deadlock running OAST William Lee Irwin III
2004-06-23 21:07 ` [oom]: [1/4] add __GFP_WIRED to pinned allocations William Lee Irwin III
2004-06-23 21:07   ` [oom]: [2/4] add nr_wired to page_state William Lee Irwin III
2004-06-23 21:07     ` [oom]: [3/4] track wired pages on a per-zone basis William Lee Irwin III
2004-06-23 21:07       ` [oom]: [4/4] check __GFP_WIRED in out_of_memory() William Lee Irwin III
2004-06-23 21:29       ` [oom]: [3/4] track wired pages on a per-zone basis William Lee Irwin III
2004-06-23 22:15       ` Andrew Morton
2004-06-23 22:28         ` William Lee Irwin III
2004-06-23 22:05   ` [oom]: [1/4] add __GFP_WIRED to pinned allocations Andrew Morton
2004-06-23 22:22     ` William Lee Irwin III
2004-06-23 22:36       ` Andrew Morton
2004-06-23 22:16 ` [oom]: [0/4] fix OOM deadlock running OAST Andrew Morton
2004-06-23 22:31   ` William Lee Irwin III
2004-06-23 22:37     ` Andrew Morton
2004-06-23 23:07       ` William Lee Irwin III
2004-06-23 23:38         ` Andrew Morton
2004-06-24  0:03           ` William Lee Irwin III [this message]
2004-06-24  0:18             ` Andrew Morton
2004-06-24  0:26               ` William Lee Irwin III
2004-06-24  0:32                 ` William Lee Irwin III
2004-06-24  1:07                   ` Andrew Morton
2004-06-24  1:24                     ` William Lee Irwin III
2004-06-24  1:52                       ` William Lee Irwin III
2004-06-24  2:01                         ` Andrew Morton
2004-06-24  2:15                           ` William Lee Irwin III
2004-06-24 14:16                 ` Marcelo Tosatti
2004-06-24 15:18                   ` William Lee Irwin III
2004-06-24 15:19                     ` William Lee Irwin III
2004-06-24 15:23                   ` William Lee Irwin III
2004-06-24 16:55                     ` Marcelo Tosatti
2004-06-25 15:18         ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040624000308.GK1552@holomorphy.com \
    --to=wli@holomorphy.com \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox