All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <andrea@cpushare.com>
To: David Rientjes <rientjes@google.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 04 of 11] avoid selecting already killed tasks
Date: Thu, 3 Jan 2008 14:41:37 +0100	[thread overview]
Message-ID: <20080103134137.GT30939@v2.random> (raw)
In-Reply-To: <alpine.DEB.0.9999.0801030134130.25018@chino.kir.corp.google.com>

On Thu, Jan 03, 2008 at 01:40:09AM -0800, David Rientjes wrote:
> On Thu, 3 Jan 2008, Andrea Arcangeli wrote:
> 
> > avoid selecting already killed tasks
> > 
> > If the killed task doesn't go away because it's waiting on some other
> > task who needs to allocate memory, to release the i_sem or some other
> > lock, we must fallback to killing some other task in order to kill the
> > original selected and already oomkilled task, but the logic that kills
> > the childs first, would deadlock, if the already oom-killed task was
> > actually the first child of the newly oom-killed task.
> > 
> 
> The problem is that this can cause the parent or one of its children to be 
> unnecessarily killed.

Well, the single fact I'm skipping over the TIF_MEMDIE tasks to
prevent deadlocks, allows for spurious oom killing again. Like you
said we can later add a per-task timeout so we wait only X seconds for
a certain TIF_MEMDIE task to quit before selecting another one.

But we got to ignore those TIF_MEMDIE tasks unfortunately, or we
deadlock, no matter if we're in select_bad_process, or in
oom_kill_process. Initially I didn't notice oom_kill_process had that
problem so I was then deadlocking despite select_bad_process was
selecting the parent that didn't have TIF_MEMDIE set (but the first
child already had it).

> Regardless of any OOM killer sychronization that we do, it is still 
> possible for the OOM killer to return after killing a task and then 
> another OOM situation be triggered on a subsequent allocation attempt 
> before the killed task has exited.  It's still marked as TIF_MEMDIE, so 
> your change will exempt it from being a target again and one of its 
> siblings or, worse, it's parent will be killed.

This is the risk of suprious oom killing yes. You got to choose
between a deadlock and risking a suprious oom killing. Even when you
add your 60second timeout in the task_struct between each new TIF_MEMDIE
bitflag set, you're still going to risk spurious oom killing...

The schedule_timeout in the oom killer and in the VM that I have in my
patchset combined with your very limited functionality of
zone-oom-lock (limited because it's gone by the time out_of_memory
returns and it currently can't take into account when the TIF_MEMDIE
task actually exited) in practice didn't generate suprious kills in my
testing. It may not be enough but it's a start...

> You can't guarantee that this couldn't have been prevented given 
> sufficient time for the exiting task to die, so this change introduces the 
> possibility that tasks will unnecessarily be killed to alleviate the OOM 
> condition.

Not just to 'alleviate' the oom condition, but to prevent a system crash.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2008-01-03 13:41 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-03  2:09 [PATCH 00 of 11] oom deadlock fixes Andrea Arcangeli
2008-01-03  2:09 ` [PATCH 01 of 11] limit shrink zone scanning Andrea Arcangeli
2008-01-07 19:11   ` Christoph Lameter
2008-01-03  2:09 ` [PATCH 02 of 11] avoid oom deadlock in nfs_create_request Andrea Arcangeli
2008-01-07 19:13   ` Christoph Lameter
2008-01-03  2:09 ` [PATCH 03 of 11] prevent oom deadlocks during read/write operations Andrea Arcangeli
2008-01-07 19:15   ` Christoph Lameter
2008-01-07 19:26     ` Andrea Arcangeli
2008-01-03  2:09 ` [PATCH 04 of 11] avoid selecting already killed tasks Andrea Arcangeli
2008-01-03  9:40   ` David Rientjes
2008-01-03 13:41     ` Andrea Arcangeli [this message]
2008-01-03 18:47       ` David Rientjes
2008-01-03 19:54         ` Andrea Arcangeli
2008-01-03 20:49           ` David Rientjes
2008-01-07 19:17   ` Christoph Lameter
2008-01-03  2:09 ` [PATCH 05 of 11] reduce the probability of an OOM livelock Andrea Arcangeli
2008-01-07 19:32   ` Christoph Lameter
2008-01-03  2:09 ` [PATCH 06 of 11] balance_pgdat doesn't return the number of pages freed Andrea Arcangeli
2008-01-07 19:33   ` Christoph Lameter
2008-01-03  2:09 ` [PATCH 07 of 11] don't depend on PF_EXITING tasks to go away Andrea Arcangeli
2008-01-03  9:52   ` David Rientjes
2008-01-03 13:29     ` Andrea Arcangeli
2008-01-03  2:09 ` [PATCH 08 of 11] stop useless vm trashing while we wait the TIF_MEMDIE task to exit Andrea Arcangeli
2008-01-03  2:09 ` [PATCH 09 of 11] oom select should only take rss into account Andrea Arcangeli
2008-01-07 19:35   ` Christoph Lameter
2008-01-03  2:09 ` [PATCH 10 of 11] limit reclaim if enough pages have been freed Andrea Arcangeli
2008-01-07 19:37   ` Christoph Lameter
2008-01-08  7:28     ` Andrea Arcangeli
2008-01-03  2:09 ` [PATCH 11 of 11] not-wait-memdie Andrea Arcangeli
2008-01-03  9:55   ` David Rientjes
2008-01-03 13:06     ` Andrea Arcangeli
2008-01-03 18:54       ` David Rientjes
2008-01-07 19:43   ` Christoph Lameter
2008-01-08  1:57     ` David Rientjes
2008-01-08  3:25       ` Nick Piggin
2008-01-08  3:37         ` David Rientjes
2008-01-08  7:42           ` Nick Piggin
2008-01-08  7:45         ` Andrea Arcangeli
2008-01-08  7:37       ` Andrea Arcangeli
2008-01-08  7:31     ` Andrea Arcangeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080103134137.GT30939@v2.random \
    --to=andrea@cpushare.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.