From: Mel Gorman <mgorman@suse.de>
To: Minchan Kim <minchan@kernel.org>
Cc: David Rientjes <rientjes@google.com>,
Luigi Semenzato <semenzato@google.com>,
linux-mm@kvack.org, Dan Magenheimer <dan.magenheimer@oracle.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Sonny Rao <sonnyrao@google.com>
Subject: Re: zram OOM behavior
Date: Fri, 9 Nov 2012 09:50:24 +0000 [thread overview]
Message-ID: <20121109095024.GI8218@suse.de> (raw)
In-Reply-To: <20121106101719.GA2005@barrios>
On Tue, Nov 06, 2012 at 07:17:20PM +0900, Minchan Kim wrote:
> On Tue, Nov 06, 2012 at 08:58:22AM +0000, Mel Gorman wrote:
> > On Tue, Nov 06, 2012 at 09:25:50AM +0900, Minchan Kim wrote:
> > > On Mon, Nov 05, 2012 at 02:46:14PM +0000, Mel Gorman wrote:
> > > > On Sat, Nov 03, 2012 at 07:36:31AM +0900, Minchan Kim wrote:
> > > > > > <SNIP>
> > > > > > In the first version it would never try to enter direct reclaim if a
> > > > > > fatal signal was pending but always claim that forward progress was
> > > > > > being made.
> > > > >
> > > > > Surely we need fix for preventing deadlock with OOM kill and that's why
> > > > > I have Cced you and this patch fixes it but my question is why we need
> > > > > such fatal signal checking trick.
> > > > >
> > > > > How about this?
> > > > >
> > > >
> > > > Both will work as expected but....
> > > >
> > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > > > > index 10090c8..881619e 100644
> > > > > --- a/mm/vmscan.c
> > > > > +++ b/mm/vmscan.c
> > > > > @@ -2306,13 +2306,6 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
> > > > >
> > > > > throttle_direct_reclaim(gfp_mask, zonelist, nodemask);
> > > > >
> > > > > - /*
> > > > > - * Do not enter reclaim if fatal signal is pending. 1 is returned so
> > > > > - * that the page allocator does not consider triggering OOM
> > > > > - */
> > > > > - if (fatal_signal_pending(current))
> > > > > - return 1;
> > > > > -
> > > > > trace_mm_vmscan_direct_reclaim_begin(order,
> > > > > sc.may_writepage,
> > > > > gfp_mask);
> > > > >
> > > > > In this case, after throttling, current will try to do direct reclaim and
> > > > > if he makes forward progress, he will get a memory and exit if he receive KILL signal.
> > > >
> > > > It may be completely unnecessary to reclaim memory if the process that was
> > > > throttled and killed just exits quickly. As the fatal signal is pending
> > > > it will be able to use the pfmemalloc reserves.
> > > >
> > > > > If he can't make forward progress with direct reclaim, he can ends up OOM path but
> > > > > out_of_memory checks signal check of current and allow to access reserved memory pool
> > > > > for quick exit and return without killing other victim selection.
> > > >
> > > > While this is true, what advantage is there to having a killed process
> > > > potentially reclaiming memory it does not need to?
> > >
> > > Killed process needs a memory for him to be terminated. I think it's not a good idea for him
> > > to use reserved memory pool unconditionally although he is throtlled and killed.
> > > Because reserved memory pool is very stricted resource for emergency so using reserved memory
> > > pool should be last resort after he fail to reclaim.
> > >
> >
> > Part of that reclaim can be the process reclaiming its own pages and
> > putting them in swap just so it can exit shortly afterwards. If it was
> > throttled in this path, it implies that swap-over-NFS is enabled where
>
> Could we make sure it's only the case for swap-over-NFS?
The PFMEMALLOC reserves being consumed to the point of throttline is only
expected in the case of swap-over-network -- check the pgscan_direct_throttle
counter to be sure. So it's already the case that this throttling logic and
its signal handling is mostly a swap-over-NFS thing. It is possible that
a badly behaving driver using GFP_ATOMIC to allocate long-lived buffers
could force a situation where a process gets throttled but I'm not aware
of a case where this happens todays.
> I think it can happen if the system has very slow thumb card.
>
How? They shouldn't be stuck in throttling in this case. They should be
blocked on IO, congestion wait, dirty throttling etc.
> > such reclaim in fact might require the pfmemalloc reserves to be used to
> > allocate network buffers. It's potentially unnecessary work because the
>
> You mean we need pfmemalloc reserve to swap out anon pages by swap-over-NFS?
In very low-memory situations - yes. We can be at the min watermark but
still need to allocate a page for a network buffer to swap out the anon page.
> Yes. In this case, you're right. I would be better to use reserve pool for
> just exiting instead of swap out over network. But how can you make sure that
> we have only anonymous page when we try to reclaim?
> If there are some file-backed pages, we can avoid swapout at that time.
> Maybe we need some check.
>
That would be a fairly invasive set of checks for a corner case. if
swap-over-nfs + critically low + about to OOM + file pages available then
only reclaim files.
It's getting off track as to why we're having this discussion in the first
place -- looping due to improper handling of fatal signal pending.
> > same reserves could have been used to just exit the process.
> >
> > I'll go your way if you insist because it's not like getting throttled
> > and killed before exit is a common situation and it should work either
> > way.
>
> I don't want to insist on. Just want to know what's the problem and find
> better solution. :)
>
In that case, I'm going to send the patch to Andrew on Monday and avoid
direct reclaim when a fatal signal is pending in the swap-over-network
case. Are you ok with that?
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-11-09 9:50 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-02 6:39 zram OOM behavior Minchan Kim
2012-11-02 8:30 ` Mel Gorman
2012-11-02 22:36 ` Minchan Kim
2012-11-05 14:46 ` Mel Gorman
2012-11-06 0:25 ` Minchan Kim
2012-11-06 8:58 ` Mel Gorman
2012-11-06 10:17 ` Minchan Kim
2012-11-09 9:50 ` Mel Gorman [this message]
2012-11-12 13:32 ` Minchan Kim
2012-11-12 14:06 ` Mel Gorman
2012-11-13 13:31 ` Minchan Kim
2012-11-21 15:38 ` [PATCH] mm: vmscan: Check for fatal signals iff the process was throttled Mel Gorman
2012-11-21 15:38 ` Mel Gorman
2012-11-21 20:15 ` Andrew Morton
2012-11-21 20:15 ` Andrew Morton
2012-11-21 21:05 ` Mel Gorman
2012-11-21 21:05 ` Mel Gorman
2012-11-21 21:30 ` Andrew Morton
2012-11-23 5:09 ` Minchan Kim
2012-11-23 5:09 ` Minchan Kim
-- strict thread matches above, loose matches on Subject: below --
2012-09-28 17:32 zram OOM behavior Luigi Semenzato
2012-10-03 13:30 ` Konrad Rzeszutek Wilk
[not found] ` <CAA25o9SwO209DD6CUx-LzhMt9XU6niGJ-fBPmgwfcrUvf0BPWA@mail.gmail.com>
2012-10-12 23:30 ` Luigi Semenzato
2012-10-15 14:44 ` Minchan Kim
2012-10-15 18:54 ` Luigi Semenzato
2012-10-16 6:18 ` Minchan Kim
2012-10-16 17:36 ` Luigi Semenzato
2012-10-19 17:49 ` Luigi Semenzato
2012-10-22 23:53 ` Minchan Kim
2012-10-23 0:40 ` Luigi Semenzato
2012-10-23 6:03 ` David Rientjes
2012-10-29 18:26 ` Luigi Semenzato
2012-10-29 19:00 ` David Rientjes
2012-10-29 22:36 ` Luigi Semenzato
2012-10-29 22:52 ` David Rientjes
2012-10-29 23:23 ` Luigi Semenzato
2012-10-29 23:34 ` Luigi Semenzato
2012-10-30 0:18 ` Minchan Kim
2012-10-30 0:45 ` Luigi Semenzato
2012-10-30 5:41 ` David Rientjes
2012-10-30 19:12 ` Luigi Semenzato
2012-10-30 20:30 ` Luigi Semenzato
2012-10-30 22:32 ` Luigi Semenzato
2012-10-31 18:42 ` David Rientjes
2012-10-30 22:37 ` Sonny Rao
2012-10-31 4:46 ` David Rientjes
2012-10-31 6:14 ` Luigi Semenzato
2012-10-31 6:28 ` Luigi Semenzato
2012-10-31 18:45 ` David Rientjes
2012-10-31 0:57 ` Minchan Kim
2012-10-31 1:06 ` Luigi Semenzato
2012-10-31 1:27 ` Minchan Kim
2012-10-31 3:49 ` Luigi Semenzato
2012-10-31 7:24 ` Minchan Kim
2012-10-31 16:07 ` Luigi Semenzato
2012-10-31 17:49 ` Mandeep Singh Baines
2012-10-31 18:54 ` David Rientjes
2012-10-31 21:40 ` Luigi Semenzato
2012-11-01 2:11 ` Minchan Kim
2012-11-01 4:38 ` David Rientjes
2012-11-01 5:18 ` Minchan Kim
2012-11-01 2:43 ` Minchan Kim
2012-11-01 4:48 ` David Rientjes
2012-11-01 5:26 ` Minchan Kim
2012-11-01 8:28 ` Mel Gorman
2012-11-01 15:57 ` Luigi Semenzato
2012-11-01 15:58 ` Luigi Semenzato
2012-11-01 21:48 ` David Rientjes
2012-11-01 17:50 ` Luigi Semenzato
2012-11-01 21:50 ` David Rientjes
2012-11-01 22:04 ` Luigi Semenzato
2012-11-01 22:25 ` David Rientjes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121109095024.GI8218@suse.de \
--to=mgorman@suse.de \
--cc=dan.magenheimer@oracle.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
--cc=rientjes@google.com \
--cc=semenzato@google.com \
--cc=sonnyrao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.