All of lore.kernel.org
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>,
	Luigi Semenzato <semenzato@google.com>,
	linux-mm@kvack.org, Dan Magenheimer <dan.magenheimer@oracle.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Sonny Rao <sonnyrao@google.com>
Subject: Re: zram OOM behavior
Date: Mon, 12 Nov 2012 22:32:18 +0900	[thread overview]
Message-ID: <20121112133218.GA3156@barrios> (raw)
In-Reply-To: <20121109095024.GI8218@suse.de>

Sorry for the late reply.
I'm still going on training course until this week so my response would be delayed, too.

On Fri, Nov 09, 2012 at 09:50:24AM +0000, Mel Gorman wrote:
> On Tue, Nov 06, 2012 at 07:17:20PM +0900, Minchan Kim wrote:
> > On Tue, Nov 06, 2012 at 08:58:22AM +0000, Mel Gorman wrote:
> > > On Tue, Nov 06, 2012 at 09:25:50AM +0900, Minchan Kim wrote:
> > > > On Mon, Nov 05, 2012 at 02:46:14PM +0000, Mel Gorman wrote:
> > > > > On Sat, Nov 03, 2012 at 07:36:31AM +0900, Minchan Kim wrote:
> > > > > > > <SNIP>
> > > > > > > In the first version it would never try to enter direct reclaim if a
> > > > > > > fatal signal was pending but always claim that forward progress was
> > > > > > > being made.
> > > > > > 
> > > > > > Surely we need fix for preventing deadlock with OOM kill and that's why
> > > > > > I have Cced you and this patch fixes it but my question is why we need 
> > > > > > such fatal signal checking trick.
> > > > > > 
> > > > > > How about this?
> > > > > > 
> > > > > 
> > > > > Both will work as expected but....
> > > > > 
> > > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > > > > > index 10090c8..881619e 100644
> > > > > > --- a/mm/vmscan.c
> > > > > > +++ b/mm/vmscan.c
> > > > > > @@ -2306,13 +2306,6 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
> > > > > >  
> > > > > >         throttle_direct_reclaim(gfp_mask, zonelist, nodemask);
> > > > > >  
> > > > > > -       /*
> > > > > > -        * Do not enter reclaim if fatal signal is pending. 1 is returned so
> > > > > > -        * that the page allocator does not consider triggering OOM
> > > > > > -        */
> > > > > > -       if (fatal_signal_pending(current))
> > > > > > -               return 1;
> > > > > > -
> > > > > >         trace_mm_vmscan_direct_reclaim_begin(order,
> > > > > >                                 sc.may_writepage,
> > > > > >                                 gfp_mask);
> > > > > >  
> > > > > > In this case, after throttling, current will try to do direct reclaim and
> > > > > > if he makes forward progress, he will get a memory and exit if he receive KILL signal.
> > > > > 
> > > > > It may be completely unnecessary to reclaim memory if the process that was
> > > > > throttled and killed just exits quickly. As the fatal signal is pending
> > > > > it will be able to use the pfmemalloc reserves.
> > > > > 
> > > > > > If he can't make forward progress with direct reclaim, he can ends up OOM path but
> > > > > > out_of_memory checks signal check of current and allow to access reserved memory pool
> > > > > > for quick exit and return without killing other victim selection.
> > > > > 
> > > > > While this is true, what advantage is there to having a killed process
> > > > > potentially reclaiming memory it does not need to?
> > > > 
> > > > Killed process needs a memory for him to be terminated. I think it's not a good idea for him
> > > > to use reserved memory pool unconditionally although he is throtlled and killed.
> > > > Because reserved memory pool is very stricted resource for emergency so using reserved memory
> > > > pool should be last resort after he fail to reclaim.
> > > > 
> > > 
> > > Part of that reclaim can be the process reclaiming its own pages and
> > > putting them in swap just so it can exit shortly afterwards. If it was
> > > throttled in this path, it implies that swap-over-NFS is enabled where
> > 
> > Could we make sure it's only the case for swap-over-NFS?
> 
> The PFMEMALLOC reserves being consumed to the point of throttline is only
> expected in the case of swap-over-network -- check the pgscan_direct_throttle
> counter to be sure. So it's already the case that this throttling logic and
> its signal handling is mostly a swap-over-NFS thing. It is possible that
> a badly behaving driver using GFP_ATOMIC to allocate long-lived buffers
> could force a situation where a process gets throttled but I'm not aware
> of a case where this happens todays.

I saw some custom drviers in embedded side have used GFP_ATOMIC easily to protect
avoiding deadlock. Of course, it's not a good behavior but it lives with us.
Even, we can't fix it because we don't have any source. :(

> 
> > I think it can happen if the system has very slow thumb card.
> > 
> 
> How? They shouldn't be stuck in throttling in this case. They should be
> blocked on IO, congestion wait, dirty throttling etc.

Some block driver(ex, mmc) uses a thread model with PF_MEMALLOC so I think
they can be stucked by the throttling logic.

> 
> > > such reclaim in fact might require the pfmemalloc reserves to be used to
> > > allocate network buffers. It's potentially unnecessary work because the
> > 
> > You mean we need pfmemalloc reserve to swap out anon pages by swap-over-NFS?
> 
> In very low-memory situations - yes. We can be at the min watermark but
> still need to allocate a page for a network buffer to swap out the anon page.
> 
> > Yes. In this case, you're right. I would be better to use reserve pool for
> > just exiting instead of swap out over network. But how can you make sure that
> > we have only anonymous page when we try to reclaim? 
> > If there are some file-backed pages, we can avoid swapout at that time.
> > Maybe we need some check.
> > 
> 
> That would be a fairly invasive set of checks for a corner case. if
> swap-over-nfs + critically low + about to OOM + file pages available then
> only reclaim files.
> 
> It's getting off track as to why we're having this discussion in the first
> place -- looping due to improper handling of fatal signal pending.

If some user tune /proc/sys/vm/swappiness, we could have many page cache pages
when swap-over-NFS happens.
My point is that why do we should use emergency memory pool although we have
reclaimalble memory?

> 
> > > same reserves could have been used to just exit the process.
> > > 
> > > I'll go your way if you insist because it's not like getting throttled
> > > and killed before exit is a common situation and it should work either
> > > way.
> > 
> > I don't want to insist on. Just want to know what's the problem and find
> > better solution. :) 
> > 
> 
> In that case, I'm going to send the patch to Andrew on Monday and avoid
> direct reclaim when a fatal signal is pending in the swap-over-network
> case. Are you ok with that?

Sorry but I don't think your patch is best approach.

> 
> -- 
> Mel Gorman
> SUSE Labs

-- 
Kind Regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2012-11-12 13:32 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-02  6:39 zram OOM behavior Minchan Kim
2012-11-02  8:30 ` Mel Gorman
2012-11-02 22:36   ` Minchan Kim
2012-11-05 14:46     ` Mel Gorman
2012-11-06  0:25       ` Minchan Kim
2012-11-06  8:58         ` Mel Gorman
2012-11-06 10:17           ` Minchan Kim
2012-11-09  9:50             ` Mel Gorman
2012-11-12 13:32               ` Minchan Kim [this message]
2012-11-12 14:06                 ` Mel Gorman
2012-11-13 13:31                   ` Minchan Kim
2012-11-21 15:38                     ` [PATCH] mm: vmscan: Check for fatal signals iff the process was throttled Mel Gorman
2012-11-21 15:38                       ` Mel Gorman
2012-11-21 20:15                       ` Andrew Morton
2012-11-21 20:15                         ` Andrew Morton
2012-11-21 21:05                         ` Mel Gorman
2012-11-21 21:05                           ` Mel Gorman
2012-11-21 21:30                           ` Andrew Morton
2012-11-23  5:09                       ` Minchan Kim
2012-11-23  5:09                         ` Minchan Kim
  -- strict thread matches above, loose matches on Subject: below --
2012-09-28 17:32 zram OOM behavior Luigi Semenzato
2012-10-03 13:30 ` Konrad Rzeszutek Wilk
     [not found]   ` <CAA25o9SwO209DD6CUx-LzhMt9XU6niGJ-fBPmgwfcrUvf0BPWA@mail.gmail.com>
2012-10-12 23:30     ` Luigi Semenzato
2012-10-15 14:44 ` Minchan Kim
2012-10-15 18:54   ` Luigi Semenzato
2012-10-16  6:18     ` Minchan Kim
2012-10-16 17:36       ` Luigi Semenzato
2012-10-19 17:49         ` Luigi Semenzato
2012-10-22 23:53           ` Minchan Kim
2012-10-23  0:40             ` Luigi Semenzato
2012-10-23  6:03             ` David Rientjes
2012-10-29 18:26               ` Luigi Semenzato
2012-10-29 19:00                 ` David Rientjes
2012-10-29 22:36                   ` Luigi Semenzato
2012-10-29 22:52                     ` David Rientjes
2012-10-29 23:23                       ` Luigi Semenzato
2012-10-29 23:34                         ` Luigi Semenzato
2012-10-30  0:18                     ` Minchan Kim
2012-10-30  0:45                       ` Luigi Semenzato
2012-10-30  5:41                         ` David Rientjes
2012-10-30 19:12                           ` Luigi Semenzato
2012-10-30 20:30                             ` Luigi Semenzato
2012-10-30 22:32                               ` Luigi Semenzato
2012-10-31 18:42                                 ` David Rientjes
2012-10-30 22:37                               ` Sonny Rao
2012-10-31  4:46                               ` David Rientjes
2012-10-31  6:14                                 ` Luigi Semenzato
2012-10-31  6:28                                   ` Luigi Semenzato
2012-10-31 18:45                                     ` David Rientjes
2012-10-31  0:57                             ` Minchan Kim
2012-10-31  1:06                               ` Luigi Semenzato
2012-10-31  1:27                                 ` Minchan Kim
2012-10-31  3:49                                   ` Luigi Semenzato
2012-10-31  7:24                                     ` Minchan Kim
2012-10-31 16:07                                       ` Luigi Semenzato
2012-10-31 17:49                                         ` Mandeep Singh Baines
2012-10-31 18:54                               ` David Rientjes
2012-10-31 21:40                                 ` Luigi Semenzato
2012-11-01  2:11                                 ` Minchan Kim
2012-11-01  4:38                                   ` David Rientjes
2012-11-01  5:18                                     ` Minchan Kim
2012-11-01  2:43                                 ` Minchan Kim
2012-11-01  4:48                                   ` David Rientjes
2012-11-01  5:26                                     ` Minchan Kim
2012-11-01  8:28                                     ` Mel Gorman
2012-11-01 15:57                                       ` Luigi Semenzato
2012-11-01 15:58                                         ` Luigi Semenzato
2012-11-01 21:48                                           ` David Rientjes
2012-11-01 17:50                                     ` Luigi Semenzato
2012-11-01 21:50                                       ` David Rientjes
2012-11-01 22:04                                         ` Luigi Semenzato
2012-11-01 22:25                                           ` David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121112133218.GA3156@barrios \
    --to=minchan@kernel.org \
    --cc=dan.magenheimer@oracle.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=rientjes@google.com \
    --cc=semenzato@google.com \
    --cc=sonnyrao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.