From: Minchan Kim <minchan.kim@gmail.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Shaohua Li <shaohua.li@intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Michal Hocko <mhocko@suse.cz>, mel <mel@csn.ul.ie>,
Rik van Riel <riel@redhat.com>, linux-mm <linux-mm@kvack.org>,
Johannes Weiner <jweiner@redhat.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Subject: Re: [patch 1/2]vmscan: correct all_unreclaimable for zone without lru pages
Date: Tue, 11 Oct 2011 18:36:55 +0900 [thread overview]
Message-ID: <20111011093655.GA16425@barrios-desktop> (raw)
In-Reply-To: <20111011182948.82525d89.kamezawa.hiroyu@jp.fujitsu.com>
On Tue, Oct 11, 2011 at 06:29:48PM +0900, KAMEZAWA Hiroyuki wrote:
> On Tue, 11 Oct 2011 18:07:56 +0900
> Minchan Kim <minchan.kim@gmail.com> wrote:
>
> > Hi Kame,
> >
> > On Tue, Oct 11, 2011 at 05:09:41PM +0900, KAMEZAWA Hiroyuki wrote:
> > > On Sun, 9 Oct 2011 16:45:58 +0900
> > > Minchan Kim <minchan.kim@gmail.com> wrote:
> > > > hanks for your careful review.
> > > > I will send a formal version.
> > > >
> > > > From 49078e0ebccae371b04930ae76dfd5ba158032ca Mon Sep 17 00:00:00 2001
> > > > From: Minchan Kim <minchan.kim@gmail.com>
> > > > Date: Sun, 9 Oct 2011 16:38:40 +0900
> > > > Subject: [PATCH] vmscan: judge zone's all_unreclaimable carefully
> > > >
> > > > Shaohua Li reported all_unreclaimable of DMA zone is always set
> > > > because the system has a big memory HIGH zone so that lowmem_reserve[HIGH]
> > > > could be a big.
> > > >
> > > > It could be a problem as follows
> > > >
> > > > Assumption :
> > > > 1. The system has a big high memory so that lowmem_reserve[HIGH] of DMA zone would be big.
> > > > 2. HIGH/NORMAL zone are full but DMA zone has enough free pages.
> > > >
> > > > Scenario
> > > > 1. A request to allocate a page in HIGH zone.
> > > > 2. HIGH/NORMAL zone already consumes lots of pages so that it would be fall-backed to DMA zone.
> > > > 3. In DMA zone, allocator got failed, too becuase lowmem_reserve[HIGH] is very big so that it wakes up kswapd
> > > > 4. kswapd would call shrink_zone while it see DMA zone since DMA zone's lowmem_reserve[HIGHMEM]
> > > > would be big so that it couldn't meet zone_watermark_ok_safe(high_wmark_pages(zone) + balance_gap,
> > > > *end_zone*)
> > > > 5. DMA zone doesn't meet stop condition(nr_slab != 0, !zone_reclaimable) because the zone has small lru pages
> > > > and it doesn't have slab pages so that kswapd would set all_unreclaimable of the zone to *1* easily.
> > > > 6. B request to allocate many pages in NORMAL zone but NORMAL zone has no free pages
> > > > so that it would be fall-backed to DMA zone.
> > > > 7. DMA zone would allocates many pages for NORMAL zone because lowmem_reserve[NORMAL] is small.
> > > > These pages are used by application(ie, it menas LRU pages. Yes. Now DMA zone could have many reclaimable pages)
> > > > 8. C request to allocate a page in NORMAL zone but he got failed because DMA zone doesn't have enough free pages.
> > > > (Most of pages in DMA zone are consumed by B)
> > > > 9. Kswapd try to reclaim lru pages in DMA zone but got failed because all_unreclaimable of the zone is 1. Otherwise,
> > > > it could reclaim many pages which are used by B.
> > > >
> > > > Of coures, we can do something in DEF_PRIORITY but it couldn't do enough because it can't raise
> > > > synchronus reclaim in direct reclaim path if the zone has many dirty pages
> > > > so that the process is killed by OOM.
> > > >
> > > > The principal problem is caused by step 8.
> > > > In step 8, we increased # of lru size very much but still the zone->all_unreclaimable is 1.
> > > > If we increase lru size, it is valuable to try reclaiming again.
> > > > The rationale is that we reset all_unreclaimable to 0 even if we free just a one page.
> > > >
> > > > Cc: Mel Gorman <mel@csn.ul.ie>
> > > > Cc: Rik van Riel <riel@redhat.com>
> > > > Cc: Michal Hocko <mhocko@suse.cz>
> > > > Cc: Johannes Weiner <jweiner@redhat.com>
> > > > Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> > > > Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > > > Reported-by: Shaohua Li <shaohua.li@intel.com>
> > > > Reviewed-by: Shaohua Li <shaohua.li@intel.com>
> > > > Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
> > >
> > > Hmm, catching changes of page usage in a zone ?
> >
> > Not exactly.
> > It does catch only lru page increasement of zone.
> >
> Sure.
>
> > > And this will allow to catch swap_on() and make a zone reclaimable
> > > even if no page usage changes. right ?
> >
> > It's not in the patch but I think it could be a another patch.
> > Could you post it if you really need it?
> >
> What I mean is "zone_reclaimable_pages() take swappable or not
> into account for anon pages. So, it's already covered."
Got it. I thought you're saying swap on race as follows,
When VM decides the zone is all_unreclimable, sudden any user
could do swap_on. From now on, we could reclaim anon pages so we have to
reset all_unreclaimable.
Anyway, it's a idea. if anyone think we should handle it, feel free to post.
But I am sure.
>
> I have no requirements.
>
> Thanks,
> -Kame
>
--
Kinds regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
prev parent reply other threads:[~2011-10-11 9:37 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-27 7:23 [patch 1/2]vmscan: correct all_unreclaimable for zone without lru pages Shaohua Li
2011-09-27 9:28 ` Michal Hocko
2011-09-28 0:46 ` Shaohua Li
2011-09-28 6:57 ` Minchan Kim
2011-09-28 7:08 ` Shaohua Li
2011-09-28 17:57 ` Minchan Kim
2011-09-29 1:14 ` Shaohua Li
2011-09-29 9:18 ` Minchan Kim
2011-09-30 2:12 ` Shaohua Li
2011-10-01 6:59 ` Minchan Kim
2011-10-08 3:09 ` Shaohua Li
2011-10-08 4:32 ` Minchan Kim
2011-10-08 5:48 ` Shaohua Li
2011-10-08 9:35 ` Minchan Kim
2011-10-09 6:08 ` Shaohua Li
2011-10-09 7:45 ` Minchan Kim
2011-10-11 8:09 ` KAMEZAWA Hiroyuki
2011-10-11 9:07 ` Minchan Kim
2011-10-11 9:29 ` KAMEZAWA Hiroyuki
2011-10-11 9:36 ` Minchan Kim [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111011093655.GA16425@barrios-desktop \
--to=minchan.kim@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=jweiner@redhat.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=mhocko@suse.cz \
--cc=riel@redhat.com \
--cc=shaohua.li@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).