* [PATCH v2 -mm] limit direct reclaim for higher order allocations @ 2011-09-27 14:52 Rik van Riel 2011-09-27 16:06 ` Johannes Weiner 2011-09-28 10:02 ` Mel Gorman 0 siblings, 2 replies; 4+ messages in thread From: Rik van Riel @ 2011-09-27 14:52 UTC (permalink / raw) To: linux-mm; +Cc: linux-kernel, Mel Gorman, akpm, Johannes Weiner, aarcange When suffering from memory fragmentation due to unfreeable pages, THP page faults will repeatedly try to compact memory. Due to the unfreeable pages, compaction fails. Needless to say, at that point page reclaim also fails to create free contiguous 2MB areas. However, that doesn't stop the current code from trying, over and over again, and freeing a minimum of 4MB (2UL << sc->order pages) at every single invocation. This resulted in my 12GB system having 2-3GB free memory, a corresponding amount of used swap and very sluggish response times. This can be avoided by having the direct reclaim code not reclaim from zones that already have plenty of free memory available for compaction. If compaction still fails due to unmovable memory, doing additional reclaim will only hurt the system, not help. Signed-off-by: Rik van Riel <riel@redhat.com> --- -v2: shrink_zones now uses the same thresholds as used by compaction itself, not only is this conceptually nicer, it also results in kswapd doing some actual work; before all the page freeing work was done by THP allocators, I seem to see fewer application stalls after this change. mm/vmscan.c | 10 ++++++++++ 1 files changed, 10 insertions(+), 0 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index b7719ec..117eb4d 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2083,6 +2083,16 @@ static void shrink_zones(int priority, struct zonelist *zonelist, continue; if (zone->all_unreclaimable && priority != DEF_PRIORITY) continue; /* Let kswapd poll it */ + if (COMPACTION_BUILD) { + /* + * If we already have plenty of memory free + * for compaction, don't free any more. + */ + if (sc->order > PAGE_ALLOC_COSTLY_ORDER && + (compaction_suitable(zone, sc->order) || + compaction_deferred(zone))) + continue; + } /* * This steals pages from memory cgroups over softlimit * and returns the number of reclaimed pages and -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v2 -mm] limit direct reclaim for higher order allocations 2011-09-27 14:52 [PATCH v2 -mm] limit direct reclaim for higher order allocations Rik van Riel @ 2011-09-27 16:06 ` Johannes Weiner 2011-10-07 9:07 ` Johannes Weiner 2011-09-28 10:02 ` Mel Gorman 1 sibling, 1 reply; 4+ messages in thread From: Johannes Weiner @ 2011-09-27 16:06 UTC (permalink / raw) To: Rik van Riel Cc: linux-mm, linux-kernel, Mel Gorman, akpm, Johannes Weiner, aarcange On Tue, Sep 27, 2011 at 10:52:46AM -0400, Rik van Riel wrote: > When suffering from memory fragmentation due to unfreeable pages, > THP page faults will repeatedly try to compact memory. Due to > the unfreeable pages, compaction fails. > > Needless to say, at that point page reclaim also fails to create > free contiguous 2MB areas. However, that doesn't stop the current > code from trying, over and over again, and freeing a minimum of > 4MB (2UL << sc->order pages) at every single invocation. > > This resulted in my 12GB system having 2-3GB free memory, a > corresponding amount of used swap and very sluggish response times. > > This can be avoided by having the direct reclaim code not reclaim > from zones that already have plenty of free memory available for > compaction. > > If compaction still fails due to unmovable memory, doing additional > reclaim will only hurt the system, not help. > > Signed-off-by: Rik van Riel <riel@redhat.com> > > --- > -v2: shrink_zones now uses the same thresholds as used by compaction itself, > not only is this conceptually nicer, it also results in kswapd doing > some actual work; before all the page freeing work was done by THP > allocators, I seem to see fewer application stalls after this change. > > mm/vmscan.c | 10 ++++++++++ > 1 files changed, 10 insertions(+), 0 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index b7719ec..117eb4d 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2083,6 +2083,16 @@ static void shrink_zones(int priority, struct zonelist *zonelist, > continue; > if (zone->all_unreclaimable && priority != DEF_PRIORITY) > continue; /* Let kswapd poll it */ > + if (COMPACTION_BUILD) { > + /* > + * If we already have plenty of memory free > + * for compaction, don't free any more. > + */ > + if (sc->order > PAGE_ALLOC_COSTLY_ORDER && > + (compaction_suitable(zone, sc->order) || > + compaction_deferred(zone))) > + continue; > + } I don't think the comment is complete in combination with the check for order > PAGE_ALLOC_COSTLY_ORDER, as compaction is invoked for all non-zero orders. But the traditional behaviour does less harm if the orders are small and your problem was triggered by THP allocations, so I agree with the code itself. Acked-by: Johannes Weiner <jweiner@redhat.com> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2 -mm] limit direct reclaim for higher order allocations 2011-09-27 16:06 ` Johannes Weiner @ 2011-10-07 9:07 ` Johannes Weiner 0 siblings, 0 replies; 4+ messages in thread From: Johannes Weiner @ 2011-10-07 9:07 UTC (permalink / raw) To: Rik van Riel Cc: linux-mm, linux-kernel, Mel Gorman, akpm, Johannes Weiner, aarcange On Tue, Sep 27, 2011 at 06:06:48PM +0200, Johannes Weiner wrote: > On Tue, Sep 27, 2011 at 10:52:46AM -0400, Rik van Riel wrote: > > When suffering from memory fragmentation due to unfreeable pages, > > THP page faults will repeatedly try to compact memory. Due to > > the unfreeable pages, compaction fails. > > > > Needless to say, at that point page reclaim also fails to create > > free contiguous 2MB areas. However, that doesn't stop the current > > code from trying, over and over again, and freeing a minimum of > > 4MB (2UL << sc->order pages) at every single invocation. > > > > This resulted in my 12GB system having 2-3GB free memory, a > > corresponding amount of used swap and very sluggish response times. > > > > This can be avoided by having the direct reclaim code not reclaim > > from zones that already have plenty of free memory available for > > compaction. > > > > If compaction still fails due to unmovable memory, doing additional > > reclaim will only hurt the system, not help. > > > > Signed-off-by: Rik van Riel <riel@redhat.com> > > > > --- > > -v2: shrink_zones now uses the same thresholds as used by compaction itself, > > not only is this conceptually nicer, it also results in kswapd doing > > some actual work; before all the page freeing work was done by THP > > allocators, I seem to see fewer application stalls after this change. > > > > mm/vmscan.c | 10 ++++++++++ > > 1 files changed, 10 insertions(+), 0 deletions(-) > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index b7719ec..117eb4d 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -2083,6 +2083,16 @@ static void shrink_zones(int priority, struct zonelist *zonelist, > > continue; > > if (zone->all_unreclaimable && priority != DEF_PRIORITY) > > continue; /* Let kswapd poll it */ > > + if (COMPACTION_BUILD) { > > + /* > > + * If we already have plenty of memory free > > + * for compaction, don't free any more. > > + */ > > + if (sc->order > PAGE_ALLOC_COSTLY_ORDER && > > + (compaction_suitable(zone, sc->order) || > > + compaction_deferred(zone))) > > + continue; > > + } > > I don't think the comment is complete in combination with the check > for order > PAGE_ALLOC_COSTLY_ORDER, as compaction is invoked for all > non-zero orders. > > But the traditional behaviour does less harm if the orders are small > and your problem was triggered by THP allocations, so I agree with the > code itself. FWIW, an incremental patch to explain the order check. What do you think? Signed-off-by: Johannes Weiner <jweiner@redhat.com> --- mm/vmscan.c | 10 ++++++++-- 1 files changed, 8 insertions(+), 2 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 3817fa9..930085a 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2068,8 +2068,14 @@ static void shrink_zones(int priority, struct zonelist *zonelist, continue; /* Let kswapd poll it */ if (COMPACTION_BUILD) { /* - * If we already have plenty of memory free - * for compaction, don't free any more. + * If we already have plenty of memory + * free for compaction, don't free any + * more. Even though compaction is + * invoked for any non-zero order, + * only frequent costly order + * reclamation is disruptive enough to + * become a noticable problem, like + * transparent huge page allocations. */ if (sc->order > PAGE_ALLOC_COSTLY_ORDER && (compaction_suitable(zone, sc->order) || -- 1.7.6.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v2 -mm] limit direct reclaim for higher order allocations 2011-09-27 14:52 [PATCH v2 -mm] limit direct reclaim for higher order allocations Rik van Riel 2011-09-27 16:06 ` Johannes Weiner @ 2011-09-28 10:02 ` Mel Gorman 1 sibling, 0 replies; 4+ messages in thread From: Mel Gorman @ 2011-09-28 10:02 UTC (permalink / raw) To: Rik van Riel; +Cc: linux-mm, linux-kernel, akpm, Johannes Weiner, aarcange On Tue, Sep 27, 2011 at 10:52:46AM -0400, Rik van Riel wrote: > When suffering from memory fragmentation due to unfreeable pages, > THP page faults will repeatedly try to compact memory. Due to > the unfreeable pages, compaction fails. > > Needless to say, at that point page reclaim also fails to create > free contiguous 2MB areas. However, that doesn't stop the current > code from trying, over and over again, and freeing a minimum of > 4MB (2UL << sc->order pages) at every single invocation. > > This resulted in my 12GB system having 2-3GB free memory, a > corresponding amount of used swap and very sluggish response times. > > This can be avoided by having the direct reclaim code not reclaim > from zones that already have plenty of free memory available for > compaction. > > If compaction still fails due to unmovable memory, doing additional > reclaim will only hurt the system, not help. > > Signed-off-by: Rik van Riel <riel@redhat.com> > Because this patch improves things; Acked-by: Mel Gorman <mgorman@suse.de> That said, shrink_zones potentially returns having scanning and reclaimed 0 pages. We still fall through to shrink_slab and because we are reclaiming 0 pages, we loop over all priorities in do_try_to_free_pages() and potentially even call wait_iff_congested. I think this patch would be better if shrink_zones() returned true if we did not reclaim because compaction was ready and returned early from shrink_zones. diff --git a/mm/vmscan.c b/mm/vmscan.c index 117eb4d..ead9c94 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2061,14 +2061,19 @@ restart: * * If a zone is deemed to be full of pinned pages then just give it a light * scan then give up on it. + * + * This function returns true if a zone is being reclaimed for a high-order + * allocation that will use compaction and compaction is ready to begin. This + * indicates to the caller that further reclaim is unnecessary. */ -static void shrink_zones(int priority, struct zonelist *zonelist, +static bool shrink_zones(int priority, struct zonelist *zonelist, struct scan_control *sc) { struct zoneref *z; struct zone *zone; unsigned long nr_soft_reclaimed; unsigned long nr_soft_scanned; + bool abort_reclaim_compaction = false; for_each_zone_zonelist_nodemask(zone, z, zonelist, gfp_zone(sc->gfp_mask), sc->nodemask) { @@ -2090,8 +2095,10 @@ static void shrink_zones(int priority, struct zonelist *zonelist, */ if (sc->order > PAGE_ALLOC_COSTLY_ORDER && (compaction_suitable(zone, sc->order) || - compaction_deferred(zone))) + compaction_deferred(zone))) { + abort_reclaim_compaction = true; continue; + } } /* * This steals pages from memory cgroups over softlimit @@ -2110,6 +2117,8 @@ static void shrink_zones(int priority, struct zonelist *zonelist, shrink_zone(priority, zone, sc); } + + return abort_reclaim_compaction; } static bool zone_reclaimable(struct zone *zone) @@ -2174,7 +2183,8 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist, sc->nr_scanned = 0; if (!priority) disable_swap_token(sc->mem_cgroup); - shrink_zones(priority, zonelist, sc); + if (shrink_zones(priority, zonelist, sc)) + break; /* * Don't shrink slabs when reclaiming memory from * over limit cgroups -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2011-10-07 9:07 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-09-27 14:52 [PATCH v2 -mm] limit direct reclaim for higher order allocations Rik van Riel 2011-09-27 16:06 ` Johannes Weiner 2011-10-07 9:07 ` Johannes Weiner 2011-09-28 10:02 ` Mel Gorman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).