* [PATCH] mm: vmscan: check mem cgroup over reclaimed @ 2012-01-23 1:55 Hillf Danton 2012-01-23 10:47 ` Johannes Weiner 2012-01-23 19:04 ` Ying Han 0 siblings, 2 replies; 12+ messages in thread From: Hillf Danton @ 2012-01-23 1:55 UTC (permalink / raw) To: linux-mm Cc: Michal Hocko, KAMEZAWA Hiroyuki, Ying Han, Hugh Dickins, Andrew Morton, LKML, Hillf Danton To avoid reduction in performance of reclaimee, checking overreclaim is added after shrinking lru list, when pages are reclaimed from mem cgroup. If over reclaim occurs, shrinking remaining lru lists is skipped, and no more reclaim for reclaim/compaction. Signed-off-by: Hillf Danton <dhillf@gmail.com> --- --- a/mm/vmscan.c Mon Jan 23 00:23:10 2012 +++ b/mm/vmscan.c Mon Jan 23 09:57:20 2012 @@ -2086,6 +2086,7 @@ static void shrink_mem_cgroup_zone(int p unsigned long nr_reclaimed, nr_scanned; unsigned long nr_to_reclaim = sc->nr_to_reclaim; struct blk_plug plug; + bool memcg_over_reclaimed = false; restart: nr_reclaimed = 0; @@ -2103,6 +2104,11 @@ restart: nr_reclaimed += shrink_list(lru, nr_to_scan, mz, sc, priority); + + memcg_over_reclaimed = !scanning_global_lru(mz) + && (nr_reclaimed >= nr_to_reclaim); + if (memcg_over_reclaimed) + goto out; } } /* @@ -2116,6 +2122,7 @@ restart: if (nr_reclaimed >= nr_to_reclaim && priority < DEF_PRIORITY) break; } +out: blk_finish_plug(&plug); sc->nr_reclaimed += nr_reclaimed; @@ -2127,7 +2134,8 @@ restart: shrink_active_list(SWAP_CLUSTER_MAX, mz, sc, priority, 0); /* reclaim/compaction might need reclaim to continue */ - if (should_continue_reclaim(mz, nr_reclaimed, + if (!memcg_over_reclaimed && + should_continue_reclaim(mz, nr_reclaimed, sc->nr_scanned - nr_scanned, sc)) goto restart; -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm: vmscan: check mem cgroup over reclaimed 2012-01-23 1:55 [PATCH] mm: vmscan: check mem cgroup over reclaimed Hillf Danton @ 2012-01-23 10:47 ` Johannes Weiner 2012-01-23 12:30 ` Hillf Danton 2012-01-23 19:04 ` Ying Han 1 sibling, 1 reply; 12+ messages in thread From: Johannes Weiner @ 2012-01-23 10:47 UTC (permalink / raw) To: Hillf Danton Cc: linux-mm, Michal Hocko, KAMEZAWA Hiroyuki, Ying Han, Hugh Dickins, Andrew Morton, LKML On Mon, Jan 23, 2012 at 09:55:07AM +0800, Hillf Danton wrote: > To avoid reduction in performance of reclaimee, checking overreclaim is added > after shrinking lru list, when pages are reclaimed from mem cgroup. > > If over reclaim occurs, shrinking remaining lru lists is skipped, and no more > reclaim for reclaim/compaction. > > Signed-off-by: Hillf Danton <dhillf@gmail.com> > --- > > --- a/mm/vmscan.c Mon Jan 23 00:23:10 2012 > +++ b/mm/vmscan.c Mon Jan 23 09:57:20 2012 > @@ -2086,6 +2086,7 @@ static void shrink_mem_cgroup_zone(int p > unsigned long nr_reclaimed, nr_scanned; > unsigned long nr_to_reclaim = sc->nr_to_reclaim; > struct blk_plug plug; > + bool memcg_over_reclaimed = false; > > restart: > nr_reclaimed = 0; > @@ -2103,6 +2104,11 @@ restart: > > nr_reclaimed += shrink_list(lru, nr_to_scan, > mz, sc, priority); > + > + memcg_over_reclaimed = !scanning_global_lru(mz) > + && (nr_reclaimed >= nr_to_reclaim); > + if (memcg_over_reclaimed) > + goto out; Since this merge window, scanning_global_lru() is always false when the memory controller is enabled, i.e. most common configurations and distribution kernels. This will with quite likely have bad effects on zone balancing, pressure balancing between anon/file lru etc, while you haven't shown that any workloads actually benefit from this. Submitting patches like this without mentioning a problematic scenario and numbers that demonstrate that the patch improve it is not helpful. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm: vmscan: check mem cgroup over reclaimed 2012-01-23 10:47 ` Johannes Weiner @ 2012-01-23 12:30 ` Hillf Danton 2012-01-24 8:33 ` Johannes Weiner 0 siblings, 1 reply; 12+ messages in thread From: Hillf Danton @ 2012-01-23 12:30 UTC (permalink / raw) To: Johannes Weiner Cc: linux-mm, Michal Hocko, KAMEZAWA Hiroyuki, Ying Han, Hugh Dickins, Andrew Morton, LKML On Mon, Jan 23, 2012 at 6:47 PM, Johannes Weiner <hannes@cmpxchg.org> wrote: > On Mon, Jan 23, 2012 at 09:55:07AM +0800, Hillf Danton wrote: >> To avoid reduction in performance of reclaimee, checking overreclaim is added >> after shrinking lru list, when pages are reclaimed from mem cgroup. >> >> If over reclaim occurs, shrinking remaining lru lists is skipped, and no more >> reclaim for reclaim/compaction. >> >> Signed-off-by: Hillf Danton <dhillf@gmail.com> >> --- >> >> --- a/mm/vmscan.c Mon Jan 23 00:23:10 2012 >> +++ b/mm/vmscan.c Mon Jan 23 09:57:20 2012 >> @@ -2086,6 +2086,7 @@ static void shrink_mem_cgroup_zone(int p >> unsigned long nr_reclaimed, nr_scanned; >> unsigned long nr_to_reclaim = sc->nr_to_reclaim; >> struct blk_plug plug; >> + bool memcg_over_reclaimed = false; >> >> restart: >> nr_reclaimed = 0; >> @@ -2103,6 +2104,11 @@ restart: >> >> nr_reclaimed += shrink_list(lru, nr_to_scan, >> mz, sc, priority); >> + >> + memcg_over_reclaimed = !scanning_global_lru(mz) >> + && (nr_reclaimed >= nr_to_reclaim); >> + if (memcg_over_reclaimed) >> + goto out; > > Since this merge window, scanning_global_lru() is always false when > the memory controller is enabled, i.e. most common configurations and > distribution kernels. > > This will with quite likely have bad effects on zone balancing, > pressure balancing between anon/file lru etc, while you haven't shown > that any workloads actually benefit from this. > Hi Johannes Thanks for your comment, first. Impact on zone balance and lru-list balance is introduced actually, but I dont think the patch is totally responsible for the balance mentioned, because soft limit, embedded in mem cgroup, is setup by users according to whatever tastes they have. Though there is room for the patch to be fine tuned in this direction or that, over reclaim should not be neglected entirely, but be avoided as much as we could, or users are enforced to set up soft limit with much care not to mess up zone balance. Hillf -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm: vmscan: check mem cgroup over reclaimed 2012-01-23 12:30 ` Hillf Danton @ 2012-01-24 8:33 ` Johannes Weiner 2012-01-24 9:08 ` KAMEZAWA Hiroyuki 0 siblings, 1 reply; 12+ messages in thread From: Johannes Weiner @ 2012-01-24 8:33 UTC (permalink / raw) To: Hillf Danton Cc: linux-mm, Michal Hocko, KAMEZAWA Hiroyuki, Ying Han, Hugh Dickins, Andrew Morton, LKML On Mon, Jan 23, 2012 at 08:30:42PM +0800, Hillf Danton wrote: > On Mon, Jan 23, 2012 at 6:47 PM, Johannes Weiner <hannes@cmpxchg.org> wrote: > > On Mon, Jan 23, 2012 at 09:55:07AM +0800, Hillf Danton wrote: > >> To avoid reduction in performance of reclaimee, checking overreclaim is added > >> after shrinking lru list, when pages are reclaimed from mem cgroup. > >> > >> If over reclaim occurs, shrinking remaining lru lists is skipped, and no more > >> reclaim for reclaim/compaction. > >> > >> Signed-off-by: Hillf Danton <dhillf@gmail.com> > >> --- > >> > >> --- a/mm/vmscan.c Mon Jan 23 00:23:10 2012 > >> +++ b/mm/vmscan.c Mon Jan 23 09:57:20 2012 > >> @@ -2086,6 +2086,7 @@ static void shrink_mem_cgroup_zone(int p > >> unsigned long nr_reclaimed, nr_scanned; > >> unsigned long nr_to_reclaim = sc->nr_to_reclaim; > >> struct blk_plug plug; > >> + bool memcg_over_reclaimed = false; > >> > >> restart: > >> nr_reclaimed = 0; > >> @@ -2103,6 +2104,11 @@ restart: > >> > >> nr_reclaimed += shrink_list(lru, nr_to_scan, > >> mz, sc, priority); > >> + > >> + memcg_over_reclaimed = !scanning_global_lru(mz) > >> + && (nr_reclaimed >= nr_to_reclaim); > >> + if (memcg_over_reclaimed) > >> + goto out; > > > > Since this merge window, scanning_global_lru() is always false when > > the memory controller is enabled, i.e. most common configurations and > > distribution kernels. > > > > This will with quite likely have bad effects on zone balancing, > > pressure balancing between anon/file lru etc, while you haven't shown > > that any workloads actually benefit from this. > > > Hi Johannes > > Thanks for your comment, first. > > Impact on zone balance and lru-list balance is introduced actually, but I > dont think the patch is totally responsible for the balance mentioned, > because soft limit, embedded in mem cgroup, is setup by users according to > whatever tastes they have. > > Though there is room for the patch to be fine tuned in this direction or that, > over reclaim should not be neglected entirely, but be avoided as much as we > could, or users are enforced to set up soft limit with much care not to mess > up zone balance. Overreclaim is absolutely horrible with soft limits, but I think there are more direct reasons than checking nr_to_reclaim only after a full zone scan, for example, soft limit reclaim is invoked on zones that are totally fine. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm: vmscan: check mem cgroup over reclaimed 2012-01-24 8:33 ` Johannes Weiner @ 2012-01-24 9:08 ` KAMEZAWA Hiroyuki 2012-01-24 23:33 ` Ying Han 0 siblings, 1 reply; 12+ messages in thread From: KAMEZAWA Hiroyuki @ 2012-01-24 9:08 UTC (permalink / raw) To: Johannes Weiner Cc: Hillf Danton, linux-mm, Michal Hocko, Ying Han, Hugh Dickins, Andrew Morton, LKML On Tue, 24 Jan 2012 09:33:47 +0100 Johannes Weiner <hannes@cmpxchg.org> wrote: > On Mon, Jan 23, 2012 at 08:30:42PM +0800, Hillf Danton wrote: > > On Mon, Jan 23, 2012 at 6:47 PM, Johannes Weiner <hannes@cmpxchg.org> wrote: > > > On Mon, Jan 23, 2012 at 09:55:07AM +0800, Hillf Danton wrote: > > >> To avoid reduction in performance of reclaimee, checking overreclaim is added > > >> after shrinking lru list, when pages are reclaimed from mem cgroup. > > >> > > >> If over reclaim occurs, shrinking remaining lru lists is skipped, and no more > > >> reclaim for reclaim/compaction. > > >> > > >> Signed-off-by: Hillf Danton <dhillf@gmail.com> > > >> --- > > >> > > >> --- a/mm/vmscan.c A A Mon Jan 23 00:23:10 2012 > > >> +++ b/mm/vmscan.c A A Mon Jan 23 09:57:20 2012 > > >> @@ -2086,6 +2086,7 @@ static void shrink_mem_cgroup_zone(int p > > >> A A A unsigned long nr_reclaimed, nr_scanned; > > >> A A A unsigned long nr_to_reclaim = sc->nr_to_reclaim; > > >> A A A struct blk_plug plug; > > >> + A A bool memcg_over_reclaimed = false; > > >> > > >> A restart: > > >> A A A nr_reclaimed = 0; > > >> @@ -2103,6 +2104,11 @@ restart: > > >> > > >> A A A A A A A A A A A A A A A nr_reclaimed += shrink_list(lru, nr_to_scan, > > >> A A A A A A A A A A A A A A A A A A A A A A A A A A A A A mz, sc, priority); > > >> + > > >> + A A A A A A A A A A A A A A memcg_over_reclaimed = !scanning_global_lru(mz) > > >> + A A A A A A A A A A A A A A A A A A && (nr_reclaimed >= nr_to_reclaim); > > >> + A A A A A A A A A A A A A A if (memcg_over_reclaimed) > > >> + A A A A A A A A A A A A A A A A A A goto out; > > > > > > Since this merge window, scanning_global_lru() is always false when > > > the memory controller is enabled, i.e. most common configurations and > > > distribution kernels. > > > > > > This will with quite likely have bad effects on zone balancing, > > > pressure balancing between anon/file lru etc, while you haven't shown > > > that any workloads actually benefit from this. > > > > > Hi Johannes > > > > Thanks for your comment, first. > > > > Impact on zone balance and lru-list balance is introduced actually, but I > > dont think the patch is totally responsible for the balance mentioned, > > because soft limit, embedded in mem cgroup, is setup by users according to > > whatever tastes they have. > > > > Though there is room for the patch to be fine tuned in this direction or that, > > over reclaim should not be neglected entirely, but be avoided as much as we > > could, or users are enforced to set up soft limit with much care not to mess > > up zone balance. > > Overreclaim is absolutely horrible with soft limits, but I think there > are more direct reasons than checking nr_to_reclaim only after a full > zone scan, for example, soft limit reclaim is invoked on zones that > are totally fine. > IIUC.. - Because zonelist is all visited by alloc_pages(), _all_ zones in zonelist are in memory shortage. - taking care of zone/node balancing. I know this 'full zone scan' affects latency of alloc_pages() if the number of node is big. IMHO, in case of direct-reclaim caused by memcg's limit, we should avoid full zone scan because the reclaim is not caused by any memory shortage in zonelist. In case of global memory reclaim, kswapd doesn't use zonelist. So, only global-direct-reclaim is a problem here. I think do-full-zone-scan will reduce the calls of try_to_free_pages() in future and may reduce lock contention but adds a thread too much penalty. In typical case, considering 4-node x86/64 NUMA, GFP_HIGHUSER_MOVABLE allocation failure will reclaim 4*ZONE_NORMAL+ZONE_DMA32 = 160pages per scan. If 16-node, it will be 16*ZONE_NORMAL+ZONE_DMA32 = 544? pages per scan. 32pages may be too small but don't we need to have some threshold to quit full-zone-scan ? Here, the topic is about softlimit reclaim. I think... 1. follow up for following comment(*) is required. == nr_soft_scanned = 0; nr_soft_reclaimed = mem_cgroup_soft_limit_reclaim(zone, sc->order, sc->gfp_mask, &nr_soft_scanned); sc->nr_reclaimed += nr_soft_reclaimed; sc->nr_scanned += nr_soft_scanned; /* need some check for avoid more shrink_zone() */ <----(*) == 2. some threshold for avoinding full zone scan may be good. (But this may need deep discussion...) 3. About the patch, I think it will not break zone-balancing if (*) is handled in a good way. This check is not good. + memcg_over_reclaimed = !scanning_global_lru(mz) + && (nr_reclaimed >= nr_to_reclaim); I like following If (we-are-doing-softlimit-reclaim-for-global-direct-reclaim && res_counter_soft_limit_excess(memcg->res)) memcg_over_reclaimed = true; Then another memcg will be picked up and soft-limit-reclaim() will continue. Thanks, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm: vmscan: check mem cgroup over reclaimed 2012-01-24 9:08 ` KAMEZAWA Hiroyuki @ 2012-01-24 23:33 ` Ying Han 2012-01-26 9:16 ` KAMEZAWA Hiroyuki 0 siblings, 1 reply; 12+ messages in thread From: Ying Han @ 2012-01-24 23:33 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: Johannes Weiner, Hillf Danton, linux-mm, Michal Hocko, Hugh Dickins, Andrew Morton, LKML On Tue, Jan 24, 2012 at 1:08 AM, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote: > On Tue, 24 Jan 2012 09:33:47 +0100 > Johannes Weiner <hannes@cmpxchg.org> wrote: > >> On Mon, Jan 23, 2012 at 08:30:42PM +0800, Hillf Danton wrote: >> > On Mon, Jan 23, 2012 at 6:47 PM, Johannes Weiner <hannes@cmpxchg.org> wrote: >> > > On Mon, Jan 23, 2012 at 09:55:07AM +0800, Hillf Danton wrote: >> > >> To avoid reduction in performance of reclaimee, checking overreclaim is added >> > >> after shrinking lru list, when pages are reclaimed from mem cgroup. >> > >> >> > >> If over reclaim occurs, shrinking remaining lru lists is skipped, and no more >> > >> reclaim for reclaim/compaction. >> > >> >> > >> Signed-off-by: Hillf Danton <dhillf@gmail.com> >> > >> --- >> > >> >> > >> --- a/mm/vmscan.c Mon Jan 23 00:23:10 2012 >> > >> +++ b/mm/vmscan.c Mon Jan 23 09:57:20 2012 >> > >> @@ -2086,6 +2086,7 @@ static void shrink_mem_cgroup_zone(int p >> > >> unsigned long nr_reclaimed, nr_scanned; >> > >> unsigned long nr_to_reclaim = sc->nr_to_reclaim; >> > >> struct blk_plug plug; >> > >> + bool memcg_over_reclaimed = false; >> > >> >> > >> restart: >> > >> nr_reclaimed = 0; >> > >> @@ -2103,6 +2104,11 @@ restart: >> > >> >> > >> nr_reclaimed += shrink_list(lru, nr_to_scan, >> > >> mz, sc, priority); >> > >> + >> > >> + memcg_over_reclaimed = !scanning_global_lru(mz) >> > >> + && (nr_reclaimed >= nr_to_reclaim); >> > >> + if (memcg_over_reclaimed) >> > >> + goto out; >> > > >> > > Since this merge window, scanning_global_lru() is always false when >> > > the memory controller is enabled, i.e. most common configurations and >> > > distribution kernels. >> > > >> > > This will with quite likely have bad effects on zone balancing, >> > > pressure balancing between anon/file lru etc, while you haven't shown >> > > that any workloads actually benefit from this. >> > > >> > Hi Johannes >> > >> > Thanks for your comment, first. >> > >> > Impact on zone balance and lru-list balance is introduced actually, but I >> > dont think the patch is totally responsible for the balance mentioned, >> > because soft limit, embedded in mem cgroup, is setup by users according to >> > whatever tastes they have. >> > >> > Though there is room for the patch to be fine tuned in this direction or that, >> > over reclaim should not be neglected entirely, but be avoided as much as we >> > could, or users are enforced to set up soft limit with much care not to mess >> > up zone balance. >> >> Overreclaim is absolutely horrible with soft limits, but I think there >> are more direct reasons than checking nr_to_reclaim only after a full >> zone scan, for example, soft limit reclaim is invoked on zones that >> are totally fine. >> > > > IIUC.. > - Because zonelist is all visited by alloc_pages(), _all_ zones in zonelist > are in memory shortage. > - taking care of zone/node balancing. > > I know this 'full zone scan' affects latency of alloc_pages() if the number > of node is big. > > IMHO, in case of direct-reclaim caused by memcg's limit, we should avoid > full zone scan because the reclaim is not caused by any memory shortage in zonelist. > > In case of global memory reclaim, kswapd doesn't use zonelist. > > So, only global-direct-reclaim is a problem here. > I think do-full-zone-scan will reduce the calls of try_to_free_pages() > in future and may reduce lock contention but adds a thread too much > penalty. > In typical case, considering 4-node x86/64 NUMA, GFP_HIGHUSER_MOVABLE > allocation failure will reclaim 4*ZONE_NORMAL+ZONE_DMA32 = 160pages per scan. > > If 16-node, it will be 16*ZONE_NORMAL+ZONE_DMA32 = 544? pages per scan. > > 32pages may be too small but don't we need to have some threshold to quit > full-zone-scan ? Sorry I am confused. Are we talking about doing full zonelist scanning within a memcg or doing anon/file lru balance within a zone? AFAIU, it is the later one. In this patch, we do early breakout (memcg_over_reclaimed) without finish scanning other lrus per-memcg-per-zone. I think the concern is what is the side effect of that ? > Here, the topic is about softlimit reclaim. I think... > > 1. follow up for following comment(*) is required. > == > nr_soft_scanned = 0; > nr_soft_reclaimed = mem_cgroup_soft_limit_reclaim(zone, > sc->order, sc->gfp_mask, > &nr_soft_scanned); > sc->nr_reclaimed += nr_soft_reclaimed; > sc->nr_scanned += nr_soft_scanned; > /* need some check for avoid more shrink_zone() */ <----(*) > == > > 2. some threshold for avoinding full zone scan may be good. > (But this may need deep discussion...) > > 3. About the patch, I think it will not break zone-balancing if (*) is > handled in a good way. > > This check is not good. > > + memcg_over_reclaimed = !scanning_global_lru(mz) > + && (nr_reclaimed >= nr_to_reclaim); > > > I like following > > If (we-are-doing-softlimit-reclaim-for-global-direct-reclaim && > res_counter_soft_limit_excess(memcg->res)) > memcg_over_reclaimed = true; This condition looks quite similar to what we've discussed on another thread, except that we do allow over-reclaim under softlimit after certain priority loop. (assume we have hard-to-reclaim memory on other cgroups above their softlimit) There are some works needed to be done ( like reverting the rb-tree ) on current soft limit implementation before we can even further to optimize it. It would be nice to settle the first part before everything else. --Ying > Then another memcg will be picked up and soft-limit-reclaim() will continue. > > Thanks, > -Kame > > > > > > > > > > > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm: vmscan: check mem cgroup over reclaimed 2012-01-24 23:33 ` Ying Han @ 2012-01-26 9:16 ` KAMEZAWA Hiroyuki 0 siblings, 0 replies; 12+ messages in thread From: KAMEZAWA Hiroyuki @ 2012-01-26 9:16 UTC (permalink / raw) To: Ying Han Cc: Johannes Weiner, Hillf Danton, linux-mm, Michal Hocko, Hugh Dickins, Andrew Morton, LKML On Tue, 24 Jan 2012 15:33:11 -0800 Ying Han <yinghan@google.com> wrote: > On Tue, Jan 24, 2012 at 1:08 AM, KAMEZAWA Hiroyuki > <kamezawa.hiroyu@jp.fujitsu.com> wrote: > > On Tue, 24 Jan 2012 09:33:47 +0100 > > Johannes Weiner <hannes@cmpxchg.org> wrote: > > > >> On Mon, Jan 23, 2012 at 08:30:42PM +0800, Hillf Danton wrote: > >> > On Mon, Jan 23, 2012 at 6:47 PM, Johannes Weiner <hannes@cmpxchg.org> wrote: > >> > > On Mon, Jan 23, 2012 at 09:55:07AM +0800, Hillf Danton wrote: > >> > >> To avoid reduction in performance of reclaimee, checking overreclaim is added > >> > >> after shrinking lru list, when pages are reclaimed from mem cgroup. > >> > >> > >> > >> If over reclaim occurs, shrinking remaining lru lists is skipped, and no more > >> > >> reclaim for reclaim/compaction. > >> > >> > >> > >> Signed-off-by: Hillf Danton <dhillf@gmail.com> > >> > >> --- > >> > >> > >> > >> --- a/mm/vmscan.c A A Mon Jan 23 00:23:10 2012 > >> > >> +++ b/mm/vmscan.c A A Mon Jan 23 09:57:20 2012 > >> > >> @@ -2086,6 +2086,7 @@ static void shrink_mem_cgroup_zone(int p > >> > >> A A A unsigned long nr_reclaimed, nr_scanned; > >> > >> A A A unsigned long nr_to_reclaim = sc->nr_to_reclaim; > >> > >> A A A struct blk_plug plug; > >> > >> + A A bool memcg_over_reclaimed = false; > >> > >> > >> > >> A restart: > >> > >> A A A nr_reclaimed = 0; > >> > >> @@ -2103,6 +2104,11 @@ restart: > >> > >> > >> > >> A A A A A A A A A A A A A A A nr_reclaimed += shrink_list(lru, nr_to_scan, > >> > >> A A A A A A A A A A A A A A A A A A A A A A A A A A A A A mz, sc, priority); > >> > >> + > >> > >> + A A A A A A A A A A A A A A memcg_over_reclaimed = !scanning_global_lru(mz) > >> > >> + A A A A A A A A A A A A A A A A A A && (nr_reclaimed >= nr_to_reclaim); > >> > >> + A A A A A A A A A A A A A A if (memcg_over_reclaimed) > >> > >> + A A A A A A A A A A A A A A A A A A goto out; > >> > > > >> > > Since this merge window, scanning_global_lru() is always false when > >> > > the memory controller is enabled, i.e. most common configurations and > >> > > distribution kernels. > >> > > > >> > > This will with quite likely have bad effects on zone balancing, > >> > > pressure balancing between anon/file lru etc, while you haven't shown > >> > > that any workloads actually benefit from this. > >> > > > >> > Hi Johannes > >> > > >> > Thanks for your comment, first. > >> > > >> > Impact on zone balance and lru-list balance is introduced actually, but I > >> > dont think the patch is totally responsible for the balance mentioned, > >> > because soft limit, embedded in mem cgroup, is setup by users according to > >> > whatever tastes they have. > >> > > >> > Though there is room for the patch to be fine tuned in this direction or that, > >> > over reclaim should not be neglected entirely, but be avoided as much as we > >> > could, or users are enforced to set up soft limit with much care not to mess > >> > up zone balance. > >> > >> Overreclaim is absolutely horrible with soft limits, but I think there > >> are more direct reasons than checking nr_to_reclaim only after a full > >> zone scan, for example, soft limit reclaim is invoked on zones that > >> are totally fine. > >> > > > > > > IIUC.. > > A - Because zonelist is all visited by alloc_pages(), _all_ zones in zonelist > > A are in memory shortage. > > A - taking care of zone/node balancing. > > > > I know this 'full zone scan' affects latency of alloc_pages() if the number > > of node is big. > > > > > IMHO, in case of direct-reclaim caused by memcg's limit, we should avoid > > full zone scan because the reclaim is not caused by any memory shortage in zonelist. > > This text is talking about memcg's direct reclaim scanning caused by 'limit'. > > In case of global memory reclaim, kswapd doesn't use zonelist. > > > > So, only global-direct-reclaim is a problem here. > > I think do-full-zone-scan will reduce the calls of try_to_free_pages() > > in future and may reduce lock contention but adds a thread too much > > penalty. > > > In typical case, considering 4-node x86/64 NUMA, GFP_HIGHUSER_MOVABLE > > allocation failure will reclaim 4*ZONE_NORMAL+ZONE_DMA32 = 160pages per scan. > > > > If 16-node, it will be 16*ZONE_NORMAL+ZONE_DMA32 = 544? pages per scan. > > > > 32pages may be too small but don't we need to have some threshold to quit > > full-zone-scan ? > > Sorry I am confused. Are we talking about doing full zonelist scanning > within a memcg or doing anon/file lru balance within a zone? AFAIU, it > is the later one. > I'm sorry for confusing. Above test is talking about global lru scanning, not memcg related. > In this patch, we do early breakout (memcg_over_reclaimed) without > finish scanning other lrus per-memcg-per-zone. I think the concern is > what is the side effect of that ? > > > Here, the topic is about softlimit reclaim. I think... > > > > 1. follow up for following comment(*) is required. > > == > > A A A A A A A A A A A A nr_soft_scanned = 0; > > A A A A A A A A A A A A nr_soft_reclaimed = mem_cgroup_soft_limit_reclaim(zone, > > A A A A A A A A A A A A A A A A A A A A A A A A sc->order, sc->gfp_mask, > > A A A A A A A A A A A A A A A A A A A A A A A A &nr_soft_scanned); > > A A A A A A A A A A A A sc->nr_reclaimed += nr_soft_reclaimed; > > A A A A A A A A A A A A sc->nr_scanned += nr_soft_scanned; > > A A A A A A A A A A A A /* need some check for avoid more shrink_zone() */ <----(*) > > == > > > > 2. some threshold for avoinding full zone scan may be good. > > A (But this may need deep discussion...) > > > > 3. About the patch, I think it will not break zone-balancing if (*) is > > A handled in a good way. > > > > A This check is not good. > > > > + A A A A A A A A A A A A A A A memcg_over_reclaimed = !scanning_global_lru(mz) > > + A A A A A A A A A A A A A A A A A A A && (nr_reclaimed >= nr_to_reclaim); > > > > > > A I like following > > > > A If (we-are-doing-softlimit-reclaim-for-global-direct-reclaim && > > A A A res_counter_soft_limit_excess(memcg->res)) > > A A A memcg_over_reclaimed = true; > > This condition looks quite similar to what we've discussed on another > thread, except that we do allow over-reclaim under softlimit after > certain priority loop. (assume we have hard-to-reclaim memory on other > cgroups above their softlimit) > yes. I've cut this from that thread. > There are some works needed to be done ( like reverting the rb-tree ) > on current soft limit implementation before we can even further to > optimize it. It would be nice to settle the first part before > everything else. > Agreed. I personally think Johannes' clean up should go first and removing rb-tree before optimization is better. Thanks, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm: vmscan: check mem cgroup over reclaimed 2012-01-23 1:55 [PATCH] mm: vmscan: check mem cgroup over reclaimed Hillf Danton 2012-01-23 10:47 ` Johannes Weiner @ 2012-01-23 19:04 ` Ying Han 2012-01-24 3:45 ` Hillf Danton 1 sibling, 1 reply; 12+ messages in thread From: Ying Han @ 2012-01-23 19:04 UTC (permalink / raw) To: Hillf Danton Cc: linux-mm, Michal Hocko, KAMEZAWA Hiroyuki, Hugh Dickins, Andrew Morton, LKML On Sun, Jan 22, 2012 at 5:55 PM, Hillf Danton <dhillf@gmail.com> wrote: > To avoid reduction in performance of reclaimee, checking overreclaim is added > after shrinking lru list, when pages are reclaimed from mem cgroup. > > If over reclaim occurs, shrinking remaining lru lists is skipped, and no more > reclaim for reclaim/compaction. > > Signed-off-by: Hillf Danton <dhillf@gmail.com> > --- > > --- a/mm/vmscan.c Mon Jan 23 00:23:10 2012 > +++ b/mm/vmscan.c Mon Jan 23 09:57:20 2012 > @@ -2086,6 +2086,7 @@ static void shrink_mem_cgroup_zone(int p > unsigned long nr_reclaimed, nr_scanned; > unsigned long nr_to_reclaim = sc->nr_to_reclaim; > struct blk_plug plug; > + bool memcg_over_reclaimed = false; > > restart: > nr_reclaimed = 0; > @@ -2103,6 +2104,11 @@ restart: > > nr_reclaimed += shrink_list(lru, nr_to_scan, > mz, sc, priority); > + > + memcg_over_reclaimed = !scanning_global_lru(mz) > + && (nr_reclaimed >= nr_to_reclaim); > + if (memcg_over_reclaimed) > + goto out; Why we need the change here? Do we have number to demonstrate? > } > } > /* > @@ -2116,6 +2122,7 @@ restart: > if (nr_reclaimed >= nr_to_reclaim && priority < DEF_PRIORITY) > break; > } > +out: > blk_finish_plug(&plug); > sc->nr_reclaimed += nr_reclaimed; > > @@ -2127,7 +2134,8 @@ restart: > shrink_active_list(SWAP_CLUSTER_MAX, mz, sc, priority, 0); > > /* reclaim/compaction might need reclaim to continue */ > - if (should_continue_reclaim(mz, nr_reclaimed, > + if (!memcg_over_reclaimed && > + should_continue_reclaim(mz, nr_reclaimed, > sc->nr_scanned - nr_scanned, sc)) This changes the existing logic. What if the nr_reclaimed is greater than nr_to_reclaim, but smaller than pages_for_compaction? The existing logic is to continue reclaiming. --Ying > goto restart; -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm: vmscan: check mem cgroup over reclaimed 2012-01-23 19:04 ` Ying Han @ 2012-01-24 3:45 ` Hillf Danton 2012-01-24 23:22 ` Ying Han 0 siblings, 1 reply; 12+ messages in thread From: Hillf Danton @ 2012-01-24 3:45 UTC (permalink / raw) To: Ying Han Cc: linux-mm, Michal Hocko, KAMEZAWA Hiroyuki, Hugh Dickins, Andrew Morton, LKML Hi all On Tue, Jan 24, 2012 at 3:04 AM, Ying Han <yinghan@google.com> wrote: > On Sun, Jan 22, 2012 at 5:55 PM, Hillf Danton <dhillf@gmail.com> wrote: >> To avoid reduction in performance of reclaimee, checking overreclaim is added >> after shrinking lru list, when pages are reclaimed from mem cgroup. >> >> If over reclaim occurs, shrinking remaining lru lists is skipped, and no more >> reclaim for reclaim/compaction. >> >> Signed-off-by: Hillf Danton <dhillf@gmail.com> >> --- >> >> --- a/mm/vmscan.c Mon Jan 23 00:23:10 2012 >> +++ b/mm/vmscan.c Mon Jan 23 09:57:20 2012 >> @@ -2086,6 +2086,7 @@ static void shrink_mem_cgroup_zone(int p >> unsigned long nr_reclaimed, nr_scanned; >> unsigned long nr_to_reclaim = sc->nr_to_reclaim; >> struct blk_plug plug; >> + bool memcg_over_reclaimed = false; >> >> restart: >> nr_reclaimed = 0; >> @@ -2103,6 +2104,11 @@ restart: >> >> nr_reclaimed += shrink_list(lru, nr_to_scan, >> mz, sc, priority); >> + >> + memcg_over_reclaimed = !scanning_global_lru(mz) >> + && (nr_reclaimed >= nr_to_reclaim); >> + if (memcg_over_reclaimed) >> + goto out; > > Why we need the change here? Do we have number to demonstrate? See below please 8-) > > >> } >> } >> /* >> @@ -2116,6 +2122,7 @@ restart: >> if (nr_reclaimed >= nr_to_reclaim && priority < DEF_PRIORITY) >> break; >> } >> +out: >> blk_finish_plug(&plug); >> sc->nr_reclaimed += nr_reclaimed; >> >> @@ -2127,7 +2134,8 @@ restart: >> shrink_active_list(SWAP_CLUSTER_MAX, mz, sc, priority, 0); >> >> /* reclaim/compaction might need reclaim to continue */ >> - if (should_continue_reclaim(mz, nr_reclaimed, >> + if (!memcg_over_reclaimed && >> + should_continue_reclaim(mz, nr_reclaimed, >> sc->nr_scanned - nr_scanned, sc)) > > This changes the existing logic. What if the nr_reclaimed is greater > than nr_to_reclaim, but smaller than pages_for_compaction? The > existing logic is to continue reclaiming. > With soft limit available, what if nr_to_reclaim set to be the number of pages exceeding soft limit? With over reclaim abused, what are the targets of soft limit? Thanks Hillf -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm: vmscan: check mem cgroup over reclaimed 2012-01-24 3:45 ` Hillf Danton @ 2012-01-24 23:22 ` Ying Han 2012-01-25 1:47 ` Hillf Danton 0 siblings, 1 reply; 12+ messages in thread From: Ying Han @ 2012-01-24 23:22 UTC (permalink / raw) To: Hillf Danton Cc: linux-mm, Michal Hocko, KAMEZAWA Hiroyuki, Hugh Dickins, Andrew Morton, LKML On Mon, Jan 23, 2012 at 7:45 PM, Hillf Danton <dhillf@gmail.com> wrote: > Hi all > > On Tue, Jan 24, 2012 at 3:04 AM, Ying Han <yinghan@google.com> wrote: >> On Sun, Jan 22, 2012 at 5:55 PM, Hillf Danton <dhillf@gmail.com> wrote: >>> To avoid reduction in performance of reclaimee, checking overreclaim is added >>> after shrinking lru list, when pages are reclaimed from mem cgroup. >>> >>> If over reclaim occurs, shrinking remaining lru lists is skipped, and no more >>> reclaim for reclaim/compaction. >>> >>> Signed-off-by: Hillf Danton <dhillf@gmail.com> >>> --- >>> >>> --- a/mm/vmscan.c Mon Jan 23 00:23:10 2012 >>> +++ b/mm/vmscan.c Mon Jan 23 09:57:20 2012 >>> @@ -2086,6 +2086,7 @@ static void shrink_mem_cgroup_zone(int p >>> unsigned long nr_reclaimed, nr_scanned; >>> unsigned long nr_to_reclaim = sc->nr_to_reclaim; >>> struct blk_plug plug; >>> + bool memcg_over_reclaimed = false; >>> >>> restart: >>> nr_reclaimed = 0; >>> @@ -2103,6 +2104,11 @@ restart: >>> >>> nr_reclaimed += shrink_list(lru, nr_to_scan, >>> mz, sc, priority); >>> + >>> + memcg_over_reclaimed = !scanning_global_lru(mz) >>> + && (nr_reclaimed >= nr_to_reclaim); >>> + if (memcg_over_reclaimed) >>> + goto out; >> >> Why we need the change here? Do we have number to demonstrate? > > See below please 8-) > >> >> >>> } >>> } >>> /* >>> @@ -2116,6 +2122,7 @@ restart: >>> if (nr_reclaimed >= nr_to_reclaim && priority < DEF_PRIORITY) >>> break; >>> } >>> +out: >>> blk_finish_plug(&plug); >>> sc->nr_reclaimed += nr_reclaimed; >>> >>> @@ -2127,7 +2134,8 @@ restart: >>> shrink_active_list(SWAP_CLUSTER_MAX, mz, sc, priority, 0); >>> >>> /* reclaim/compaction might need reclaim to continue */ >>> - if (should_continue_reclaim(mz, nr_reclaimed, >>> + if (!memcg_over_reclaimed && >>> + should_continue_reclaim(mz, nr_reclaimed, >>> sc->nr_scanned - nr_scanned, sc)) >> >> This changes the existing logic. What if the nr_reclaimed is greater >> than nr_to_reclaim, but smaller than pages_for_compaction? The >> existing logic is to continue reclaiming. >> > With soft limit available, what if nr_to_reclaim set to be the number of > pages exceeding soft limit? With over reclaim abused, what are the targets > of soft limit? The nr_to_reclaim is set to SWAP_CLUSTER_MAX (32) for direct reclaim and ULONG_MAX for background reclaim. Not sure we can set it, but it is possible the res_counter_soft_limit_excess equal to that target value. The current soft limit mechanism provides a clue of WHERE to reclaim pages when there is memory pressure, it doesn't change the reclaim target as it was before. Overreclaim a cgroup under its softlimit is bad, but we should be careful not introducing side effect before providing the guarantee. Here, the should_continue_reclaim() has logic of freeing a bit more order-0 pages for compaction. The logic got changed after this. --Ying > Thanks > Hillf -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm: vmscan: check mem cgroup over reclaimed 2012-01-24 23:22 ` Ying Han @ 2012-01-25 1:47 ` Hillf Danton 2012-01-25 19:20 ` Ying Han 0 siblings, 1 reply; 12+ messages in thread From: Hillf Danton @ 2012-01-25 1:47 UTC (permalink / raw) To: Ying Han Cc: linux-mm, Michal Hocko, KAMEZAWA Hiroyuki, Hugh Dickins, Andrew Morton, LKML On Wed, Jan 25, 2012 at 7:22 AM, Ying Han <yinghan@google.com> wrote: > On Mon, Jan 23, 2012 at 7:45 PM, Hillf Danton <dhillf@gmail.com> wrote: >> With soft limit available, what if nr_to_reclaim set to be the number of >> pages exceeding soft limit? With over reclaim abused, what are the targets >> of soft limit? > > The nr_to_reclaim is set to SWAP_CLUSTER_MAX (32) for direct reclaim > and ULONG_MAX for background reclaim. Not sure we can set it, but it > is possible the res_counter_soft_limit_excess equal to that target > value. The current soft limit mechanism provides a clue of WHERE to > reclaim pages when there is memory pressure, it doesn't change the > reclaim target as it was before. > Decrement in sc->nr_to_reclaim was tried in another patch, you already saw it. > Overreclaim a cgroup under its softlimit is bad, but we should be > careful not introducing side effect before providing the guarantee. Yes 8-) > Here, the should_continue_reclaim() has logic of freeing a bit more > order-0 pages for compaction. The logic got changed after this. > Compaction is to increase the successful rate of THP allocation, and in turn to back up higher performance. In soft limit, performance guarantee is not extra request but treated with less care. Which one you prefer, compaction or guarantee? Thanks Hillf -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm: vmscan: check mem cgroup over reclaimed 2012-01-25 1:47 ` Hillf Danton @ 2012-01-25 19:20 ` Ying Han 0 siblings, 0 replies; 12+ messages in thread From: Ying Han @ 2012-01-25 19:20 UTC (permalink / raw) To: Hillf Danton Cc: linux-mm, Michal Hocko, KAMEZAWA Hiroyuki, Hugh Dickins, Andrew Morton, LKML On Tue, Jan 24, 2012 at 5:47 PM, Hillf Danton <dhillf@gmail.com> wrote: > On Wed, Jan 25, 2012 at 7:22 AM, Ying Han <yinghan@google.com> wrote: >> On Mon, Jan 23, 2012 at 7:45 PM, Hillf Danton <dhillf@gmail.com> wrote: >>> With soft limit available, what if nr_to_reclaim set to be the number of >>> pages exceeding soft limit? With over reclaim abused, what are the targets >>> of soft limit? >> >> The nr_to_reclaim is set to SWAP_CLUSTER_MAX (32) for direct reclaim >> and ULONG_MAX for background reclaim. Not sure we can set it, but it >> is possible the res_counter_soft_limit_excess equal to that target >> value. The current soft limit mechanism provides a clue of WHERE to >> reclaim pages when there is memory pressure, it doesn't change the >> reclaim target as it was before. >> > > Decrement in sc->nr_to_reclaim was tried in another patch, you already saw it. > >> Overreclaim a cgroup under its softlimit is bad, but we should be >> careful not introducing side effect before providing the guarantee. > > Yes 8-) > >> Here, the should_continue_reclaim() has logic of freeing a bit more >> order-0 pages for compaction. The logic got changed after this. >> > > Compaction is to increase the successful rate of THP allocation, and in turn > to back up higher performance. In soft limit, performance guarantee is not > extra request but treated with less care. > > Which one you prefer, compaction or guarantee? The compaction is something we already supporting, while the softlimit implementation is a new design. I would say that we need to guarantee no regression introduced by any new code. --Ying > Thanks > Hillf -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2012-01-26 9:17 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-01-23 1:55 [PATCH] mm: vmscan: check mem cgroup over reclaimed Hillf Danton 2012-01-23 10:47 ` Johannes Weiner 2012-01-23 12:30 ` Hillf Danton 2012-01-24 8:33 ` Johannes Weiner 2012-01-24 9:08 ` KAMEZAWA Hiroyuki 2012-01-24 23:33 ` Ying Han 2012-01-26 9:16 ` KAMEZAWA Hiroyuki 2012-01-23 19:04 ` Ying Han 2012-01-24 3:45 ` Hillf Danton 2012-01-24 23:22 ` Ying Han 2012-01-25 1:47 ` Hillf Danton 2012-01-25 19:20 ` Ying Han
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).