* [PATCH -repost] memcg,vmscan: do not break out targeted reclaim without reclaimed pages @ 2013-01-03 18:09 Michal Hocko 2013-01-03 20:24 ` Andrew Morton 2013-01-29 8:51 ` [PATCH] mmotm: memcgvmscan-do-not-break-out-targeted-reclaim-without-reclaimed-pages.patch fix Michal Hocko 0 siblings, 2 replies; 6+ messages in thread From: Michal Hocko @ 2013-01-03 18:09 UTC (permalink / raw) To: linux-mm Cc: linux-kernel, KAMEZAWA Hiroyuki, Johannes Weiner, Tejun Heo, Glauber Costa, Li Zefan, Andrew Morton Hi, I have posted this quite some time ago (https://lkml.org/lkml/2012/12/14/102) but it probably slipped through --- ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH -repost] memcg,vmscan: do not break out targeted reclaim without reclaimed pages 2013-01-03 18:09 [PATCH -repost] memcg,vmscan: do not break out targeted reclaim without reclaimed pages Michal Hocko @ 2013-01-03 20:24 ` Andrew Morton 2013-01-04 14:52 ` Michal Hocko 2013-01-29 8:51 ` [PATCH] mmotm: memcgvmscan-do-not-break-out-targeted-reclaim-without-reclaimed-pages.patch fix Michal Hocko 1 sibling, 1 reply; 6+ messages in thread From: Andrew Morton @ 2013-01-03 20:24 UTC (permalink / raw) To: Michal Hocko Cc: linux-mm, linux-kernel, KAMEZAWA Hiroyuki, Johannes Weiner, Tejun Heo, Glauber Costa, Li Zefan On Thu, 3 Jan 2013 19:09:01 +0100 Michal Hocko <mhocko@suse.cz> wrote: > Hi, > I have posted this quite some time ago > (https://lkml.org/lkml/2012/12/14/102) but it probably slipped through > --- > >From 28b4e10bc3c18b82bee695b76f4bf25c03baa5f8 Mon Sep 17 00:00:00 2001 > From: Michal Hocko <mhocko@suse.cz> > Date: Fri, 14 Dec 2012 11:12:43 +0100 > Subject: [PATCH] memcg,vmscan: do not break out targeted reclaim without > reclaimed pages > > Targeted (hard resp. soft) reclaim has traditionally tried to scan one > group with decreasing priority until nr_to_reclaim (SWAP_CLUSTER_MAX > pages) is reclaimed or all priorities are exhausted. The reclaim is > then retried until the limit is met. > > This approach, however, doesn't work well with deeper hierarchies where > groups higher in the hierarchy do not have any or only very few pages > (this usually happens if those groups do not have any tasks and they > have only re-parented pages after some of their children is removed). > Those groups are reclaimed with decreasing priority pointlessly as there > is nothing to reclaim from them. > > An easiest fix is to break out of the memcg iteration loop in shrink_zone > only if the whole hierarchy has been visited or sufficient pages have > been reclaimed. This is also more natural because the reclaimer expects > that the hierarchy under the given root is reclaimed. As a result we can > simplify the soft limit reclaim which does its own iteration. > > Reported-by: Ying Han <yinghan@google.com> But what was in that report? My guess would be "excessive CPU consumption", and perhaps "excessive reclaim in the higher-level memcgs". IOW, what are the user-visible effects of this change? (And congrats - you're the first person I've sent that sentence to this year! But not, I fear, the last) I don't really understand what prevents limit reclaim from stealing lots of pages from the top-level groups. How do we ensure balancing/fairness in this case? > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1973,18 +1973,17 @@ static void shrink_zone(struct zone *zone, struct scan_control *sc) shrink_zone() might be getting a bit bloaty for CONFIG_MEMCG=n kernels. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH -repost] memcg,vmscan: do not break out targeted reclaim without reclaimed pages 2013-01-03 20:24 ` Andrew Morton @ 2013-01-04 14:52 ` Michal Hocko 0 siblings, 0 replies; 6+ messages in thread From: Michal Hocko @ 2013-01-04 14:52 UTC (permalink / raw) To: Andrew Morton Cc: linux-mm, linux-kernel, KAMEZAWA Hiroyuki, Johannes Weiner, Tejun Heo, Glauber Costa, Li Zefan On Thu 03-01-13 12:24:04, Andrew Morton wrote: > On Thu, 3 Jan 2013 19:09:01 +0100 > Michal Hocko <mhocko@suse.cz> wrote: > > > Hi, > > I have posted this quite some time ago > > (https://lkml.org/lkml/2012/12/14/102) but it probably slipped through > > --- > > >From 28b4e10bc3c18b82bee695b76f4bf25c03baa5f8 Mon Sep 17 00:00:00 2001 > > From: Michal Hocko <mhocko@suse.cz> > > Date: Fri, 14 Dec 2012 11:12:43 +0100 > > Subject: [PATCH] memcg,vmscan: do not break out targeted reclaim without > > reclaimed pages > > > > Targeted (hard resp. soft) reclaim has traditionally tried to scan one > > group with decreasing priority until nr_to_reclaim (SWAP_CLUSTER_MAX > > pages) is reclaimed or all priorities are exhausted. The reclaim is > > then retried until the limit is met. > > > > This approach, however, doesn't work well with deeper hierarchies where > > groups higher in the hierarchy do not have any or only very few pages > > (this usually happens if those groups do not have any tasks and they > > have only re-parented pages after some of their children is removed). > > Those groups are reclaimed with decreasing priority pointlessly as there > > is nothing to reclaim from them. > > > > An easiest fix is to break out of the memcg iteration loop in shrink_zone > > only if the whole hierarchy has been visited or sufficient pages have > > been reclaimed. This is also more natural because the reclaimer expects > > that the hierarchy under the given root is reclaimed. As a result we can > > simplify the soft limit reclaim which does its own iteration. > > > > Reported-by: Ying Han <yinghan@google.com> > > But what was in that report? Well, Ying was complaining that targeted reclaim differs a lot from the global one regarding iteration because we reclaim each group in at the same priority because falling to lower one which targeted reclaim hammers one group for each zone and priority first before it visits a next group. More on that here: https://lkml.org/lkml/2012/12/13/712 > My guess would be "excessive CPU consumption", and perhaps "excessive > reclaim in the higher-level memcgs". > IOW, what are the user-visible effects of this change? I would expect a smaller CPU consumption because save a lot of per-zone-in-zonelist * per-prio loops without any useful work to do with deeper hierarchies. I haven't measured that though. I do not expect the win would be huge because shrink_lruvec should be mostly noop for groups without any pages (after Johannes recent changes in that area). The patch is more trying to make it more consistent with the global reclaim. > (And congrats - you're the first person I've sent that sentence to this > year! But not, I fear, the last) > > I don't really understand what prevents limit reclaim from stealing > lots of pages from the top-level groups. How do we ensure > balancing/fairness in this case? All groups in a hierarchy (with root_mem_cgroup for the global reclaim) are visited in the round robin fashion (per-node per-zone per-priority). Last visited group is cached in iterator (mem_cgroup_per_zone::mem_cgroup_reclaim_iter) and the following reclaim will start with the next one. Over reclaim is not an issue because of nr_to_reclaim checks both up the way in do_try_to_free_pages and shrink_lruvec which make sure we back off after SWAP_CLUSTER_MAX have been reclaimed. > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -1973,18 +1973,17 @@ static void shrink_zone(struct zone *zone, struct scan_control *sc) > > shrink_zone() might be getting a bit bloaty for CONFIG_MEMCG=n kernels. Most of the code should be compiled out because global_reclaim is compile time true and mem_cgroup_iter returns NULL right away. So all we do is we waste mem_cgroup_reclaim_cookie on the stack but I guess gcc should be able to optimize that one out as well (mine does that). If you are talking about uncompiled code then yes it is getting messy. Maybe we want to have something like mem_cgroup_shrink_zone and keep only: do { mem_cgroup_shrink_zone(); } while (should_continue_reclaim(...)); here. Dunno if that is a huge win though. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] mmotm: memcgvmscan-do-not-break-out-targeted-reclaim-without-reclaimed-pages.patch fix 2013-01-03 18:09 [PATCH -repost] memcg,vmscan: do not break out targeted reclaim without reclaimed pages Michal Hocko 2013-01-03 20:24 ` Andrew Morton @ 2013-01-29 8:51 ` Michal Hocko 2013-01-30 16:22 ` Johannes Weiner 1 sibling, 1 reply; 6+ messages in thread From: Michal Hocko @ 2013-01-29 8:51 UTC (permalink / raw) To: Ying Han, Andrew Morton Cc: linux-kernel, KAMEZAWA Hiroyuki, Johannes Weiner, Tejun Heo, Glauber Costa, linux-mm, Li Zefan Ying has noticed me (via private email) that the patch is bogus because the break out condition is incorrect. She said she would post a fix but she's been probably too busy. If she doesn't oppose, could you add the follow up fix, please? I am really sorry about this mess. --- ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mmotm: memcgvmscan-do-not-break-out-targeted-reclaim-without-reclaimed-pages.patch fix 2013-01-29 8:51 ` [PATCH] mmotm: memcgvmscan-do-not-break-out-targeted-reclaim-without-reclaimed-pages.patch fix Michal Hocko @ 2013-01-30 16:22 ` Johannes Weiner 2013-01-30 16:37 ` Michal Hocko 0 siblings, 1 reply; 6+ messages in thread From: Johannes Weiner @ 2013-01-30 16:22 UTC (permalink / raw) To: Michal Hocko Cc: Ying Han, Andrew Morton, linux-kernel, KAMEZAWA Hiroyuki, Tejun Heo, Glauber Costa, linux-mm, Li Zefan On Tue, Jan 29, 2013 at 09:51:04AM +0100, Michal Hocko wrote: > Ying has noticed me (via private email) that the patch is bogus because > the break out condition is incorrect. She said she would post a fix > but she's been probably too busy. If she doesn't oppose, could you add > the follow up fix, please? > > I am really sorry about this mess. > --- > >From 6d23b59e96b8173fae2d0d397cb5e99f16899874 Mon Sep 17 00:00:00 2001 > From: Ying Han <yinghan@google.com> > Date: Tue, 29 Jan 2013 09:42:28 +0100 > Subject: [PATCH] mmotm: > memcgvmscan-do-not-break-out-targeted-reclaim-without-reclaimed-pages.patch > fix > > We should break out of the hierarchy loop only if nr_reclaimed exceeded > nr_to_reclaim and not vice-versa. This patch fixes the condition. > > Signed-off-by: Ying Han <yinghan@google.com> > --- > mm/vmscan.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index d75c1ec..7528eae 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1985,7 +1985,7 @@ static void shrink_zone(struct zone *zone, struct scan_control *sc) > * whole hierarchy is not sufficient. > */ > if (!global_reclaim(sc) && > - sc->nr_to_reclaim >= sc->nr_reclaimed) { > + sc->nr_to_reclaim <= sc->nr_reclaimed) { This is just a really weird ordering of the operands, isn't it? You compare the constant to the variable, like if (42 == foo->nr_pages). if (sc->nr_reclaimed >= sc->nr_to_reclaim) would be less surprising. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mmotm: memcgvmscan-do-not-break-out-targeted-reclaim-without-reclaimed-pages.patch fix 2013-01-30 16:22 ` Johannes Weiner @ 2013-01-30 16:37 ` Michal Hocko 0 siblings, 0 replies; 6+ messages in thread From: Michal Hocko @ 2013-01-30 16:37 UTC (permalink / raw) To: Johannes Weiner Cc: Ying Han, Andrew Morton, linux-kernel, KAMEZAWA Hiroyuki, Tejun Heo, Glauber Costa, linux-mm, Li Zefan On Wed 30-01-13 11:22:57, Johannes Weiner wrote: > On Tue, Jan 29, 2013 at 09:51:04AM +0100, Michal Hocko wrote: > > Ying has noticed me (via private email) that the patch is bogus because > > the break out condition is incorrect. She said she would post a fix > > but she's been probably too busy. If she doesn't oppose, could you add > > the follow up fix, please? > > > > I am really sorry about this mess. > > --- > > >From 6d23b59e96b8173fae2d0d397cb5e99f16899874 Mon Sep 17 00:00:00 2001 > > From: Ying Han <yinghan@google.com> > > Date: Tue, 29 Jan 2013 09:42:28 +0100 > > Subject: [PATCH] mmotm: > > memcgvmscan-do-not-break-out-targeted-reclaim-without-reclaimed-pages.patch > > fix > > > > We should break out of the hierarchy loop only if nr_reclaimed exceeded > > nr_to_reclaim and not vice-versa. This patch fixes the condition. > > > > Signed-off-by: Ying Han <yinghan@google.com> > > --- > > mm/vmscan.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index d75c1ec..7528eae 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -1985,7 +1985,7 @@ static void shrink_zone(struct zone *zone, struct scan_control *sc) > > * whole hierarchy is not sufficient. > > */ > > if (!global_reclaim(sc) && > > - sc->nr_to_reclaim >= sc->nr_reclaimed) { > > + sc->nr_to_reclaim <= sc->nr_reclaimed) { > > This is just a really weird ordering of the operands, isn't it? You > compare the constant to the variable, like if (42 == foo->nr_pages). > > if (sc->nr_reclaimed >= sc->nr_to_reclaim) > > would be less surprising. No objections from me. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-01-30 16:38 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-01-03 18:09 [PATCH -repost] memcg,vmscan: do not break out targeted reclaim without reclaimed pages Michal Hocko 2013-01-03 20:24 ` Andrew Morton 2013-01-04 14:52 ` Michal Hocko 2013-01-29 8:51 ` [PATCH] mmotm: memcgvmscan-do-not-break-out-targeted-reclaim-without-reclaimed-pages.patch fix Michal Hocko 2013-01-30 16:22 ` Johannes Weiner 2013-01-30 16:37 ` Michal Hocko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).