* [patch 1/2 v2]compaction: abort compaction loop if lock is contended or run too long @ 2012-09-10 1:18 Shaohua Li 2012-09-10 8:11 ` Mel Gorman ` (2 more replies) 0 siblings, 3 replies; 18+ messages in thread From: Shaohua Li @ 2012-09-10 1:18 UTC (permalink / raw) To: linux-mm; +Cc: akpm, mgorman, aarcange isolate_migratepages_range() might isolate none pages, for example, when zone->lru_lock is contended and compaction is async. In this case, we should abort compaction, otherwise, compact_zone will run a useless loop and make zone->lru_lock is even contended. V2: only abort the compaction if lock is contended or run too long Rearranged the code by Andrea Arcangeli. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Shaohua Li <shli@fusionio.com> --- mm/compaction.c | 12 +++++++----- mm/internal.h | 2 +- 2 files changed, 8 insertions(+), 6 deletions(-) Index: linux/mm/compaction.c =================================================================== --- linux.orig/mm/compaction.c 2012-09-06 18:37:52.636413761 +0800 +++ linux/mm/compaction.c 2012-09-10 08:49:40.377869710 +0800 @@ -70,8 +70,7 @@ static bool compact_checklock_irqsave(sp /* async aborts if taking too long or contended */ if (!cc->sync) { - if (cc->contended) - *cc->contended = true; + cc->contended = true; return false; } @@ -634,7 +633,7 @@ static isolate_migrate_t isolate_migrate /* Perform the isolation */ low_pfn = isolate_migratepages_range(zone, cc, low_pfn, end_pfn); - if (!low_pfn) + if (!low_pfn || cc->contended) return ISOLATE_ABORT; cc->migrate_pfn = low_pfn; @@ -831,6 +830,7 @@ static unsigned long compact_zone_order( int order, gfp_t gfp_mask, bool sync, bool *contended) { + unsigned long ret; struct compact_control cc = { .nr_freepages = 0, .nr_migratepages = 0, @@ -838,12 +838,14 @@ static unsigned long compact_zone_order( .migratetype = allocflags_to_migratetype(gfp_mask), .zone = zone, .sync = sync, - .contended = contended, }; INIT_LIST_HEAD(&cc.freepages); INIT_LIST_HEAD(&cc.migratepages); - return compact_zone(zone, &cc); + ret = compact_zone(zone, &cc); + if (contended) + *contended = cc.contended; + return ret; } int sysctl_extfrag_threshold = 500; Index: linux/mm/internal.h =================================================================== --- linux.orig/mm/internal.h 2012-09-03 15:16:30.566299444 +0800 +++ linux/mm/internal.h 2012-09-10 08:45:41.980866645 +0800 @@ -131,7 +131,7 @@ struct compact_control { int order; /* order a direct compactor needs */ int migratetype; /* MOVABLE, RECLAIMABLE etc */ struct zone *zone; - bool *contended; /* True if a lock was contended */ + bool contended; /* True if a lock was contended */ }; unsigned long -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch 1/2 v2]compaction: abort compaction loop if lock is contended or run too long 2012-09-10 1:18 [patch 1/2 v2]compaction: abort compaction loop if lock is contended or run too long Shaohua Li @ 2012-09-10 8:11 ` Mel Gorman 2012-09-11 1:45 ` Minchan Kim 2012-09-11 23:34 ` Andrew Morton 2 siblings, 0 replies; 18+ messages in thread From: Mel Gorman @ 2012-09-10 8:11 UTC (permalink / raw) To: Shaohua Li; +Cc: linux-mm, akpm, aarcange On Mon, Sep 10, 2012 at 09:18:30AM +0800, Shaohua Li wrote: > isolate_migratepages_range() might isolate none pages, for example, when > zone->lru_lock is contended and compaction is async. In this case, we should > abort compaction, otherwise, compact_zone will run a useless loop and make > zone->lru_lock is even contended. > > V2: > only abort the compaction if lock is contended or run too long > Rearranged the code by Andrea Arcangeli. > > Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> > Signed-off-by: Shaohua Li <shli@fusionio.com> Acked-by: Mel Gorman <mgorman@suse.de> -- Mel Gorman SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch 1/2 v2]compaction: abort compaction loop if lock is contended or run too long 2012-09-10 1:18 [patch 1/2 v2]compaction: abort compaction loop if lock is contended or run too long Shaohua Li 2012-09-10 8:11 ` Mel Gorman @ 2012-09-11 1:45 ` Minchan Kim 2012-09-11 8:29 ` Shaohua Li 2012-09-11 23:34 ` Andrew Morton 2 siblings, 1 reply; 18+ messages in thread From: Minchan Kim @ 2012-09-11 1:45 UTC (permalink / raw) To: Shaohua Li; +Cc: linux-mm, akpm, mgorman, aarcange On Mon, Sep 10, 2012 at 09:18:30AM +0800, Shaohua Li wrote: > isolate_migratepages_range() might isolate none pages, for example, when > zone->lru_lock is contended and compaction is async. In this case, we should > abort compaction, otherwise, compact_zone will run a useless loop and make > zone->lru_lock is even contended. > As I read old thread, you have the scenario and number. Please include them in this description. > V2: > only abort the compaction if lock is contended or run too long > Rearranged the code by Andrea Arcangeli. > > Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> > Signed-off-by: Shaohua Li <shli@fusionio.com> Acked-by: Minchan Kim <minchan@kernel.org> Other than some nitpicks, looks good to me. > --- > mm/compaction.c | 12 +++++++----- > mm/internal.h | 2 +- > 2 files changed, 8 insertions(+), 6 deletions(-) > > Index: linux/mm/compaction.c > =================================================================== > --- linux.orig/mm/compaction.c 2012-09-06 18:37:52.636413761 +0800 > +++ linux/mm/compaction.c 2012-09-10 08:49:40.377869710 +0800 > @@ -70,8 +70,7 @@ static bool compact_checklock_irqsave(sp > > /* async aborts if taking too long or contended */ > if (!cc->sync) { > - if (cc->contended) > - *cc->contended = true; > + cc->contended = true; > return false; > } > > @@ -634,7 +633,7 @@ static isolate_migrate_t isolate_migrate > > /* Perform the isolation */ > low_pfn = isolate_migratepages_range(zone, cc, low_pfn, end_pfn); > - if (!low_pfn) > + if (!low_pfn || cc->contended) > return ISOLATE_ABORT; > > cc->migrate_pfn = low_pfn; > @@ -831,6 +830,7 @@ static unsigned long compact_zone_order( > int order, gfp_t gfp_mask, > bool sync, bool *contended) > { > + unsigned long ret; > struct compact_control cc = { > .nr_freepages = 0, > .nr_migratepages = 0, > @@ -838,12 +838,14 @@ static unsigned long compact_zone_order( > .migratetype = allocflags_to_migratetype(gfp_mask), > .zone = zone, > .sync = sync, > - .contended = contended, > }; > INIT_LIST_HEAD(&cc.freepages); > INIT_LIST_HEAD(&cc.migratepages); > > - return compact_zone(zone, &cc); > + ret = compact_zone(zone, &cc); > + if (contended) > + *contended = cc.contended; > + return ret; > } > > int sysctl_extfrag_threshold = 500; > Index: linux/mm/internal.h > =================================================================== > --- linux.orig/mm/internal.h 2012-09-03 15:16:30.566299444 +0800 > +++ linux/mm/internal.h 2012-09-10 08:45:41.980866645 +0800 > @@ -131,7 +131,7 @@ struct compact_control { > int order; /* order a direct compactor needs */ > int migratetype; /* MOVABLE, RECLAIMABLE etc */ > struct zone *zone; > - bool *contended; /* True if a lock was contended */ > + bool contended; /* True if a lock was contended */ I would like to add more specific condition. /* True if a lock was contended in case of async compaction */ > }; > > unsigned long > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch 1/2 v2]compaction: abort compaction loop if lock is contended or run too long 2012-09-11 1:45 ` Minchan Kim @ 2012-09-11 8:29 ` Shaohua Li 2012-09-11 8:40 ` Minchan Kim 0 siblings, 1 reply; 18+ messages in thread From: Shaohua Li @ 2012-09-11 8:29 UTC (permalink / raw) To: Minchan Kim; +Cc: linux-mm, akpm, mgorman, aarcange On Tue, Sep 11, 2012 at 10:45:55AM +0900, Minchan Kim wrote: > On Mon, Sep 10, 2012 at 09:18:30AM +0800, Shaohua Li wrote: > > isolate_migratepages_range() might isolate none pages, for example, when > > zone->lru_lock is contended and compaction is async. In this case, we should > > abort compaction, otherwise, compact_zone will run a useless loop and make > > zone->lru_lock is even contended. > > > > As I read old thread, you have the scenario and number. > Please include them in this description. that is just to show how the contention is, not performance data. I thought explaining the contention is enough. How did you think? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch 1/2 v2]compaction: abort compaction loop if lock is contended or run too long 2012-09-11 8:29 ` Shaohua Li @ 2012-09-11 8:40 ` Minchan Kim 0 siblings, 0 replies; 18+ messages in thread From: Minchan Kim @ 2012-09-11 8:40 UTC (permalink / raw) To: Shaohua Li; +Cc: linux-mm, akpm, mgorman, aarcange On Tue, Sep 11, 2012 at 04:29:46PM +0800, Shaohua Li wrote: > On Tue, Sep 11, 2012 at 10:45:55AM +0900, Minchan Kim wrote: > > On Mon, Sep 10, 2012 at 09:18:30AM +0800, Shaohua Li wrote: > > > isolate_migratepages_range() might isolate none pages, for example, when > > > zone->lru_lock is contended and compaction is async. In this case, we should > > > abort compaction, otherwise, compact_zone will run a useless loop and make > > > zone->lru_lock is even contended. > > > > > > > As I read old thread, you have the scenario and number. > > Please include them in this description. > > that is just to show how the contention is, not performance data. I thought > explaining the contention is enough. How did you think? If you measured it with perf, you had data which is percentage of sampling of the lock. It's enough. I meant it, NOT performance data. > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch 1/2 v2]compaction: abort compaction loop if lock is contended or run too long 2012-09-10 1:18 [patch 1/2 v2]compaction: abort compaction loop if lock is contended or run too long Shaohua Li 2012-09-10 8:11 ` Mel Gorman 2012-09-11 1:45 ` Minchan Kim @ 2012-09-11 23:34 ` Andrew Morton 2012-09-12 0:48 ` Andrea Arcangeli 2012-09-12 11:08 ` Mel Gorman 2 siblings, 2 replies; 18+ messages in thread From: Andrew Morton @ 2012-09-11 23:34 UTC (permalink / raw) To: Shaohua Li; +Cc: linux-mm, mgorman, aarcange On Mon, 10 Sep 2012 09:18:30 +0800 Shaohua Li <shli@kernel.org> wrote: > isolate_migratepages_range() might isolate none pages, for example, when > zone->lru_lock is contended and compaction is async. In this case, we should > abort compaction, otherwise, compact_zone will run a useless loop and make > zone->lru_lock is even contended. > > ... > > @@ -838,12 +838,14 @@ static unsigned long compact_zone_order( > .migratetype = allocflags_to_migratetype(gfp_mask), > .zone = zone, > .sync = sync, > - .contended = contended, > }; > INIT_LIST_HEAD(&cc.freepages); > INIT_LIST_HEAD(&cc.migratepages); > > - return compact_zone(zone, &cc); > + ret = compact_zone(zone, &cc); > + if (contended) > + *contended = cc.contended; > + return ret; > } > >From a quick read, `contended' is never NULL here. And defining the interface so that `contended' must be a valid pointer is a good change, IMO - it results in simpler and faster code. Alas, try_to_compact_pages()'s kerneldoc altogether forgets to describe this argument. Mel's mm-compaction-capture-a-suitable-high-order-page-immediately-when-it-is-made-available.patch adds a `pages' arg and forgets to document that as well. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch 1/2 v2]compaction: abort compaction loop if lock is contended or run too long 2012-09-11 23:34 ` Andrew Morton @ 2012-09-12 0:48 ` Andrea Arcangeli 2012-09-12 21:20 ` Andrew Morton 2012-09-12 11:08 ` Mel Gorman 1 sibling, 1 reply; 18+ messages in thread From: Andrea Arcangeli @ 2012-09-12 0:48 UTC (permalink / raw) To: Andrew Morton; +Cc: Shaohua Li, linux-mm, mgorman On Tue, Sep 11, 2012 at 04:34:55PM -0700, Andrew Morton wrote: > On Mon, 10 Sep 2012 09:18:30 +0800 > Shaohua Li <shli@kernel.org> wrote: > > > isolate_migratepages_range() might isolate none pages, for example, when > > zone->lru_lock is contended and compaction is async. In this case, we should > > abort compaction, otherwise, compact_zone will run a useless loop and make > > zone->lru_lock is even contended. > > > > ... > > > > @@ -838,12 +838,14 @@ static unsigned long compact_zone_order( > > .migratetype = allocflags_to_migratetype(gfp_mask), > > .zone = zone, > > .sync = sync, > > - .contended = contended, > > }; > > INIT_LIST_HEAD(&cc.freepages); > > INIT_LIST_HEAD(&cc.migratepages); > > > > - return compact_zone(zone, &cc); > > + ret = compact_zone(zone, &cc); > > + if (contended) > > + *contended = cc.contended; > > + return ret; > > } > > > > From a quick read, `contended' is never NULL here. And defining the "contended" pointer can be null with some caller so the if is needed. The inner code was checking it before. This is also why we couldn't use *contended for the loop break bugfix, because contended could have been null at times. Now cc.contended is always available so we can use that to fix the bug and it's self documenting as well. > interface so that `contended' must be a valid pointer is a good change, > IMO - it results in simpler and faster code. Agreed. > Alas, try_to_compact_pages()'s kerneldoc altogether forgets to describe > this argument. Mel's > mm-compaction-capture-a-suitable-high-order-page-immediately-when-it-is-made-available.patch > adds a `pages' arg and forgets to document that as well. > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch 1/2 v2]compaction: abort compaction loop if lock is contended or run too long 2012-09-12 0:48 ` Andrea Arcangeli @ 2012-09-12 21:20 ` Andrew Morton 2012-09-12 23:48 ` Andrea Arcangeli 2012-09-13 10:03 ` Mel Gorman 0 siblings, 2 replies; 18+ messages in thread From: Andrew Morton @ 2012-09-12 21:20 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: Shaohua Li, linux-mm, mgorman On Wed, 12 Sep 2012 02:48:40 +0200 Andrea Arcangeli <aarcange@redhat.com> wrote: > On Tue, Sep 11, 2012 at 04:34:55PM -0700, Andrew Morton wrote: > > On Mon, 10 Sep 2012 09:18:30 +0800 > > Shaohua Li <shli@kernel.org> wrote: > > > > > isolate_migratepages_range() might isolate none pages, for example, when > > > zone->lru_lock is contended and compaction is async. In this case, we should > > > abort compaction, otherwise, compact_zone will run a useless loop and make > > > zone->lru_lock is even contended. > > > > > > ... > > > > > > @@ -838,12 +838,14 @@ static unsigned long compact_zone_order( > > > .migratetype = allocflags_to_migratetype(gfp_mask), > > > .zone = zone, > > > .sync = sync, > > > - .contended = contended, > > > }; > > > INIT_LIST_HEAD(&cc.freepages); > > > INIT_LIST_HEAD(&cc.migratepages); > > > > > > - return compact_zone(zone, &cc); > > > + ret = compact_zone(zone, &cc); > > > + if (contended) > > > + *contended = cc.contended; > > > + return ret; > > > } > > > > > > > From a quick read, `contended' is never NULL here. And defining the > > "contended" pointer can be null with some caller so the if is > needed. The inner code was checking it before. This is also why we > couldn't use *contended for the loop break bugfix, because contended > could have been null at times. Confused. I can see only two call sites: __alloc_pages_slowpath ->__alloc_pages_direct_compact ->try_to_compact_pages ->compact_zone_order and in both cases, `contended' points at valid storage. > Now cc.contended is always available so > we can use that to fix the bug and it's self documenting as well. > > > interface so that `contended' must be a valid pointer is a good change, > > IMO - it results in simpler and faster code. > > Agreed. OK, I'll slip this in there: --- a/mm/compaction.c~mm-compaction-abort-compaction-loop-if-lock-is-contended-or-run-too-long-fix +++ a/mm/compaction.c @@ -909,8 +909,7 @@ static unsigned long compact_zone_order( INIT_LIST_HEAD(&cc.migratepages); ret = compact_zone(zone, &cc); - if (contended) - *contended = cc.contended; + *contended = cc.contended; return ret; } > > Alas, try_to_compact_pages()'s kerneldoc altogether forgets to describe > > this argument. Mel's > > mm-compaction-capture-a-suitable-high-order-page-immediately-when-it-is-made-available.patch > > adds a `pages' arg and forgets to document that as well. poke poke -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch 1/2 v2]compaction: abort compaction loop if lock is contended or run too long 2012-09-12 21:20 ` Andrew Morton @ 2012-09-12 23:48 ` Andrea Arcangeli 2012-09-13 0:47 ` Minchan Kim 2012-09-13 10:03 ` Mel Gorman 1 sibling, 1 reply; 18+ messages in thread From: Andrea Arcangeli @ 2012-09-12 23:48 UTC (permalink / raw) To: Andrew Morton; +Cc: Shaohua Li, linux-mm, mgorman On Wed, Sep 12, 2012 at 02:20:19PM -0700, Andrew Morton wrote: > OK, I'll slip this in there: > > --- a/mm/compaction.c~mm-compaction-abort-compaction-loop-if-lock-is-contended-or-run-too-long-fix > +++ a/mm/compaction.c > @@ -909,8 +909,7 @@ static unsigned long compact_zone_order( > INIT_LIST_HEAD(&cc.migratepages); > > ret = compact_zone(zone, &cc); > - if (contended) > - *contended = cc.contended; > + *contended = cc.contended; > return ret; > } Ack the above, thanks. One more thing, today a bug tripped while building cyanogenmod10 (it swaps despite so much ram) after I added the cc->contended loop break patch. The original version of the fix from Shaohua didn't have this problem because it would only abort compaction if the low_pfn didn't advance and in turn the list would be guaranteed empty. Verifying the list is empty before aborting compaction (which takes a path that ignores the cc->migratelist) should be enough to fix it and it makes it really equivalent to the previous fix. Both cachelines should be cache hot so it should be practically zero cost to check it. Only lightly tested so far. === ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch 1/2 v2]compaction: abort compaction loop if lock is contended or run too long 2012-09-12 23:48 ` Andrea Arcangeli @ 2012-09-13 0:47 ` Minchan Kim 2012-09-13 2:49 ` Shaohua Li 2012-09-13 9:38 ` Mel Gorman 0 siblings, 2 replies; 18+ messages in thread From: Minchan Kim @ 2012-09-13 0:47 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: Andrew Morton, Shaohua Li, linux-mm, mgorman Hi Andrea, On Thu, Sep 13, 2012 at 01:48:08AM +0200, Andrea Arcangeli wrote: > On Wed, Sep 12, 2012 at 02:20:19PM -0700, Andrew Morton wrote: > > OK, I'll slip this in there: > > > > --- a/mm/compaction.c~mm-compaction-abort-compaction-loop-if-lock-is-contended-or-run-too-long-fix > > +++ a/mm/compaction.c > > @@ -909,8 +909,7 @@ static unsigned long compact_zone_order( > > INIT_LIST_HEAD(&cc.migratepages); > > > > ret = compact_zone(zone, &cc); > > - if (contended) > > - *contended = cc.contended; > > + *contended = cc.contended; > > return ret; > > } > > Ack the above, thanks. > > One more thing, today a bug tripped while building cyanogenmod10 (it > swaps despite so much ram) after I added the cc->contended loop break > patch. The original version of the fix from Shaohua didn't have this > problem because it would only abort compaction if the low_pfn didn't > advance and in turn the list would be guaranteed empty. Nice catch! > > Verifying the list is empty before aborting compaction (which takes a > path that ignores the cc->migratelist) should be enough to fix it and > it makes it really equivalent to the previous fix. Both cachelines > should be cache hot so it should be practically zero cost to check it. > > Only lightly tested so far. > > === > >From b2a50e49d65596d3920773316ad9b7dd54e4acaf Mon Sep 17 00:00:00 2001 > From: Andrea Arcangeli <aarcange@redhat.com> > Date: Thu, 13 Sep 2012 01:22:03 +0200 > Subject: [PATCH] mm: compaction: fix leak in cc->contended loop breaking > logic > > We cannot return ISOLATE_ABORT when cc->contended is true, if we have > some pages already successfully isolated in the cc->migratepages > list, or they will be leaked. > > The bug was highlighted by a nice VM_BUG_ON in the async compaction in > kswapd. So I also added the symmetric VM_BUG_ON to the other caller of > the function considering it looks a worthwhile VM_BUG_ON. Fair enough. > > ------------[ cut here ]------------ > kernel BUG at mm/compaction.c:934! > invalid opcode: 0000 [#1] SMP > Modules linked in: tun usbhid kvm_intel xhci_hcd kvm snd_hda_codec_realtek ehci_hcd usbcore snd_hda_intel sn > er crc32c_intel psmouse ghash_clmulni_intel sr_mod snd sg cdrom snd_page_alloc usb_common pcspkr [last unloa > > CPU 0 > Pid: 513, comm: kswapd0 Not tainted 3.6.0-rc4+ #17 /DH61BE > RIP: 0010:[<ffffffff8111302c>] [<ffffffff8111302c>] __compact_pgdat+0x1ac/0x1b0 > RSP: 0018:ffff880216fa5cb0 EFLAGS: 00010283 > RAX: 0000000000000003 RBX: ffff880216fa5d00 RCX: 0000000000000002 > RDX: 00000000000008d7 RSI: 0000000000000002 RDI: ffffffff8195b058 > RBP: ffffffff8195b000 R08: 0000000000000be4 R09: ffffffff8195a9c0 > R10: ffffffff8195b400 R11: ffffffff8195b570 R12: 0000000000000001 > R13: 0000000000000001 R14: ffff880216fa5d10 R15: 0000000000000003 > FS: 0000000000000000(0000) GS:ffff88021fa00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 00007f14d4167000 CR3: 00000000018f1000 CR4: 00000000000407f0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process kswapd0 (pid: 513, threadinfo ffff880216fa4000, task ffff880216cfef20) > Stack: > ffffffff8195a9c0 ffffffff8195b000 0000000000000320 0000000000000003 > ffffffff8195a9c0 ffffffff8195b640 0000000000000002 0000000000000c80 > 0000000000000001 ffffffff811132f3 ffff880216fa5d00 ffff880216fa5d00 > Call Trace: > [<ffffffff811132f3>] ? compact_pgdat+0x23/0x30 > [<ffffffff8110503f>] ? kswapd+0x89f/0xac0 > [<ffffffff8106f450>] ? wake_up_bit+0x40/0x40 > [<ffffffff811047a0>] ? shrink_lruvec+0x510/0x510 > [<ffffffff811047a0>] ? shrink_lruvec+0x510/0x510 > [<ffffffff8106ef1e>] ? kthread+0x9e/0xb0 > > Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> > --- > mm/compaction.c | 6 +++++- > 1 files changed, 5 insertions(+), 1 deletions(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > index 6066a35..0292984 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -633,7 +633,7 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone, > > /* Perform the isolation */ > low_pfn = isolate_migratepages_range(zone, cc, low_pfn, end_pfn); > - if (!low_pfn || cc->contended) > + if (!low_pfn || (cc->contended && !cc->nr_migratepages)) > return ISOLATE_ABORT; I'm not sure it's best. As you mentioned, it's same with first version of Shaohua. But it could mitigate the goal of the patch if lock contention or need_resched happens in the middle of loop once we isolate a migratable page. What do you think about this? diff --git a/mm/compaction.c b/mm/compaction.c index 0fbc6b7..7a009dd 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -848,6 +848,10 @@ static int compact_zone(struct zone *zone, struct compact_control *cc) switch (isolate_migratepages(zone, cc)) { case ISOLATE_ABORT: ret = COMPACT_PARTIAL; + if (!list_empty(&cc->migratepages)) { + putback_lru_pages(&cc->migratepages); + cc->nr_migratepages = 0; + } goto out; case ISOLATE_NONE: continue; > > cc->migrate_pfn = low_pfn; > @@ -843,6 +843,10 @@ static unsigned long compact_zone_order(struct zone *zone, > INIT_LIST_HEAD(&cc.migratepages); > > ret = compact_zone(zone, &cc); > + > + VM_BUG_ON(!list_empty(&cc.freepages)); > + VM_BUG_ON(!list_empty(&cc.migratepages)); > + > *contended = cc.contended; > return ret; > } > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [patch 1/2 v2]compaction: abort compaction loop if lock is contended or run too long 2012-09-13 0:47 ` Minchan Kim @ 2012-09-13 2:49 ` Shaohua Li 2012-09-13 9:38 ` Mel Gorman 1 sibling, 0 replies; 18+ messages in thread From: Shaohua Li @ 2012-09-13 2:49 UTC (permalink / raw) To: Minchan Kim; +Cc: Andrea Arcangeli, Andrew Morton, linux-mm, mgorman On Thu, Sep 13, 2012 at 09:47:22AM +0900, Minchan Kim wrote: > Hi Andrea, > > On Thu, Sep 13, 2012 at 01:48:08AM +0200, Andrea Arcangeli wrote: > > On Wed, Sep 12, 2012 at 02:20:19PM -0700, Andrew Morton wrote: > > > OK, I'll slip this in there: > > > > > > --- a/mm/compaction.c~mm-compaction-abort-compaction-loop-if-lock-is-contended-or-run-too-long-fix > > > +++ a/mm/compaction.c > > > @@ -909,8 +909,7 @@ static unsigned long compact_zone_order( > > > INIT_LIST_HEAD(&cc.migratepages); > > > > > > ret = compact_zone(zone, &cc); > > > - if (contended) > > > - *contended = cc.contended; > > > + *contended = cc.contended; > > > return ret; > > > } > > > > Ack the above, thanks. > > > > One more thing, today a bug tripped while building cyanogenmod10 (it > > swaps despite so much ram) after I added the cc->contended loop break > > patch. The original version of the fix from Shaohua didn't have this > > problem because it would only abort compaction if the low_pfn didn't > > advance and in turn the list would be guaranteed empty. > > Nice catch! > > > > > Verifying the list is empty before aborting compaction (which takes a > > path that ignores the cc->migratelist) should be enough to fix it and > > it makes it really equivalent to the previous fix. Both cachelines > > should be cache hot so it should be practically zero cost to check it. > > > > Only lightly tested so far. > > > > === > > >From b2a50e49d65596d3920773316ad9b7dd54e4acaf Mon Sep 17 00:00:00 2001 > > From: Andrea Arcangeli <aarcange@redhat.com> > > Date: Thu, 13 Sep 2012 01:22:03 +0200 > > Subject: [PATCH] mm: compaction: fix leak in cc->contended loop breaking > > logic > > > > We cannot return ISOLATE_ABORT when cc->contended is true, if we have > > some pages already successfully isolated in the cc->migratepages > > list, or they will be leaked. > > > > The bug was highlighted by a nice VM_BUG_ON in the async compaction in > > kswapd. So I also added the symmetric VM_BUG_ON to the other caller of > > the function considering it looks a worthwhile VM_BUG_ON. > > Fair enough. > > > > > ------------[ cut here ]------------ > > kernel BUG at mm/compaction.c:934! > > invalid opcode: 0000 [#1] SMP > > Modules linked in: tun usbhid kvm_intel xhci_hcd kvm snd_hda_codec_realtek ehci_hcd usbcore snd_hda_intel sn > > er crc32c_intel psmouse ghash_clmulni_intel sr_mod snd sg cdrom snd_page_alloc usb_common pcspkr [last unloa > > > > CPU 0 > > Pid: 513, comm: kswapd0 Not tainted 3.6.0-rc4+ #17 /DH61BE > > RIP: 0010:[<ffffffff8111302c>] [<ffffffff8111302c>] __compact_pgdat+0x1ac/0x1b0 > > RSP: 0018:ffff880216fa5cb0 EFLAGS: 00010283 > > RAX: 0000000000000003 RBX: ffff880216fa5d00 RCX: 0000000000000002 > > RDX: 00000000000008d7 RSI: 0000000000000002 RDI: ffffffff8195b058 > > RBP: ffffffff8195b000 R08: 0000000000000be4 R09: ffffffff8195a9c0 > > R10: ffffffff8195b400 R11: ffffffff8195b570 R12: 0000000000000001 > > R13: 0000000000000001 R14: ffff880216fa5d10 R15: 0000000000000003 > > FS: 0000000000000000(0000) GS:ffff88021fa00000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > CR2: 00007f14d4167000 CR3: 00000000018f1000 CR4: 00000000000407f0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > Process kswapd0 (pid: 513, threadinfo ffff880216fa4000, task ffff880216cfef20) > > Stack: > > ffffffff8195a9c0 ffffffff8195b000 0000000000000320 0000000000000003 > > ffffffff8195a9c0 ffffffff8195b640 0000000000000002 0000000000000c80 > > 0000000000000001 ffffffff811132f3 ffff880216fa5d00 ffff880216fa5d00 > > Call Trace: > > [<ffffffff811132f3>] ? compact_pgdat+0x23/0x30 > > [<ffffffff8110503f>] ? kswapd+0x89f/0xac0 > > [<ffffffff8106f450>] ? wake_up_bit+0x40/0x40 > > [<ffffffff811047a0>] ? shrink_lruvec+0x510/0x510 > > [<ffffffff811047a0>] ? shrink_lruvec+0x510/0x510 > > [<ffffffff8106ef1e>] ? kthread+0x9e/0xb0 > > > > Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> > > --- > > mm/compaction.c | 6 +++++- > > 1 files changed, 5 insertions(+), 1 deletions(-) > > > > diff --git a/mm/compaction.c b/mm/compaction.c > > index 6066a35..0292984 100644 > > --- a/mm/compaction.c > > +++ b/mm/compaction.c > > @@ -633,7 +633,7 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone, > > > > /* Perform the isolation */ > > low_pfn = isolate_migratepages_range(zone, cc, low_pfn, end_pfn); > > - if (!low_pfn || cc->contended) > > + if (!low_pfn || (cc->contended && !cc->nr_migratepages)) > > return ISOLATE_ABORT; > > I'm not sure it's best. > As you mentioned, it's same with first version of Shaohua. > But it could mitigate the goal of the patch if lock contention or > need_resched happens in the middle of loop once we isolate a > migratable page. > > What do you think about this? Thanks for catching this issue. Both looks sane, but I would vote for Andrea's. If some pages are isolated, better we use them. Thanks, Shaohua -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch 1/2 v2]compaction: abort compaction loop if lock is contended or run too long 2012-09-13 0:47 ` Minchan Kim 2012-09-13 2:49 ` Shaohua Li @ 2012-09-13 9:38 ` Mel Gorman 2012-09-13 10:13 ` Shaohua Li 2012-09-13 16:04 ` Andrea Arcangeli 1 sibling, 2 replies; 18+ messages in thread From: Mel Gorman @ 2012-09-13 9:38 UTC (permalink / raw) To: Minchan Kim; +Cc: Andrea Arcangeli, Andrew Morton, Shaohua Li, linux-mm On Thu, Sep 13, 2012 at 09:47:22AM +0900, Minchan Kim wrote: > Hi Andrea, > > On Thu, Sep 13, 2012 at 01:48:08AM +0200, Andrea Arcangeli wrote: > > On Wed, Sep 12, 2012 at 02:20:19PM -0700, Andrew Morton wrote: > > > OK, I'll slip this in there: > > > > > > --- a/mm/compaction.c~mm-compaction-abort-compaction-loop-if-lock-is-contended-or-run-too-long-fix > > > +++ a/mm/compaction.c > > > @@ -909,8 +909,7 @@ static unsigned long compact_zone_order( > > > INIT_LIST_HEAD(&cc.migratepages); > > > > > > ret = compact_zone(zone, &cc); > > > - if (contended) > > > - *contended = cc.contended; > > > + *contended = cc.contended; > > > return ret; > > > } > > > > Ack the above, thanks. > > > > One more thing, today a bug tripped while building cyanogenmod10 (it > > swaps despite so much ram) after I added the cc->contended loop break > > patch. The original version of the fix from Shaohua didn't have this > > problem because it would only abort compaction if the low_pfn didn't > > advance and in turn the list would be guaranteed empty. > > Nice catch! > > > > > Verifying the list is empty before aborting compaction (which takes a > > path that ignores the cc->migratelist) should be enough to fix it and > > it makes it really equivalent to the previous fix. Both cachelines > > should be cache hot so it should be practically zero cost to check it. > > > > Only lightly tested so far. > > > > === > > >From b2a50e49d65596d3920773316ad9b7dd54e4acaf Mon Sep 17 00:00:00 2001 > > From: Andrea Arcangeli <aarcange@redhat.com> > > Date: Thu, 13 Sep 2012 01:22:03 +0200 > > Subject: [PATCH] mm: compaction: fix leak in cc->contended loop breaking > > logic > > > > We cannot return ISOLATE_ABORT when cc->contended is true, if we have > > some pages already successfully isolated in the cc->migratepages > > list, or they will be leaked. > > > > The bug was highlighted by a nice VM_BUG_ON in the async compaction in > > kswapd. So I also added the symmetric VM_BUG_ON to the other caller of > > the function considering it looks a worthwhile VM_BUG_ON. > > Fair enough. > > > > > ------------[ cut here ]------------ > > kernel BUG at mm/compaction.c:934! > > invalid opcode: 0000 [#1] SMP > > Modules linked in: tun usbhid kvm_intel xhci_hcd kvm snd_hda_codec_realtek ehci_hcd usbcore snd_hda_intel sn > > er crc32c_intel psmouse ghash_clmulni_intel sr_mod snd sg cdrom snd_page_alloc usb_common pcspkr [last unloa > > > > CPU 0 > > Pid: 513, comm: kswapd0 Not tainted 3.6.0-rc4+ #17 /DH61BE > > RIP: 0010:[<ffffffff8111302c>] [<ffffffff8111302c>] __compact_pgdat+0x1ac/0x1b0 > > RSP: 0018:ffff880216fa5cb0 EFLAGS: 00010283 > > RAX: 0000000000000003 RBX: ffff880216fa5d00 RCX: 0000000000000002 > > RDX: 00000000000008d7 RSI: 0000000000000002 RDI: ffffffff8195b058 > > RBP: ffffffff8195b000 R08: 0000000000000be4 R09: ffffffff8195a9c0 > > R10: ffffffff8195b400 R11: ffffffff8195b570 R12: 0000000000000001 > > R13: 0000000000000001 R14: ffff880216fa5d10 R15: 0000000000000003 > > FS: 0000000000000000(0000) GS:ffff88021fa00000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > CR2: 00007f14d4167000 CR3: 00000000018f1000 CR4: 00000000000407f0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > Process kswapd0 (pid: 513, threadinfo ffff880216fa4000, task ffff880216cfef20) > > Stack: > > ffffffff8195a9c0 ffffffff8195b000 0000000000000320 0000000000000003 > > ffffffff8195a9c0 ffffffff8195b640 0000000000000002 0000000000000c80 > > 0000000000000001 ffffffff811132f3 ffff880216fa5d00 ffff880216fa5d00 > > Call Trace: > > [<ffffffff811132f3>] ? compact_pgdat+0x23/0x30 > > [<ffffffff8110503f>] ? kswapd+0x89f/0xac0 > > [<ffffffff8106f450>] ? wake_up_bit+0x40/0x40 > > [<ffffffff811047a0>] ? shrink_lruvec+0x510/0x510 > > [<ffffffff811047a0>] ? shrink_lruvec+0x510/0x510 > > [<ffffffff8106ef1e>] ? kthread+0x9e/0xb0 > > > > Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> > > --- > > mm/compaction.c | 6 +++++- > > 1 files changed, 5 insertions(+), 1 deletions(-) > > > > diff --git a/mm/compaction.c b/mm/compaction.c > > index 6066a35..0292984 100644 > > --- a/mm/compaction.c > > +++ b/mm/compaction.c > > @@ -633,7 +633,7 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone, > > > > /* Perform the isolation */ > > low_pfn = isolate_migratepages_range(zone, cc, low_pfn, end_pfn); > > - if (!low_pfn || cc->contended) > > + if (!low_pfn || (cc->contended && !cc->nr_migratepages)) > > return ISOLATE_ABORT; > > I'm not sure it's best. > As you mentioned, it's same with first version of Shaohua. > But it could mitigate the goal of the patch if lock contention or > need_resched happens in the middle of loop once we isolate a > migratable page. > > What do you think about this? > > diff --git a/mm/compaction.c b/mm/compaction.c > index 0fbc6b7..7a009dd 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -848,6 +848,10 @@ static int compact_zone(struct zone *zone, struct compact_control *cc) > switch (isolate_migratepages(zone, cc)) { > case ISOLATE_ABORT: > ret = COMPACT_PARTIAL; > + if (!list_empty(&cc->migratepages)) { > + putback_lru_pages(&cc->migratepages); > + cc->nr_migratepages = 0; > + } > goto out; > case ISOLATE_NONE: > continue; > I agree with Minchan. Andrea's patch ignores the fact that free page isolation might have aborted due to lock contention. It's not necessarily going to be isolating the pages it needs for migration. > > > > > cc->migrate_pfn = low_pfn; > > @@ -843,6 +843,10 @@ static unsigned long compact_zone_order(struct zone *zone, > > INIT_LIST_HEAD(&cc.migratepages); > > > > ret = compact_zone(zone, &cc); > > + > > + VM_BUG_ON(!list_empty(&cc.freepages)); > > + VM_BUG_ON(!list_empty(&cc.migratepages)); > > + > > *contended = cc.contended; > > return ret; > > } > > -- Mel Gorman SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch 1/2 v2]compaction: abort compaction loop if lock is contended or run too long 2012-09-13 9:38 ` Mel Gorman @ 2012-09-13 10:13 ` Shaohua Li 2012-09-13 16:04 ` Andrea Arcangeli 1 sibling, 0 replies; 18+ messages in thread From: Shaohua Li @ 2012-09-13 10:13 UTC (permalink / raw) To: Mel Gorman; +Cc: Minchan Kim, Andrea Arcangeli, Andrew Morton, linux-mm On Thu, Sep 13, 2012 at 10:38:26AM +0100, Mel Gorman wrote: > On Thu, Sep 13, 2012 at 09:47:22AM +0900, Minchan Kim wrote: > > Hi Andrea, > > > > On Thu, Sep 13, 2012 at 01:48:08AM +0200, Andrea Arcangeli wrote: > > > On Wed, Sep 12, 2012 at 02:20:19PM -0700, Andrew Morton wrote: > > > > OK, I'll slip this in there: > > > > > > > > --- a/mm/compaction.c~mm-compaction-abort-compaction-loop-if-lock-is-contended-or-run-too-long-fix > > > > +++ a/mm/compaction.c > > > > @@ -909,8 +909,7 @@ static unsigned long compact_zone_order( > > > > INIT_LIST_HEAD(&cc.migratepages); > > > > > > > > ret = compact_zone(zone, &cc); > > > > - if (contended) > > > > - *contended = cc.contended; > > > > + *contended = cc.contended; > > > > return ret; > > > > } > > > > > > Ack the above, thanks. > > > > > > One more thing, today a bug tripped while building cyanogenmod10 (it > > > swaps despite so much ram) after I added the cc->contended loop break > > > patch. The original version of the fix from Shaohua didn't have this > > > problem because it would only abort compaction if the low_pfn didn't > > > advance and in turn the list would be guaranteed empty. > > > > Nice catch! > > > > > > > > Verifying the list is empty before aborting compaction (which takes a > > > path that ignores the cc->migratelist) should be enough to fix it and > > > it makes it really equivalent to the previous fix. Both cachelines > > > should be cache hot so it should be practically zero cost to check it. > > > > > > Only lightly tested so far. > > > > > > === > > > >From b2a50e49d65596d3920773316ad9b7dd54e4acaf Mon Sep 17 00:00:00 2001 > > > From: Andrea Arcangeli <aarcange@redhat.com> > > > Date: Thu, 13 Sep 2012 01:22:03 +0200 > > > Subject: [PATCH] mm: compaction: fix leak in cc->contended loop breaking > > > logic > > > > > > We cannot return ISOLATE_ABORT when cc->contended is true, if we have > > > some pages already successfully isolated in the cc->migratepages > > > list, or they will be leaked. > > > > > > The bug was highlighted by a nice VM_BUG_ON in the async compaction in > > > kswapd. So I also added the symmetric VM_BUG_ON to the other caller of > > > the function considering it looks a worthwhile VM_BUG_ON. > > > > Fair enough. > > > > > > > > ------------[ cut here ]------------ > > > kernel BUG at mm/compaction.c:934! > > > invalid opcode: 0000 [#1] SMP > > > Modules linked in: tun usbhid kvm_intel xhci_hcd kvm snd_hda_codec_realtek ehci_hcd usbcore snd_hda_intel sn > > > er crc32c_intel psmouse ghash_clmulni_intel sr_mod snd sg cdrom snd_page_alloc usb_common pcspkr [last unloa > > > > > > CPU 0 > > > Pid: 513, comm: kswapd0 Not tainted 3.6.0-rc4+ #17 /DH61BE > > > RIP: 0010:[<ffffffff8111302c>] [<ffffffff8111302c>] __compact_pgdat+0x1ac/0x1b0 > > > RSP: 0018:ffff880216fa5cb0 EFLAGS: 00010283 > > > RAX: 0000000000000003 RBX: ffff880216fa5d00 RCX: 0000000000000002 > > > RDX: 00000000000008d7 RSI: 0000000000000002 RDI: ffffffff8195b058 > > > RBP: ffffffff8195b000 R08: 0000000000000be4 R09: ffffffff8195a9c0 > > > R10: ffffffff8195b400 R11: ffffffff8195b570 R12: 0000000000000001 > > > R13: 0000000000000001 R14: ffff880216fa5d10 R15: 0000000000000003 > > > FS: 0000000000000000(0000) GS:ffff88021fa00000(0000) knlGS:0000000000000000 > > > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > > CR2: 00007f14d4167000 CR3: 00000000018f1000 CR4: 00000000000407f0 > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > > Process kswapd0 (pid: 513, threadinfo ffff880216fa4000, task ffff880216cfef20) > > > Stack: > > > ffffffff8195a9c0 ffffffff8195b000 0000000000000320 0000000000000003 > > > ffffffff8195a9c0 ffffffff8195b640 0000000000000002 0000000000000c80 > > > 0000000000000001 ffffffff811132f3 ffff880216fa5d00 ffff880216fa5d00 > > > Call Trace: > > > [<ffffffff811132f3>] ? compact_pgdat+0x23/0x30 > > > [<ffffffff8110503f>] ? kswapd+0x89f/0xac0 > > > [<ffffffff8106f450>] ? wake_up_bit+0x40/0x40 > > > [<ffffffff811047a0>] ? shrink_lruvec+0x510/0x510 > > > [<ffffffff811047a0>] ? shrink_lruvec+0x510/0x510 > > > [<ffffffff8106ef1e>] ? kthread+0x9e/0xb0 > > > > > > Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> > > > --- > > > mm/compaction.c | 6 +++++- > > > 1 files changed, 5 insertions(+), 1 deletions(-) > > > > > > diff --git a/mm/compaction.c b/mm/compaction.c > > > index 6066a35..0292984 100644 > > > --- a/mm/compaction.c > > > +++ b/mm/compaction.c > > > @@ -633,7 +633,7 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone, > > > > > > /* Perform the isolation */ > > > low_pfn = isolate_migratepages_range(zone, cc, low_pfn, end_pfn); > > > - if (!low_pfn || cc->contended) > > > + if (!low_pfn || (cc->contended && !cc->nr_migratepages)) > > > return ISOLATE_ABORT; > > > > I'm not sure it's best. > > As you mentioned, it's same with first version of Shaohua. > > But it could mitigate the goal of the patch if lock contention or > > need_resched happens in the middle of loop once we isolate a > > migratable page. > > > > What do you think about this? > > > > diff --git a/mm/compaction.c b/mm/compaction.c > > index 0fbc6b7..7a009dd 100644 > > --- a/mm/compaction.c > > +++ b/mm/compaction.c > > @@ -848,6 +848,10 @@ static int compact_zone(struct zone *zone, struct compact_control *cc) > > switch (isolate_migratepages(zone, cc)) { > > case ISOLATE_ABORT: > > ret = COMPACT_PARTIAL; > > + if (!list_empty(&cc->migratepages)) { > > + putback_lru_pages(&cc->migratepages); > > + cc->nr_migratepages = 0; > > + } > > goto out; > > case ISOLATE_NONE: > > continue; > > > > I agree with Minchan. Andrea's patch ignores the fact that free page > isolation might have aborted due to lock contention. It's not necessarily > going to be isolating the pages it needs for migration. So it's the free page isolation abort, I misunderstood Minchan's point. Thanks for clarifying. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch 1/2 v2]compaction: abort compaction loop if lock is contended or run too long 2012-09-13 9:38 ` Mel Gorman 2012-09-13 10:13 ` Shaohua Li @ 2012-09-13 16:04 ` Andrea Arcangeli 2012-09-13 16:31 ` Mel Gorman 1 sibling, 1 reply; 18+ messages in thread From: Andrea Arcangeli @ 2012-09-13 16:04 UTC (permalink / raw) To: Mel Gorman; +Cc: Minchan Kim, Andrew Morton, Shaohua Li, linux-mm On Thu, Sep 13, 2012 at 10:38:26AM +0100, Mel Gorman wrote: > I agree with Minchan. Andrea's patch ignores the fact that free page > isolation might have aborted due to lock contention. It's not necessarily > going to be isolating the pages it needs for migration. Actually I thought of calling putback_lru_pages first, but then I thought it was better to just complete the current slice. Note that putback_lru_pages can take the lru_lock immediately too when the pagevec gets full which won't work any better than if the cc->contended was set by the freepages isolation and we do migrate_pages. There's no way to abort lockless from that point, so I think it's better to take the last locks to finish the current slice of work and then abort if it's still contended (which confirms we're really trashing). Skipping isolated pages without rewinding low_pfn would also reduce compaction reliability so that should be evaluated as well. And rewinding with the putback_lru_pages would risk livelocks. I agree Minchan's patch would fix the problem too, and this should be a fairly uncommon path so either ways shouldn't make a noticeable difference. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch 1/2 v2]compaction: abort compaction loop if lock is contended or run too long 2012-09-13 16:04 ` Andrea Arcangeli @ 2012-09-13 16:31 ` Mel Gorman 2012-09-13 17:11 ` Andrea Arcangeli 0 siblings, 1 reply; 18+ messages in thread From: Mel Gorman @ 2012-09-13 16:31 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: Minchan Kim, Andrew Morton, Shaohua Li, linux-mm On Thu, Sep 13, 2012 at 06:04:32PM +0200, Andrea Arcangeli wrote: > On Thu, Sep 13, 2012 at 10:38:26AM +0100, Mel Gorman wrote: > > I agree with Minchan. Andrea's patch ignores the fact that free page > > isolation might have aborted due to lock contention. It's not necessarily > > going to be isolating the pages it needs for migration. > > Actually I thought of calling putback_lru_pages first, but then I > thought it was better to just complete the current slice. > Unfortunately that will end up calling compaction_alloc() -> isolate_freepages and probably end up contending again. > Note that putback_lru_pages can take the lru_lock immediately too when True, but in that case there is no choice in the matter. We can't just leak the pages. > the pagevec gets full which won't work any better than if the > cc->contended was set by the freepages isolation and we do > migrate_pages. > > There's no way to abort lockless from that point, so I think it's > better to take the last locks to finish the current slice of work and > then abort if it's still contended (which confirms we're really > trashing). > To me, that will just contend more than we have to. We're aborting compaction and finishing off the current slice will not make any meaningful difference to whether tha allocation succeeds or not. > Skipping isolated pages without rewinding low_pfn would also reduce > compaction reliability so that should be evaluated as well. And > rewinding with the putback_lru_pages would risk livelocks. > > I agree Minchan's patch would fix the problem too, and this should be > a fairly uncommon path so either ways shouldn't make a noticeable > difference. -- Mel Gorman SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch 1/2 v2]compaction: abort compaction loop if lock is contended or run too long 2012-09-13 16:31 ` Mel Gorman @ 2012-09-13 17:11 ` Andrea Arcangeli 0 siblings, 0 replies; 18+ messages in thread From: Andrea Arcangeli @ 2012-09-13 17:11 UTC (permalink / raw) To: Mel Gorman; +Cc: Minchan Kim, Andrew Morton, Shaohua Li, linux-mm On Thu, Sep 13, 2012 at 05:31:51PM +0100, Mel Gorman wrote: > On Thu, Sep 13, 2012 at 06:04:32PM +0200, Andrea Arcangeli wrote: > > On Thu, Sep 13, 2012 at 10:38:26AM +0100, Mel Gorman wrote: > > > I agree with Minchan. Andrea's patch ignores the fact that free page > > > isolation might have aborted due to lock contention. It's not necessarily > > > going to be isolating the pages it needs for migration. > > > > Actually I thought of calling putback_lru_pages first, but then I > > thought it was better to just complete the current slice. > > > > Unfortunately that will end up calling compaction_alloc() -> > isolate_freepages and probably end up contending again. > > > Note that putback_lru_pages can take the lru_lock immediately too when > > True, but in that case there is no choice in the matter. We can't just > leak the pages. This is why in that case (if the contention was generated by the lru_lock) we would be better off to go ahead and do migrate_pages. We could track contended_lru_lock and contended_zone_lock separately to know if it's that case or not, but then I doubt it matters that much. > To me, that will just contend more than we have to. We're aborting compaction > and finishing off the current slice will not make any meaningful difference > to whether tha allocation succeeds or not. If you prefer the putback_lru_pages I'm fine, I only wanted to clarify neither of the two solutions is going to do the optimal thing at all times. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch 1/2 v2]compaction: abort compaction loop if lock is contended or run too long 2012-09-12 21:20 ` Andrew Morton 2012-09-12 23:48 ` Andrea Arcangeli @ 2012-09-13 10:03 ` Mel Gorman 1 sibling, 0 replies; 18+ messages in thread From: Mel Gorman @ 2012-09-13 10:03 UTC (permalink / raw) To: Andrew Morton Cc: Andrea Arcangeli, Shaohua Li, linux-mm, minchan, Richard Davies On Wed, Sep 12, 2012 at 02:20:19PM -0700, Andrew Morton wrote: > On Wed, 12 Sep 2012 02:48:40 +0200 > Andrea Arcangeli <aarcange@redhat.com> wrote: > > > On Tue, Sep 11, 2012 at 04:34:55PM -0700, Andrew Morton wrote: > > > On Mon, 10 Sep 2012 09:18:30 +0800 > > > Shaohua Li <shli@kernel.org> wrote: > > > > > > > isolate_migratepages_range() might isolate none pages, for example, when > > > > zone->lru_lock is contended and compaction is async. In this case, we should > > > > abort compaction, otherwise, compact_zone will run a useless loop and make > > > > zone->lru_lock is even contended. > > > > > > > > ... > > > > > > > > @@ -838,12 +838,14 @@ static unsigned long compact_zone_order( > > > > .migratetype = allocflags_to_migratetype(gfp_mask), > > > > .zone = zone, > > > > .sync = sync, > > > > - .contended = contended, > > > > }; > > > > INIT_LIST_HEAD(&cc.freepages); > > > > INIT_LIST_HEAD(&cc.migratepages); > > > > > > > > - return compact_zone(zone, &cc); > > > > + ret = compact_zone(zone, &cc); > > > > + if (contended) > > > > + *contended = cc.contended; > > > > + return ret; > > > > } > > > > > > > > > > From a quick read, `contended' is never NULL here. And defining the > > > > "contended" pointer can be null with some caller so the if is > > needed. The inner code was checking it before. This is also why we > > couldn't use *contended for the loop break bugfix, because contended > > could have been null at times. > > Confused. I can see only two call sites: > __alloc_pages_slowpath > ->__alloc_pages_direct_compact > ->try_to_compact_pages > ->compact_zone_order > and in both cases, `contended' points at valid storage. > Andrea encountered out an additional bug as well but I preferred a variant of Minchan's fix for it. Can you replace patches mm-compaction-abort-compaction-loop-if-lock-is-contended-or-run-too-long.patch mm-compaction-abort-compaction-loop-if-lock-is-contended-or-run-too-long-fix.patch with this version please? FWIW, I've asked Richard Davies (added to cc) to test with this version of the patch as he is also reporting contention problems when booting a windows guest under qemu. I see you already picked up the documentation patch so I'll ignore the additional "poke poke" in your mail :) ---8<--- From: Shaohua Li <shli@fusionio.com> Subject: [PATCH] compaction: abort compaction loop if lock is contended or run too long Changelog since V2 o Fix BUG_ON triggered due to pages left on cc.migratepages o Make compact_zone_order() require non-NULL arg `contended' Changelog since V1 o only abort the compaction if lock is contended or run too long o Rearranged the code by Andrea Arcangeli. isolate_migratepages_range() might isolate no pages if for example when zone->lru_lock is contended and running asynchronous compaction. In this case, we should abort compaction, otherwise, compact_zone will run a useless loop and make zone->lru_lock is even contended. An additional check is added to ensure that cc.migratepages and cc.freepages get properly drained whan compaction is aborted. [minchan@kernel.org: Putback pages isolated for migration if aborting] [akpm@linux-foundation.org: compact_zone_order requires non-NULL arg contended] Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Shaohua Li <shli@fusionio.com> Signed-off-by: Mel Gorman <mgorman@suse.de> --- mm/compaction.c | 17 ++++++++++++----- mm/internal.h | 2 +- 2 files changed, 13 insertions(+), 6 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index 7fcd3a5..a8de20d 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -70,8 +70,7 @@ static bool compact_checklock_irqsave(spinlock_t *lock, unsigned long *flags, /* async aborts if taking too long or contended */ if (!cc->sync) { - if (cc->contended) - *cc->contended = true; + cc->contended = true; return false; } @@ -634,7 +633,7 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone, /* Perform the isolation */ low_pfn = isolate_migratepages_range(zone, cc, low_pfn, end_pfn); - if (!low_pfn) + if (!low_pfn || cc->contended) return ISOLATE_ABORT; cc->migrate_pfn = low_pfn; @@ -787,6 +786,8 @@ static int compact_zone(struct zone *zone, struct compact_control *cc) switch (isolate_migratepages(zone, cc)) { case ISOLATE_ABORT: ret = COMPACT_PARTIAL; + putback_lru_pages(&cc->migratepages); + cc->nr_migratepages = 0; goto out; case ISOLATE_NONE: continue; @@ -831,6 +832,7 @@ static unsigned long compact_zone_order(struct zone *zone, int order, gfp_t gfp_mask, bool sync, bool *contended) { + unsigned long ret; struct compact_control cc = { .nr_freepages = 0, .nr_migratepages = 0, @@ -838,12 +840,17 @@ static unsigned long compact_zone_order(struct zone *zone, .migratetype = allocflags_to_migratetype(gfp_mask), .zone = zone, .sync = sync, - .contended = contended, }; INIT_LIST_HEAD(&cc.freepages); INIT_LIST_HEAD(&cc.migratepages); - return compact_zone(zone, &cc); + ret = compact_zone(zone, &cc); + + VM_BUG_ON(!list_empty(&cc.freepages)); + VM_BUG_ON(!list_empty(&cc.migratepages)); + + *contended = cc.contended; + return ret; } int sysctl_extfrag_threshold = 500; diff --git a/mm/internal.h b/mm/internal.h index b8c91b3..4bd7c0e 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -130,7 +130,7 @@ struct compact_control { int order; /* order a direct compactor needs */ int migratetype; /* MOVABLE, RECLAIMABLE etc */ struct zone *zone; - bool *contended; /* True if a lock was contended */ + bool contended; /* True if a lock was contended */ }; unsigned long -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [patch 1/2 v2]compaction: abort compaction loop if lock is contended or run too long 2012-09-11 23:34 ` Andrew Morton 2012-09-12 0:48 ` Andrea Arcangeli @ 2012-09-12 11:08 ` Mel Gorman 1 sibling, 0 replies; 18+ messages in thread From: Mel Gorman @ 2012-09-12 11:08 UTC (permalink / raw) To: Andrew Morton; +Cc: Shaohua Li, linux-mm, aarcange On Tue, Sep 11, 2012 at 04:34:55PM -0700, Andrew Morton wrote: > Alas, try_to_compact_pages()'s kerneldoc altogether forgets to describe > this argument. Mel's > mm-compaction-capture-a-suitable-high-order-page-immediately-when-it-is-made-available.patch > adds a `pages' arg and forgets to document that as well. > *slaps* This covers both of them. ---8<--- mm: compaction: Update try_to_compact_pages kernel doc comment Parameters were added without documentation, tut tut. Signed-off-by: Mel Gorman <mgorman@suse.de> --- mm/compaction.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/compaction.c b/mm/compaction.c index 364e12f..614f18b 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -919,6 +919,8 @@ int sysctl_extfrag_threshold = 500; * @gfp_mask: The GFP mask of the current allocation * @nodemask: The allowed nodes to allocate from * @sync: Whether migration is synchronous or not + * @contended: Return value that is true if compaction was aborted due to lock contention + * @page: Optionally capture a free page of the requested order during compaction * * This is the main entry point for direct page compaction. */ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 18+ messages in thread
end of thread, other threads:[~2012-09-13 17:11 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-09-10 1:18 [patch 1/2 v2]compaction: abort compaction loop if lock is contended or run too long Shaohua Li 2012-09-10 8:11 ` Mel Gorman 2012-09-11 1:45 ` Minchan Kim 2012-09-11 8:29 ` Shaohua Li 2012-09-11 8:40 ` Minchan Kim 2012-09-11 23:34 ` Andrew Morton 2012-09-12 0:48 ` Andrea Arcangeli 2012-09-12 21:20 ` Andrew Morton 2012-09-12 23:48 ` Andrea Arcangeli 2012-09-13 0:47 ` Minchan Kim 2012-09-13 2:49 ` Shaohua Li 2012-09-13 9:38 ` Mel Gorman 2012-09-13 10:13 ` Shaohua Li 2012-09-13 16:04 ` Andrea Arcangeli 2012-09-13 16:31 ` Mel Gorman 2012-09-13 17:11 ` Andrea Arcangeli 2012-09-13 10:03 ` Mel Gorman 2012-09-12 11:08 ` Mel Gorman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).