All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 4/8] mm: migration: Allow migration to operate asynchronously and avoid synchronous compaction in the faster path
Date: Thu, 18 Nov 2010 20:00:23 +0100	[thread overview]
Message-ID: <20101118190023.GE30376@random.random> (raw)
In-Reply-To: <20101118183437.GP8135@csn.ul.ie>

On Thu, Nov 18, 2010 at 06:34:38PM +0000, Mel Gorman wrote:
> On Thu, Nov 18, 2010 at 07:21:06PM +0100, Andrea Arcangeli wrote:
> > On Wed, Nov 17, 2010 at 04:22:45PM +0000, Mel Gorman wrote:
> > > @@ -484,6 +486,7 @@ static unsigned long compact_zone_order(struct zone *zone,
> > >  		.order = order,
> > >  		.migratetype = allocflags_to_migratetype(gfp_mask),
> > >  		.zone = zone,
> > > +		.sync = false,
> > >  	};
> > >  	INIT_LIST_HEAD(&cc.freepages);
> > >  	INIT_LIST_HEAD(&cc.migratepages);
> > 
> > I like this because I'm very afraid to avoid wait-I/O latencies
> > introduced into hugepage allocations that I prefer to fail quickly and
> > be handled later by khugepaged ;).
> > 
> 
> As you can see from the graphs in the leader, it makes a big difference to
> latency as well to avoid sync migration where possible.

Yep, amazing benchmarking work you did! Great job indeed.

I thought of this sync wait in migrate myself as being troublesome a
few days ago as I was reviewing the btrfs migration bug that I helped
track down this week (triggering only with THP because it exercises
compaction and in turn migration more often than upstream, it's rare
to get any order > 4 allocation with upstream that would exercise
compaction and trip on the btrfs fs corruption, it really had nothing
to do with THP as I expected).

> We could pass gfp flags in I guess and abuse __GFP_NO_KSWAPD (from the THP
> series obviously)?

That would work for me... :)

> Yes, it's the "slower" path where we've already reclaim pages and more
> willing to wait for the compaction to occur as the alternative is failing
> the allocation.

I've noticed, which is why I think it's equivalent to invoking the
second try_to_compact_pages with (fast_scan=false, sync=true) (and the
first of course with (fast_scan=true, sync=false)).

> I'll think about it more. I could just leave it at try_to_compact_pages
> doing the zonelist scan although it's not immediately occuring to me how I
> should decide between sync and async other than "async the first time and
> sync after that". The allocator does not have the same "reclaim priority"
> awareness that reclaim does.

I think the "migrate async & fast scan first, migrate sync and full
scan later" is a simpler heuristic we can do and I expect it to work
fine and equivalent (if not better).

I'm undecided if it worth to run the hugepage page fault with "async &
fast scan always" by abusing __GFP_NO_KSWAPD or by adding a
__GFP_COMPACT_FAST. Of course it would only make a difference mostly
if the hugepage allocation has to fail often (like 95% of ram in
hugepages with slab spread over 10% of ram) so that is a corner case
that not everybody experiences... Probably not worth it.

Increasing nr_to_reclaim to 1<<order only when the compaction_suitable
checks are not satisfied and compaction becomes a noop, may also be
worth investigating (as long as there are enough cond_resched() inside
those loops ;). But hey I'm not sure if it's really needed...

WARNING: multiple messages have this Message-ID (diff)
From: Andrea Arcangeli <aarcange@redhat.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 4/8] mm: migration: Allow migration to operate asynchronously and avoid synchronous compaction in the faster path
Date: Thu, 18 Nov 2010 20:00:23 +0100	[thread overview]
Message-ID: <20101118190023.GE30376@random.random> (raw)
In-Reply-To: <20101118183437.GP8135@csn.ul.ie>

On Thu, Nov 18, 2010 at 06:34:38PM +0000, Mel Gorman wrote:
> On Thu, Nov 18, 2010 at 07:21:06PM +0100, Andrea Arcangeli wrote:
> > On Wed, Nov 17, 2010 at 04:22:45PM +0000, Mel Gorman wrote:
> > > @@ -484,6 +486,7 @@ static unsigned long compact_zone_order(struct zone *zone,
> > >  		.order = order,
> > >  		.migratetype = allocflags_to_migratetype(gfp_mask),
> > >  		.zone = zone,
> > > +		.sync = false,
> > >  	};
> > >  	INIT_LIST_HEAD(&cc.freepages);
> > >  	INIT_LIST_HEAD(&cc.migratepages);
> > 
> > I like this because I'm very afraid to avoid wait-I/O latencies
> > introduced into hugepage allocations that I prefer to fail quickly and
> > be handled later by khugepaged ;).
> > 
> 
> As you can see from the graphs in the leader, it makes a big difference to
> latency as well to avoid sync migration where possible.

Yep, amazing benchmarking work you did! Great job indeed.

I thought of this sync wait in migrate myself as being troublesome a
few days ago as I was reviewing the btrfs migration bug that I helped
track down this week (triggering only with THP because it exercises
compaction and in turn migration more often than upstream, it's rare
to get any order > 4 allocation with upstream that would exercise
compaction and trip on the btrfs fs corruption, it really had nothing
to do with THP as I expected).

> We could pass gfp flags in I guess and abuse __GFP_NO_KSWAPD (from the THP
> series obviously)?

That would work for me... :)

> Yes, it's the "slower" path where we've already reclaim pages and more
> willing to wait for the compaction to occur as the alternative is failing
> the allocation.

I've noticed, which is why I think it's equivalent to invoking the
second try_to_compact_pages with (fast_scan=false, sync=true) (and the
first of course with (fast_scan=true, sync=false)).

> I'll think about it more. I could just leave it at try_to_compact_pages
> doing the zonelist scan although it's not immediately occuring to me how I
> should decide between sync and async other than "async the first time and
> sync after that". The allocator does not have the same "reclaim priority"
> awareness that reclaim does.

I think the "migrate async & fast scan first, migrate sync and full
scan later" is a simpler heuristic we can do and I expect it to work
fine and equivalent (if not better).

I'm undecided if it worth to run the hugepage page fault with "async &
fast scan always" by abusing __GFP_NO_KSWAPD or by adding a
__GFP_COMPACT_FAST. Of course it would only make a difference mostly
if the hugepage allocation has to fail often (like 95% of ram in
hugepages with slab spread over 10% of ram) so that is a corner case
that not everybody experiences... Probably not worth it.

Increasing nr_to_reclaim to 1<<order only when the compaction_suitable
checks are not satisfied and compaction becomes a noop, may also be
worth investigating (as long as there are enough cond_resched() inside
those loops ;). But hey I'm not sure if it's really needed...

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-11-18 19:00 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-17 16:22 [PATCH 0/8] Use memory compaction instead of lumpy reclaim during high-order allocations Mel Gorman
2010-11-17 16:22 ` Mel Gorman
2010-11-17 16:22 ` [PATCH 1/8] mm: compaction: Add trace events for memory compaction activity Mel Gorman
2010-11-17 16:22   ` Mel Gorman
2010-11-17 16:22 ` [PATCH 2/8] mm: vmscan: Convert lumpy_mode into a bitmask Mel Gorman
2010-11-17 16:22   ` Mel Gorman
2010-11-17 16:22 ` [PATCH 3/8] mm: vmscan: Reclaim order-0 and use compaction instead of lumpy reclaim Mel Gorman
2010-11-17 16:22   ` Mel Gorman
2010-11-18 18:09   ` Andrea Arcangeli
2010-11-18 18:09     ` Andrea Arcangeli
2010-11-18 18:30     ` Mel Gorman
2010-11-18 18:30       ` Mel Gorman
2010-11-17 16:22 ` [PATCH 4/8] mm: migration: Allow migration to operate asynchronously and avoid synchronous compaction in the faster path Mel Gorman
2010-11-17 16:22   ` Mel Gorman
2010-11-18 18:21   ` Andrea Arcangeli
2010-11-18 18:21     ` Andrea Arcangeli
2010-11-18 18:34     ` Mel Gorman
2010-11-18 18:34       ` Mel Gorman
2010-11-18 19:00       ` Andrea Arcangeli [this message]
2010-11-18 19:00         ` Andrea Arcangeli
2010-11-17 16:22 ` [PATCH 5/8] mm: migration: Cleanup migrate_pages API by matching types for offlining and sync Mel Gorman
2010-11-17 16:22   ` Mel Gorman
2010-11-17 16:22 ` [PATCH 6/8] mm: compaction: Perform a faster scan in try_to_compact_pages() Mel Gorman
2010-11-17 16:22   ` Mel Gorman
2010-11-18 18:34   ` Andrea Arcangeli
2010-11-18 18:34     ` Andrea Arcangeli
2010-11-18 18:50     ` Mel Gorman
2010-11-18 18:50       ` Mel Gorman
2010-11-18 19:08       ` Andrea Arcangeli
2010-11-18 19:08         ` Andrea Arcangeli
2010-11-19 11:16         ` Mel Gorman
2010-11-19 11:16           ` Mel Gorman
2010-11-17 16:22 ` [PATCH 7/8] mm: compaction: Use the LRU to get a hint on where compaction should start Mel Gorman
2010-11-17 16:22   ` Mel Gorman
2010-11-18  9:10   ` KAMEZAWA Hiroyuki
2010-11-18  9:10     ` KAMEZAWA Hiroyuki
2010-11-18  9:28     ` Mel Gorman
2010-11-18  9:28       ` Mel Gorman
2010-11-18 18:46   ` Andrea Arcangeli
2010-11-18 18:46     ` Andrea Arcangeli
2010-11-19 11:08     ` Mel Gorman
2010-11-19 11:08       ` Mel Gorman
2010-11-17 16:22 ` [PATCH 8/8] mm: vmscan: Rename lumpy_mode to reclaim_mode Mel Gorman
2010-11-17 16:22   ` Mel Gorman
2010-11-17 23:46 ` [PATCH 0/8] Use memory compaction instead of lumpy reclaim during high-order allocations Andrew Morton
2010-11-17 23:46   ` Andrew Morton
2010-11-18  2:03   ` Rik van Riel
2010-11-18  2:03     ` Rik van Riel
2010-11-18  8:12   ` Mel Gorman
2010-11-18  8:12     ` Mel Gorman
2010-11-18  8:26     ` KAMEZAWA Hiroyuki
2010-11-18  8:26       ` KAMEZAWA Hiroyuki
2010-11-18  8:38       ` Johannes Weiner
2010-11-18  8:38         ` Johannes Weiner
2010-11-18  9:20         ` Mel Gorman
2010-11-18  9:20           ` Mel Gorman
2010-11-18 19:49           ` Andrew Morton
2010-11-18 19:49             ` Andrew Morton
2010-11-19 10:48             ` Mel Gorman
2010-11-19 10:48               ` Mel Gorman
2010-11-19 12:43               ` Theodore Tso
2010-11-19 12:43                 ` Theodore Tso
2010-11-19 14:05                 ` Mel Gorman
2010-11-19 14:05                   ` Mel Gorman
2010-11-19 15:45                   ` Ted Ts'o
2010-11-19 15:45                     ` Ted Ts'o
2010-11-18  8:44       ` Mel Gorman
2010-11-18  8:44         ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101118190023.GE30376@random.random \
    --to=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.