linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/25] Cleanup and optimise the page allocator V6
@ 2009-04-20 22:19 Mel Gorman
  2009-04-20 22:19 ` [PATCH 01/25] Replace __alloc_pages_internal() with __alloc_pages_nodemask() Mel Gorman
                   ` (25 more replies)
  0 siblings, 26 replies; 105+ messages in thread
From: Mel Gorman @ 2009-04-20 22:19 UTC (permalink / raw)
  To: Mel Gorman, Linux Memory Management List
  Cc: KOSAKI Motohiro, Christoph Lameter, Nick Piggin,
	Linux Kernel Mailing List, Lin Ming, Zhang Yanmin, Peter Zijlstra,
	Pekka Enberg, Andrew Morton

Here is V6 of the cleanup and optimisation of the page allocator and it
should be ready for wider testing. Please consider a possibility for
merging as a Pass 1 at making the page allocator faster. Other passes
will occur later when this one has had a bit of exercise. This patchset
is based on mmotm-2009-04-17 but I haven't widely tested it myself due to
problems I'm encountering with the test grid I use (mostly unrelated to
the kernel). It doesn't apply cleanly to linux-next due to dependencies on
patches in -mm but the conflicts are fairly straight-forward to resolve.
I'm working on getting three local test machines built to test there but
it'll take a while and I wanted to get these patches out.

Hence, the following report is the same from V5 and based on an older
kernel. However, I expect similar results in a newer kernel.

======== Old Report ========

Performance is improved in a variety of cases but note it's not universal due
to lock contention which I'll explain later. Text is reduced by 497 bytes on
the x86-64 config I checked. 18.78% less clock cycles were sampled in the page
allocator paths excluding zeroing which is roughly the same in either kernel,
L1 cache misses are reduced by about 7.36% and L2 cache misses were reduced
by 17.91% cache misses incurred within the allocator itself are reduced.

The lock contention on some machines goes up for the the zone->lru_lock
and zone->lock locks which can regress some workloads even though others on
the same machine still go faster. For netperf, a lock called slock-AF_INET
seemed very important although I didn't look too closely other than noting
contention went up. The zone->lock gets hammered a lot by high order allocs
and frees coming from SLUB which are not covered by the PCP allocator in
this patchset. zone->lru_lock goes up is less clear but as it's page cache
releases but overall contention may be up because CPUs are spending less
time with interrupts disabled and more time trying to do real work but
contending on the locks.

============

Change since V5
  o Rebase to mmotm-2009-04-17

Changes since V4
  o Drop the more controversial patches for now and focus on the "obvious win"
    material.
  o Add reviewed-by notes
  o Fix changelog entry to say __rmqueue_fallback instead __rmqueue
  o Add unlikely() for the clearMlocked check
  o Change where PGFREE is accounted in free_hot_cold_page() to have symmetry
    with __free_pages_ok()
  o Convert num_online_nodes() to use a static value so that callers do
    not have to be individually updated
  o Rebase to mmotm-2003-03-13

Changes since V3
  o Drop the more controversial patches for now and focus on the "obvious win"
    material
  o Add reviewed-by notes
  o Fix changelog entry to say __rmqueue_fallback instead __rmqueue
  o Add unlikely() for the clearMlocked check
  o Change where PGFREE is accounted in free_hot_cold_page() to have symmetry
    with __free_pages_ok()

Changes since V2
  o Remove brances by treating watermark flags as array indices
  o Remove branch by assuming __GFP_HIGH == ALLOC_HIGH
  o Do not check for compound on every page free
  o Remove branch by always ensuring the migratetype is known on free
  o Simplify buffered_rmqueue further
  o Reintroduce improved version of batched bulk free of pcp pages
  o Use allocation flags as an index to zone watermarks
  o Work out __GFP_COLD only once
  o Reduce the number of times zone stats are updated
  o Do not dump reserve pages back into the allocator. Instead treat them
    as MOVABLE so that MIGRATE_RESERVE gets used on the max-order-overlapped
    boundaries without causing trouble
  o Allow pages up to PAGE_ALLOC_COSTLY_ORDER to use the per-cpu allocator.
    order-1 allocations are frequently enough in particular to justify this
  o Rearrange inlining such that the hot-path is inlined but not in a way
    that increases the text size of the page allocator
  o Make the check for needing additional zonelist filtering due to NUMA
    or cpusets as light as possible
  o Do not destroy compound pages going to the PCP lists
  o Delay the merging of buddies until a high-order allocation needs them
    or anti-fragmentation is being forced to fallback

Changes since V1
  o Remove the ifdef CONFIG_CPUSETS from inside get_page_from_freelist()
  o Use non-lock bit operations for clearing the mlock flag
  o Factor out alloc_flags calculation so it is only done once (Peter)
  o Make gfp.h a bit prettier and clear-cut (Peter)
  o Instead of deleting a debugging check, replace page_count() in the
    free path with a version that does not check for compound pages (Nick)
  o Drop the alteration for hot/cold page freeing until we know if it
    helps or not

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread
* [PATCH 00/25] Cleanup and optimise the page allocator V5
@ 2009-03-20 10:02 Mel Gorman
  2009-03-20 10:02 ` [PATCH 02/25] Do not sanity check order in the fast path Mel Gorman
  0 siblings, 1 reply; 105+ messages in thread
From: Mel Gorman @ 2009-03-20 10:02 UTC (permalink / raw)
  To: Mel Gorman, Linux Memory Management List
  Cc: KOSAKI Motohiro, Christoph Lameter, Nick Piggin,
	Linux Kernel Mailing List, Lin Ming, Zhang Yanmin, Peter Zijlstra,
	Andrew Morton

Here is V5 of the cleanup and optimisation of the page allocator and it should
be ready for wider testing. Please consider a possibility for merging as a
Pass 1 at making the page allocator faster. Other passes will occur later
when this one has had a bit of exercise. The patchset completed a series
of tests based on the latest MMOTM.

Performance is improved in a variety of cases but note it's not universal due
to lock contention which I'll explain later. Text is reduced by 497 bytes on
the x86-64 config I checked. 18.78% less clock cycles were sampled in the page
allocator paths excluding zeroing which is roughly the same in either kernel,
L1 cache misses are reduced by about 7.36% and L2 cache misses were reduced
by 17.91% cache misses incurred within the allocator itself are reduced.

The lock contention on some machines goes up for the the zone->lru_lock
and zone->lock locks which can regress some workloads even though others on
the same machine still go faster. For netperf, a lock called slock-AF_INET
seemed very important although I didn't look too closely other than noting
contention went up. The zone->lock gets hammered a lot by high order allocs
and frees coming from SLUB which are not covered by the PCP allocator in
this patchset. zone->lru_lock goes up is less clear but as it's page cache
releases but overall contention may be up because CPUs are spending less
time with interrupts disabled and more time trying to do real work but
contending on the locks.

Changes since V4
  o Drop the more controversial patches for now and focus on the "obvious win"
    material.
  o Add reviewed-by notes
  o Fix changelog entry to say __rmqueue_fallback instead __rmqueue
  o Add unlikely() for the clearMlocked check
  o Change where PGFREE is accounted in free_hot_cold_page() to have symmetry
    with __free_pages_ok()
  o Convert num_online_nodes() to use a static value so that callers do
    not have to be individually updated
  o Rebase to mmotm-2003-03-13

Changes since V3
  o Drop the more controversial patches for now and focus on the "obvious win"
    material
  o Add reviewed-by notes
  o Fix changelog entry to say __rmqueue_fallback instead __rmqueue
  o Add unlikely() for the clearMlocked check
  o Change where PGFREE is accounted in free_hot_cold_page() to have symmetry
    with __free_pages_ok()

Changes since V2
  o Remove brances by treating watermark flags as array indices
  o Remove branch by assuming __GFP_HIGH == ALLOC_HIGH
  o Do not check for compound on every page free
  o Remove branch by always ensuring the migratetype is known on free
  o Simplify buffered_rmqueue further
  o Reintroduce improved version of batched bulk free of pcp pages
  o Use allocation flags as an index to zone watermarks
  o Work out __GFP_COLD only once
  o Reduce the number of times zone stats are updated
  o Do not dump reserve pages back into the allocator. Instead treat them
    as MOVABLE so that MIGRATE_RESERVE gets used on the max-order-overlapped
    boundaries without causing trouble
  o Allow pages up to PAGE_ALLOC_COSTLY_ORDER to use the per-cpu allocator.
    order-1 allocations are frequently enough in particular to justify this
  o Rearrange inlining such that the hot-path is inlined but not in a way
    that increases the text size of the page allocator
  o Make the check for needing additional zonelist filtering due to NUMA
    or cpusets as light as possible
  o Do not destroy compound pages going to the PCP lists
  o Delay the merging of buddies until a high-order allocation needs them
    or anti-fragmentation is being forced to fallback

Changes since V1
  o Remove the ifdef CONFIG_CPUSETS from inside get_page_from_freelist()
  o Use non-lock bit operations for clearing the mlock flag
  o Factor out alloc_flags calculation so it is only done once (Peter)
  o Make gfp.h a bit prettier and clear-cut (Peter)
  o Instead of deleting a debugging check, replace page_count() in the
    free path with a version that does not check for compound pages (Nick)
  o Drop the alteration for hot/cold page freeing until we know if it
    helps or not

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

end of thread, other threads:[~2009-04-22 14:42 UTC | newest]

Thread overview: 105+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-20 22:19 [PATCH 00/25] Cleanup and optimise the page allocator V6 Mel Gorman
2009-04-20 22:19 ` [PATCH 01/25] Replace __alloc_pages_internal() with __alloc_pages_nodemask() Mel Gorman
2009-04-21  1:44   ` KOSAKI Motohiro
2009-04-21  5:55   ` Pekka Enberg
2009-04-20 22:19 ` [PATCH 02/25] Do not sanity check order in the fast path Mel Gorman
2009-04-21  1:45   ` KOSAKI Motohiro
2009-04-21  5:55   ` Pekka Enberg
2009-04-20 22:19 ` [PATCH 03/25] Do not check NUMA node ID when the caller knows the node is valid Mel Gorman
2009-04-21  2:44   ` KOSAKI Motohiro
2009-04-21  6:00   ` Pekka Enberg
2009-04-21  6:33   ` Paul Mundt
2009-04-20 22:19 ` [PATCH 04/25] Check only once if the zonelist is suitable for the allocation Mel Gorman
2009-04-21  3:03   ` KOSAKI Motohiro
2009-04-21  7:09   ` Pekka Enberg
2009-04-20 22:19 ` [PATCH 05/25] Break up the allocator entry point into fast and slow paths Mel Gorman
2009-04-21  6:35   ` KOSAKI Motohiro
2009-04-21  7:13     ` Pekka Enberg
2009-04-21  9:30       ` Mel Gorman
2009-04-21  9:29     ` Mel Gorman
2009-04-21 10:44       ` KOSAKI Motohiro
2009-04-20 22:19 ` [PATCH 06/25] Move check for disabled anti-fragmentation out of fastpath Mel Gorman
2009-04-21  6:37   ` KOSAKI Motohiro
2009-04-20 22:19 ` [PATCH 07/25] Check in advance if the zonelist needs additional filtering Mel Gorman
2009-04-21  6:52   ` KOSAKI Motohiro
2009-04-21  9:47     ` Mel Gorman
2009-04-21  7:21   ` Pekka Enberg
2009-04-21  9:49     ` Mel Gorman
2009-04-20 22:19 ` [PATCH 08/25] Calculate the preferred zone for allocation only once Mel Gorman
2009-04-21  7:03   ` KOSAKI Motohiro
2009-04-21  8:23     ` Mel Gorman
2009-04-21  7:37   ` Pekka Enberg
2009-04-21  8:27     ` Mel Gorman
2009-04-21  8:29       ` Pekka Enberg
2009-04-20 22:19 ` [PATCH 09/25] Calculate the migratetype " Mel Gorman
2009-04-21  7:37   ` KOSAKI Motohiro
2009-04-21  8:35     ` Mel Gorman
2009-04-21 10:19       ` KOSAKI Motohiro
2009-04-21 10:30         ` Mel Gorman
2009-04-20 22:19 ` [PATCH 10/25] Calculate the alloc_flags " Mel Gorman
2009-04-21  9:03   ` KOSAKI Motohiro
2009-04-21 10:05     ` Mel Gorman
2009-04-21 10:12       ` KOSAKI Motohiro
2009-04-21 10:37         ` Mel Gorman
2009-04-21 10:40           ` KOSAKI Motohiro
2009-04-20 22:19 ` [PATCH 11/25] Calculate the cold parameter " Mel Gorman
2009-04-21  7:43   ` Pekka Enberg
2009-04-21  8:41     ` Mel Gorman
2009-04-21  9:07   ` KOSAKI Motohiro
2009-04-21 10:08     ` Mel Gorman
2009-04-21 14:59     ` Christoph Lameter
2009-04-21 14:58   ` Christoph Lameter
2009-04-20 22:19 ` [PATCH 12/25] Remove a branch by assuming __GFP_HIGH == ALLOC_HIGH Mel Gorman
2009-04-21  7:46   ` Pekka Enberg
2009-04-21  8:45     ` Mel Gorman
2009-04-21 10:25       ` Pekka Enberg
2009-04-21  9:08   ` KOSAKI Motohiro
2009-04-21 10:31     ` KOSAKI Motohiro
2009-04-21 10:43       ` Mel Gorman
2009-04-20 22:19 ` [PATCH 13/25] Inline __rmqueue_smallest() Mel Gorman
2009-04-21  7:58   ` Pekka Enberg
2009-04-21  8:48     ` Mel Gorman
2009-04-21  9:52   ` KOSAKI Motohiro
2009-04-21 10:11     ` Mel Gorman
2009-04-21 10:22       ` KOSAKI Motohiro
2009-04-20 22:20 ` [PATCH 14/25] Inline buffered_rmqueue() Mel Gorman
2009-04-21  9:56   ` KOSAKI Motohiro
2009-04-20 22:20 ` [PATCH 15/25] Inline __rmqueue_fallback() Mel Gorman
2009-04-21  9:56   ` KOSAKI Motohiro
2009-04-20 22:20 ` [PATCH 16/25] Save text by reducing call sites of __rmqueue() Mel Gorman
2009-04-21 10:47   ` KOSAKI Motohiro
2009-04-20 22:20 ` [PATCH 17/25] Do not call get_pageblock_migratetype() more than necessary Mel Gorman
2009-04-21 11:03   ` KOSAKI Motohiro
2009-04-21 16:12     ` Mel Gorman
2009-04-22  2:25       ` KOSAKI Motohiro
2009-04-20 22:20 ` [PATCH 18/25] Do not disable interrupts in free_page_mlock() Mel Gorman
2009-04-21  7:55   ` Pekka Enberg
2009-04-21  8:50     ` Mel Gorman
2009-04-21 15:05       ` Christoph Lameter
2009-04-22  0:13   ` KOSAKI Motohiro
2009-04-22 14:43     ` Lee Schermerhorn
2009-04-20 22:20 ` [PATCH 19/25] Do not setup zonelist cache when there is only one node Mel Gorman
2009-04-20 22:20 ` [PATCH 20/25] Do not check for compound pages during the page allocator sanity checks Mel Gorman
2009-04-22  0:20   ` KOSAKI Motohiro
2009-04-22 10:09     ` Mel Gorman
2009-04-22 10:41       ` KOSAKI Motohiro
2009-04-20 22:20 ` [PATCH 21/25] Use allocation flags as an index to the zone watermark Mel Gorman
2009-04-22  0:26   ` KOSAKI Motohiro
2009-04-22  0:41     ` David Rientjes
2009-04-22 10:21     ` Mel Gorman
2009-04-22 10:23       ` Mel Gorman
2009-04-20 22:20 ` [PATCH 22/25] Update NR_FREE_PAGES only as necessary Mel Gorman
2009-04-22  0:35   ` KOSAKI Motohiro
2009-04-20 22:20 ` [PATCH 23/25] Get the pageblock migratetype without disabling interrupts Mel Gorman
2009-04-20 22:20 ` [PATCH 24/25] Re-sort GFP flags and fix whitespace alignment for easier reading Mel Gorman
2009-04-21  8:04   ` Pekka Enberg
2009-04-21  8:52     ` Mel Gorman
2009-04-21 15:08       ` Christoph Lameter
2009-04-21 15:24         ` Mel Gorman
2009-04-20 22:20 ` [PATCH 25/25] Use a pre-calculated value instead of num_online_nodes() in fast paths Mel Gorman
2009-04-21  8:08   ` Pekka Enberg
2009-04-21  9:01     ` Mel Gorman
2009-04-21 15:09       ` Christoph Lameter
2009-04-21  8:13 ` [PATCH 00/25] Cleanup and optimise the page allocator V6 Pekka Enberg
2009-04-22 14:13   ` Mel Gorman
  -- strict thread matches above, loose matches on Subject: below --
2009-03-20 10:02 [PATCH 00/25] Cleanup and optimise the page allocator V5 Mel Gorman
2009-03-20 10:02 ` [PATCH 02/25] Do not sanity check order in the fast path Mel Gorman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).