From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx195.postini.com [74.125.245.195]) by kanga.kvack.org (Postfix) with SMTP id 271AF6B0062 for ; Mon, 22 Oct 2012 04:07:37 -0400 (EDT) From: Mel Gorman Subject: [RFC PATCH 0/5] vmstats for compaction, migration and autonuma Date: Mon, 22 Oct 2012 08:59:46 +0100 Message-Id: <1350892791-2682-1-git-send-email-mgorman@suse.de> Sender: owner-linux-mm@kvack.org List-ID: To: Linux-MM Cc: Peter Zijlstra , Andrea Arcangeli , Rik van Riel , Mel Gorman , LKML I'm travelling for a conference at the moment so these patches are not tested but with the ongoing NUMA migration work I figured it was best to post these sooner rather than later. This series adds vmstat counters and tracepoints for migration, compaction and autonuma. Using them it's possible to create a basic cost model to estimate the overhead due to compaction or autonuma. Using the stats it is also possible to measure if a workload is converging on autonuma or not and potentially measure how quickly it is converging. Ideally the same stats would be available for schednuma but I did not review the series when it was last posted in July and had not seen a recent posting. I only recently heard they were in the -tip tree but will not get the chance to look at them until I've finished travelling in a weeks time. If schednuma had similar stats it would then be possible to compare schednuma and autonuma in terms of how quickly a workload converges with either approach. include/linux/migrate.h | 14 +++++++++- include/linux/vm_event_item.h | 12 ++++++++- include/trace/events/migrate.h | 52 ++++++++++++++++++++++++++++++++++++++++ mm/autonuma.c | 22 +++++++++++++---- mm/compaction.c | 15 +++++++---- mm/memory-failure.c | 3 +- mm/memory_hotplug.c | 3 +- mm/mempolicy.c | 6 +++- mm/migrate.c | 16 ++++++++++- mm/page_alloc.c | 3 +- mm/vmstat.c | 16 ++++++++++-- 11 files changed, 139 insertions(+), 23 deletions(-) create mode 100644 include/trace/events/migrate.h -- 1.7.7 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx119.postini.com [74.125.245.119]) by kanga.kvack.org (Postfix) with SMTP id 18C596B0062 for ; Mon, 22 Oct 2012 04:09:14 -0400 (EDT) From: Mel Gorman Subject: [PATCH 1/5] mm: compaction: Move migration fail/success stats to migrate.c Date: Mon, 22 Oct 2012 08:59:47 +0100 Message-Id: <1350892791-2682-2-git-send-email-mgorman@suse.de> In-Reply-To: <1350892791-2682-1-git-send-email-mgorman@suse.de> References: <1350892791-2682-1-git-send-email-mgorman@suse.de> Sender: owner-linux-mm@kvack.org List-ID: To: Linux-MM Cc: Peter Zijlstra , Andrea Arcangeli , Rik van Riel , Mel Gorman , LKML The compact_pages_moved and compact_pagemigrate_failed events are convenient for determining if compaction is active and to what degree migration is succeeding but it's at the wrong level. Other users of migration may also want to know if migration is working properly and this will be particularly true for any automated NUMA migration. This patch moves the counters down to migration with the new events called pgmigrate_success and pgmigrate_fail. The compact_blocks_moved counter is removed because while it was useful for debugging initially, it's worthless now as no meaningful conclusions can be drawn from its value. Signed-off-by: Mel Gorman --- include/linux/vm_event_item.h | 4 +++- mm/compaction.c | 4 ---- mm/migrate.c | 6 ++++++ mm/vmstat.c | 7 ++++--- 4 files changed, 13 insertions(+), 8 deletions(-) diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index 57f7b10..5ce5c5f 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -38,8 +38,10 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, KSWAPD_LOW_WMARK_HIT_QUICKLY, KSWAPD_HIGH_WMARK_HIT_QUICKLY, KSWAPD_SKIP_CONGESTION_WAIT, PAGEOUTRUN, ALLOCSTALL, PGROTATED, +#ifdef CONFIG_MIGRATION + PGMIGRATE_SUCCESS, PGMIGRATE_FAIL, +#endif #ifdef CONFIG_COMPACTION - COMPACTBLOCKS, COMPACTPAGES, COMPACTPAGEFAILED, COMPACTSTALL, COMPACTFAIL, COMPACTSUCCESS, #endif #ifdef CONFIG_HUGETLB_PAGE diff --git a/mm/compaction.c b/mm/compaction.c index 7fcd3a5..8c1a53a 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -801,10 +801,6 @@ static int compact_zone(struct zone *zone, struct compact_control *cc) update_nr_listpages(cc); nr_remaining = cc->nr_migratepages; - count_vm_event(COMPACTBLOCKS); - count_vm_events(COMPACTPAGES, nr_migrate - nr_remaining); - if (nr_remaining) - count_vm_events(COMPACTPAGEFAILED, nr_remaining); trace_mm_compaction_migratepages(nr_migrate - nr_remaining, nr_remaining); diff --git a/mm/migrate.c b/mm/migrate.c index 77ed2d7..04687f6 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -962,6 +962,7 @@ int migrate_pages(struct list_head *from, { int retry = 1; int nr_failed = 0; + int nr_succeeded = 0; int pass = 0; struct page *page; struct page *page2; @@ -988,6 +989,7 @@ int migrate_pages(struct list_head *from, retry++; break; case 0: + nr_succeeded++; break; default: /* Permanent failure */ @@ -998,6 +1000,10 @@ int migrate_pages(struct list_head *from, } rc = 0; out: + if (nr_succeeded) + count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded); + if (nr_failed) + count_vm_events(PGMIGRATE_FAIL, nr_failed); if (!swapwrite) current->flags &= ~PF_SWAPWRITE; diff --git a/mm/vmstat.c b/mm/vmstat.c index df7a674..4849241 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -761,10 +761,11 @@ const char * const vmstat_text[] = { "pgrotated", +#ifdef CONFIG_MIGRATION + "pgmigrate_success", + "pgmigrate_fail", +#endif #ifdef CONFIG_COMPACTION - "compact_blocks_moved", - "compact_pages_moved", - "compact_pagemigrate_failed", "compact_stall", "compact_fail", "compact_success", -- 1.7.7 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx166.postini.com [74.125.245.166]) by kanga.kvack.org (Postfix) with SMTP id 37B1F6B0069 for ; Mon, 22 Oct 2012 04:09:14 -0400 (EDT) From: Mel Gorman Subject: [PATCH 2/5] mm: migrate: Add a tracepoint for migrate_pages Date: Mon, 22 Oct 2012 08:59:48 +0100 Message-Id: <1350892791-2682-3-git-send-email-mgorman@suse.de> In-Reply-To: <1350892791-2682-1-git-send-email-mgorman@suse.de> References: <1350892791-2682-1-git-send-email-mgorman@suse.de> Sender: owner-linux-mm@kvack.org List-ID: To: Linux-MM Cc: Peter Zijlstra , Andrea Arcangeli , Rik van Riel , Mel Gorman , LKML The pgmigrate_success and pgmigrate_fail vmstat counters tells the user about migration activity but not the type or the reason. This patch adds a tracepoint to identify the type of page migration and why the page is being migrated. Signed-off-by: Mel Gorman --- include/linux/migrate.h | 13 ++++++++- include/trace/events/migrate.h | 51 ++++++++++++++++++++++++++++++++++++++++ mm/compaction.c | 3 +- mm/memory-failure.c | 3 +- mm/memory_hotplug.c | 3 +- mm/mempolicy.c | 6 +++- mm/migrate.c | 10 ++++++- mm/page_alloc.c | 3 +- 8 files changed, 82 insertions(+), 10 deletions(-) create mode 100644 include/trace/events/migrate.h diff --git a/include/linux/migrate.h b/include/linux/migrate.h index ce7e667..9d1c159 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -7,6 +7,15 @@ typedef struct page *new_page_t(struct page *, unsigned long private, int **); +enum migrate_reason { + MR_COMPACTION, + MR_MEMORY_FAILURE, + MR_MEMORY_HOTPLUG, + MR_SYSCALL, /* also applies to cpusets */ + MR_MEMPOLICY_MBIND, + MR_CMA +}; + #ifdef CONFIG_MIGRATION extern void putback_lru_pages(struct list_head *l); @@ -14,7 +23,7 @@ extern int migrate_page(struct address_space *, struct page *, struct page *, enum migrate_mode); extern int migrate_pages(struct list_head *l, new_page_t x, unsigned long private, bool offlining, - enum migrate_mode mode); + enum migrate_mode mode, int reason); extern int migrate_huge_page(struct page *, new_page_t x, unsigned long private, bool offlining, enum migrate_mode mode); @@ -35,7 +44,7 @@ extern int migrate_huge_page_move_mapping(struct address_space *mapping, static inline void putback_lru_pages(struct list_head *l) {} static inline int migrate_pages(struct list_head *l, new_page_t x, unsigned long private, bool offlining, - enum migrate_mode mode) { return -ENOSYS; } + enum migrate_mode mode, int reason) { return -ENOSYS; } static inline int migrate_huge_page(struct page *page, new_page_t x, unsigned long private, bool offlining, enum migrate_mode mode) { return -ENOSYS; } diff --git a/include/trace/events/migrate.h b/include/trace/events/migrate.h new file mode 100644 index 0000000..ec2a6cc --- /dev/null +++ b/include/trace/events/migrate.h @@ -0,0 +1,51 @@ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM migrate + +#if !defined(_TRACE_MIGRATE_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_MIGRATE_H + +#define MIGRATE_MODE \ + {MIGRATE_ASYNC, "MIGRATE_ASYNC"}, \ + {MIGRATE_SYNC_LIGHT, "MIGRATE_SYNC_LIGHT"}, \ + {MIGRATE_SYNC, "MIGRATE_SYNC"} + +#define MIGRATE_REASON \ + {MR_COMPACTION, "compaction"}, \ + {MR_MEMORY_FAILURE, "memory_failure"}, \ + {MR_MEMORY_HOTPLUG, "memory_hotplug"}, \ + {MR_SYSCALL, "syscall_or_cpuset"}, \ + {MR_MEMPOLICY_MBIND, "mempolicy_mbind"}, \ + {MR_CMA, "cma"} + +TRACE_EVENT(mm_migrate_pages, + + TP_PROTO(unsigned long succeeded, unsigned long failed, + enum migrate_mode mode, int reason), + + TP_ARGS(succeeded, failed, mode, reason), + + TP_STRUCT__entry( + __field( unsigned long, succeeded) + __field( unsigned long, failed) + __field( enum migrate_mode, mode) + __field( int, reason) + ), + + TP_fast_assign( + __entry->succeeded = succeeded; + __entry->failed = failed; + __entry->mode = mode; + __entry->reason = reason; + ), + + TP_printk("nr_succeeded=%lu nr_failed=%lu mode=%s reason=%s", + __entry->succeeded, + __entry->failed, + __print_symbolic(__entry->mode, MIGRATE_MODE), + __print_symbolic(__entry->reason, MIGRATE_REASON)) +); + +#endif /* _TRACE_MIGRATE_H */ + +/* This part must be outside protection */ +#include diff --git a/mm/compaction.c b/mm/compaction.c index 8c1a53a..11b455b 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -797,7 +797,8 @@ static int compact_zone(struct zone *zone, struct compact_control *cc) nr_migrate = cc->nr_migratepages; err = migrate_pages(&cc->migratepages, compaction_alloc, (unsigned long)cc, false, - cc->sync ? MIGRATE_SYNC_LIGHT : MIGRATE_ASYNC); + cc->sync ? MIGRATE_SYNC_LIGHT : MIGRATE_ASYNC, + MR_COMPACTION); update_nr_listpages(cc); nr_remaining = cc->nr_migratepages; diff --git a/mm/memory-failure.c b/mm/memory-failure.c index a6e2141..9d4489c 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1556,7 +1556,8 @@ int soft_offline_page(struct page *page, int flags) page_is_file_cache(page)); list_add(&page->lru, &pagelist); ret = migrate_pages(&pagelist, new_page, MPOL_MF_MOVE_ALL, - false, MIGRATE_SYNC); + false, MIGRATE_SYNC, + MR_MEMORY_FAILURE); if (ret) { putback_lru_pages(&pagelist); pr_info("soft offline: %#lx: migration failed %d, type %lx\n", diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 6a5b90d..b299e83 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -815,7 +815,8 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn) } /* this function returns # of failed pages */ ret = migrate_pages(&source, hotremove_migrate_alloc, 0, - true, MIGRATE_SYNC); + true, MIGRATE_SYNC, + MR_MEMORY_HOTPLUG); if (ret) putback_lru_pages(&source); } diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 5cffcb6..bd4fc4c 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -936,7 +936,8 @@ static int migrate_to_node(struct mm_struct *mm, int source, int dest, if (!list_empty(&pagelist)) { err = migrate_pages(&pagelist, new_node_page, dest, - false, MIGRATE_SYNC); + false, MIGRATE_SYNC, + MR_SYSCALL); if (err) putback_lru_pages(&pagelist); } @@ -1177,7 +1178,8 @@ static long do_mbind(unsigned long start, unsigned long len, if (!list_empty(&pagelist)) { nr_failed = migrate_pages(&pagelist, new_vma_page, (unsigned long)vma, - false, MIGRATE_SYNC); + false, MIGRATE_SYNC, + MR_MEMPOLICY_MBIND); if (nr_failed) putback_lru_pages(&pagelist); } diff --git a/mm/migrate.c b/mm/migrate.c index 04687f6..27be9c9 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -38,6 +38,9 @@ #include +#define CREATE_TRACE_POINTS +#include + #include "internal.h" /* @@ -958,7 +961,7 @@ out: */ int migrate_pages(struct list_head *from, new_page_t get_new_page, unsigned long private, bool offlining, - enum migrate_mode mode) + enum migrate_mode mode, int reason) { int retry = 1; int nr_failed = 0; @@ -1004,6 +1007,8 @@ out: count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded); if (nr_failed) count_vm_events(PGMIGRATE_FAIL, nr_failed); + trace_mm_migrate_pages(nr_succeeded, nr_failed, mode, reason); + if (!swapwrite) current->flags &= ~PF_SWAPWRITE; @@ -1145,7 +1150,8 @@ set_status: err = 0; if (!list_empty(&pagelist)) { err = migrate_pages(&pagelist, new_page_node, - (unsigned long)pm, 0, MIGRATE_SYNC); + (unsigned long)pm, 0, MIGRATE_SYNC, + MR_SYSCALL); if (err) putback_lru_pages(&pagelist); } diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 55df691..3d361f6 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5677,7 +5677,8 @@ static int __alloc_contig_migrate_range(unsigned long start, unsigned long end) ret = migrate_pages(&cc.migratepages, __alloc_contig_migrate_alloc, - 0, false, MIGRATE_SYNC); + 0, false, MIGRATE_SYNC, + MR_CMA); } putback_lru_pages(&cc.migratepages); -- 1.7.7 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx114.postini.com [74.125.245.114]) by kanga.kvack.org (Postfix) with SMTP id C7BC86B0071 for ; Mon, 22 Oct 2012 04:13:07 -0400 (EDT) Date: Mon, 22 Oct 2012 09:05:25 +0100 From: Mel Gorman Subject: [PATCH 3/5] mm: compaction: Add scanned and isolated counters for compaction Message-ID: <20121022080525.GB2198@suse.de> References: <1350892791-2682-1-git-send-email-mgorman@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <1350892791-2682-1-git-send-email-mgorman@suse.de> Sender: owner-linux-mm@kvack.org List-ID: To: Linux-MM Cc: Peter Zijlstra , Andrea Arcangeli , Rik van Riel , LKML Compaction already has tracepoints to count scanned and isolated pages but it requires that ftrace be enabled and if that information has to be written to disk then it can be disruptive. This patch adds vmstat counters for compaction called compact_migrate_scanned, compact_free_scanned and compact_isolated. With these counters, it is possible to define a basic cost model for compaction. This approximates of how much work compaction is doing and can be compared that with an oprofile showing TLB misses and see if the cost of compaction is being offset by THP for example. Minimally a compaction patch can be evaluated in terms of whether it increases or decreases cost. The basic cost model looks like this Fundamental unit u: a word sizeof(void *) Ca = cost of struct page access = sizeof(struct page) / u Cmc = Cost migrate page copy = (Ca + PAGE_SIZE/u) * 2 Cmf = Cost migrate failure = Ca * 2 Ci = Cost page isolation = (Ca + Wi) where Wi is a constant that should reflect the approximate cost of the locking operation. Csm = Cost migrate scanning = Ca Csf = Cost free scanning = Ca Overall cost = (Csm * compact_migrate_scanned) + (Csf * compact_free_scanned) + (Ci * compact_isolated) + (Cmc * pgmigrate_success) + (Cmf * pgmigrate_failed) Where the values are read from /proc/vmstat. This is very basic and ignores certain costs such as the allocation cost to do a migrate page copy but any improvement to the model would still use the same vmstat counters. Signed-off-by: Mel Gorman --- include/linux/vm_event_item.h | 2 ++ mm/compaction.c | 8 ++++++++ mm/vmstat.c | 3 +++ 3 files changed, 13 insertions(+), 0 deletions(-) diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index 5ce5c5f..83ea0b6 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -42,6 +42,8 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, PGMIGRATE_SUCCESS, PGMIGRATE_FAIL, #endif #ifdef CONFIG_COMPACTION + COMPACTMIGRATE_SCANNED, COMPACTFREE_SCANNED, + COMPACTISOLATED, COMPACTSTALL, COMPACTFAIL, COMPACTSUCCESS, #endif #ifdef CONFIG_HUGETLB_PAGE diff --git a/mm/compaction.c b/mm/compaction.c index 11b455b..8422dd4 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -143,6 +143,10 @@ static unsigned long isolate_freepages_block(unsigned long blockpfn, } trace_mm_compaction_isolate_freepages(nr_scanned, total_isolated); + count_vm_events(COMPACTFREE_SCANNED, nr_scanned); + if (total_isolated) + count_vm_events(COMPACTISOLATED, total_isolated); + return total_isolated; } @@ -402,6 +406,10 @@ isolate_migratepages_range(struct zone *zone, struct compact_control *cc, trace_mm_compaction_isolate_migratepages(nr_scanned, nr_isolated); + count_vm_events(COMPACTMIGRATE_SCANNED, nr_scanned); + if (nr_isolated) + count_vm_events(COMPACTISOLATED, nr_isolated); + return low_pfn; } diff --git a/mm/vmstat.c b/mm/vmstat.c index 4849241..ab0b1b1 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -766,6 +766,9 @@ const char * const vmstat_text[] = { "pgmigrate_fail", #endif #ifdef CONFIG_COMPACTION + "compact_migrate_scanned", + "compact_free_scanned", + "compact_isolated", "compact_stall", "compact_fail", "compact_success", -- 1.7.7 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx137.postini.com [74.125.245.137]) by kanga.kvack.org (Postfix) with SMTP id 05D276B0072 for ; Mon, 22 Oct 2012 04:13:46 -0400 (EDT) Date: Mon, 22 Oct 2012 09:06:16 +0100 From: Mel Gorman Subject: [PATCH 4/5] mm: autonuma: Add pte updates, hinting and migration stats for AutoNUMA Message-ID: <20121022080616.GC2198@suse.de> References: <1350892791-2682-1-git-send-email-mgorman@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <1350892791-2682-1-git-send-email-mgorman@suse.de> Sender: owner-linux-mm@kvack.org List-ID: To: Linux-MM Cc: Peter Zijlstra , Andrea Arcangeli , Rik van Riel , LKML The system CPU cost of AutoNUMA is known to be high but it is tricky to quantify the cost in a meaningful manner. This patch adds some vmstats that can be used as part of a basic costing model. u = basic unit = sizeof(void *) Ca = cost of struct page access = sizeof(struct page) / u Cpte = Cost PTE access = Ca Cupdate = Cost PTE update = (2 * Cpte) + (2 * Wlock) where Cpte is incurred twice for a read and a write and Wlock is a constant representing the cost of taking or releasing a lock Cnumahint = Cost of a minor page fault = some high constant e.g. 1000 Cpagerw = Cost to read or write a full page = Ca + PAGE_SIZE/u Ci = Cost of page isolation = Ca + Wi where Wi is a constant that should reflect the approximate cost of the locking operation Cpagecopy = Cpagerw + (Cpagerw * Wnuma) + Ci + (Ci * Wnuma) where Wnuma is the approximate NUMA factor. 1 is local. 1.2 would imply that remote accesses are 20% more expensive AutoNUMA cost = Cpte * numa_pte_updates + Cnumahint * numa_hint_faults + Ci * numa_pages_migrated + Cpagecopy * numa_pages_migrated Note that numa_pages_migrated is used as a measure of how many pages were isolated even though it would miss pages that failed to migrate. A vmstat counter could have been added for it but the isolation cost is pretty marginal in comparison to the overall cost so it seemed overkill. The ideal way to measure AutoNUMA benefit would be to count the number of remote accesses versus local accesses and do something like benefit = (remote_accesses_before - remove_access_after) * Wnuma but the information is not readily available. However, for two given versions of AutoNUMA we can at least estimate if one is better than the other in terms of convergence. As a workload converges, the expection would be that the number of remote numa hints would reduce to 0. convergence = numa_hint_faults_local / numa_hint_faults where this is measured for the last N number of numa hints recorded. When the workload is fully converged the value is 1. This can measure if AutoNUMA is converging and how fast it is doing it. Signed-off-by: Mel Gorman --- include/linux/vm_event_item.h | 6 ++++++ mm/autonuma.c | 19 +++++++++++++++---- mm/vmstat.c | 6 ++++++ 3 files changed, 27 insertions(+), 4 deletions(-) diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index 83ea0b6..53eb132 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -38,6 +38,12 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, KSWAPD_LOW_WMARK_HIT_QUICKLY, KSWAPD_HIGH_WMARK_HIT_QUICKLY, KSWAPD_SKIP_CONGESTION_WAIT, PAGEOUTRUN, ALLOCSTALL, PGROTATED, +#ifdef CONFIG_AUTONUMA + NUMA_PTE_UPDATES, + NUMA_HINT_FAULTS, + NUMA_HINT_FAULTS_LOCAL, + NUMA_PAGE_MIGRATE, +#endif #ifdef CONFIG_MIGRATION PGMIGRATE_SUCCESS, PGMIGRATE_FAIL, #endif diff --git a/mm/autonuma.c b/mm/autonuma.c index f1e699f..4db53a1 100644 --- a/mm/autonuma.c +++ b/mm/autonuma.c @@ -245,11 +245,13 @@ static bool autonuma_migrate_page(struct page *page, int dst_nid, migrated); if (isolated) { - int err; + int nr_remaining; pages_migrated += isolated; /* FIXME: per node */ - err = migrate_pages(&migratepages, alloc_migrate_dst_page, + nr_remaining = migrate_pages(&migratepages, + alloc_migrate_dst_page, pgdat->node_id, false, MIGRATE_ASYNC); - if (err) + count_vm_events(NUMA_PAGE_MIGRATE, isolated - nr_remaining); + if (nr_remaining) putback_lru_pages(&migratepages); } BUG_ON(!list_empty(&migratepages)); @@ -364,6 +366,8 @@ bool numa_hinting_fault(struct page *page, int numpages) p->mm->mm_autonuma->mm_numa_fault_pass; page_nid = page_to_nid(page); this_nid = numa_node_id(); + if (page_nid == this_nid) + count_vm_event(NUMA_HINT_FAULTS_LOCAL); VM_BUG_ON(this_nid < 0); VM_BUG_ON(this_nid >= MAX_NUMNODES); access_nid = numa_hinting_fault_memory_follow_cpu(page, @@ -423,6 +427,7 @@ out: out_unlock: pte_unmap_unlock(ptep, ptl); + count_vm_event(NUMA_HINT_FAULTS); goto out; } @@ -571,6 +576,7 @@ static int knuma_scand_pmd(struct mm_struct *mm, unsigned long _address, end; spinlock_t *ptl; int ret = 0; + int nr_pte_updates = 0; VM_BUG_ON(address & ~PAGE_MASK); @@ -616,6 +622,7 @@ static int knuma_scand_pmd(struct mm_struct *mm, } set_pmd_at(mm, address, pmd, pmd_mknuma(*pmd)); + nr_pte_updates++; /* defer TLB flush to lower the overhead */ spin_unlock(&mm->page_table_lock); goto out; @@ -648,8 +655,10 @@ static int knuma_scand_pmd(struct mm_struct *mm, if (pte_numa(pteval)) continue; - if (!autonuma_scan_pmd()) + if (!autonuma_scan_pmd()) { set_pte_at(mm, _address, _pte, pte_mknuma(pteval)); + nr_pte_updates++; + } /* defer TLB flush to lower the overhead */ ret++; @@ -668,6 +677,8 @@ static int knuma_scand_pmd(struct mm_struct *mm, } out: + if (nr_pte_updates) + count_vm_events(NUMA_PTE_UPDATES, nr_pte_updates); return ret; } diff --git a/mm/vmstat.c b/mm/vmstat.c index ab0b1b1..58c2757 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -761,6 +761,12 @@ const char * const vmstat_text[] = { "pgrotated", +#ifdef CONFIG_AUTONUMA + "numa_pte_updates", + "numa_hint_faults", + "numa_hint_faults_local", + "numa_pages_migrated", +#endif #ifdef CONFIG_MIGRATION "pgmigrate_success", "pgmigrate_fail", -- 1.7.7 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx108.postini.com [74.125.245.108]) by kanga.kvack.org (Postfix) with SMTP id 091116B0073 for ; Mon, 22 Oct 2012 04:14:24 -0400 (EDT) Date: Mon, 22 Oct 2012 09:06:55 +0100 From: Mel Gorman Subject: [PATCH 5/5] mm: autonuma: Specify the migration reason for the tracepoint Message-ID: <20121022080655.GD2198@suse.de> References: <1350892791-2682-1-git-send-email-mgorman@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <1350892791-2682-1-git-send-email-mgorman@suse.de> Sender: owner-linux-mm@kvack.org List-ID: To: Linux-MM Cc: Peter Zijlstra , Andrea Arcangeli , Rik van Riel , LKML Record in the migrate_pages tracepoint that the migration is for AutoNUMA. Signed-off-by: Mel Gorman --- include/linux/migrate.h | 1 + include/trace/events/migrate.h | 1 + mm/autonuma.c | 3 ++- 3 files changed, 4 insertions(+), 1 deletions(-) diff --git a/include/linux/migrate.h b/include/linux/migrate.h index 9d1c159..ba17e56 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -13,6 +13,7 @@ enum migrate_reason { MR_MEMORY_HOTPLUG, MR_SYSCALL, /* also applies to cpusets */ MR_MEMPOLICY_MBIND, + MR_AUTONUMA, MR_CMA }; diff --git a/include/trace/events/migrate.h b/include/trace/events/migrate.h index ec2a6cc..2eaaf90 100644 --- a/include/trace/events/migrate.h +++ b/include/trace/events/migrate.h @@ -15,6 +15,7 @@ {MR_MEMORY_HOTPLUG, "memory_hotplug"}, \ {MR_SYSCALL, "syscall_or_cpuset"}, \ {MR_MEMPOLICY_MBIND, "mempolicy_mbind"}, \ + {MR_AUTONUMA, "autonuma"}, \ {MR_CMA, "cma"} TRACE_EVENT(mm_migrate_pages, diff --git a/mm/autonuma.c b/mm/autonuma.c index 4db53a1..cb02641 100644 --- a/mm/autonuma.c +++ b/mm/autonuma.c @@ -249,7 +249,8 @@ static bool autonuma_migrate_page(struct page *page, int dst_nid, pages_migrated += isolated; /* FIXME: per node */ nr_remaining = migrate_pages(&migratepages, alloc_migrate_dst_page, - pgdat->node_id, false, MIGRATE_ASYNC); + pgdat->node_id, false, MIGRATE_ASYNC, + MR_AUTONUMA); count_vm_events(NUMA_PAGE_MIGRATE, isolated - nr_remaining); if (nr_remaining) putback_lru_pages(&migratepages); -- 1.7.7 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx167.postini.com [74.125.245.167]) by kanga.kvack.org (Postfix) with SMTP id CDEB76B005D for ; Wed, 24 Oct 2012 22:57:12 -0400 (EDT) Received: by mail-pa0-f41.google.com with SMTP id fa10so891238pad.14 for ; Wed, 24 Oct 2012 19:57:12 -0700 (PDT) Date: Wed, 24 Oct 2012 19:57:10 -0700 (PDT) From: David Rientjes Subject: Re: [PATCH 1/5] mm: compaction: Move migration fail/success stats to migrate.c In-Reply-To: <1350892791-2682-2-git-send-email-mgorman@suse.de> Message-ID: References: <1350892791-2682-1-git-send-email-mgorman@suse.de> <1350892791-2682-2-git-send-email-mgorman@suse.de> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org List-ID: To: Mel Gorman Cc: Linux-MM , Peter Zijlstra , Andrea Arcangeli , Rik van Riel , LKML On Mon, 22 Oct 2012, Mel Gorman wrote: > The compact_pages_moved and compact_pagemigrate_failed events are > convenient for determining if compaction is active and to what > degree migration is succeeding but it's at the wrong level. Other > users of migration may also want to know if migration is working > properly and this will be particularly true for any automated > NUMA migration. This patch moves the counters down to migration > with the new events called pgmigrate_success and pgmigrate_fail. > The compact_blocks_moved counter is removed because while it was > useful for debugging initially, it's worthless now as no meaningful > conclusions can be drawn from its value. > Agreed, "compact_blocks_moved" should have been named "compact_blocks_scanned" to accurately describe what it was representing. > Signed-off-by: Mel Gorman Acked-by: David Rientjes -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751995Ab2JVIHg (ORCPT ); Mon, 22 Oct 2012 04:07:36 -0400 Received: from cantor2.suse.de ([195.135.220.15]:57230 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750848Ab2JVIHe (ORCPT ); Mon, 22 Oct 2012 04:07:34 -0400 From: Mel Gorman To: Linux-MM Cc: Peter Zijlstra , Andrea Arcangeli , Rik van Riel , Mel Gorman , LKML Subject: [RFC PATCH 0/5] vmstats for compaction, migration and autonuma Date: Mon, 22 Oct 2012 08:59:46 +0100 Message-Id: <1350892791-2682-1-git-send-email-mgorman@suse.de> X-Mailer: git-send-email 1.7.7 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org I'm travelling for a conference at the moment so these patches are not tested but with the ongoing NUMA migration work I figured it was best to post these sooner rather than later. This series adds vmstat counters and tracepoints for migration, compaction and autonuma. Using them it's possible to create a basic cost model to estimate the overhead due to compaction or autonuma. Using the stats it is also possible to measure if a workload is converging on autonuma or not and potentially measure how quickly it is converging. Ideally the same stats would be available for schednuma but I did not review the series when it was last posted in July and had not seen a recent posting. I only recently heard they were in the -tip tree but will not get the chance to look at them until I've finished travelling in a weeks time. If schednuma had similar stats it would then be possible to compare schednuma and autonuma in terms of how quickly a workload converges with either approach. include/linux/migrate.h | 14 +++++++++- include/linux/vm_event_item.h | 12 ++++++++- include/trace/events/migrate.h | 52 ++++++++++++++++++++++++++++++++++++++++ mm/autonuma.c | 22 +++++++++++++---- mm/compaction.c | 15 +++++++---- mm/memory-failure.c | 3 +- mm/memory_hotplug.c | 3 +- mm/mempolicy.c | 6 +++- mm/migrate.c | 16 ++++++++++- mm/page_alloc.c | 3 +- mm/vmstat.c | 16 ++++++++++-- 11 files changed, 139 insertions(+), 23 deletions(-) create mode 100644 include/trace/events/migrate.h -- 1.7.7 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752104Ab2JVIJP (ORCPT ); Mon, 22 Oct 2012 04:09:15 -0400 Received: from cantor2.suse.de ([195.135.220.15]:57279 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750848Ab2JVIJN (ORCPT ); Mon, 22 Oct 2012 04:09:13 -0400 From: Mel Gorman To: Linux-MM Cc: Peter Zijlstra , Andrea Arcangeli , Rik van Riel , Mel Gorman , LKML Subject: [PATCH 2/5] mm: migrate: Add a tracepoint for migrate_pages Date: Mon, 22 Oct 2012 08:59:48 +0100 Message-Id: <1350892791-2682-3-git-send-email-mgorman@suse.de> X-Mailer: git-send-email 1.7.7 In-Reply-To: <1350892791-2682-1-git-send-email-mgorman@suse.de> References: <1350892791-2682-1-git-send-email-mgorman@suse.de> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The pgmigrate_success and pgmigrate_fail vmstat counters tells the user about migration activity but not the type or the reason. This patch adds a tracepoint to identify the type of page migration and why the page is being migrated. Signed-off-by: Mel Gorman --- include/linux/migrate.h | 13 ++++++++- include/trace/events/migrate.h | 51 ++++++++++++++++++++++++++++++++++++++++ mm/compaction.c | 3 +- mm/memory-failure.c | 3 +- mm/memory_hotplug.c | 3 +- mm/mempolicy.c | 6 +++- mm/migrate.c | 10 ++++++- mm/page_alloc.c | 3 +- 8 files changed, 82 insertions(+), 10 deletions(-) create mode 100644 include/trace/events/migrate.h diff --git a/include/linux/migrate.h b/include/linux/migrate.h index ce7e667..9d1c159 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -7,6 +7,15 @@ typedef struct page *new_page_t(struct page *, unsigned long private, int **); +enum migrate_reason { + MR_COMPACTION, + MR_MEMORY_FAILURE, + MR_MEMORY_HOTPLUG, + MR_SYSCALL, /* also applies to cpusets */ + MR_MEMPOLICY_MBIND, + MR_CMA +}; + #ifdef CONFIG_MIGRATION extern void putback_lru_pages(struct list_head *l); @@ -14,7 +23,7 @@ extern int migrate_page(struct address_space *, struct page *, struct page *, enum migrate_mode); extern int migrate_pages(struct list_head *l, new_page_t x, unsigned long private, bool offlining, - enum migrate_mode mode); + enum migrate_mode mode, int reason); extern int migrate_huge_page(struct page *, new_page_t x, unsigned long private, bool offlining, enum migrate_mode mode); @@ -35,7 +44,7 @@ extern int migrate_huge_page_move_mapping(struct address_space *mapping, static inline void putback_lru_pages(struct list_head *l) {} static inline int migrate_pages(struct list_head *l, new_page_t x, unsigned long private, bool offlining, - enum migrate_mode mode) { return -ENOSYS; } + enum migrate_mode mode, int reason) { return -ENOSYS; } static inline int migrate_huge_page(struct page *page, new_page_t x, unsigned long private, bool offlining, enum migrate_mode mode) { return -ENOSYS; } diff --git a/include/trace/events/migrate.h b/include/trace/events/migrate.h new file mode 100644 index 0000000..ec2a6cc --- /dev/null +++ b/include/trace/events/migrate.h @@ -0,0 +1,51 @@ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM migrate + +#if !defined(_TRACE_MIGRATE_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_MIGRATE_H + +#define MIGRATE_MODE \ + {MIGRATE_ASYNC, "MIGRATE_ASYNC"}, \ + {MIGRATE_SYNC_LIGHT, "MIGRATE_SYNC_LIGHT"}, \ + {MIGRATE_SYNC, "MIGRATE_SYNC"} + +#define MIGRATE_REASON \ + {MR_COMPACTION, "compaction"}, \ + {MR_MEMORY_FAILURE, "memory_failure"}, \ + {MR_MEMORY_HOTPLUG, "memory_hotplug"}, \ + {MR_SYSCALL, "syscall_or_cpuset"}, \ + {MR_MEMPOLICY_MBIND, "mempolicy_mbind"}, \ + {MR_CMA, "cma"} + +TRACE_EVENT(mm_migrate_pages, + + TP_PROTO(unsigned long succeeded, unsigned long failed, + enum migrate_mode mode, int reason), + + TP_ARGS(succeeded, failed, mode, reason), + + TP_STRUCT__entry( + __field( unsigned long, succeeded) + __field( unsigned long, failed) + __field( enum migrate_mode, mode) + __field( int, reason) + ), + + TP_fast_assign( + __entry->succeeded = succeeded; + __entry->failed = failed; + __entry->mode = mode; + __entry->reason = reason; + ), + + TP_printk("nr_succeeded=%lu nr_failed=%lu mode=%s reason=%s", + __entry->succeeded, + __entry->failed, + __print_symbolic(__entry->mode, MIGRATE_MODE), + __print_symbolic(__entry->reason, MIGRATE_REASON)) +); + +#endif /* _TRACE_MIGRATE_H */ + +/* This part must be outside protection */ +#include diff --git a/mm/compaction.c b/mm/compaction.c index 8c1a53a..11b455b 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -797,7 +797,8 @@ static int compact_zone(struct zone *zone, struct compact_control *cc) nr_migrate = cc->nr_migratepages; err = migrate_pages(&cc->migratepages, compaction_alloc, (unsigned long)cc, false, - cc->sync ? MIGRATE_SYNC_LIGHT : MIGRATE_ASYNC); + cc->sync ? MIGRATE_SYNC_LIGHT : MIGRATE_ASYNC, + MR_COMPACTION); update_nr_listpages(cc); nr_remaining = cc->nr_migratepages; diff --git a/mm/memory-failure.c b/mm/memory-failure.c index a6e2141..9d4489c 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1556,7 +1556,8 @@ int soft_offline_page(struct page *page, int flags) page_is_file_cache(page)); list_add(&page->lru, &pagelist); ret = migrate_pages(&pagelist, new_page, MPOL_MF_MOVE_ALL, - false, MIGRATE_SYNC); + false, MIGRATE_SYNC, + MR_MEMORY_FAILURE); if (ret) { putback_lru_pages(&pagelist); pr_info("soft offline: %#lx: migration failed %d, type %lx\n", diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 6a5b90d..b299e83 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -815,7 +815,8 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn) } /* this function returns # of failed pages */ ret = migrate_pages(&source, hotremove_migrate_alloc, 0, - true, MIGRATE_SYNC); + true, MIGRATE_SYNC, + MR_MEMORY_HOTPLUG); if (ret) putback_lru_pages(&source); } diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 5cffcb6..bd4fc4c 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -936,7 +936,8 @@ static int migrate_to_node(struct mm_struct *mm, int source, int dest, if (!list_empty(&pagelist)) { err = migrate_pages(&pagelist, new_node_page, dest, - false, MIGRATE_SYNC); + false, MIGRATE_SYNC, + MR_SYSCALL); if (err) putback_lru_pages(&pagelist); } @@ -1177,7 +1178,8 @@ static long do_mbind(unsigned long start, unsigned long len, if (!list_empty(&pagelist)) { nr_failed = migrate_pages(&pagelist, new_vma_page, (unsigned long)vma, - false, MIGRATE_SYNC); + false, MIGRATE_SYNC, + MR_MEMPOLICY_MBIND); if (nr_failed) putback_lru_pages(&pagelist); } diff --git a/mm/migrate.c b/mm/migrate.c index 04687f6..27be9c9 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -38,6 +38,9 @@ #include +#define CREATE_TRACE_POINTS +#include + #include "internal.h" /* @@ -958,7 +961,7 @@ out: */ int migrate_pages(struct list_head *from, new_page_t get_new_page, unsigned long private, bool offlining, - enum migrate_mode mode) + enum migrate_mode mode, int reason) { int retry = 1; int nr_failed = 0; @@ -1004,6 +1007,8 @@ out: count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded); if (nr_failed) count_vm_events(PGMIGRATE_FAIL, nr_failed); + trace_mm_migrate_pages(nr_succeeded, nr_failed, mode, reason); + if (!swapwrite) current->flags &= ~PF_SWAPWRITE; @@ -1145,7 +1150,8 @@ set_status: err = 0; if (!list_empty(&pagelist)) { err = migrate_pages(&pagelist, new_page_node, - (unsigned long)pm, 0, MIGRATE_SYNC); + (unsigned long)pm, 0, MIGRATE_SYNC, + MR_SYSCALL); if (err) putback_lru_pages(&pagelist); } diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 55df691..3d361f6 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5677,7 +5677,8 @@ static int __alloc_contig_migrate_range(unsigned long start, unsigned long end) ret = migrate_pages(&cc.migratepages, __alloc_contig_migrate_alloc, - 0, false, MIGRATE_SYNC); + 0, false, MIGRATE_SYNC, + MR_CMA); } putback_lru_pages(&cc.migratepages); -- 1.7.7 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752029Ab2JVIJO (ORCPT ); Mon, 22 Oct 2012 04:09:14 -0400 Received: from cantor2.suse.de ([195.135.220.15]:57274 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750793Ab2JVIJN (ORCPT ); Mon, 22 Oct 2012 04:09:13 -0400 From: Mel Gorman To: Linux-MM Cc: Peter Zijlstra , Andrea Arcangeli , Rik van Riel , Mel Gorman , LKML Subject: [PATCH 1/5] mm: compaction: Move migration fail/success stats to migrate.c Date: Mon, 22 Oct 2012 08:59:47 +0100 Message-Id: <1350892791-2682-2-git-send-email-mgorman@suse.de> X-Mailer: git-send-email 1.7.7 In-Reply-To: <1350892791-2682-1-git-send-email-mgorman@suse.de> References: <1350892791-2682-1-git-send-email-mgorman@suse.de> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The compact_pages_moved and compact_pagemigrate_failed events are convenient for determining if compaction is active and to what degree migration is succeeding but it's at the wrong level. Other users of migration may also want to know if migration is working properly and this will be particularly true for any automated NUMA migration. This patch moves the counters down to migration with the new events called pgmigrate_success and pgmigrate_fail. The compact_blocks_moved counter is removed because while it was useful for debugging initially, it's worthless now as no meaningful conclusions can be drawn from its value. Signed-off-by: Mel Gorman --- include/linux/vm_event_item.h | 4 +++- mm/compaction.c | 4 ---- mm/migrate.c | 6 ++++++ mm/vmstat.c | 7 ++++--- 4 files changed, 13 insertions(+), 8 deletions(-) diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index 57f7b10..5ce5c5f 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -38,8 +38,10 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, KSWAPD_LOW_WMARK_HIT_QUICKLY, KSWAPD_HIGH_WMARK_HIT_QUICKLY, KSWAPD_SKIP_CONGESTION_WAIT, PAGEOUTRUN, ALLOCSTALL, PGROTATED, +#ifdef CONFIG_MIGRATION + PGMIGRATE_SUCCESS, PGMIGRATE_FAIL, +#endif #ifdef CONFIG_COMPACTION - COMPACTBLOCKS, COMPACTPAGES, COMPACTPAGEFAILED, COMPACTSTALL, COMPACTFAIL, COMPACTSUCCESS, #endif #ifdef CONFIG_HUGETLB_PAGE diff --git a/mm/compaction.c b/mm/compaction.c index 7fcd3a5..8c1a53a 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -801,10 +801,6 @@ static int compact_zone(struct zone *zone, struct compact_control *cc) update_nr_listpages(cc); nr_remaining = cc->nr_migratepages; - count_vm_event(COMPACTBLOCKS); - count_vm_events(COMPACTPAGES, nr_migrate - nr_remaining); - if (nr_remaining) - count_vm_events(COMPACTPAGEFAILED, nr_remaining); trace_mm_compaction_migratepages(nr_migrate - nr_remaining, nr_remaining); diff --git a/mm/migrate.c b/mm/migrate.c index 77ed2d7..04687f6 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -962,6 +962,7 @@ int migrate_pages(struct list_head *from, { int retry = 1; int nr_failed = 0; + int nr_succeeded = 0; int pass = 0; struct page *page; struct page *page2; @@ -988,6 +989,7 @@ int migrate_pages(struct list_head *from, retry++; break; case 0: + nr_succeeded++; break; default: /* Permanent failure */ @@ -998,6 +1000,10 @@ int migrate_pages(struct list_head *from, } rc = 0; out: + if (nr_succeeded) + count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded); + if (nr_failed) + count_vm_events(PGMIGRATE_FAIL, nr_failed); if (!swapwrite) current->flags &= ~PF_SWAPWRITE; diff --git a/mm/vmstat.c b/mm/vmstat.c index df7a674..4849241 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -761,10 +761,11 @@ const char * const vmstat_text[] = { "pgrotated", +#ifdef CONFIG_MIGRATION + "pgmigrate_success", + "pgmigrate_fail", +#endif #ifdef CONFIG_COMPACTION - "compact_blocks_moved", - "compact_pages_moved", - "compact_pagemigrate_failed", "compact_stall", "compact_fail", "compact_success", -- 1.7.7 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752135Ab2JVINI (ORCPT ); Mon, 22 Oct 2012 04:13:08 -0400 Received: from cantor2.suse.de ([195.135.220.15]:57489 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751603Ab2JVINH (ORCPT ); Mon, 22 Oct 2012 04:13:07 -0400 Date: Mon, 22 Oct 2012 09:05:25 +0100 From: Mel Gorman To: Linux-MM Cc: Peter Zijlstra , Andrea Arcangeli , Rik van Riel , LKML Subject: [PATCH 3/5] mm: compaction: Add scanned and isolated counters for compaction Message-ID: <20121022080525.GB2198@suse.de> References: <1350892791-2682-1-git-send-email-mgorman@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <1350892791-2682-1-git-send-email-mgorman@suse.de> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Compaction already has tracepoints to count scanned and isolated pages but it requires that ftrace be enabled and if that information has to be written to disk then it can be disruptive. This patch adds vmstat counters for compaction called compact_migrate_scanned, compact_free_scanned and compact_isolated. With these counters, it is possible to define a basic cost model for compaction. This approximates of how much work compaction is doing and can be compared that with an oprofile showing TLB misses and see if the cost of compaction is being offset by THP for example. Minimally a compaction patch can be evaluated in terms of whether it increases or decreases cost. The basic cost model looks like this Fundamental unit u: a word sizeof(void *) Ca = cost of struct page access = sizeof(struct page) / u Cmc = Cost migrate page copy = (Ca + PAGE_SIZE/u) * 2 Cmf = Cost migrate failure = Ca * 2 Ci = Cost page isolation = (Ca + Wi) where Wi is a constant that should reflect the approximate cost of the locking operation. Csm = Cost migrate scanning = Ca Csf = Cost free scanning = Ca Overall cost = (Csm * compact_migrate_scanned) + (Csf * compact_free_scanned) + (Ci * compact_isolated) + (Cmc * pgmigrate_success) + (Cmf * pgmigrate_failed) Where the values are read from /proc/vmstat. This is very basic and ignores certain costs such as the allocation cost to do a migrate page copy but any improvement to the model would still use the same vmstat counters. Signed-off-by: Mel Gorman --- include/linux/vm_event_item.h | 2 ++ mm/compaction.c | 8 ++++++++ mm/vmstat.c | 3 +++ 3 files changed, 13 insertions(+), 0 deletions(-) diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index 5ce5c5f..83ea0b6 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -42,6 +42,8 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, PGMIGRATE_SUCCESS, PGMIGRATE_FAIL, #endif #ifdef CONFIG_COMPACTION + COMPACTMIGRATE_SCANNED, COMPACTFREE_SCANNED, + COMPACTISOLATED, COMPACTSTALL, COMPACTFAIL, COMPACTSUCCESS, #endif #ifdef CONFIG_HUGETLB_PAGE diff --git a/mm/compaction.c b/mm/compaction.c index 11b455b..8422dd4 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -143,6 +143,10 @@ static unsigned long isolate_freepages_block(unsigned long blockpfn, } trace_mm_compaction_isolate_freepages(nr_scanned, total_isolated); + count_vm_events(COMPACTFREE_SCANNED, nr_scanned); + if (total_isolated) + count_vm_events(COMPACTISOLATED, total_isolated); + return total_isolated; } @@ -402,6 +406,10 @@ isolate_migratepages_range(struct zone *zone, struct compact_control *cc, trace_mm_compaction_isolate_migratepages(nr_scanned, nr_isolated); + count_vm_events(COMPACTMIGRATE_SCANNED, nr_scanned); + if (nr_isolated) + count_vm_events(COMPACTISOLATED, nr_isolated); + return low_pfn; } diff --git a/mm/vmstat.c b/mm/vmstat.c index 4849241..ab0b1b1 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -766,6 +766,9 @@ const char * const vmstat_text[] = { "pgmigrate_fail", #endif #ifdef CONFIG_COMPACTION + "compact_migrate_scanned", + "compact_free_scanned", + "compact_isolated", "compact_stall", "compact_fail", "compact_success", -- 1.7.7 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752250Ab2JVINu (ORCPT ); Mon, 22 Oct 2012 04:13:50 -0400 Received: from cantor2.suse.de ([195.135.220.15]:57504 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752168Ab2JVINq (ORCPT ); Mon, 22 Oct 2012 04:13:46 -0400 Date: Mon, 22 Oct 2012 09:06:16 +0100 From: Mel Gorman To: Linux-MM Cc: Peter Zijlstra , Andrea Arcangeli , Rik van Riel , LKML Subject: [PATCH 4/5] mm: autonuma: Add pte updates, hinting and migration stats for AutoNUMA Message-ID: <20121022080616.GC2198@suse.de> References: <1350892791-2682-1-git-send-email-mgorman@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <1350892791-2682-1-git-send-email-mgorman@suse.de> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The system CPU cost of AutoNUMA is known to be high but it is tricky to quantify the cost in a meaningful manner. This patch adds some vmstats that can be used as part of a basic costing model. u = basic unit = sizeof(void *) Ca = cost of struct page access = sizeof(struct page) / u Cpte = Cost PTE access = Ca Cupdate = Cost PTE update = (2 * Cpte) + (2 * Wlock) where Cpte is incurred twice for a read and a write and Wlock is a constant representing the cost of taking or releasing a lock Cnumahint = Cost of a minor page fault = some high constant e.g. 1000 Cpagerw = Cost to read or write a full page = Ca + PAGE_SIZE/u Ci = Cost of page isolation = Ca + Wi where Wi is a constant that should reflect the approximate cost of the locking operation Cpagecopy = Cpagerw + (Cpagerw * Wnuma) + Ci + (Ci * Wnuma) where Wnuma is the approximate NUMA factor. 1 is local. 1.2 would imply that remote accesses are 20% more expensive AutoNUMA cost = Cpte * numa_pte_updates + Cnumahint * numa_hint_faults + Ci * numa_pages_migrated + Cpagecopy * numa_pages_migrated Note that numa_pages_migrated is used as a measure of how many pages were isolated even though it would miss pages that failed to migrate. A vmstat counter could have been added for it but the isolation cost is pretty marginal in comparison to the overall cost so it seemed overkill. The ideal way to measure AutoNUMA benefit would be to count the number of remote accesses versus local accesses and do something like benefit = (remote_accesses_before - remove_access_after) * Wnuma but the information is not readily available. However, for two given versions of AutoNUMA we can at least estimate if one is better than the other in terms of convergence. As a workload converges, the expection would be that the number of remote numa hints would reduce to 0. convergence = numa_hint_faults_local / numa_hint_faults where this is measured for the last N number of numa hints recorded. When the workload is fully converged the value is 1. This can measure if AutoNUMA is converging and how fast it is doing it. Signed-off-by: Mel Gorman --- include/linux/vm_event_item.h | 6 ++++++ mm/autonuma.c | 19 +++++++++++++++---- mm/vmstat.c | 6 ++++++ 3 files changed, 27 insertions(+), 4 deletions(-) diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index 83ea0b6..53eb132 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -38,6 +38,12 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, KSWAPD_LOW_WMARK_HIT_QUICKLY, KSWAPD_HIGH_WMARK_HIT_QUICKLY, KSWAPD_SKIP_CONGESTION_WAIT, PAGEOUTRUN, ALLOCSTALL, PGROTATED, +#ifdef CONFIG_AUTONUMA + NUMA_PTE_UPDATES, + NUMA_HINT_FAULTS, + NUMA_HINT_FAULTS_LOCAL, + NUMA_PAGE_MIGRATE, +#endif #ifdef CONFIG_MIGRATION PGMIGRATE_SUCCESS, PGMIGRATE_FAIL, #endif diff --git a/mm/autonuma.c b/mm/autonuma.c index f1e699f..4db53a1 100644 --- a/mm/autonuma.c +++ b/mm/autonuma.c @@ -245,11 +245,13 @@ static bool autonuma_migrate_page(struct page *page, int dst_nid, migrated); if (isolated) { - int err; + int nr_remaining; pages_migrated += isolated; /* FIXME: per node */ - err = migrate_pages(&migratepages, alloc_migrate_dst_page, + nr_remaining = migrate_pages(&migratepages, + alloc_migrate_dst_page, pgdat->node_id, false, MIGRATE_ASYNC); - if (err) + count_vm_events(NUMA_PAGE_MIGRATE, isolated - nr_remaining); + if (nr_remaining) putback_lru_pages(&migratepages); } BUG_ON(!list_empty(&migratepages)); @@ -364,6 +366,8 @@ bool numa_hinting_fault(struct page *page, int numpages) p->mm->mm_autonuma->mm_numa_fault_pass; page_nid = page_to_nid(page); this_nid = numa_node_id(); + if (page_nid == this_nid) + count_vm_event(NUMA_HINT_FAULTS_LOCAL); VM_BUG_ON(this_nid < 0); VM_BUG_ON(this_nid >= MAX_NUMNODES); access_nid = numa_hinting_fault_memory_follow_cpu(page, @@ -423,6 +427,7 @@ out: out_unlock: pte_unmap_unlock(ptep, ptl); + count_vm_event(NUMA_HINT_FAULTS); goto out; } @@ -571,6 +576,7 @@ static int knuma_scand_pmd(struct mm_struct *mm, unsigned long _address, end; spinlock_t *ptl; int ret = 0; + int nr_pte_updates = 0; VM_BUG_ON(address & ~PAGE_MASK); @@ -616,6 +622,7 @@ static int knuma_scand_pmd(struct mm_struct *mm, } set_pmd_at(mm, address, pmd, pmd_mknuma(*pmd)); + nr_pte_updates++; /* defer TLB flush to lower the overhead */ spin_unlock(&mm->page_table_lock); goto out; @@ -648,8 +655,10 @@ static int knuma_scand_pmd(struct mm_struct *mm, if (pte_numa(pteval)) continue; - if (!autonuma_scan_pmd()) + if (!autonuma_scan_pmd()) { set_pte_at(mm, _address, _pte, pte_mknuma(pteval)); + nr_pte_updates++; + } /* defer TLB flush to lower the overhead */ ret++; @@ -668,6 +677,8 @@ static int knuma_scand_pmd(struct mm_struct *mm, } out: + if (nr_pte_updates) + count_vm_events(NUMA_PTE_UPDATES, nr_pte_updates); return ret; } diff --git a/mm/vmstat.c b/mm/vmstat.c index ab0b1b1..58c2757 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -761,6 +761,12 @@ const char * const vmstat_text[] = { "pgrotated", +#ifdef CONFIG_AUTONUMA + "numa_pte_updates", + "numa_hint_faults", + "numa_hint_faults_local", + "numa_pages_migrated", +#endif #ifdef CONFIG_MIGRATION "pgmigrate_success", "pgmigrate_fail", -- 1.7.7 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752366Ab2JVIO0 (ORCPT ); Mon, 22 Oct 2012 04:14:26 -0400 Received: from cantor2.suse.de ([195.135.220.15]:57524 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752303Ab2JVIOY (ORCPT ); Mon, 22 Oct 2012 04:14:24 -0400 Date: Mon, 22 Oct 2012 09:06:55 +0100 From: Mel Gorman To: Linux-MM Cc: Peter Zijlstra , Andrea Arcangeli , Rik van Riel , LKML Subject: [PATCH 5/5] mm: autonuma: Specify the migration reason for the tracepoint Message-ID: <20121022080655.GD2198@suse.de> References: <1350892791-2682-1-git-send-email-mgorman@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <1350892791-2682-1-git-send-email-mgorman@suse.de> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Record in the migrate_pages tracepoint that the migration is for AutoNUMA. Signed-off-by: Mel Gorman --- include/linux/migrate.h | 1 + include/trace/events/migrate.h | 1 + mm/autonuma.c | 3 ++- 3 files changed, 4 insertions(+), 1 deletions(-) diff --git a/include/linux/migrate.h b/include/linux/migrate.h index 9d1c159..ba17e56 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -13,6 +13,7 @@ enum migrate_reason { MR_MEMORY_HOTPLUG, MR_SYSCALL, /* also applies to cpusets */ MR_MEMPOLICY_MBIND, + MR_AUTONUMA, MR_CMA }; diff --git a/include/trace/events/migrate.h b/include/trace/events/migrate.h index ec2a6cc..2eaaf90 100644 --- a/include/trace/events/migrate.h +++ b/include/trace/events/migrate.h @@ -15,6 +15,7 @@ {MR_MEMORY_HOTPLUG, "memory_hotplug"}, \ {MR_SYSCALL, "syscall_or_cpuset"}, \ {MR_MEMPOLICY_MBIND, "mempolicy_mbind"}, \ + {MR_AUTONUMA, "autonuma"}, \ {MR_CMA, "cma"} TRACE_EVENT(mm_migrate_pages, diff --git a/mm/autonuma.c b/mm/autonuma.c index 4db53a1..cb02641 100644 --- a/mm/autonuma.c +++ b/mm/autonuma.c @@ -249,7 +249,8 @@ static bool autonuma_migrate_page(struct page *page, int dst_nid, pages_migrated += isolated; /* FIXME: per node */ nr_remaining = migrate_pages(&migratepages, alloc_migrate_dst_page, - pgdat->node_id, false, MIGRATE_ASYNC); + pgdat->node_id, false, MIGRATE_ASYNC, + MR_AUTONUMA); count_vm_events(NUMA_PAGE_MIGRATE, isolated - nr_remaining); if (nr_remaining) putback_lru_pages(&migratepages); -- 1.7.7 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756948Ab2JYC5N (ORCPT ); Wed, 24 Oct 2012 22:57:13 -0400 Received: from mail-pa0-f46.google.com ([209.85.220.46]:55744 "EHLO mail-pa0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754798Ab2JYC5M (ORCPT ); Wed, 24 Oct 2012 22:57:12 -0400 Date: Wed, 24 Oct 2012 19:57:10 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Mel Gorman cc: Linux-MM , Peter Zijlstra , Andrea Arcangeli , Rik van Riel , LKML Subject: Re: [PATCH 1/5] mm: compaction: Move migration fail/success stats to migrate.c In-Reply-To: <1350892791-2682-2-git-send-email-mgorman@suse.de> Message-ID: References: <1350892791-2682-1-git-send-email-mgorman@suse.de> <1350892791-2682-2-git-send-email-mgorman@suse.de> User-Agent: Alpine 2.00 (DEB 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 22 Oct 2012, Mel Gorman wrote: > The compact_pages_moved and compact_pagemigrate_failed events are > convenient for determining if compaction is active and to what > degree migration is succeeding but it's at the wrong level. Other > users of migration may also want to know if migration is working > properly and this will be particularly true for any automated > NUMA migration. This patch moves the counters down to migration > with the new events called pgmigrate_success and pgmigrate_fail. > The compact_blocks_moved counter is removed because while it was > useful for debugging initially, it's worthless now as no meaningful > conclusions can be drawn from its value. > Agreed, "compact_blocks_moved" should have been named "compact_blocks_scanned" to accurately describe what it was representing. > Signed-off-by: Mel Gorman Acked-by: David Rientjes