* [PATCH v2 0/2] mm, thp: Fix unnecessarry resource consuming in swapin @ 2016-03-13 9:28 Ebru Akagunduz 2016-03-13 9:28 ` [PATCH v2 1/2] mm, vmstat: calculate particular vm event Ebru Akagunduz 2016-03-13 9:28 ` [PATCH v2 2/2] mm, thp: avoid unnecessary swapin in khugepaged Ebru Akagunduz 0 siblings, 2 replies; 7+ messages in thread From: Ebru Akagunduz @ 2016-03-13 9:28 UTC (permalink / raw) To: linux-mm Cc: hughd, riel, akpm, kirill.shutemov, n-horiguchi, aarcange, iamjoonsoo.kim, gorcunov, linux-kernel, mgorman, rientjes, vbabka, aneesh.kumar, hannes, mhocko, boaz, Ebru Akagunduz This patch series fixes unnecessarry resource consuming in khugepaged swapin and introduces a new function to calculate value of specific vm event. Ebru Akagunduz (2): mm, vmstat: calculate particular vm event mm, thp: avoid unnecessary swapin in khugepaged include/linux/vmstat.h | 2 ++ mm/huge_memory.c | 13 +++++++++++-- mm/vmstat.c | 12 ++++++++++++ 3 files changed, 25 insertions(+), 2 deletions(-) -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 1/2] mm, vmstat: calculate particular vm event 2016-03-13 9:28 [PATCH v2 0/2] mm, thp: Fix unnecessarry resource consuming in swapin Ebru Akagunduz @ 2016-03-13 9:28 ` Ebru Akagunduz 2016-03-13 23:08 ` Kirill A. Shutemov 2016-03-13 9:28 ` [PATCH v2 2/2] mm, thp: avoid unnecessary swapin in khugepaged Ebru Akagunduz 1 sibling, 1 reply; 7+ messages in thread From: Ebru Akagunduz @ 2016-03-13 9:28 UTC (permalink / raw) To: linux-mm Cc: hughd, riel, akpm, kirill.shutemov, n-horiguchi, aarcange, iamjoonsoo.kim, gorcunov, linux-kernel, mgorman, rientjes, vbabka, aneesh.kumar, hannes, mhocko, boaz, Ebru Akagunduz Currently, vmstat can calculate specific vm event with all_vm_events() however it allocates all vm events to stack. This patch introduces a helper to sum value of a specific vm event over all cpu, without loading all the events. Signed-off-by: Ebru Akagunduz <ebru.akagunduz@gmail.com> --- Changes in v2: - this patch newly created in this version - create sum event function to calculate particular vm event (Kirill A. Shutemov) include/linux/vmstat.h | 2 ++ mm/vmstat.c | 12 ++++++++++++ 2 files changed, 14 insertions(+) diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h index 73fae8c..add0cc1 100644 --- a/include/linux/vmstat.h +++ b/include/linux/vmstat.h @@ -53,6 +53,8 @@ static inline void count_vm_events(enum vm_event_item item, long delta) extern void all_vm_events(unsigned long *); +extern unsigned long sum_vm_event(enum vm_event_item item); + extern void vm_events_fold_cpu(int cpu); #else diff --git a/mm/vmstat.c b/mm/vmstat.c index 5e43004..b76d664 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -34,6 +34,18 @@ DEFINE_PER_CPU(struct vm_event_state, vm_event_states) = {{0}}; EXPORT_PER_CPU_SYMBOL(vm_event_states); +unsigned long sum_vm_event(enum vm_event_item item) +{ + int cpu; + unsigned long ret = 0; + + get_online_cpus(); + for_each_online_cpu(cpu) + ret += per_cpu(vm_event_states, cpu).event[item]; + put_online_cpus(); + return ret; +} + static void sum_vm_events(unsigned long *ret) { int cpu; -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v2 1/2] mm, vmstat: calculate particular vm event 2016-03-13 9:28 ` [PATCH v2 1/2] mm, vmstat: calculate particular vm event Ebru Akagunduz @ 2016-03-13 23:08 ` Kirill A. Shutemov 0 siblings, 0 replies; 7+ messages in thread From: Kirill A. Shutemov @ 2016-03-13 23:08 UTC (permalink / raw) To: Ebru Akagunduz Cc: linux-mm, hughd, riel, akpm, kirill.shutemov, n-horiguchi, aarcange, iamjoonsoo.kim, gorcunov, linux-kernel, mgorman, rientjes, vbabka, aneesh.kumar, hannes, mhocko, boaz On Sun, Mar 13, 2016 at 11:28:54AM +0200, Ebru Akagunduz wrote: > Currently, vmstat can calculate specific vm event with all_vm_events() > however it allocates all vm events to stack. This patch introduces > a helper to sum value of a specific vm event over all cpu, without > loading all the events. > > Signed-off-by: Ebru Akagunduz <ebru.akagunduz@gmail.com> > --- > Changes in v2: > - this patch newly created in this version > - create sum event function to > calculate particular vm event (Kirill A. Shutemov) > > include/linux/vmstat.h | 2 ++ > mm/vmstat.c | 12 ++++++++++++ > 2 files changed, 14 insertions(+) > > diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h > index 73fae8c..add0cc1 100644 > --- a/include/linux/vmstat.h > +++ b/include/linux/vmstat.h > @@ -53,6 +53,8 @@ static inline void count_vm_events(enum vm_event_item item, long delta) > > extern void all_vm_events(unsigned long *); > > +extern unsigned long sum_vm_event(enum vm_event_item item); > + > extern void vm_events_fold_cpu(int cpu); > > #else You need dumy definition of the function for !CONFIG_VM_EVENT_COUNTERS case here. Otherwise build will fail. See 0-day report. Otherwise looks good to me: Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > diff --git a/mm/vmstat.c b/mm/vmstat.c > index 5e43004..b76d664 100644 > --- a/mm/vmstat.c > +++ b/mm/vmstat.c > @@ -34,6 +34,18 @@ > DEFINE_PER_CPU(struct vm_event_state, vm_event_states) = {{0}}; > EXPORT_PER_CPU_SYMBOL(vm_event_states); > > +unsigned long sum_vm_event(enum vm_event_item item) > +{ > + int cpu; > + unsigned long ret = 0; > + > + get_online_cpus(); > + for_each_online_cpu(cpu) > + ret += per_cpu(vm_event_states, cpu).event[item]; > + put_online_cpus(); > + return ret; > +} > + > static void sum_vm_events(unsigned long *ret) > { > int cpu; > -- > 1.9.1 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Kirill A. Shutemov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 2/2] mm, thp: avoid unnecessary swapin in khugepaged 2016-03-13 9:28 [PATCH v2 0/2] mm, thp: Fix unnecessarry resource consuming in swapin Ebru Akagunduz 2016-03-13 9:28 ` [PATCH v2 1/2] mm, vmstat: calculate particular vm event Ebru Akagunduz @ 2016-03-13 9:28 ` Ebru Akagunduz 2016-03-13 9:45 ` kbuild test robot 2016-03-13 23:33 ` Kirill A. Shutemov 1 sibling, 2 replies; 7+ messages in thread From: Ebru Akagunduz @ 2016-03-13 9:28 UTC (permalink / raw) To: linux-mm Cc: hughd, riel, akpm, kirill.shutemov, n-horiguchi, aarcange, iamjoonsoo.kim, gorcunov, linux-kernel, mgorman, rientjes, vbabka, aneesh.kumar, hannes, mhocko, boaz, Ebru Akagunduz Currently khugepaged makes swapin readahead to improve THP collapse rate. This patch checks vm statistics to avoid workload of swapin, if unnecessary. So that when system under pressure, khugepaged won't consume resources to swapin. The patch was tested with a test program that allocates 800MB of memory, writes to it, and then sleeps. The system was forced to swap out all. Afterwards, the test program touches the area by writing, it skips a page in each 20 pages of the area. When waiting to swapin readahead left part of the test, the memory forced to be busy doing page reclaim. There was enough free memory during test, khugepaged did not swapin readahead due to business. Test results: After swapped out ------------------------------------------------------------------- | Anonymous | AnonHugePages | Swap | Fraction | ------------------------------------------------------------------- With patch | 325784 kB | 325632 kB | 474216 kB | %99 | ------------------------------------------------------------------- Without patch | 351308 kB | 350208 kB | 448692 kB | %99 | ------------------------------------------------------------------- After swapped in (waiting 10 minutes) ------------------------------------------------------------------- | Anonymous | AnonHugePages | Swap | Fraction | ------------------------------------------------------------------- With patch | 714164 kB | 489472 kB | 85836 kB | %68 | ------------------------------------------------------------------- Without patch | 586816 kB | 464896 kB | 213184 kB | %79 | ------------------------------------------------------------------- Signed-off-by: Ebru Akagunduz <ebru.akagunduz@gmail.com> Fixes: 363cd76e5b11c ("mm: make swapin readahead to improve thp collapse rate") --- Changes in v2: - Add reference to specify which patch fixed (Ebru Akagunduz) - Fix commit subject line (Ebru Akagunduz) mm/huge_memory.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 86e9666..4a60035 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -102,6 +102,7 @@ static DECLARE_WAIT_QUEUE_HEAD(khugepaged_wait); */ static unsigned int khugepaged_max_ptes_none __read_mostly; static unsigned int khugepaged_max_ptes_swap __read_mostly; +static unsigned long int allocstall = 0; static int khugepaged(void *none); static int khugepaged_slab_init(void); @@ -2438,7 +2439,7 @@ static void collapse_huge_page(struct mm_struct *mm, struct page *new_page; spinlock_t *pmd_ptl, *pte_ptl; int isolated = 0, result = 0; - unsigned long hstart, hend; + unsigned long hstart, hend, swap = 0, curr_allocstall = 0; struct mem_cgroup *memcg; unsigned long mmun_start; /* For mmu_notifiers */ unsigned long mmun_end; /* For mmu_notifiers */ @@ -2493,7 +2494,14 @@ static void collapse_huge_page(struct mm_struct *mm, goto out; } - __collapse_huge_page_swapin(mm, vma, address, pmd); + swap = get_mm_counter(mm, MM_SWAPENTS); + curr_allocstall = sum_vm_event(ALLOCSTALL); + /* + * When system under pressure, don't swapin readahead. + * So that avoid unnecessary resource consuming. + */ + if (allocstall == curr_allocstall && swap != 0) + __collapse_huge_page_swapin(mm, vma, address, pmd); anon_vma_lock_write(vma->anon_vma); @@ -2790,6 +2798,7 @@ skip: VM_BUG_ON(khugepaged_scan.address < hstart || khugepaged_scan.address + HPAGE_PMD_SIZE > hend); + allocstall = sum_vm_event(ALLOCSTALL); ret = khugepaged_scan_pmd(mm, vma, khugepaged_scan.address, hpage); -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v2 2/2] mm, thp: avoid unnecessary swapin in khugepaged 2016-03-13 9:28 ` [PATCH v2 2/2] mm, thp: avoid unnecessary swapin in khugepaged Ebru Akagunduz @ 2016-03-13 9:45 ` kbuild test robot 2016-03-13 23:33 ` Kirill A. Shutemov 1 sibling, 0 replies; 7+ messages in thread From: kbuild test robot @ 2016-03-13 9:45 UTC (permalink / raw) To: Ebru Akagunduz Cc: kbuild-all, linux-mm, hughd, riel, akpm, kirill.shutemov, n-horiguchi, aarcange, iamjoonsoo.kim, gorcunov, linux-kernel, mgorman, rientjes, vbabka, aneesh.kumar, hannes, mhocko, boaz [-- Attachment #1: Type: text/plain, Size: 1341 bytes --] Hi Ebru, [auto build test ERROR on next-20160311] [cannot apply to v4.5-rc7 v4.5-rc6 v4.5-rc5 v4.5-rc7] [if your patch is applied to the wrong git tree, please drop us a note to help improving the system] url: https://github.com/0day-ci/linux/commits/Ebru-Akagunduz/mm-vmstat-calculate-particular-vm-event/20160313-173055 config: i386-randconfig-x009-201611 (attached as .config) reproduce: # save the attached .config to linux build tree make ARCH=i386 All errors (new ones prefixed by >>): mm/huge_memory.c: In function 'collapse_huge_page': >> mm/huge_memory.c:2498:20: error: implicit declaration of function 'sum_vm_event' [-Werror=implicit-function-declaration] curr_allocstall = sum_vm_event(ALLOCSTALL); ^ cc1: some warnings being treated as errors vim +/sum_vm_event +2498 mm/huge_memory.c 2492 if (!pmd) { 2493 result = SCAN_PMD_NULL; 2494 goto out; 2495 } 2496 2497 swap = get_mm_counter(mm, MM_SWAPENTS); > 2498 curr_allocstall = sum_vm_event(ALLOCSTALL); 2499 /* 2500 * When system under pressure, don't swapin readahead. 2501 * So that avoid unnecessary resource consuming. --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation [-- Attachment #2: .config.gz --] [-- Type: application/octet-stream, Size: 26673 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 2/2] mm, thp: avoid unnecessary swapin in khugepaged 2016-03-13 9:28 ` [PATCH v2 2/2] mm, thp: avoid unnecessary swapin in khugepaged Ebru Akagunduz 2016-03-13 9:45 ` kbuild test robot @ 2016-03-13 23:33 ` Kirill A. Shutemov 2016-03-14 1:14 ` Rik van Riel 1 sibling, 1 reply; 7+ messages in thread From: Kirill A. Shutemov @ 2016-03-13 23:33 UTC (permalink / raw) To: Ebru Akagunduz Cc: linux-mm, hughd, riel, akpm, kirill.shutemov, n-horiguchi, aarcange, iamjoonsoo.kim, gorcunov, linux-kernel, mgorman, rientjes, vbabka, aneesh.kumar, hannes, mhocko, boaz On Sun, Mar 13, 2016 at 11:28:55AM +0200, Ebru Akagunduz wrote: > Currently khugepaged makes swapin readahead to improve > THP collapse rate. This patch checks vm statistics > to avoid workload of swapin, if unnecessary. So that > when system under pressure, khugepaged won't consume > resources to swapin. > > The patch was tested with a test program that allocates > 800MB of memory, writes to it, and then sleeps. The system > was forced to swap out all. Afterwards, the test program > touches the area by writing, it skips a page in each > 20 pages of the area. When waiting to swapin readahead > left part of the test, the memory forced to be busy > doing page reclaim. There was enough free memory during > test, khugepaged did not swapin readahead due to business. > > Test results: > > After swapped out > ------------------------------------------------------------------- > | Anonymous | AnonHugePages | Swap | Fraction | > ------------------------------------------------------------------- > With patch | 325784 kB | 325632 kB | 474216 kB | %99 | > ------------------------------------------------------------------- > Without patch | 351308 kB | 350208 kB | 448692 kB | %99 | > ------------------------------------------------------------------- > > After swapped in (waiting 10 minutes) > ------------------------------------------------------------------- > | Anonymous | AnonHugePages | Swap | Fraction | > ------------------------------------------------------------------- > With patch | 714164 kB | 489472 kB | 85836 kB | %68 | > ------------------------------------------------------------------- > Without patch | 586816 kB | 464896 kB | 213184 kB | %79 | > ------------------------------------------------------------------- > > Signed-off-by: Ebru Akagunduz <ebru.akagunduz@gmail.com> > Fixes: 363cd76e5b11c ("mm: make swapin readahead to improve thp collapse rate") > --- > Changes in v2: > - Add reference to specify which patch fixed (Ebru Akagunduz) > - Fix commit subject line (Ebru Akagunduz) > > mm/huge_memory.c | 13 +++++++++++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 86e9666..4a60035 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -102,6 +102,7 @@ static DECLARE_WAIT_QUEUE_HEAD(khugepaged_wait); > */ > static unsigned int khugepaged_max_ptes_none __read_mostly; > static unsigned int khugepaged_max_ptes_swap __read_mostly; > +static unsigned long int allocstall = 0; No need to zero it out. The variable is in .bss. > static int khugepaged(void *none); > static int khugepaged_slab_init(void); > @@ -2438,7 +2439,7 @@ static void collapse_huge_page(struct mm_struct *mm, > struct page *new_page; > spinlock_t *pmd_ptl, *pte_ptl; > int isolated = 0, result = 0; > - unsigned long hstart, hend; > + unsigned long hstart, hend, swap = 0, curr_allocstall = 0; No need to zero out too, because you always will initialize it anyway. > struct mem_cgroup *memcg; > unsigned long mmun_start; /* For mmu_notifiers */ > unsigned long mmun_end; /* For mmu_notifiers */ > @@ -2493,7 +2494,14 @@ static void collapse_huge_page(struct mm_struct *mm, > goto out; > } > > - __collapse_huge_page_swapin(mm, vma, address, pmd); > + swap = get_mm_counter(mm, MM_SWAPENTS); > + curr_allocstall = sum_vm_event(ALLOCSTALL); > + /* > + * When system under pressure, don't swapin readahead. > + * So that avoid unnecessary resource consuming. > + */ > + if (allocstall == curr_allocstall && swap != 0) > + __collapse_huge_page_swapin(mm, vma, address, pmd); So, between these too points, where new ALLOCSTALL events comes from? I would guess that in most cases they would come from allocation of huge page itself (if khugepaged defrag is enabled). So we are willing to pay for allocation new huge page, but not for swapping in. I wounder, if it was wise to allocate the huge page in first place? Or shouldn't we at least have consistent behaviour on swap-in vs. allocation wrt khugepaged defragmentation option? Or am I wrong and ALLOCSTALLs aren't caused by khugepagd? > anon_vma_lock_write(vma->anon_vma); > > @@ -2790,6 +2798,7 @@ skip: > VM_BUG_ON(khugepaged_scan.address < hstart || > khugepaged_scan.address + HPAGE_PMD_SIZE > > hend); > + allocstall = sum_vm_event(ALLOCSTALL); > ret = khugepaged_scan_pmd(mm, vma, > khugepaged_scan.address, > hpage); > -- > 1.9.1 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Kirill A. Shutemov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 2/2] mm, thp: avoid unnecessary swapin in khugepaged 2016-03-13 23:33 ` Kirill A. Shutemov @ 2016-03-14 1:14 ` Rik van Riel 0 siblings, 0 replies; 7+ messages in thread From: Rik van Riel @ 2016-03-14 1:14 UTC (permalink / raw) To: Kirill A. Shutemov, Ebru Akagunduz Cc: linux-mm, hughd, akpm, kirill.shutemov, n-horiguchi, aarcange, iamjoonsoo.kim, gorcunov, linux-kernel, mgorman, rientjes, vbabka, aneesh.kumar, hannes, mhocko, boaz [-- Attachment #1: Type: text/plain, Size: 1519 bytes --] On Mon, 2016-03-14 at 02:33 +0300, Kirill A. Shutemov wrote: > On Sun, Mar 13, 2016 at 11:28:55AM +0200, Ebru Akagunduz wrote: > > > > @@ -2493,7 +2494,14 @@ static void collapse_huge_page(struct > > mm_struct *mm, > > goto out; > > } > > > > - __collapse_huge_page_swapin(mm, vma, address, pmd); > > + swap = get_mm_counter(mm, MM_SWAPENTS); > > + curr_allocstall = sum_vm_event(ALLOCSTALL); > > + /* > > + * When system under pressure, don't swapin readahead. > > + * So that avoid unnecessary resource consuming. > > + */ > > + if (allocstall == curr_allocstall && swap != 0) > > + __collapse_huge_page_swapin(mm, vma, address, > > pmd); > > So, between these too points, where new ALLOCSTALL events comes from? > > I would guess that in most cases they would come from allocation of > huge > page itself (if khugepaged defrag is enabled). So we are willing to > pay > for allocation new huge page, but not for swapping in. > > I wounder, if it was wise to allocate the huge page in first place? > > Or shouldn't we at least have consistent behaviour on swap-in vs. > allocation wrt khugepaged defragmentation option? > > Or am I wrong and ALLOCSTALLs aren't caused by khugepagd? It could be caused by khugepaged, but it could just as well be caused by any other task running in the system. Khugepaged stores the allocstall value when it goes to sleep, and checks it before calling (or not) __collapse_huge_page_swapin. -- All Rights Reversed. [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-03-14 1:15 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-03-13 9:28 [PATCH v2 0/2] mm, thp: Fix unnecessarry resource consuming in swapin Ebru Akagunduz 2016-03-13 9:28 ` [PATCH v2 1/2] mm, vmstat: calculate particular vm event Ebru Akagunduz 2016-03-13 23:08 ` Kirill A. Shutemov 2016-03-13 9:28 ` [PATCH v2 2/2] mm, thp: avoid unnecessary swapin in khugepaged Ebru Akagunduz 2016-03-13 9:45 ` kbuild test robot 2016-03-13 23:33 ` Kirill A. Shutemov 2016-03-14 1:14 ` Rik van Riel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).