* [RFC v5 0/3] mm: make swapin readahead to gain more thp performance @ 2015-09-14 19:31 Ebru Akagunduz 2015-09-14 19:31 ` [RFC v5 1/3] mm: add tracepoint for scanning pages Ebru Akagunduz ` (3 more replies) 0 siblings, 4 replies; 17+ messages in thread From: Ebru Akagunduz @ 2015-09-14 19:31 UTC (permalink / raw) To: linux-mm Cc: akpm, kirill.shutemov, n-horiguchi, aarcange, riel, iamjoonsoo.kim, xiexiuqi, gorcunov, linux-kernel, mgorman, rientjes, vbabka, aneesh.kumar, hughd, hannes, mhocko, boaz, raindel, Ebru Akagunduz This patch series makes swapin readahead up to a certain number to gain more thp performance and adds tracepoint for khugepaged_scan_pmd, collapse_huge_page, __collapse_huge_page_isolate. This patch series was written to deal with programs that access most, but not all, of their memory after they get swapped out. Currently these programs do not get their memory collapsed into THPs after the system swapped their memory out, while they would get THPs before swapping happened. This patch series was tested with a test program, it allocates 400MB of memory, writes to it, and then sleeps. I force the system to swap out all. Afterwards, the test program touches the area by writing and leaves a piece of it without writing. This shows how much swap in readahead made by the patch. Test results: After swapped out ------------------------------------------------------------------- | Anonymous | AnonHugePages | Swap | Fraction | ------------------------------------------------------------------- With patch | 90076 kB | 88064 kB | 309928 kB | %99 | ------------------------------------------------------------------- Without patch | 194068 kB | 192512 kB | 205936 kB | %99 | ------------------------------------------------------------------- After swapped in ------------------------------------------------------------------- | Anonymous | AnonHugePages | Swap | Fraction | ------------------------------------------------------------------- With patch | 201408 kB | 198656 kB | 198596 kB | %98 | ------------------------------------------------------------------- Without patch | 292624 kB | 192512 kB | 107380 kB | %65 | ------------------------------------------------------------------- Ebru Akagunduz (3): mm: add tracepoint for scanning pages mm: make optimistic check for swapin readahead mm: make swapin readahead to improve thp collapse rate include/trace/events/huge_memory.h | 165 ++++++++++++++++++++++++ mm/huge_memory.c | 248 ++++++++++++++++++++++++++++++++----- mm/internal.h | 4 + mm/memory.c | 2 +- 4 files changed, 386 insertions(+), 33 deletions(-) create mode 100644 include/trace/events/huge_memory.h -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* [RFC v5 1/3] mm: add tracepoint for scanning pages 2015-09-14 19:31 [RFC v5 0/3] mm: make swapin readahead to gain more thp performance Ebru Akagunduz @ 2015-09-14 19:31 ` Ebru Akagunduz 2015-09-14 19:31 ` [RFC v5 2/3] mm: make optimistic check for swapin readahead Ebru Akagunduz ` (2 subsequent siblings) 3 siblings, 0 replies; 17+ messages in thread From: Ebru Akagunduz @ 2015-09-14 19:31 UTC (permalink / raw) To: linux-mm Cc: akpm, kirill.shutemov, n-horiguchi, aarcange, riel, iamjoonsoo.kim, xiexiuqi, gorcunov, linux-kernel, mgorman, rientjes, vbabka, aneesh.kumar, hughd, hannes, mhocko, boaz, raindel, Ebru Akagunduz Using static tracepoints, data of functions is recorded. It is good to automatize debugging without doing a lot of changes in the source code. This patch adds tracepoint for khugepaged_scan_pmd, collapse_huge_page and __collapse_huge_page_isolate. Signed-off-by: Ebru Akagunduz <ebru.akagunduz@gmail.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Rik van Riel <riel@redhat.com> --- Changes in v2: - Nothing changed Changes in v3: - Print page address instead of vm_start (Vlastimil Babka) - Define constants to specify exact tracepoint result (Vlastimil Babka) Changes in v4: - Change the constant prefix with SCAN_ instead of MM_ (Vlastimil Babka) - Move the constants into the enum (Vlastimil Babka) - Move the constants from mm.h to huge_memory.c (because only will be used in huge_memory.c) (Vlastimil Babka) - Print pfn in tracepoints (Vlastimil Babka) - Print scan result as string in tracepoint (Vlastimil Babka) (I tried to make same things to print string like mm/compaction.c. My patch does not print string, I skip something but could not see why) - Do not change function return values for success and failure, leave them original agreed with Doc/CodingStyle (Vlastimil Babka) - Define scan_result to specify tracepoint result (Ebru Akagunduz) - Add out_nolock label to avoid multiple tracepoints (Vlastimil Babka) Changes in v5: - Use tracepoint macros to print string in userspace (fixes printing string problem) (Vlastimil Babka) include/trace/events/huge_memory.h | 137 +++++++++++++++++++++++++++++++ mm/huge_memory.c | 164 ++++++++++++++++++++++++++++++------- 2 files changed, 270 insertions(+), 31 deletions(-) create mode 100644 include/trace/events/huge_memory.h diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h new file mode 100644 index 0000000..1df9bf5 --- /dev/null +++ b/include/trace/events/huge_memory.h @@ -0,0 +1,137 @@ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM huge_memory + +#if !defined(__HUGE_MEMORY_H) || defined(TRACE_HEADER_MULTI_READ) +#define __HUGE_MEMORY_H + +#include <linux/tracepoint.h> + +#include <trace/events/gfpflags.h> + +#define SCAN_STATUS \ + EM( SCAN_FAIL, "failed") \ + EM( SCAN_SUCCEED, "succeeded") \ + EM( SCAN_PMD_NULL, "pmd_null") \ + EM( SCAN_EXCEED_NONE_PTE, "exceed_none_pte") \ + EM( SCAN_PTE_NON_PRESENT, "pte_non_present") \ + EM( SCAN_PAGE_RO, "no_writable_page") \ + EM( SCAN_NO_REFERENCED_PAGE, "no_referenced_page") \ + EM( SCAN_PAGE_NULL, "page_null") \ + EM( SCAN_SCAN_ABORT, "scan_aborted") \ + EM( SCAN_PAGE_COUNT, "not_suitable_page_count") \ + EM( SCAN_PAGE_LRU, "page_not_in_lru") \ + EM( SCAN_PAGE_LOCK, "page_locked") \ + EM( SCAN_PAGE_ANON, "page_not_anon") \ + EM( SCAN_ANY_PROCESS, "no_process_for_page") \ + EM( SCAN_VMA_NULL, "vma_null") \ + EM( SCAN_VMA_CHECK, "vma_check_failed") \ + EM( SCAN_ADDRESS_RANGE, "not_suitable_address_range") \ + EM( SCAN_SWAP_CACHE_PAGE, "page_swap_cache") \ + EM( SCAN_DEL_PAGE_LRU, "could_not_delete_page_from_lru")\ + EM( SCAN_ALLOC_HUGE_PAGE_FAIL, "alloc_huge_page_failed") \ + EMe( SCAN_CGROUP_CHARGE_FAIL, "ccgroup_charge_failed") + +#undef EM +#undef EMe +#define EM(a, b) TRACE_DEFINE_ENUM(a); +#define EMe(a, b) TRACE_DEFINE_ENUM(a); + +SCAN_STATUS + +#undef EM +#undef EMe +#define EM(a, b) {a, b}, +#define EMe(a, b) {a, b} + +TRACE_EVENT(mm_khugepaged_scan_pmd, + + TP_PROTO(struct mm_struct *mm, unsigned long pfn, bool writable, + bool referenced, int none_or_zero, int status), + + TP_ARGS(mm, pfn, writable, referenced, none_or_zero, status), + + TP_STRUCT__entry( + __field(struct mm_struct *, mm) + __field(unsigned long, pfn) + __field(bool, writable) + __field(bool, referenced) + __field(int, none_or_zero) + __field(int, status) + ), + + TP_fast_assign( + __entry->mm = mm; + __entry->pfn = pfn; + __entry->writable = writable; + __entry->referenced = referenced; + __entry->none_or_zero = none_or_zero; + __entry->status = status; + ), + + TP_printk("mm=%p, scan_pfn=0x%lx, writable=%d, referenced=%d, none_or_zero=%d, status=%s", + __entry->mm, + __entry->pfn, + __entry->writable, + __entry->referenced, + __entry->none_or_zero, + __print_symbolic(__entry->status, SCAN_STATUS)) +); + +TRACE_EVENT(mm_collapse_huge_page, + + TP_PROTO(struct mm_struct *mm, int isolated, int status), + + TP_ARGS(mm, isolated, status), + + TP_STRUCT__entry( + __field(struct mm_struct *, mm) + __field(int, isolated) + __field(int, status) + ), + + TP_fast_assign( + __entry->mm = mm; + __entry->isolated = isolated; + __entry->status = status; + ), + + TP_printk("mm=%p, isolated=%d, status=%s", + __entry->mm, + __entry->isolated, + __print_symbolic(__entry->status, SCAN_STATUS)) +); + +TRACE_EVENT(mm_collapse_huge_page_isolate, + + TP_PROTO(unsigned long pfn, int none_or_zero, + bool referenced, bool writable, int status), + + TP_ARGS(pfn, none_or_zero, referenced, writable, status), + + TP_STRUCT__entry( + __field(unsigned long, pfn) + __field(int, none_or_zero) + __field(bool, referenced) + __field(bool, writable) + __field(int, status) + ), + + TP_fast_assign( + __entry->pfn = pfn; + __entry->none_or_zero = none_or_zero; + __entry->referenced = referenced; + __entry->writable = writable; + __entry->status = status; + ), + + TP_printk("scan_pfn=0x%lx, none_or_zero=%d, referenced=%d, writable=%d, status=%s", + __entry->pfn, + __entry->none_or_zero, + __entry->referenced, + __entry->writable, + __print_symbolic(__entry->status, SCAN_STATUS)) +); + +#endif /* __HUGE_MEMORY_H */ +#include <trace/define_trace.h> + diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 4b06b8d..4215cee 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -31,6 +31,33 @@ #include <asm/pgalloc.h> #include "internal.h" +enum scan_result { + SCAN_FAIL, + SCAN_SUCCEED, + SCAN_PMD_NULL, + SCAN_EXCEED_NONE_PTE, + SCAN_PTE_NON_PRESENT, + SCAN_PAGE_RO, + SCAN_NO_REFERENCED_PAGE, + SCAN_PAGE_NULL, + SCAN_SCAN_ABORT, + SCAN_PAGE_COUNT, + SCAN_PAGE_LRU, + SCAN_PAGE_LOCK, + SCAN_PAGE_ANON, + SCAN_ANY_PROCESS, + SCAN_VMA_NULL, + SCAN_VMA_CHECK, + SCAN_ADDRESS_RANGE, + SCAN_SWAP_CACHE_PAGE, + SCAN_DEL_PAGE_LRU, + SCAN_ALLOC_HUGE_PAGE_FAIL, + SCAN_CGROUP_CHARGE_FAIL +}; + +#define CREATE_TRACE_POINTS +#include <trace/events/huge_memory.h> + /* * By default transparent hugepage support is disabled in order that avoid * to risk increase the memory footprint of applications without a guaranteed @@ -2199,25 +2226,31 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, unsigned long address, pte_t *pte) { - struct page *page; + struct page *page = NULL; pte_t *_pte; - int none_or_zero = 0; + int none_or_zero = 0, result = 0; bool referenced = false, writable = false; for (_pte = pte; _pte < pte+HPAGE_PMD_NR; _pte++, address += PAGE_SIZE) { pte_t pteval = *_pte; if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { if (!userfaultfd_armed(vma) && - ++none_or_zero <= khugepaged_max_ptes_none) + ++none_or_zero <= khugepaged_max_ptes_none) { continue; - else + } else { + result = SCAN_EXCEED_NONE_PTE; goto out; + } } - if (!pte_present(pteval)) + if (!pte_present(pteval)) { + result = SCAN_PTE_NON_PRESENT; goto out; + } page = vm_normal_page(vma, address, pteval); - if (unlikely(!page)) + if (unlikely(!page)) { + result = SCAN_PAGE_NULL; goto out; + } VM_BUG_ON_PAGE(PageCompound(page), page); VM_BUG_ON_PAGE(!PageAnon(page), page); @@ -2229,8 +2262,10 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, * is needed to serialize against split_huge_page * when invoked from the VM. */ - if (!trylock_page(page)) + if (!trylock_page(page)) { + result = SCAN_PAGE_LOCK; goto out; + } /* * cannot use mapcount: can't collapse if there's a gup pin. @@ -2239,6 +2274,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, */ if (page_count(page) != 1 + !!PageSwapCache(page)) { unlock_page(page); + result = SCAN_PAGE_COUNT; goto out; } if (pte_write(pteval)) { @@ -2246,6 +2282,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, } else { if (PageSwapCache(page) && !reuse_swap_page(page)) { unlock_page(page); + result = SCAN_SWAP_CACHE_PAGE; goto out; } /* @@ -2260,6 +2297,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, */ if (isolate_lru_page(page)) { unlock_page(page); + result = SCAN_DEL_PAGE_LRU; goto out; } /* 0 stands for page_is_file_cache(page) == false */ @@ -2273,10 +2311,21 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, mmu_notifier_test_young(vma->vm_mm, address)) referenced = true; } - if (likely(referenced && writable)) - return 1; + if (likely(writable)) { + if (likely(referenced)) { + result = SCAN_SUCCEED; + trace_mm_collapse_huge_page_isolate(page_to_pfn(page), none_or_zero, + referenced, writable, result); + return 1; + } + } else { + result = SCAN_PAGE_RO; + } + out: release_pte_pages(pte, _pte); + trace_mm_collapse_huge_page_isolate(page_to_pfn(page), none_or_zero, + referenced, writable, result); return 0; } @@ -2515,7 +2564,7 @@ static void collapse_huge_page(struct mm_struct *mm, pgtable_t pgtable; struct page *new_page; spinlock_t *pmd_ptl, *pte_ptl; - int isolated; + int isolated, result = 0; unsigned long hstart, hend; struct mem_cgroup *memcg; unsigned long mmun_start; /* For mmu_notifiers */ @@ -2530,12 +2579,16 @@ static void collapse_huge_page(struct mm_struct *mm, /* release the mmap_sem read lock. */ new_page = khugepaged_alloc_page(hpage, gfp, mm, vma, address, node); - if (!new_page) - return; + if (!new_page) { + result = SCAN_ALLOC_HUGE_PAGE_FAIL; + goto out_nolock; + } if (unlikely(mem_cgroup_try_charge(new_page, mm, - gfp, &memcg))) - return; + gfp, &memcg))) { + result = SCAN_CGROUP_CHARGE_FAIL; + goto out_nolock; + } /* * Prevent all access to pagetables with the exception of @@ -2543,21 +2596,31 @@ static void collapse_huge_page(struct mm_struct *mm, * handled by the anon_vma lock + PG_lock. */ down_write(&mm->mmap_sem); - if (unlikely(khugepaged_test_exit(mm))) + if (unlikely(khugepaged_test_exit(mm))) { + result = SCAN_ANY_PROCESS; goto out; + } vma = find_vma(mm, address); - if (!vma) + if (!vma) { + result = SCAN_VMA_NULL; goto out; + } hstart = (vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK; hend = vma->vm_end & HPAGE_PMD_MASK; - if (address < hstart || address + HPAGE_PMD_SIZE > hend) + if (address < hstart || address + HPAGE_PMD_SIZE > hend) { + result = SCAN_ADDRESS_RANGE; goto out; - if (!hugepage_vma_check(vma)) + } + if (!hugepage_vma_check(vma)) { + result = SCAN_VMA_CHECK; goto out; + } pmd = mm_find_pmd(mm, address); - if (!pmd) + if (!pmd) { + result = SCAN_PMD_NULL; goto out; + } anon_vma_lock_write(vma->anon_vma); @@ -2594,6 +2657,7 @@ static void collapse_huge_page(struct mm_struct *mm, pmd_populate(mm, pmd, pmd_pgtable(_pmd)); spin_unlock(pmd_ptl); anon_vma_unlock_write(vma->anon_vma); + result = SCAN_FAIL; goto out; } @@ -2631,10 +2695,15 @@ static void collapse_huge_page(struct mm_struct *mm, *hpage = NULL; khugepaged_pages_collapsed++; + result = SCAN_SUCCEED; out_up_write: up_write(&mm->mmap_sem); + trace_mm_collapse_huge_page(mm, isolated, result); return; +out_nolock: + trace_mm_collapse_huge_page(mm, isolated, result); + return; out: mem_cgroup_cancel_charge(new_page, memcg); goto out_up_write; @@ -2647,8 +2716,8 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, { pmd_t *pmd; pte_t *pte, *_pte; - int ret = 0, none_or_zero = 0; - struct page *page; + int ret = 0, none_or_zero = 0, result = 0; + struct page *page = NULL; unsigned long _address; spinlock_t *ptl; int node = NUMA_NO_NODE; @@ -2657,8 +2726,10 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, VM_BUG_ON(address & ~HPAGE_PMD_MASK); pmd = mm_find_pmd(mm, address); - if (!pmd) + if (!pmd) { + result = SCAN_PMD_NULL; goto out; + } memset(khugepaged_node_load, 0, sizeof(khugepaged_node_load)); pte = pte_offset_map_lock(mm, pmd, address, &ptl); @@ -2667,19 +2738,25 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, pte_t pteval = *_pte; if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { if (!userfaultfd_armed(vma) && - ++none_or_zero <= khugepaged_max_ptes_none) + ++none_or_zero <= khugepaged_max_ptes_none) { continue; - else + } else { + result = SCAN_EXCEED_NONE_PTE; goto out_unmap; + } } - if (!pte_present(pteval)) + if (!pte_present(pteval)) { + result = SCAN_PTE_NON_PRESENT; goto out_unmap; + } if (pte_write(pteval)) writable = true; page = vm_normal_page(vma, _address, pteval); - if (unlikely(!page)) + if (unlikely(!page)) { + result = SCAN_PAGE_NULL; goto out_unmap; + } /* * Record which node the original page is from and save this * information to khugepaged_node_load[]. @@ -2687,26 +2764,49 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, * hit record. */ node = page_to_nid(page); - if (khugepaged_scan_abort(node)) + if (khugepaged_scan_abort(node)) { + result = SCAN_SCAN_ABORT; goto out_unmap; + } khugepaged_node_load[node]++; VM_BUG_ON_PAGE(PageCompound(page), page); - if (!PageLRU(page) || PageLocked(page) || !PageAnon(page)) + if (!PageLRU(page)) { + result = SCAN_SCAN_ABORT; + goto out_unmap; + } + if (PageLocked(page)) { + result = SCAN_PAGE_LOCK; goto out_unmap; + } + if (!PageAnon(page)) { + result = SCAN_PAGE_ANON; + goto out_unmap; + } + /* * cannot use mapcount: can't collapse if there's a gup pin. * The page must only be referenced by the scanned process * and page swap cache. */ - if (page_count(page) != 1 + !!PageSwapCache(page)) + if (page_count(page) != 1 + !!PageSwapCache(page)) { + result = SCAN_PAGE_COUNT; goto out_unmap; + } if (pte_young(pteval) || page_is_young(page) || PageReferenced(page) || mmu_notifier_test_young(vma->vm_mm, address)) referenced = true; } - if (referenced && writable) - ret = 1; + if (writable) { + if (referenced) { + result = SCAN_SUCCEED; + ret = 1; + } else { + result = SCAN_NO_REFERENCED_PAGE; + } + } else { + result = SCAN_PAGE_RO; + } out_unmap: pte_unmap_unlock(pte, ptl); if (ret) { @@ -2715,6 +2815,8 @@ out_unmap: collapse_huge_page(mm, address, hpage, vma, node); } out: + trace_mm_khugepaged_scan_pmd(mm, page_to_pfn(page), writable, referenced, + none_or_zero, result); return ret; } -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [RFC v5 2/3] mm: make optimistic check for swapin readahead 2015-09-14 19:31 [RFC v5 0/3] mm: make swapin readahead to gain more thp performance Ebru Akagunduz 2015-09-14 19:31 ` [RFC v5 1/3] mm: add tracepoint for scanning pages Ebru Akagunduz @ 2015-09-14 19:31 ` Ebru Akagunduz 2015-09-14 19:47 ` Rik van Riel 2015-09-14 21:33 ` Andrew Morton 2015-09-14 19:31 ` [RFC v5 3/3] mm: make swapin readahead to improve thp collapse rate Ebru Akagunduz 2015-09-14 21:41 ` [RFC v5 0/3] mm: make swapin readahead to gain more thp performance Andrew Morton 3 siblings, 2 replies; 17+ messages in thread From: Ebru Akagunduz @ 2015-09-14 19:31 UTC (permalink / raw) To: linux-mm Cc: akpm, kirill.shutemov, n-horiguchi, aarcange, riel, iamjoonsoo.kim, xiexiuqi, gorcunov, linux-kernel, mgorman, rientjes, vbabka, aneesh.kumar, hughd, hannes, mhocko, boaz, raindel, Ebru Akagunduz This patch introduces new sysfs integer knob /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_swap which makes optimistic check for swapin readahead to increase thp collapse rate. Before getting swapped out pages to memory, checks them and allows up to a certain number. It also prints out using tracepoints amount of unmapped ptes. Signed-off-by: Ebru Akagunduz <ebru.akagunduz@gmail.com> --- Changes in v2: - Nothing changed Changes in v3: - Define constant for exact tracepoint result (Vlastimil Babka) Changes in v4: - Add sysfs knob request (Kirill A. Shutemov) Changes in v5: - Rename MM_EXCEED_SWAP_PTE with SCAN_EXCEED_SWAP_PTE (Vlastimil Babka) - Add tracepoint macros to print string (Vlastimil Babka) include/trace/events/huge_memory.h | 14 +++++++----- mm/huge_memory.c | 45 +++++++++++++++++++++++++++++++++++--- 2 files changed, 51 insertions(+), 8 deletions(-) diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h index 1df9bf5..153274c 100644 --- a/include/trace/events/huge_memory.h +++ b/include/trace/events/huge_memory.h @@ -29,7 +29,8 @@ EM( SCAN_SWAP_CACHE_PAGE, "page_swap_cache") \ EM( SCAN_DEL_PAGE_LRU, "could_not_delete_page_from_lru")\ EM( SCAN_ALLOC_HUGE_PAGE_FAIL, "alloc_huge_page_failed") \ - EMe( SCAN_CGROUP_CHARGE_FAIL, "ccgroup_charge_failed") + EM( SCAN_CGROUP_CHARGE_FAIL, "ccgroup_charge_failed") \ + EMe( SCAN_EXCEED_SWAP_PTE, "exceed_swap_pte") #undef EM #undef EMe @@ -46,9 +47,9 @@ SCAN_STATUS TRACE_EVENT(mm_khugepaged_scan_pmd, TP_PROTO(struct mm_struct *mm, unsigned long pfn, bool writable, - bool referenced, int none_or_zero, int status), + bool referenced, int none_or_zero, int status, int unmapped), - TP_ARGS(mm, pfn, writable, referenced, none_or_zero, status), + TP_ARGS(mm, pfn, writable, referenced, none_or_zero, status, unmapped), TP_STRUCT__entry( __field(struct mm_struct *, mm) @@ -57,6 +58,7 @@ TRACE_EVENT(mm_khugepaged_scan_pmd, __field(bool, referenced) __field(int, none_or_zero) __field(int, status) + __field(int, unmapped) ), TP_fast_assign( @@ -66,15 +68,17 @@ TRACE_EVENT(mm_khugepaged_scan_pmd, __entry->referenced = referenced; __entry->none_or_zero = none_or_zero; __entry->status = status; + __entry->unmapped = unmapped; ), - TP_printk("mm=%p, scan_pfn=0x%lx, writable=%d, referenced=%d, none_or_zero=%d, status=%s", + TP_printk("mm=%p, scan_pfn=0x%lx, writable=%d, referenced=%d, none_or_zero=%d, status=%s, unmapped=%d", __entry->mm, __entry->pfn, __entry->writable, __entry->referenced, __entry->none_or_zero, - __print_symbolic(__entry->status, SCAN_STATUS)) + __print_symbolic(__entry->status, SCAN_STATUS), + __entry->unmapped) ); TRACE_EVENT(mm_collapse_huge_page, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 4215cee..049b0db 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -26,6 +26,7 @@ #include <linux/hashtable.h> #include <linux/userfaultfd_k.h> #include <linux/page_idle.h> +#include <linux/swapops.h> #include <asm/tlb.h> #include <asm/pgalloc.h> @@ -52,7 +53,8 @@ enum scan_result { SCAN_SWAP_CACHE_PAGE, SCAN_DEL_PAGE_LRU, SCAN_ALLOC_HUGE_PAGE_FAIL, - SCAN_CGROUP_CHARGE_FAIL + SCAN_CGROUP_CHARGE_FAIL, + SCAN_EXCEED_SWAP_PTE }; #define CREATE_TRACE_POINTS @@ -94,6 +96,7 @@ static DECLARE_WAIT_QUEUE_HEAD(khugepaged_wait); * fault. */ static unsigned int khugepaged_max_ptes_none __read_mostly = HPAGE_PMD_NR-1; +static unsigned int khugepaged_max_ptes_swap __read_mostly = HPAGE_PMD_NR/8; static int khugepaged(void *none); static int khugepaged_slab_init(void); @@ -580,6 +583,33 @@ static struct kobj_attribute khugepaged_max_ptes_none_attr = __ATTR(max_ptes_none, 0644, khugepaged_max_ptes_none_show, khugepaged_max_ptes_none_store); +static ssize_t khugepaged_max_ptes_swap_show(struct kobject *kobj, + struct kobj_attribute *attr, + char *buf) +{ + return sprintf(buf, "%u\n", khugepaged_max_ptes_swap); +} + +static ssize_t khugepaged_max_ptes_swap_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + int err; + unsigned long max_ptes_swap; + + err = kstrtoul(buf, 10, &max_ptes_swap); + if (err || max_ptes_swap > HPAGE_PMD_NR-1) + return -EINVAL; + + khugepaged_max_ptes_swap = max_ptes_swap; + + return count; +} + +static struct kobj_attribute khugepaged_max_ptes_swap_attr = + __ATTR(max_ptes_swap, 0644, khugepaged_max_ptes_swap_show, + khugepaged_max_ptes_swap_store); + static struct attribute *khugepaged_attr[] = { &khugepaged_defrag_attr.attr, &khugepaged_max_ptes_none_attr.attr, @@ -588,6 +618,7 @@ static struct attribute *khugepaged_attr[] = { &full_scans_attr.attr, &scan_sleep_millisecs_attr.attr, &alloc_sleep_millisecs_attr.attr, + &khugepaged_max_ptes_swap_attr.attr, NULL, }; @@ -2720,7 +2751,7 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, struct page *page = NULL; unsigned long _address; spinlock_t *ptl; - int node = NUMA_NO_NODE; + int node = NUMA_NO_NODE, unmapped = 0; bool writable = false, referenced = false; VM_BUG_ON(address & ~HPAGE_PMD_MASK); @@ -2736,6 +2767,14 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, for (_address = address, _pte = pte; _pte < pte+HPAGE_PMD_NR; _pte++, _address += PAGE_SIZE) { pte_t pteval = *_pte; + if (is_swap_pte(pteval)) { + if (++unmapped <= khugepaged_max_ptes_swap) { + continue; + } else { + ret = SCAN_EXCEED_SWAP_PTE; + goto out_unmap; + } + } if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { if (!userfaultfd_armed(vma) && ++none_or_zero <= khugepaged_max_ptes_none) { @@ -2816,7 +2855,7 @@ out_unmap: } out: trace_mm_khugepaged_scan_pmd(mm, page_to_pfn(page), writable, referenced, - none_or_zero, result); + none_or_zero, result, unmapped); return ret; } -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [RFC v5 2/3] mm: make optimistic check for swapin readahead 2015-09-14 19:31 ` [RFC v5 2/3] mm: make optimistic check for swapin readahead Ebru Akagunduz @ 2015-09-14 19:47 ` Rik van Riel 2015-09-14 21:33 ` Andrew Morton 1 sibling, 0 replies; 17+ messages in thread From: Rik van Riel @ 2015-09-14 19:47 UTC (permalink / raw) To: Ebru Akagunduz, linux-mm Cc: akpm, kirill.shutemov, n-horiguchi, aarcange, iamjoonsoo.kim, xiexiuqi, gorcunov, linux-kernel, mgorman, rientjes, vbabka, aneesh.kumar, hughd, hannes, mhocko, boaz, raindel On 09/14/2015 03:31 PM, Ebru Akagunduz wrote: > This patch introduces new sysfs integer knob > /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_swap > which makes optimistic check for swapin readahead to > increase thp collapse rate. Before getting swapped > out pages to memory, checks them and allows up to a > certain number. It also prints out using tracepoints > amount of unmapped ptes. This may need some more refinement in the future, but your patch series seems to create a large improvement over what we have now. > Signed-off-by: Ebru Akagunduz <ebru.akagunduz@gmail.com> Acked-by: Rik van Riel <riel@redhat.com> -- All rights reversed -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC v5 2/3] mm: make optimistic check for swapin readahead 2015-09-14 19:31 ` [RFC v5 2/3] mm: make optimistic check for swapin readahead Ebru Akagunduz 2015-09-14 19:47 ` Rik van Riel @ 2015-09-14 21:33 ` Andrew Morton 2015-09-15 20:08 ` Ebru Akagunduz 1 sibling, 1 reply; 17+ messages in thread From: Andrew Morton @ 2015-09-14 21:33 UTC (permalink / raw) To: Ebru Akagunduz Cc: linux-mm, kirill.shutemov, n-horiguchi, aarcange, riel, iamjoonsoo.kim, xiexiuqi, gorcunov, linux-kernel, mgorman, rientjes, vbabka, aneesh.kumar, hughd, hannes, mhocko, boaz, raindel On Mon, 14 Sep 2015 22:31:44 +0300 Ebru Akagunduz <ebru.akagunduz@gmail.com> wrote: > This patch introduces new sysfs integer knob > /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_swap > which makes optimistic check for swapin readahead to > increase thp collapse rate. Before getting swapped > out pages to memory, checks them and allows up to a > certain number. It also prints out using tracepoints > amount of unmapped ptes. We we please get this control documented? Documentation/vm/transhuge.txt appears to be the place for it. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC v5 2/3] mm: make optimistic check for swapin readahead 2015-09-14 21:33 ` Andrew Morton @ 2015-09-15 20:08 ` Ebru Akagunduz 0 siblings, 0 replies; 17+ messages in thread From: Ebru Akagunduz @ 2015-09-15 20:08 UTC (permalink / raw) To: Andrew Morton Cc: linux-mm, kirill.shutemov, n-horiguchi, aarcange, riel, iamjoonsoo.kim, xiexiuqi, gorcunov, linux-kernel, mgorman, rientjes, vbabka, aneesh.kumar, hughd, hannes, mhocko, boaz, raindel On Mon, Sep 14, 2015 at 02:33:55PM -0700, Andrew Morton wrote: > On Mon, 14 Sep 2015 22:31:44 +0300 Ebru Akagunduz <ebru.akagunduz@gmail.com> wrote: > > > This patch introduces new sysfs integer knob > > /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_swap > > which makes optimistic check for swapin readahead to > > increase thp collapse rate. Before getting swapped > > out pages to memory, checks them and allows up to a > > certain number. It also prints out using tracepoints > > amount of unmapped ptes. > > We we please get this control documented? > Documentation/vm/transhuge.txt appears to be the place for it. I will add annotation about max_swap_ptes to doc and send it with new patch. Kind regards, Ebru -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* [RFC v5 3/3] mm: make swapin readahead to improve thp collapse rate 2015-09-14 19:31 [RFC v5 0/3] mm: make swapin readahead to gain more thp performance Ebru Akagunduz 2015-09-14 19:31 ` [RFC v5 1/3] mm: add tracepoint for scanning pages Ebru Akagunduz 2015-09-14 19:31 ` [RFC v5 2/3] mm: make optimistic check for swapin readahead Ebru Akagunduz @ 2015-09-14 19:31 ` Ebru Akagunduz 2015-09-17 13:28 ` Kirill A. Shutemov 2015-09-17 15:13 ` Kirill A. Shutemov 2015-09-14 21:41 ` [RFC v5 0/3] mm: make swapin readahead to gain more thp performance Andrew Morton 3 siblings, 2 replies; 17+ messages in thread From: Ebru Akagunduz @ 2015-09-14 19:31 UTC (permalink / raw) To: linux-mm Cc: akpm, kirill.shutemov, n-horiguchi, aarcange, riel, iamjoonsoo.kim, xiexiuqi, gorcunov, linux-kernel, mgorman, rientjes, vbabka, aneesh.kumar, hughd, hannes, mhocko, boaz, raindel, Ebru Akagunduz This patch makes swapin readahead to improve thp collapse rate. When khugepaged scanned pages, there can be a few of the pages in swap area. With the patch THP can collapse 4kB pages into a THP when there are up to max_ptes_swap swap ptes in a 2MB range. The patch was tested with a test program that allocates 400B of memory, writes to it, and then sleeps. I force the system to swap out all. Afterwards, the test program touches the area by writing, it skips a page in each 20 pages of the area. Without the patch, system did not swap in readahead. THP rate was %65 of the program of the memory, it did not change over time. With this patch, after 10 minutes of waiting khugepaged had collapsed %99 of the program's memory. Signed-off-by: Ebru Akagunduz <ebru.akagunduz@gmail.com> Acked-by: Rik van Riel <riel@redhat.com> --- Changes in v2: - Use FAULT_FLAG_ALLOW_RETRY|FAULT_FLAG_RETRY_NOWAIT flag instead of 0x0 when called do_swap_page from __collapse_huge_page_swapin (Rik van Riel) Changes in v3: - Catch VM_FAULT_HWPOISON and VM_FAULT_OOM return cases in __collapse_huge_page_swapin (Kirill A. Shutemov) Changes in v4: - Fix broken indentation reverting if (...) statement in __collapse_huge_page_swapin (Kirill A. Shutemov) - Fix check statement of ret (Kirill A. Shutemov) - Use swapped_in name instead of swap_pte Changes in v5: - Export do_swap_page in mm/internal.h instead outside of mm (Vlastimil Babka) Test results: After swapped out ------------------------------------------------------------------- | Anonymous | AnonHugePages | Swap | Fraction | ------------------------------------------------------------------- With patch | 90076 kB | 88064 kB | 309928 kB | %99 | ------------------------------------------------------------------- Without patch | 194068 kB | 192512 kB | 205936 kB | %99 | ------------------------------------------------------------------- After swapped in ------------------------------------------------------------------- | Anonymous | AnonHugePages | Swap | Fraction | ------------------------------------------------------------------- With patch | 201408 kB | 198656 kB | 198596 kB | %98 | ------------------------------------------------------------------- Without patch | 292624 kB | 192512 kB | 107380 kB | %65 | ------------------------------------------------------------------- include/trace/events/huge_memory.h | 24 +++++++++++++++++++++ mm/huge_memory.c | 43 ++++++++++++++++++++++++++++++++++++++ mm/internal.h | 4 ++++ mm/memory.c | 2 +- 4 files changed, 72 insertions(+), 1 deletion(-) diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h index 153274c..1efc7f1 100644 --- a/include/trace/events/huge_memory.h +++ b/include/trace/events/huge_memory.h @@ -136,6 +136,30 @@ TRACE_EVENT(mm_collapse_huge_page_isolate, __print_symbolic(__entry->status, SCAN_STATUS)) ); +TRACE_EVENT(mm_collapse_huge_page_swapin, + + TP_PROTO(struct mm_struct *mm, int swapped_in, int ret), + + TP_ARGS(mm, swapped_in, ret), + + TP_STRUCT__entry( + __field(struct mm_struct *, mm) + __field(int, swapped_in) + __field(int, ret) + ), + + TP_fast_assign( + __entry->mm = mm; + __entry->swapped_in = swapped_in; + __entry->ret = ret; + ), + + TP_printk("mm=%p, swapped_in=%d, ret=%d", + __entry->mm, + __entry->swapped_in, + __entry->ret) +); + #endif /* __HUGE_MEMORY_H */ #include <trace/define_trace.h> diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 049b0db..e83f20a 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2584,6 +2584,47 @@ static bool hugepage_vma_check(struct vm_area_struct *vma) return true; } +/* + * Bring missing pages in from swap, to complete THP collapse. + * Only done if khugepaged_scan_pmd believes it is worthwhile. + * + * Called and returns without pte mapped or spinlocks held, + * but with mmap_sem held to protect against vma changes. + */ + +static void __collapse_huge_page_swapin(struct mm_struct *mm, + struct vm_area_struct *vma, + unsigned long address, pmd_t *pmd, + pte_t *pte) +{ + unsigned long _address; + pte_t pteval = *pte; + int swapped_in = 0, ret = 0; + + pte = pte_offset_map(pmd, address); + for (_address = address; _address < address + HPAGE_PMD_NR*PAGE_SIZE; + pte++, _address += PAGE_SIZE) { + pteval = *pte; + if (!is_swap_pte(pteval)) + continue; + swapped_in++; + ret = do_swap_page(mm, vma, _address, pte, pmd, + FAULT_FLAG_ALLOW_RETRY|FAULT_FLAG_RETRY_NOWAIT, + pteval); + if (ret & VM_FAULT_ERROR) { + trace_mm_collapse_huge_page_swapin(mm, swapped_in, 0); + return; + } + /* pte is unmapped now, we need to map it */ + pte = pte_offset_map(pmd, _address); + } + pte--; + pte_unmap(pte); + trace_mm_collapse_huge_page_swapin(mm, swapped_in, 1); +} + + + static void collapse_huge_page(struct mm_struct *mm, unsigned long address, struct page **hpage, @@ -2655,6 +2696,8 @@ static void collapse_huge_page(struct mm_struct *mm, anon_vma_lock_write(vma->anon_vma); + __collapse_huge_page_swapin(mm, vma, address, pmd, pte); + pte = pte_offset_map(pmd, address); pte_ptl = pte_lockptr(mm, pmd); diff --git a/mm/internal.h b/mm/internal.h index bc0fa9a..867ea14 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -14,6 +14,10 @@ #include <linux/fs.h> #include <linux/mm.h> +extern int do_swap_page(struct mm_struct *mm, struct vm_area_struct *vma, + unsigned long address, pte_t *page_table, pmd_t *pmd, + unsigned int flags, pte_t orig_pte); + void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma, unsigned long floor, unsigned long ceiling); diff --git a/mm/memory.c b/mm/memory.c index 9cb2747..caecc64 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2441,7 +2441,7 @@ EXPORT_SYMBOL(unmap_mapping_range); * We return with the mmap_sem locked or unlocked in the same cases * as does filemap_fault(). */ -static int do_swap_page(struct mm_struct *mm, struct vm_area_struct *vma, +int do_swap_page(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long address, pte_t *page_table, pmd_t *pmd, unsigned int flags, pte_t orig_pte) { -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [RFC v5 3/3] mm: make swapin readahead to improve thp collapse rate 2015-09-14 19:31 ` [RFC v5 3/3] mm: make swapin readahead to improve thp collapse rate Ebru Akagunduz @ 2015-09-17 13:28 ` Kirill A. Shutemov 2015-09-17 15:13 ` Kirill A. Shutemov 1 sibling, 0 replies; 17+ messages in thread From: Kirill A. Shutemov @ 2015-09-17 13:28 UTC (permalink / raw) To: Ebru Akagunduz Cc: linux-mm, akpm, kirill.shutemov, n-horiguchi, aarcange, riel, iamjoonsoo.kim, xiexiuqi, gorcunov, linux-kernel, mgorman, rientjes, vbabka, aneesh.kumar, hughd, hannes, mhocko, boaz, raindel On Mon, Sep 14, 2015 at 10:31:45PM +0300, Ebru Akagunduz wrote: > @@ -2655,6 +2696,8 @@ static void collapse_huge_page(struct mm_struct *mm, > > anon_vma_lock_write(vma->anon_vma); > > + __collapse_huge_page_swapin(mm, vma, address, pmd, pte); > + Do I miss something, or 'pte' is not initialized at this point? And the value is not really used in __collapse_huge_page_swapin(). > pte = pte_offset_map(pmd, address); > pte_ptl = pte_lockptr(mm, pmd); ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC v5 3/3] mm: make swapin readahead to improve thp collapse rate 2015-09-14 19:31 ` [RFC v5 3/3] mm: make swapin readahead to improve thp collapse rate Ebru Akagunduz 2015-09-17 13:28 ` Kirill A. Shutemov @ 2015-09-17 15:13 ` Kirill A. Shutemov 1 sibling, 0 replies; 17+ messages in thread From: Kirill A. Shutemov @ 2015-09-17 15:13 UTC (permalink / raw) To: Ebru Akagunduz Cc: linux-mm, akpm, kirill.shutemov, n-horiguchi, aarcange, riel, iamjoonsoo.kim, xiexiuqi, gorcunov, linux-kernel, mgorman, rientjes, vbabka, aneesh.kumar, hughd, hannes, mhocko, boaz, raindel On Mon, Sep 14, 2015 at 10:31:45PM +0300, Ebru Akagunduz wrote: > This patch makes swapin readahead to improve thp collapse rate. > When khugepaged scanned pages, there can be a few of the pages > in swap area. > > With the patch THP can collapse 4kB pages into a THP when > there are up to max_ptes_swap swap ptes in a 2MB range. > > The patch was tested with a test program that allocates > 400B of memory, writes to it, and then sleeps. I force > the system to swap out all. Afterwards, the test program > touches the area by writing, it skips a page in each > 20 pages of the area. > > Without the patch, system did not swap in readahead. > THP rate was %65 of the program of the memory, it > did not change over time. > > With this patch, after 10 minutes of waiting khugepaged had > collapsed %99 of the program's memory. > > Signed-off-by: Ebru Akagunduz <ebru.akagunduz@gmail.com> > Acked-by: Rik van Riel <riel@redhat.com> [ resend with correct TO/CC lists] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC v5 0/3] mm: make swapin readahead to gain more thp performance 2015-09-14 19:31 [RFC v5 0/3] mm: make swapin readahead to gain more thp performance Ebru Akagunduz ` (2 preceding siblings ...) 2015-09-14 19:31 ` [RFC v5 3/3] mm: make swapin readahead to improve thp collapse rate Ebru Akagunduz @ 2015-09-14 21:41 ` Andrew Morton 2016-02-25 7:36 ` Hugh Dickins 3 siblings, 1 reply; 17+ messages in thread From: Andrew Morton @ 2015-09-14 21:41 UTC (permalink / raw) To: Ebru Akagunduz Cc: linux-mm, kirill.shutemov, n-horiguchi, aarcange, riel, iamjoonsoo.kim, xiexiuqi, gorcunov, linux-kernel, mgorman, rientjes, vbabka, aneesh.kumar, hughd, hannes, mhocko, boaz, raindel On Mon, 14 Sep 2015 22:31:42 +0300 Ebru Akagunduz <ebru.akagunduz@gmail.com> wrote: > This patch series makes swapin readahead up to a > certain number to gain more thp performance and adds > tracepoint for khugepaged_scan_pmd, collapse_huge_page, > __collapse_huge_page_isolate. I'll merge this series for testing. Hopefully Andrea and/or Hugh will find time for a quality think about the issue before 4.3 comes around. It would be much better if we didn't have that sysfs knob - make the control automatic in some fashion. If we can't think of a way of doing that then at least let's document max_ptes_swap very carefully. Explain to our users what it does, why they should care about it, how they should set about determining (ie: measuring) its effect upon their workloads. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC v5 0/3] mm: make swapin readahead to gain more thp performance 2015-09-14 21:41 ` [RFC v5 0/3] mm: make swapin readahead to gain more thp performance Andrew Morton @ 2016-02-25 7:36 ` Hugh Dickins 2016-02-25 22:35 ` Rik van Riel 2016-02-25 23:16 ` Ebru Akagunduz 0 siblings, 2 replies; 17+ messages in thread From: Hugh Dickins @ 2016-02-25 7:36 UTC (permalink / raw) To: Ebru Akagunduz Cc: Andrew Morton, linux-mm, kirill.shutemov, n-horiguchi, aarcange, riel, iamjoonsoo.kim, xiexiuqi, gorcunov, linux-kernel, mgorman, rientjes, vbabka, aneesh.kumar, hughd, hannes, mhocko, boaz, raindel On Mon, 14 Sep 2015, Andrew Morton wrote: > On Mon, 14 Sep 2015 22:31:42 +0300 Ebru Akagunduz <ebru.akagunduz@gmail.com> wrote: > > > This patch series makes swapin readahead up to a > > certain number to gain more thp performance and adds > > tracepoint for khugepaged_scan_pmd, collapse_huge_page, > > __collapse_huge_page_isolate. > > I'll merge this series for testing. Hopefully Andrea and/or Hugh will > find time for a quality think about the issue before 4.3 comes around. > > It would be much better if we didn't have that sysfs knob - make the > control automatic in some fashion. > > If we can't think of a way of doing that then at least let's document > max_ptes_swap very carefully. Explain to our users what it does, why > they should care about it, how they should set about determining (ie: > measuring) its effect upon their workloads. Ebru, I don't know whether you realize, but your THP swapin work has been languishing in mmotm for five months now, without getting any nearer to Linus's tree. That's partly my fault - sorry - for not responding to Andrew's nudge above. But I think you also got caught up in conference, and in the end did not get around to answering outstanding issues: please take a look at your mailbox from last September, to see what more is needed. Here's what mmotm's series file says... #mm-add-tracepoint-for-scanning-pages.patch+2: Andrea/Hugh review?. 2 Fengguang warnings, one "kernel test robot" oops #mm-make-optimistic-check-for-swapin-readahead.patch: TBU (docs) mm-make-optimistic-check-for-swapin-readahead.patch mm-make-optimistic-check-for-swapin-readahead-fix-2.patch #mm-make-swapin-readahead-to-improve-thp-collapse-rate.patch: Hugh/Kirill want collapse_huge_page() rework mm-make-swapin-readahead-to-improve-thp-collapse-rate.patch mm-make-swapin-readahead-to-improve-thp-collapse-rate-fix.patch mm-make-swapin-readahead-to-improve-thp-collapse-rate-fix-2.patch #mm-make-swapin-readahead-to-improve-thp-collapse-rate-fix-3.patch: Ebru to test? mm-make-swapin-readahead-to-improve-thp-collapse-rate-fix-3.patch ...but I think some of that is stale. There were a few little bugs when it first went into mmotm, which Kirill very swiftly fixed up, and I don't think it has given anybody any trouble since then. But do I want to see this work go in? Yes and no. The problem it fixes (that although we give out a THP to someone who faults a single page of it, after swapout the THP cannot be recovered until they have faulted in every page of it) is real and embarrassing; the code is good; and I don't mind the max_ptes_swap tunable that concerns Andrew above; but Kirill and Vlastimil made important points that still trouble me. I can't locate Kirill's mail right now, perhaps I'm misremembering: but wasn't he concerned by your __collapse_huge_page_swapin() (likely to be allocating many small pages) being called under down_write of mmap_sem? That's usually something we soon regret, and even down_read of mmap_sem across many memory allocations would be unfortunate (khugepaged used to allocate its THP that way, but we have Vlastimil to thank for stopping that in his 8b1645685acf). And didn't Vlastimil (9/4/15) make some other unanswered observations about the call to __collapse_huge_page_swapin(): > Hmm it seems rather wasteful to call this when no swap entries were detected. > Also it seems pointless to try continue collapsing when we have just only issued > async swap-in? What are the chances they would finish in time? > > I'm less sure about the relation vs khugepaged_alloc_page(). At this point, we > have already succeeded the hugepage allocation. It makes sense not to swap-in if > we can't allocate a hugepage. It makes also sense not to allocate a hugepage if > we will just issue async swap-ins and then free the hugepage back. Swap-in means > disk I/O that's best avoided if not useful. But the reclaim for hugepage > allocation might also involve disk I/O. At worst, it could be creating new swap > pte's in the very pmd we are scanning... Thoughts? Doesn't this imply that __collapse_huge_page_swapin() will initiate all the necessary swapins for a THP, then (given the FAULT_FLAG_ALLOW_RETRY) not wait for them to complete, so khugepaged will give up on that extent and move on to another; then after another full circuit of all the mms it needs to examine, it will arrive back at this extent and build a THP from the swapins it arranged last time. Which may work well when a system transitions from busy+swappingout to idle+swappingin, but isn't that rather a special case? It feels (meaning, I've not measured at all) as if the inbetween busyish case will waste a lot of I/O and memory on swapins that have to be discarded again before khugepaged has made its sedate way back to slotting them in. So I wonder how useful this is in its present form. The problem being, not with your code as such, but the whole nature of khugepaged. When I had to solve a similar problem with recovering huge tmpfs pages (not yet posted), I did briefly consider whether to hook in to use khugepaged; but rejected that, and have never regretted using a workqueue item for the extent instead. Did Vlastimil (argh, him again!) propose something similar to replace khugepaged? Or should khugepaged fire off workqueue items for THP extents needing swapin? Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC v5 0/3] mm: make swapin readahead to gain more thp performance 2016-02-25 7:36 ` Hugh Dickins @ 2016-02-25 22:35 ` Rik van Riel 2016-02-25 23:30 ` Ebru Akagunduz 2016-02-25 23:16 ` Ebru Akagunduz 1 sibling, 1 reply; 17+ messages in thread From: Rik van Riel @ 2016-02-25 22:35 UTC (permalink / raw) To: Hugh Dickins, Ebru Akagunduz Cc: Andrew Morton, linux-mm, kirill.shutemov, n-horiguchi, aarcange, iamjoonsoo.kim, xiexiuqi, gorcunov, linux-kernel, mgorman, rientjes, vbabka, aneesh.kumar, hannes, mhocko, boaz, raindel [-- Attachment #1: Type: text/plain, Size: 1555 bytes --] On Wed, 2016-02-24 at 23:36 -0800, Hugh Dickins wrote: > > Doesn't this imply that __collapse_huge_page_swapin() will initiate > all > the necessary swapins for a THP, then (given the > FAULT_FLAG_ALLOW_RETRY) > not wait for them to complete, so khugepaged will give up on that > extent > and move on to another; then after another full circuit of all the > mms > it needs to examine, it will arrive back at this extent and build a > THP > from the swapins it arranged last time. > > Which may work well when a system transitions from busy+swappingout > to idle+swappingin, but isn't that rather a special case? It feels > (meaning, I've not measured at all) as if the inbetween busyish case > will waste a lot of I/O and memory on swapins that have to be > discarded > again before khugepaged has made its sedate way back to slotting them > in. > There may be a fairly simple way to prevent that from becoming an issue. When khugepaged wakes up, it can check the PGSWPOUT or even the PGSTEAL_* stats for the system, and skip swapin readahead if there was swapout activity (or any page reclaim activity?) since the time it last ran. That way the swapin readahead will do its thing when transitioning from busy + swapout to idle + swapin, but not while the system is under permanent memory pressure. Am I forgetting anything obvious? Is this too aggressive? Not aggressive enough? Could PGPGOUT + PGSWPOUT be a useful in-between between just PGSWPOUT or PGSTEAL_*? -- All rights reversed [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC v5 0/3] mm: make swapin readahead to gain more thp performance 2016-02-25 22:35 ` Rik van Riel @ 2016-02-25 23:30 ` Ebru Akagunduz 2016-02-26 6:17 ` Hugh Dickins 0 siblings, 1 reply; 17+ messages in thread From: Ebru Akagunduz @ 2016-02-25 23:30 UTC (permalink / raw) To: linux-mm, riel, hughd Cc: akpm, kirill.shutemov, n-horiguchi, aarcange, iamjoonsoo.kim, xiexiuqi, gorcunov, linux-kernel, mgorman, rientjes, vbabka, aneesh.kumar, hannes, mhocko, boaz, raindel in Thu, Feb 25, 2016 at 05:35:50PM -0500, Rik van Riel wrote: > On Wed, 2016-02-24 at 23:36 -0800, Hugh Dickins wrote: > > > > Doesn't this imply that __collapse_huge_page_swapin() will initiate > > all > > the necessary swapins for a THP, then (given the > > FAULT_FLAG_ALLOW_RETRY) > > not wait for them to complete, so khugepaged will give up on that > > extent > > and move on to another; then after another full circuit of all the > > mms > > it needs to examine, it will arrive back at this extent and build a > > THP > > from the swapins it arranged last time. > > > > Which may work well when a system transitions from busy+swappingout > > to idle+swappingin, but isn't that rather a special case? It feels > > (meaning, I've not measured at all) as if the inbetween busyish case > > will waste a lot of I/O and memory on swapins that have to be > > discarded > > again before khugepaged has made its sedate way back to slotting them > > in. > > > > There may be a fairly simple way to prevent > that from becoming an issue. > > When khugepaged wakes up, it can check the > PGSWPOUT or even the PGSTEAL_* stats for > the system, and skip swapin readahead if > there was swapout activity (or any page > reclaim activity?) since the time it last > ran. > > That way the swapin readahead will do > its thing when transitioning from > busy + swapout to idle + swapin, but not > while the system is under permanent memory > pressure. > The idea make sense for me. > Am I forgetting anything obvious? > > Is this too aggressive? > > Not aggressive enough? > > Could PGPGOUT + PGSWPOUT be a useful > in-between between just PGSWPOUT or > PGSTEAL_*? > > -- > All rights reversed -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC v5 0/3] mm: make swapin readahead to gain more thp performance 2016-02-25 23:30 ` Ebru Akagunduz @ 2016-02-26 6:17 ` Hugh Dickins 2016-02-26 14:51 ` Rik van Riel 0 siblings, 1 reply; 17+ messages in thread From: Hugh Dickins @ 2016-02-26 6:17 UTC (permalink / raw) To: Ebru Akagunduz Cc: linux-mm, riel, hughd, akpm, kirill.shutemov, n-horiguchi, aarcange, iamjoonsoo.kim, xiexiuqi, gorcunov, linux-kernel, mgorman, rientjes, vbabka, aneesh.kumar, hannes, mhocko, boaz, raindel [-- Attachment #1: Type: TEXT/PLAIN, Size: 2689 bytes --] On Fri, 26 Feb 2016, Ebru Akagunduz wrote: > in Thu, Feb 25, 2016 at 05:35:50PM -0500, Rik van Riel wrote: > > On Wed, 2016-02-24 at 23:36 -0800, Hugh Dickins wrote: > > > > > > Doesn't this imply that __collapse_huge_page_swapin() will initiate > > > all > > > the necessary swapins for a THP, then (given the > > > FAULT_FLAG_ALLOW_RETRY) > > > not wait for them to complete, so khugepaged will give up on that > > > extent > > > and move on to another; then after another full circuit of all the > > > mms > > > it needs to examine, it will arrive back at this extent and build a > > > THP > > > from the swapins it arranged last time. > > > > > > Which may work well when a system transitions from busy+swappingout > > > to idle+swappingin, but isn't that rather a special case? It feels > > > (meaning, I've not measured at all) as if the inbetween busyish case > > > will waste a lot of I/O and memory on swapins that have to be > > > discarded > > > again before khugepaged has made its sedate way back to slotting them > > > in. > > > > > > > There may be a fairly simple way to prevent > > that from becoming an issue. > > > > When khugepaged wakes up, it can check the > > PGSWPOUT or even the PGSTEAL_* stats for > > the system, and skip swapin readahead if > > there was swapout activity (or any page > > reclaim activity?) since the time it last > > ran. > > > > That way the swapin readahead will do > > its thing when transitioning from > > busy + swapout to idle + swapin, but not > > while the system is under permanent memory > > pressure. > > > The idea make sense for me. Yes, it does sound a promising approach: please give it a try. > > Am I forgetting anything obvious? > > > > Is this too aggressive? > > > > Not aggressive enough? > > > > Could PGPGOUT + PGSWPOUT be a useful > > in-between between just PGSWPOUT or > > PGSTEAL_*? I've no idea offhand, would have to study what each of those actually means: I'm really not familiar with them myself. I did wonder whether to suggest using swapin_readahead_hits instead, but there's probably several reasons why that would be a bad idea (its volatility, its intent for a different and private purpose, and perhaps an inappropriate feedback effect - the swap pages of a split THP are much more likely to be adjacent than usually happens, so readahead probably pays off well for them, which is good, but should not feed back into the decision). There is also a question of where to position the test or tests: allocating the THP, and allocating pages for swapin, will apply their own pressure, in danger of generating swapout. Hugh ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC v5 0/3] mm: make swapin readahead to gain more thp performance 2016-02-26 6:17 ` Hugh Dickins @ 2016-02-26 14:51 ` Rik van Riel 2016-03-03 22:08 ` Ebru Akagunduz 0 siblings, 1 reply; 17+ messages in thread From: Rik van Riel @ 2016-02-26 14:51 UTC (permalink / raw) To: Hugh Dickins, Ebru Akagunduz Cc: linux-mm, akpm, kirill.shutemov, n-horiguchi, aarcange, iamjoonsoo.kim, xiexiuqi, gorcunov, linux-kernel, mgorman, rientjes, vbabka, aneesh.kumar, hannes, mhocko, boaz, raindel [-- Attachment #1: Type: text/plain, Size: 1073 bytes --] On Thu, 2016-02-25 at 22:17 -0800, Hugh Dickins wrote: > On Fri, 26 Feb 2016, Ebru Akagunduz wrote: > > in Thu, Feb 25, 2016 at 05:35:50PM -0500, Rik van Riel wrote: > > > > Am I forgetting anything obvious? > > > > > > Is this too aggressive? > > > > > > Not aggressive enough? > > > > > > Could PGPGOUT + PGSWPOUT be a useful > > > in-between between just PGSWPOUT or > > > PGSTEAL_*? > > I've no idea offhand, would have to study what each of those > actually means: I'm really not familiar with them myself. There are a few levels of page reclaim activity: PGSTEAL_* - any page was reclaimed, this could just be file pages for streaming file IO,etc PGPGOUT - the VM wrote pages back to disk to reclaim them, this could include file pages PGSWPOUT - the VM wrote something to swap to reclaim memory I am not sure which level of aggressiveness khugepaged should check against, but my gut instinct would probably be the second or third. -- All Rights Reversed. [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC v5 0/3] mm: make swapin readahead to gain more thp performance 2016-02-26 14:51 ` Rik van Riel @ 2016-03-03 22:08 ` Ebru Akagunduz 0 siblings, 0 replies; 17+ messages in thread From: Ebru Akagunduz @ 2016-03-03 22:08 UTC (permalink / raw) To: hughd, riel Cc: linux-mm, akpm, kirill.shutemov, n-horiguchi, aarcange, iamjoonsoo.kim, gorcunov, linux-kernel, mgorman, rientjes, vbabka, aneesh.kumar, hannes, mhocko, boaz, raindel On Fri, Feb 26, 2016 at 09:51:56AM -0500, Rik van Riel wrote: > On Thu, 2016-02-25 at 22:17 -0800, Hugh Dickins wrote: > > On Fri, 26 Feb 2016, Ebru Akagunduz wrote: > > > in Thu, Feb 25, 2016 at 05:35:50PM -0500, Rik van Riel wrote: > > > > > > Am I forgetting anything obvious? > > > > > > > > Is this too aggressive? > > > > > > > > Not aggressive enough? > > > > > > > > Could PGPGOUT + PGSWPOUT be a useful > > > > in-between between just PGSWPOUT or > > > > PGSTEAL_*? > > > > I've no idea offhand, would have to study what each of those > > actually means: I'm really not familiar with them myself. > > There are a few levels of page reclaim activity: > > PGSTEAL_* - any page was reclaimed, this could just > be file pages for streaming file IO,etc > > PGPGOUT - the VM wrote pages back to disk to reclaim > them, this could include file pages > > PGSWPOUT - the VM wrote something to swap to reclaim > memory > > I am not sure which level of aggressiveness khugepaged > should check against, but my gut instinct would probably > be the second or third. I tested with PGPGOUT, it does not help as I expect. As Rik's suggestion, PSWPOUT and ALLOCSTALL can be good. I started to prepare the patch last week. Just wanted to make you sure. Kind regards. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC v5 0/3] mm: make swapin readahead to gain more thp performance 2016-02-25 7:36 ` Hugh Dickins 2016-02-25 22:35 ` Rik van Riel @ 2016-02-25 23:16 ` Ebru Akagunduz 1 sibling, 0 replies; 17+ messages in thread From: Ebru Akagunduz @ 2016-02-25 23:16 UTC (permalink / raw) To: Hugh Dickins Cc: akpm, linux-mm, kirill.shutemov, n-horiguchi, aarcange, riel, iamjoonsoo.kim, xiexiuqi, gorcunov, linux-kernel, mgorman, rientjes, vbabka, aneesh.kumar, hannes, mhocko, boaz, raindel On Wed, Feb 24, 2016 at 11:36:30PM -0800, Hugh Dickins wrote: > On Mon, 14 Sep 2015, Andrew Morton wrote: > > On Mon, 14 Sep 2015 22:31:42 +0300 Ebru Akagunduz <ebru.akagunduz@gmail.com> wrote: > > > > > This patch series makes swapin readahead up to a > > > certain number to gain more thp performance and adds > > > tracepoint for khugepaged_scan_pmd, collapse_huge_page, > > > __collapse_huge_page_isolate. > > > > I'll merge this series for testing. Hopefully Andrea and/or Hugh will > > find time for a quality think about the issue before 4.3 comes around. > > > > It would be much better if we didn't have that sysfs knob - make the > > control automatic in some fashion. > > > > If we can't think of a way of doing that then at least let's document > > max_ptes_swap very carefully. Explain to our users what it does, why > > they should care about it, how they should set about determining (ie: > > measuring) its effect upon their workloads. > > Ebru, I don't know whether you realize, but your THP swapin work has > been languishing in mmotm for five months now, without getting any > nearer to Linus's tree. > > That's partly my fault - sorry - for not responding to Andrew's nudge > above. But I think you also got caught up in conference, and in the > end did not get around to answering outstanding issues: please take a > look at your mailbox from last September, to see what more is needed. > I've seen my patch series in mmotm mails but I thought, other parts of thp have problem so those are going to be forwarded to Linus's tree when other parts fixed. I did not know the file: http://www.ozlabs.org/~akpm/mmotm/series It shows explicitly the problem of patches. Thank you for summarizing it below. > Here's what mmotm's series file says... > > #mm-add-tracepoint-for-scanning-pages.patch+2: Andrea/Hugh review?. 2 Fengguang warnings, one "kernel test robot" oops > #mm-make-optimistic-check-for-swapin-readahead.patch: TBU (docs) I've sent doc patch: http://lkml.iu.edu/hypermail/linux/kernel/1509.2/01783.html > mm-make-optimistic-check-for-swapin-readahead.patch > mm-make-optimistic-check-for-swapin-readahead-fix-2.patch > #mm-make-swapin-readahead-to-improve-thp-collapse-rate.patch: Hugh/Kirill want collapse_huge_page() rework > mm-make-swapin-readahead-to-improve-thp-collapse-rate.patch > mm-make-swapin-readahead-to-improve-thp-collapse-rate-fix.patch > mm-make-swapin-readahead-to-improve-thp-collapse-rate-fix-2.patch > #mm-make-swapin-readahead-to-improve-thp-collapse-rate-fix-3.patch: Ebru to test? I've tested my whole patch series and could not produce the fault again. I've also seen Tested-by tag from Sergey so I did not sent the tag. > mm-make-swapin-readahead-to-improve-thp-collapse-rate-fix-3.patch > > ...but I think some of that is stale. There were a few little bugs > when it first went into mmotm, which Kirill very swiftly fixed up, > and I don't think it has given anybody any trouble since then. > > But do I want to see this work go in? Yes and no. The problem it > fixes (that although we give out a THP to someone who faults a single > page of it, after swapout the THP cannot be recovered until they have > faulted in every page of it) is real and embarrassing; the code is good; > and I don't mind the max_ptes_swap tunable that concerns Andrew above; > but Kirill and Vlastimil made important points that still trouble me. > > I can't locate Kirill's mail right now, perhaps I'm misremembering: > but wasn't he concerned by your __collapse_huge_page_swapin() (likely > to be allocating many small pages) being called under down_write of > mmap_sem? That's usually something we soon regret, and even down_read > of mmap_sem across many memory allocations would be unfortunate > (khugepaged used to allocate its THP that way, but we have > Vlastimil to thank for stopping that in his 8b1645685acf). > > And didn't Vlastimil (9/4/15) make some other unanswered > observations about the call to __collapse_huge_page_swapin(): > > > Hmm it seems rather wasteful to call this when no swap entries were detected. > > Also it seems pointless to try continue collapsing when we have just only issued > > async swap-in? What are the chances they would finish in time? > > > > I'm less sure about the relation vs khugepaged_alloc_page(). At this point, we > > have already succeeded the hugepage allocation. It makes sense not to swap-in if > > we can't allocate a hugepage. It makes also sense not to allocate a hugepage if > > we will just issue async swap-ins and then free the hugepage back. Swap-in means > > disk I/O that's best avoided if not useful. But the reclaim for hugepage > > allocation might also involve disk I/O. At worst, it could be creating new swap > > pte's in the very pmd we are scanning... Thoughts? > I did not take enough responsibility, you're right. I should have asked regarding the patch at least. > Doesn't this imply that __collapse_huge_page_swapin() will initiate all > the necessary swapins for a THP, then (given the FAULT_FLAG_ALLOW_RETRY) > not wait for them to complete, so khugepaged will give up on that extent > and move on to another; then after another full circuit of all the mms > it needs to examine, it will arrive back at this extent and build a THP > from the swapins it arranged last time. > > Which may work well when a system transitions from busy+swappingout > to idle+swappingin, but isn't that rather a special case? It feels > (meaning, I've not measured at all) as if the inbetween busyish case > will waste a lot of I/O and memory on swapins that have to be discarded > again before khugepaged has made its sedate way back to slotting them in. > > So I wonder how useful this is in its present form. The problem being, > not with your code as such, but the whole nature of khugepaged. When > I had to solve a similar problem with recovering huge tmpfs pages (not > yet posted), I did briefly consider whether to hook in to use khugepaged; > but rejected that, and have never regretted using a workqueue item for > the extent instead. Did Vlastimil (argh, him again!) propose something > similar to replace khugepaged? Or should khugepaged fire off workqueue > items for THP extents needing swapin? > > Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2016-03-03 22:08 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-09-14 19:31 [RFC v5 0/3] mm: make swapin readahead to gain more thp performance Ebru Akagunduz 2015-09-14 19:31 ` [RFC v5 1/3] mm: add tracepoint for scanning pages Ebru Akagunduz 2015-09-14 19:31 ` [RFC v5 2/3] mm: make optimistic check for swapin readahead Ebru Akagunduz 2015-09-14 19:47 ` Rik van Riel 2015-09-14 21:33 ` Andrew Morton 2015-09-15 20:08 ` Ebru Akagunduz 2015-09-14 19:31 ` [RFC v5 3/3] mm: make swapin readahead to improve thp collapse rate Ebru Akagunduz 2015-09-17 13:28 ` Kirill A. Shutemov 2015-09-17 15:13 ` Kirill A. Shutemov 2015-09-14 21:41 ` [RFC v5 0/3] mm: make swapin readahead to gain more thp performance Andrew Morton 2016-02-25 7:36 ` Hugh Dickins 2016-02-25 22:35 ` Rik van Riel 2016-02-25 23:30 ` Ebru Akagunduz 2016-02-26 6:17 ` Hugh Dickins 2016-02-26 14:51 ` Rik van Riel 2016-03-03 22:08 ` Ebru Akagunduz 2016-02-25 23:16 ` Ebru Akagunduz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).