All of lore.kernel.org
 help / color / mirror / Atom feed
diff for duplicates of <20151130092229.GA10745@bbox>

diff --git a/a/1.txt b/N1/1.txt
index 5fdf196..96acccf 100644
--- a/a/1.txt
+++ b/N1/1.txt
@@ -29,544 +29,3 @@ On Mon, Nov 30, 2015 at 10:20:25AM +0200, Mika Penttilä wrote:
 
 Even, I missed unlock_page.
 Thanks for the review!
-
->From d22483fae454b100bcf73d514dd7d903fd84f744 Mon Sep 17 00:00:00 2001
-From: Minchan Kim <minchan@kernel.org>
-Date: Fri, 30 Oct 2015 16:01:37 +0900
-Subject: [PATCH v5 01/12] mm: support madvise(MADV_FREE)
-
-Linux doesn't have an ability to free pages lazy while other OS already
-have been supported that named by madvise(MADV_FREE).
-
-The gain is clear that kernel can discard freed pages rather than swapping
-out or OOM if memory pressure happens.
-
-Without memory pressure, freed pages would be reused by userspace without
-another additional overhead(ex, page fault + allocation + zeroing).
-
-Jason Evans said:
-
-: Facebook has been using MAP_UNINITIALIZED
-: (https://lkml.org/lkml/2012/1/18/308) in some of its applications for
-: several years, but there are operational costs to maintaining this
-: out-of-tree in our kernel and in jemalloc, and we are anxious to retire it
-: in favor of MADV_FREE.  When we first enabled MAP_UNINITIALIZED it
-: increased throughput for much of our workload by ~5%, and although the
-: benefit has decreased using newer hardware and kernels, there is still
-: enough benefit that we cannot reasonably retire it without a replacement.
-:
-: Aside from Facebook operations, there are numerous broadly used
-: applications that would benefit from MADV_FREE.  The ones that immediately
-: come to mind are redis, varnish, and MariaDB.  I don't have much insight
-: into Android internals and development process, but I would hope to see
-: MADV_FREE support eventually end up there as well to benefit applications
-: linked with the integrated jemalloc.
-:
-: jemalloc will use MADV_FREE once it becomes available in the Linux kernel.
-: In fact, jemalloc already uses MADV_FREE or equivalent everywhere it's
-: available: *BSD, OS X, Windows, and Solaris -- every platform except Linux
-: (and AIX, but I'm not sure it even compiles on AIX).  The lack of
-: MADV_FREE on Linux forced me down a long series of increasingly
-: sophisticated heuristics for madvise() volume reduction, and even so this
-: remains a common performance issue for people using jemalloc on Linux.
-: Please integrate MADV_FREE; many people will benefit substantially.
-
-How it works:
-
-When madvise syscall is called, VM clears dirty bit of ptes of the range.
-If memory pressure happens, VM checks dirty bit of page table and if it
-found still "clean", it means it's a "lazyfree pages" so VM could discard
-the page instead of swapping out.  Once there was store operation for the
-page before VM peek a page to reclaim, dirty bit is set so VM can swap out
-the page instead of discarding.
-
-One thing we should notice is that basically, MADV_FREE relies on dirty bit
-in page table entry to decide whether VM allows to discard the page or not.
-IOW, if page table entry includes marked dirty bit, VM shouldn't discard
-the page.
-
-However, as a example, if swap-in by read fault happens, page table entry
-doesn't have dirty bit so MADV_FREE could discard the page wrongly.
-
-For avoiding the problem, MADV_FREE did more checks with PageDirty
-and PageSwapCache. It worked out because swapped-in page lives on
-swap cache and since it is evicted from the swap cache, the page has
-PG_dirty flag. So both page flags check effectively prevent
-wrong discarding by MADV_FREE.
-
-However, a problem in above logic is that swapped-in page has
-PG_dirty still after they are removed from swap cache so VM cannot
-consider the page as freeable any more even if madvise_free is
-called in future.
-
-Look at below example for detail.
-
-    ptr = malloc();
-    memset(ptr);
-    ..
-    ..
-    .. heavy memory pressure so all of pages are swapped out
-    ..
-    ..
-    var = *ptr; -> a page swapped-in and could be removed from
-                   swapcache. Then, page table doesn't mark
-                   dirty bit and page descriptor includes PG_dirty
-    ..
-    ..
-    madvise_free(ptr); -> It doesn't clear PG_dirty of the page.
-    ..
-    ..
-    ..
-    .. heavy memory pressure again.
-    .. In this time, VM cannot discard the page because the page
-    .. has *PG_dirty*
-
-To solve the problem, this patch clears PG_dirty if only the page is owned
-exclusively by current process when madvise is called because PG_dirty
-represents ptes's dirtiness in several processes so we could clear it only
-if we own it exclusively.
-
-Firstly, heavy users would be general allocators(ex, jemalloc, tcmalloc
-and hope glibc supports it) and jemalloc/tcmalloc already have supported
-the feature for other OS(ex, FreeBSD)
-
-barrios@blaptop:~/benchmark/ebizzy$ lscpu
-Architecture:          x86_64
-CPU op-mode(s):        32-bit, 64-bit
-Byte Order:            Little Endian
-CPU(s):                12
-On-line CPU(s) list:   0-11
-Thread(s) per core:    1
-Core(s) per socket:    1
-Socket(s):             12
-NUMA node(s):          1
-Vendor ID:             GenuineIntel
-CPU family:            6
-Model:                 2
-Stepping:              3
-CPU MHz:               3200.185
-BogoMIPS:              6400.53
-Virtualization:        VT-x
-Hypervisor vendor:     KVM
-Virtualization type:   full
-L1d cache:             32K
-L1i cache:             32K
-L2 cache:              4096K
-NUMA node0 CPU(s):     0-11
-ebizzy benchmark(./ebizzy -S 10 -n 512)
-
-Higher avg is better.
-
- vanilla-jemalloc		MADV_free-jemalloc
-
-1 thread
-records: 10			    records: 10
-avg:	2961.90			    avg:   12069.70
-std:	  71.96(2.43%)		    std:     186.68(1.55%)
-max:	3070.00			    max:   12385.00
-min:	2796.00			    min:   11746.00
-
-2 thread
-records: 10			    records: 10
-avg:	5020.00			    avg:   17827.00
-std:	 264.87(5.28%)		    std:     358.52(2.01%)
-max:	5244.00			    max:   18760.00
-min:	4251.00			    min:   17382.00
-
-4 thread
-records: 10			    records: 10
-avg:	8988.80			    avg:   27930.80
-std:	1175.33(13.08%)		    std:    3317.33(11.88%)
-max:	9508.00			    max:   30879.00
-min:	5477.00			    min:   21024.00
-
-8 thread
-records: 10			    records: 10
-avg:   13036.50			    avg:   33739.40
-std:	 170.67(1.31%)		    std:    5146.22(15.25%)
-max:   13371.00			    max:   40572.00
-min:   12785.00			    min:   24088.00
-
-16 thread
-records: 10			    records: 10
-avg:   11092.40			    avg:   31424.20
-std:	 710.60(6.41%)		    std:    3763.89(11.98%)
-max:   12446.00			    max:   36635.00
-min:	9949.00			    min:   25669.00
-
-32 thread
-records: 10			    records: 10
-avg:   11067.00			    avg:   34495.80
-std:	 971.06(8.77%)		    std:    2721.36(7.89%)
-max:   12010.00			    max:   38598.00
-min:	9002.00			    min:   30636.00
-
-In summary, MADV_FREE is about much faster than MADV_DONTNEED.
-
-Acked-by: Michal Hocko <mhocko@suse.com>
-Acked-by: Hugh Dickins <hughd@google.com>
-Signed-off-by: Minchan Kim <minchan@kernel.org>
----
- include/linux/rmap.h                   |   1 +
- include/linux/vm_event_item.h          |   1 +
- include/uapi/asm-generic/mman-common.h |   1 +
- mm/madvise.c                           | 170 +++++++++++++++++++++++++++++++++
- mm/rmap.c                              |   8 ++
- mm/swap_state.c                        |   5 +-
- mm/vmscan.c                            |  10 +-
- mm/vmstat.c                            |   1 +
- 8 files changed, 192 insertions(+), 5 deletions(-)
-
-diff --git a/include/linux/rmap.h b/include/linux/rmap.h
-index 77d1ba57d495..04d2aec64e57 100644
---- a/include/linux/rmap.h
-+++ b/include/linux/rmap.h
-@@ -85,6 +85,7 @@ enum ttu_flags {
- 	TTU_UNMAP = 1,			/* unmap mode */
- 	TTU_MIGRATION = 2,		/* migration mode */
- 	TTU_MUNLOCK = 4,		/* munlock mode */
-+	TTU_LZFREE = 8,			/* lazy free mode */
- 
- 	TTU_IGNORE_MLOCK = (1 << 8),	/* ignore mlock */
- 	TTU_IGNORE_ACCESS = (1 << 9),	/* don't age */
-diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h
-index e1f8c993e73b..67c1dbd19c6d 100644
---- a/include/linux/vm_event_item.h
-+++ b/include/linux/vm_event_item.h
-@@ -25,6 +25,7 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,
- 		FOR_ALL_ZONES(PGALLOC),
- 		PGFREE, PGACTIVATE, PGDEACTIVATE,
- 		PGFAULT, PGMAJFAULT,
-+		PGLAZYFREED,
- 		FOR_ALL_ZONES(PGREFILL),
- 		FOR_ALL_ZONES(PGSTEAL_KSWAPD),
- 		FOR_ALL_ZONES(PGSTEAL_DIRECT),
-diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h
-index a74dd84bbb6d..0e821e3c3d45 100644
---- a/include/uapi/asm-generic/mman-common.h
-+++ b/include/uapi/asm-generic/mman-common.h
-@@ -39,6 +39,7 @@
- #define MADV_SEQUENTIAL	2		/* expect sequential page references */
- #define MADV_WILLNEED	3		/* will need these pages */
- #define MADV_DONTNEED	4		/* don't need these pages */
-+#define MADV_FREE	5		/* free pages only if memory pressure */
- 
- /* common parameters: try to keep these consistent across architectures */
- #define MADV_REMOVE	9		/* remove these pages & resources */
-diff --git a/mm/madvise.c b/mm/madvise.c
-index c889fcbb530e..ed137fde4459 100644
---- a/mm/madvise.c
-+++ b/mm/madvise.c
-@@ -20,6 +20,9 @@
- #include <linux/backing-dev.h>
- #include <linux/swap.h>
- #include <linux/swapops.h>
-+#include <linux/mmu_notifier.h>
-+
-+#include <asm/tlb.h>
- 
- /*
-  * Any behaviour which results in changes to the vma->vm_flags needs to
-@@ -32,6 +35,7 @@ static int madvise_need_mmap_write(int behavior)
- 	case MADV_REMOVE:
- 	case MADV_WILLNEED:
- 	case MADV_DONTNEED:
-+	case MADV_FREE:
- 		return 0;
- 	default:
- 		/* be safe, default to 1. list exceptions explicitly */
-@@ -256,6 +260,163 @@ static long madvise_willneed(struct vm_area_struct *vma,
- 	return 0;
- }
- 
-+static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr,
-+				unsigned long end, struct mm_walk *walk)
-+
-+{
-+	struct mmu_gather *tlb = walk->private;
-+	struct mm_struct *mm = tlb->mm;
-+	struct vm_area_struct *vma = walk->vma;
-+	spinlock_t *ptl;
-+	pte_t *orig_pte, *pte, ptent;
-+	struct page *page;
-+
-+	split_huge_pmd(vma, pmd, addr);
-+	if (pmd_trans_unstable(pmd))
-+		return 0;
-+
-+	orig_pte = pte = pte_offset_map_lock(mm, pmd, addr, &ptl);
-+	arch_enter_lazy_mmu_mode();
-+	for (; addr != end; pte++, addr += PAGE_SIZE) {
-+		ptent = *pte;
-+
-+		if (!pte_present(ptent))
-+			continue;
-+
-+		page = vm_normal_page(vma, addr, ptent);
-+		if (!page)
-+			continue;
-+
-+		/*
-+		 * If pmd isn't transhuge but the page is THP and
-+		 * is owned by only this process, split it and
-+		 * deactivate all pages.
-+		 */
-+		if (PageTransCompound(page)) {
-+			if (page_mapcount(page) != 1)
-+				goto out;
-+			get_page(page);
-+			if (!trylock_page(page)) {
-+				put_page(page);
-+				goto out;
-+			}
-+			pte_unmap_unlock(orig_pte, ptl);
-+			if (split_huge_page(page)) {
-+				unlock_page(page);
-+				put_page(page);
-+				pte_offset_map_lock(mm, pmd, addr, &ptl);
-+				goto out;
-+			}
-+			put_page(page);
-+			unlock_page(page);
-+			pte = pte_offset_map_lock(mm, pmd, addr, &ptl);
-+			pte--;
-+			addr -= PAGE_SIZE;
-+			continue;
-+		}
-+
-+		VM_BUG_ON_PAGE(PageTransCompound(page), page);
-+
-+		if (PageSwapCache(page) || PageDirty(page)) {
-+			if (!trylock_page(page))
-+				continue;
-+			/*
-+			 * If page is shared with others, we couldn't clear
-+			 * PG_dirty of the page.
-+			 */
-+			if (page_mapcount(page) != 1) {
-+				unlock_page(page);
-+				continue;
-+			}
-+
-+			if (PageSwapCache(page) && !try_to_free_swap(page)) {
-+				unlock_page(page);
-+				continue;
-+			}
-+
-+			ClearPageDirty(page);
-+			unlock_page(page);
-+		}
-+
-+		if (pte_young(ptent) || pte_dirty(ptent)) {
-+			/*
-+			 * Some of architecture(ex, PPC) don't update TLB
-+			 * with set_pte_at and tlb_remove_tlb_entry so for
-+			 * the portability, remap the pte with old|clean
-+			 * after pte clearing.
-+			 */
-+			ptent = ptep_get_and_clear_full(mm, addr, pte,
-+							tlb->fullmm);
-+
-+			ptent = pte_mkold(ptent);
-+			ptent = pte_mkclean(ptent);
-+			set_pte_at(mm, addr, pte, ptent);
-+			tlb_remove_tlb_entry(tlb, pte, addr);
-+		}
-+	}
-+out:
-+	arch_leave_lazy_mmu_mode();
-+	pte_unmap_unlock(orig_pte, ptl);
-+	cond_resched();
-+	return 0;
-+}
-+
-+static void madvise_free_page_range(struct mmu_gather *tlb,
-+			     struct vm_area_struct *vma,
-+			     unsigned long addr, unsigned long end)
-+{
-+	struct mm_walk free_walk = {
-+		.pmd_entry = madvise_free_pte_range,
-+		.mm = vma->vm_mm,
-+		.private = tlb,
-+	};
-+
-+	tlb_start_vma(tlb, vma);
-+	walk_page_range(addr, end, &free_walk);
-+	tlb_end_vma(tlb, vma);
-+}
-+
-+static int madvise_free_single_vma(struct vm_area_struct *vma,
-+			unsigned long start_addr, unsigned long end_addr)
-+{
-+	unsigned long start, end;
-+	struct mm_struct *mm = vma->vm_mm;
-+	struct mmu_gather tlb;
-+
-+	if (vma->vm_flags & (VM_LOCKED|VM_HUGETLB|VM_PFNMAP))
-+		return -EINVAL;
-+
-+	/* MADV_FREE works for only anon vma at the moment */
-+	if (!vma_is_anonymous(vma))
-+		return -EINVAL;
-+
-+	start = max(vma->vm_start, start_addr);
-+	if (start >= vma->vm_end)
-+		return -EINVAL;
-+	end = min(vma->vm_end, end_addr);
-+	if (end <= vma->vm_start)
-+		return -EINVAL;
-+
-+	lru_add_drain();
-+	tlb_gather_mmu(&tlb, mm, start, end);
-+	update_hiwater_rss(mm);
-+
-+	mmu_notifier_invalidate_range_start(mm, start, end);
-+	madvise_free_page_range(&tlb, vma, start, end);
-+	mmu_notifier_invalidate_range_end(mm, start, end);
-+	tlb_finish_mmu(&tlb, start, end);
-+
-+	return 0;
-+}
-+
-+static long madvise_free(struct vm_area_struct *vma,
-+			     struct vm_area_struct **prev,
-+			     unsigned long start, unsigned long end)
-+{
-+	*prev = vma;
-+	return madvise_free_single_vma(vma, start, end);
-+}
-+
- /*
-  * Application no longer needs these pages.  If the pages are dirty,
-  * it's OK to just throw them away.  The app will be more careful about
-@@ -379,6 +540,14 @@ madvise_vma(struct vm_area_struct *vma, struct vm_area_struct **prev,
- 		return madvise_remove(vma, prev, start, end);
- 	case MADV_WILLNEED:
- 		return madvise_willneed(vma, prev, start, end);
-+	case MADV_FREE:
-+		/*
-+		 * XXX: In this implementation, MADV_FREE works like
-+		 * MADV_DONTNEED on swapless system or full swap.
-+		 */
-+		if (get_nr_swap_pages() > 0)
-+			return madvise_free(vma, prev, start, end);
-+		/* passthrough */
- 	case MADV_DONTNEED:
- 		return madvise_dontneed(vma, prev, start, end);
- 	default:
-@@ -398,6 +567,7 @@ madvise_behavior_valid(int behavior)
- 	case MADV_REMOVE:
- 	case MADV_WILLNEED:
- 	case MADV_DONTNEED:
-+	case MADV_FREE:
- #ifdef CONFIG_KSM
- 	case MADV_MERGEABLE:
- 	case MADV_UNMERGEABLE:
-diff --git a/mm/rmap.c b/mm/rmap.c
-index 6f371261dd12..321b633ee559 100644
---- a/mm/rmap.c
-+++ b/mm/rmap.c
-@@ -1508,6 +1508,13 @@ static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
- 		 * See handle_pte_fault() ...
- 		 */
- 		VM_BUG_ON_PAGE(!PageSwapCache(page), page);
-+
-+		if (!PageDirty(page) && (flags & TTU_LZFREE)) {
-+			/* It's a freeable page by MADV_FREE */
-+			dec_mm_counter(mm, MM_ANONPAGES);
-+			goto discard;
-+		}
-+
- 		if (swap_duplicate(entry) < 0) {
- 			set_pte_at(mm, address, pte, pteval);
- 			ret = SWAP_FAIL;
-@@ -1528,6 +1535,7 @@ static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
- 	} else
- 		dec_mm_counter(mm, mm_counter_file(page));
- 
-+discard:
- 	page_remove_rmap(page, PageHuge(page));
- 	page_cache_release(page);
- 
-diff --git a/mm/swap_state.c b/mm/swap_state.c
-index d783872d746c..676ff2991380 100644
---- a/mm/swap_state.c
-+++ b/mm/swap_state.c
-@@ -185,13 +185,12 @@ int add_to_swap(struct page *page, struct list_head *list)
- 	 * deadlock in the swap out path.
- 	 */
- 	/*
--	 * Add it to the swap cache and mark it dirty
-+	 * Add it to the swap cache.
- 	 */
- 	err = add_to_swap_cache(page, entry,
- 			__GFP_HIGH|__GFP_NOMEMALLOC|__GFP_NOWARN);
- 
--	if (!err) {	/* Success */
--		SetPageDirty(page);
-+	if (!err) {
- 		return 1;
- 	} else {	/* -ENOMEM radix-tree allocation failure */
- 		/*
-diff --git a/mm/vmscan.c b/mm/vmscan.c
-index 4589cfdbe405..c2f69445190c 100644
---- a/mm/vmscan.c
-+++ b/mm/vmscan.c
-@@ -908,6 +908,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
- 		int may_enter_fs;
- 		enum page_references references = PAGEREF_RECLAIM_CLEAN;
- 		bool dirty, writeback;
-+		bool lazyfree = false;
- 
- 		cond_resched();
- 
-@@ -1051,6 +1052,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
- 				goto keep_locked;
- 			if (!add_to_swap(page, page_list))
- 				goto activate_locked;
-+			lazyfree = true;
- 			may_enter_fs = 1;
- 
- 			/* Adding to swap updated mapping */
-@@ -1062,8 +1064,9 @@ static unsigned long shrink_page_list(struct list_head *page_list,
- 		 * processes. Try to unmap it here.
- 		 */
- 		if (page_mapped(page) && mapping) {
--			switch (try_to_unmap(page,
--					ttu_flags|TTU_BATCH_FLUSH)) {
-+			switch (try_to_unmap(page, lazyfree ?
-+				(ttu_flags | TTU_BATCH_FLUSH | TTU_LZFREE) :
-+				(ttu_flags | TTU_BATCH_FLUSH))) {
- 			case SWAP_FAIL:
- 				goto activate_locked;
- 			case SWAP_AGAIN:
-@@ -1188,6 +1191,9 @@ static unsigned long shrink_page_list(struct list_head *page_list,
- 		 */
- 		__ClearPageLocked(page);
- free_it:
-+		if (lazyfree && !PageDirty(page))
-+			count_vm_event(PGLAZYFREED);
-+
- 		nr_reclaimed++;
- 
- 		/*
-diff --git a/mm/vmstat.c b/mm/vmstat.c
-index d13cd8eebf70..38929dc79c3d 100644
---- a/mm/vmstat.c
-+++ b/mm/vmstat.c
-@@ -781,6 +781,7 @@ const char * const vmstat_text[] = {
- 
- 	"pgfault",
- 	"pgmajfault",
-+	"pglazyfreed",
- 
- 	TEXTS_FOR_ZONES("pgrefill")
- 	TEXTS_FOR_ZONES("pgsteal_kswapd")
--- 
-1.9.1
-
---
-To unsubscribe, send a message with 'unsubscribe linux-mm' in
-the body to majordomo@kvack.org.  For more info on Linux MM,
-see: http://www.linux-mm.org/ .
-Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
diff --git a/a/content_digest b/N1/content_digest
index 43edac9..3ae9247 100644
--- a/a/content_digest
+++ b/N1/content_digest
@@ -55,547 +55,6 @@
  "> (returns zero).\n"
  "\n"
  "Even, I missed unlock_page.\n"
- "Thanks for the review!\n"
- "\n"
- ">From d22483fae454b100bcf73d514dd7d903fd84f744 Mon Sep 17 00:00:00 2001\n"
- "From: Minchan Kim <minchan@kernel.org>\n"
- "Date: Fri, 30 Oct 2015 16:01:37 +0900\n"
- "Subject: [PATCH v5 01/12] mm: support madvise(MADV_FREE)\n"
- "\n"
- "Linux doesn't have an ability to free pages lazy while other OS already\n"
- "have been supported that named by madvise(MADV_FREE).\n"
- "\n"
- "The gain is clear that kernel can discard freed pages rather than swapping\n"
- "out or OOM if memory pressure happens.\n"
- "\n"
- "Without memory pressure, freed pages would be reused by userspace without\n"
- "another additional overhead(ex, page fault + allocation + zeroing).\n"
- "\n"
- "Jason Evans said:\n"
- "\n"
- ": Facebook has been using MAP_UNINITIALIZED\n"
- ": (https://lkml.org/lkml/2012/1/18/308) in some of its applications for\n"
- ": several years, but there are operational costs to maintaining this\n"
- ": out-of-tree in our kernel and in jemalloc, and we are anxious to retire it\n"
- ": in favor of MADV_FREE.  When we first enabled MAP_UNINITIALIZED it\n"
- ": increased throughput for much of our workload by ~5%, and although the\n"
- ": benefit has decreased using newer hardware and kernels, there is still\n"
- ": enough benefit that we cannot reasonably retire it without a replacement.\n"
- ":\n"
- ": Aside from Facebook operations, there are numerous broadly used\n"
- ": applications that would benefit from MADV_FREE.  The ones that immediately\n"
- ": come to mind are redis, varnish, and MariaDB.  I don't have much insight\n"
- ": into Android internals and development process, but I would hope to see\n"
- ": MADV_FREE support eventually end up there as well to benefit applications\n"
- ": linked with the integrated jemalloc.\n"
- ":\n"
- ": jemalloc will use MADV_FREE once it becomes available in the Linux kernel.\n"
- ": In fact, jemalloc already uses MADV_FREE or equivalent everywhere it's\n"
- ": available: *BSD, OS X, Windows, and Solaris -- every platform except Linux\n"
- ": (and AIX, but I'm not sure it even compiles on AIX).  The lack of\n"
- ": MADV_FREE on Linux forced me down a long series of increasingly\n"
- ": sophisticated heuristics for madvise() volume reduction, and even so this\n"
- ": remains a common performance issue for people using jemalloc on Linux.\n"
- ": Please integrate MADV_FREE; many people will benefit substantially.\n"
- "\n"
- "How it works:\n"
- "\n"
- "When madvise syscall is called, VM clears dirty bit of ptes of the range.\n"
- "If memory pressure happens, VM checks dirty bit of page table and if it\n"
- "found still \"clean\", it means it's a \"lazyfree pages\" so VM could discard\n"
- "the page instead of swapping out.  Once there was store operation for the\n"
- "page before VM peek a page to reclaim, dirty bit is set so VM can swap out\n"
- "the page instead of discarding.\n"
- "\n"
- "One thing we should notice is that basically, MADV_FREE relies on dirty bit\n"
- "in page table entry to decide whether VM allows to discard the page or not.\n"
- "IOW, if page table entry includes marked dirty bit, VM shouldn't discard\n"
- "the page.\n"
- "\n"
- "However, as a example, if swap-in by read fault happens, page table entry\n"
- "doesn't have dirty bit so MADV_FREE could discard the page wrongly.\n"
- "\n"
- "For avoiding the problem, MADV_FREE did more checks with PageDirty\n"
- "and PageSwapCache. It worked out because swapped-in page lives on\n"
- "swap cache and since it is evicted from the swap cache, the page has\n"
- "PG_dirty flag. So both page flags check effectively prevent\n"
- "wrong discarding by MADV_FREE.\n"
- "\n"
- "However, a problem in above logic is that swapped-in page has\n"
- "PG_dirty still after they are removed from swap cache so VM cannot\n"
- "consider the page as freeable any more even if madvise_free is\n"
- "called in future.\n"
- "\n"
- "Look at below example for detail.\n"
- "\n"
- "    ptr = malloc();\n"
- "    memset(ptr);\n"
- "    ..\n"
- "    ..\n"
- "    .. heavy memory pressure so all of pages are swapped out\n"
- "    ..\n"
- "    ..\n"
- "    var = *ptr; -> a page swapped-in and could be removed from\n"
- "                   swapcache. Then, page table doesn't mark\n"
- "                   dirty bit and page descriptor includes PG_dirty\n"
- "    ..\n"
- "    ..\n"
- "    madvise_free(ptr); -> It doesn't clear PG_dirty of the page.\n"
- "    ..\n"
- "    ..\n"
- "    ..\n"
- "    .. heavy memory pressure again.\n"
- "    .. In this time, VM cannot discard the page because the page\n"
- "    .. has *PG_dirty*\n"
- "\n"
- "To solve the problem, this patch clears PG_dirty if only the page is owned\n"
- "exclusively by current process when madvise is called because PG_dirty\n"
- "represents ptes's dirtiness in several processes so we could clear it only\n"
- "if we own it exclusively.\n"
- "\n"
- "Firstly, heavy users would be general allocators(ex, jemalloc, tcmalloc\n"
- "and hope glibc supports it) and jemalloc/tcmalloc already have supported\n"
- "the feature for other OS(ex, FreeBSD)\n"
- "\n"
- "barrios@blaptop:~/benchmark/ebizzy$ lscpu\n"
- "Architecture:          x86_64\n"
- "CPU op-mode(s):        32-bit, 64-bit\n"
- "Byte Order:            Little Endian\n"
- "CPU(s):                12\n"
- "On-line CPU(s) list:   0-11\n"
- "Thread(s) per core:    1\n"
- "Core(s) per socket:    1\n"
- "Socket(s):             12\n"
- "NUMA node(s):          1\n"
- "Vendor ID:             GenuineIntel\n"
- "CPU family:            6\n"
- "Model:                 2\n"
- "Stepping:              3\n"
- "CPU MHz:               3200.185\n"
- "BogoMIPS:              6400.53\n"
- "Virtualization:        VT-x\n"
- "Hypervisor vendor:     KVM\n"
- "Virtualization type:   full\n"
- "L1d cache:             32K\n"
- "L1i cache:             32K\n"
- "L2 cache:              4096K\n"
- "NUMA node0 CPU(s):     0-11\n"
- "ebizzy benchmark(./ebizzy -S 10 -n 512)\n"
- "\n"
- "Higher avg is better.\n"
- "\n"
- " vanilla-jemalloc\t\tMADV_free-jemalloc\n"
- "\n"
- "1 thread\n"
- "records: 10\t\t\t    records: 10\n"
- "avg:\t2961.90\t\t\t    avg:   12069.70\n"
- "std:\t  71.96(2.43%)\t\t    std:     186.68(1.55%)\n"
- "max:\t3070.00\t\t\t    max:   12385.00\n"
- "min:\t2796.00\t\t\t    min:   11746.00\n"
- "\n"
- "2 thread\n"
- "records: 10\t\t\t    records: 10\n"
- "avg:\t5020.00\t\t\t    avg:   17827.00\n"
- "std:\t 264.87(5.28%)\t\t    std:     358.52(2.01%)\n"
- "max:\t5244.00\t\t\t    max:   18760.00\n"
- "min:\t4251.00\t\t\t    min:   17382.00\n"
- "\n"
- "4 thread\n"
- "records: 10\t\t\t    records: 10\n"
- "avg:\t8988.80\t\t\t    avg:   27930.80\n"
- "std:\t1175.33(13.08%)\t\t    std:    3317.33(11.88%)\n"
- "max:\t9508.00\t\t\t    max:   30879.00\n"
- "min:\t5477.00\t\t\t    min:   21024.00\n"
- "\n"
- "8 thread\n"
- "records: 10\t\t\t    records: 10\n"
- "avg:   13036.50\t\t\t    avg:   33739.40\n"
- "std:\t 170.67(1.31%)\t\t    std:    5146.22(15.25%)\n"
- "max:   13371.00\t\t\t    max:   40572.00\n"
- "min:   12785.00\t\t\t    min:   24088.00\n"
- "\n"
- "16 thread\n"
- "records: 10\t\t\t    records: 10\n"
- "avg:   11092.40\t\t\t    avg:   31424.20\n"
- "std:\t 710.60(6.41%)\t\t    std:    3763.89(11.98%)\n"
- "max:   12446.00\t\t\t    max:   36635.00\n"
- "min:\t9949.00\t\t\t    min:   25669.00\n"
- "\n"
- "32 thread\n"
- "records: 10\t\t\t    records: 10\n"
- "avg:   11067.00\t\t\t    avg:   34495.80\n"
- "std:\t 971.06(8.77%)\t\t    std:    2721.36(7.89%)\n"
- "max:   12010.00\t\t\t    max:   38598.00\n"
- "min:\t9002.00\t\t\t    min:   30636.00\n"
- "\n"
- "In summary, MADV_FREE is about much faster than MADV_DONTNEED.\n"
- "\n"
- "Acked-by: Michal Hocko <mhocko@suse.com>\n"
- "Acked-by: Hugh Dickins <hughd@google.com>\n"
- "Signed-off-by: Minchan Kim <minchan@kernel.org>\n"
- "---\n"
- " include/linux/rmap.h                   |   1 +\n"
- " include/linux/vm_event_item.h          |   1 +\n"
- " include/uapi/asm-generic/mman-common.h |   1 +\n"
- " mm/madvise.c                           | 170 +++++++++++++++++++++++++++++++++\n"
- " mm/rmap.c                              |   8 ++\n"
- " mm/swap_state.c                        |   5 +-\n"
- " mm/vmscan.c                            |  10 +-\n"
- " mm/vmstat.c                            |   1 +\n"
- " 8 files changed, 192 insertions(+), 5 deletions(-)\n"
- "\n"
- "diff --git a/include/linux/rmap.h b/include/linux/rmap.h\n"
- "index 77d1ba57d495..04d2aec64e57 100644\n"
- "--- a/include/linux/rmap.h\n"
- "+++ b/include/linux/rmap.h\n"
- "@@ -85,6 +85,7 @@ enum ttu_flags {\n"
- " \tTTU_UNMAP = 1,\t\t\t/* unmap mode */\n"
- " \tTTU_MIGRATION = 2,\t\t/* migration mode */\n"
- " \tTTU_MUNLOCK = 4,\t\t/* munlock mode */\n"
- "+\tTTU_LZFREE = 8,\t\t\t/* lazy free mode */\n"
- " \n"
- " \tTTU_IGNORE_MLOCK = (1 << 8),\t/* ignore mlock */\n"
- " \tTTU_IGNORE_ACCESS = (1 << 9),\t/* don't age */\n"
- "diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h\n"
- "index e1f8c993e73b..67c1dbd19c6d 100644\n"
- "--- a/include/linux/vm_event_item.h\n"
- "+++ b/include/linux/vm_event_item.h\n"
- "@@ -25,6 +25,7 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,\n"
- " \t\tFOR_ALL_ZONES(PGALLOC),\n"
- " \t\tPGFREE, PGACTIVATE, PGDEACTIVATE,\n"
- " \t\tPGFAULT, PGMAJFAULT,\n"
- "+\t\tPGLAZYFREED,\n"
- " \t\tFOR_ALL_ZONES(PGREFILL),\n"
- " \t\tFOR_ALL_ZONES(PGSTEAL_KSWAPD),\n"
- " \t\tFOR_ALL_ZONES(PGSTEAL_DIRECT),\n"
- "diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h\n"
- "index a74dd84bbb6d..0e821e3c3d45 100644\n"
- "--- a/include/uapi/asm-generic/mman-common.h\n"
- "+++ b/include/uapi/asm-generic/mman-common.h\n"
- "@@ -39,6 +39,7 @@\n"
- " #define MADV_SEQUENTIAL\t2\t\t/* expect sequential page references */\n"
- " #define MADV_WILLNEED\t3\t\t/* will need these pages */\n"
- " #define MADV_DONTNEED\t4\t\t/* don't need these pages */\n"
- "+#define MADV_FREE\t5\t\t/* free pages only if memory pressure */\n"
- " \n"
- " /* common parameters: try to keep these consistent across architectures */\n"
- " #define MADV_REMOVE\t9\t\t/* remove these pages & resources */\n"
- "diff --git a/mm/madvise.c b/mm/madvise.c\n"
- "index c889fcbb530e..ed137fde4459 100644\n"
- "--- a/mm/madvise.c\n"
- "+++ b/mm/madvise.c\n"
- "@@ -20,6 +20,9 @@\n"
- " #include <linux/backing-dev.h>\n"
- " #include <linux/swap.h>\n"
- " #include <linux/swapops.h>\n"
- "+#include <linux/mmu_notifier.h>\n"
- "+\n"
- "+#include <asm/tlb.h>\n"
- " \n"
- " /*\n"
- "  * Any behaviour which results in changes to the vma->vm_flags needs to\n"
- "@@ -32,6 +35,7 @@ static int madvise_need_mmap_write(int behavior)\n"
- " \tcase MADV_REMOVE:\n"
- " \tcase MADV_WILLNEED:\n"
- " \tcase MADV_DONTNEED:\n"
- "+\tcase MADV_FREE:\n"
- " \t\treturn 0;\n"
- " \tdefault:\n"
- " \t\t/* be safe, default to 1. list exceptions explicitly */\n"
- "@@ -256,6 +260,163 @@ static long madvise_willneed(struct vm_area_struct *vma,\n"
- " \treturn 0;\n"
- " }\n"
- " \n"
- "+static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr,\n"
- "+\t\t\t\tunsigned long end, struct mm_walk *walk)\n"
- "+\n"
- "+{\n"
- "+\tstruct mmu_gather *tlb = walk->private;\n"
- "+\tstruct mm_struct *mm = tlb->mm;\n"
- "+\tstruct vm_area_struct *vma = walk->vma;\n"
- "+\tspinlock_t *ptl;\n"
- "+\tpte_t *orig_pte, *pte, ptent;\n"
- "+\tstruct page *page;\n"
- "+\n"
- "+\tsplit_huge_pmd(vma, pmd, addr);\n"
- "+\tif (pmd_trans_unstable(pmd))\n"
- "+\t\treturn 0;\n"
- "+\n"
- "+\torig_pte = pte = pte_offset_map_lock(mm, pmd, addr, &ptl);\n"
- "+\tarch_enter_lazy_mmu_mode();\n"
- "+\tfor (; addr != end; pte++, addr += PAGE_SIZE) {\n"
- "+\t\tptent = *pte;\n"
- "+\n"
- "+\t\tif (!pte_present(ptent))\n"
- "+\t\t\tcontinue;\n"
- "+\n"
- "+\t\tpage = vm_normal_page(vma, addr, ptent);\n"
- "+\t\tif (!page)\n"
- "+\t\t\tcontinue;\n"
- "+\n"
- "+\t\t/*\n"
- "+\t\t * If pmd isn't transhuge but the page is THP and\n"
- "+\t\t * is owned by only this process, split it and\n"
- "+\t\t * deactivate all pages.\n"
- "+\t\t */\n"
- "+\t\tif (PageTransCompound(page)) {\n"
- "+\t\t\tif (page_mapcount(page) != 1)\n"
- "+\t\t\t\tgoto out;\n"
- "+\t\t\tget_page(page);\n"
- "+\t\t\tif (!trylock_page(page)) {\n"
- "+\t\t\t\tput_page(page);\n"
- "+\t\t\t\tgoto out;\n"
- "+\t\t\t}\n"
- "+\t\t\tpte_unmap_unlock(orig_pte, ptl);\n"
- "+\t\t\tif (split_huge_page(page)) {\n"
- "+\t\t\t\tunlock_page(page);\n"
- "+\t\t\t\tput_page(page);\n"
- "+\t\t\t\tpte_offset_map_lock(mm, pmd, addr, &ptl);\n"
- "+\t\t\t\tgoto out;\n"
- "+\t\t\t}\n"
- "+\t\t\tput_page(page);\n"
- "+\t\t\tunlock_page(page);\n"
- "+\t\t\tpte = pte_offset_map_lock(mm, pmd, addr, &ptl);\n"
- "+\t\t\tpte--;\n"
- "+\t\t\taddr -= PAGE_SIZE;\n"
- "+\t\t\tcontinue;\n"
- "+\t\t}\n"
- "+\n"
- "+\t\tVM_BUG_ON_PAGE(PageTransCompound(page), page);\n"
- "+\n"
- "+\t\tif (PageSwapCache(page) || PageDirty(page)) {\n"
- "+\t\t\tif (!trylock_page(page))\n"
- "+\t\t\t\tcontinue;\n"
- "+\t\t\t/*\n"
- "+\t\t\t * If page is shared with others, we couldn't clear\n"
- "+\t\t\t * PG_dirty of the page.\n"
- "+\t\t\t */\n"
- "+\t\t\tif (page_mapcount(page) != 1) {\n"
- "+\t\t\t\tunlock_page(page);\n"
- "+\t\t\t\tcontinue;\n"
- "+\t\t\t}\n"
- "+\n"
- "+\t\t\tif (PageSwapCache(page) && !try_to_free_swap(page)) {\n"
- "+\t\t\t\tunlock_page(page);\n"
- "+\t\t\t\tcontinue;\n"
- "+\t\t\t}\n"
- "+\n"
- "+\t\t\tClearPageDirty(page);\n"
- "+\t\t\tunlock_page(page);\n"
- "+\t\t}\n"
- "+\n"
- "+\t\tif (pte_young(ptent) || pte_dirty(ptent)) {\n"
- "+\t\t\t/*\n"
- "+\t\t\t * Some of architecture(ex, PPC) don't update TLB\n"
- "+\t\t\t * with set_pte_at and tlb_remove_tlb_entry so for\n"
- "+\t\t\t * the portability, remap the pte with old|clean\n"
- "+\t\t\t * after pte clearing.\n"
- "+\t\t\t */\n"
- "+\t\t\tptent = ptep_get_and_clear_full(mm, addr, pte,\n"
- "+\t\t\t\t\t\t\ttlb->fullmm);\n"
- "+\n"
- "+\t\t\tptent = pte_mkold(ptent);\n"
- "+\t\t\tptent = pte_mkclean(ptent);\n"
- "+\t\t\tset_pte_at(mm, addr, pte, ptent);\n"
- "+\t\t\ttlb_remove_tlb_entry(tlb, pte, addr);\n"
- "+\t\t}\n"
- "+\t}\n"
- "+out:\n"
- "+\tarch_leave_lazy_mmu_mode();\n"
- "+\tpte_unmap_unlock(orig_pte, ptl);\n"
- "+\tcond_resched();\n"
- "+\treturn 0;\n"
- "+}\n"
- "+\n"
- "+static void madvise_free_page_range(struct mmu_gather *tlb,\n"
- "+\t\t\t     struct vm_area_struct *vma,\n"
- "+\t\t\t     unsigned long addr, unsigned long end)\n"
- "+{\n"
- "+\tstruct mm_walk free_walk = {\n"
- "+\t\t.pmd_entry = madvise_free_pte_range,\n"
- "+\t\t.mm = vma->vm_mm,\n"
- "+\t\t.private = tlb,\n"
- "+\t};\n"
- "+\n"
- "+\ttlb_start_vma(tlb, vma);\n"
- "+\twalk_page_range(addr, end, &free_walk);\n"
- "+\ttlb_end_vma(tlb, vma);\n"
- "+}\n"
- "+\n"
- "+static int madvise_free_single_vma(struct vm_area_struct *vma,\n"
- "+\t\t\tunsigned long start_addr, unsigned long end_addr)\n"
- "+{\n"
- "+\tunsigned long start, end;\n"
- "+\tstruct mm_struct *mm = vma->vm_mm;\n"
- "+\tstruct mmu_gather tlb;\n"
- "+\n"
- "+\tif (vma->vm_flags & (VM_LOCKED|VM_HUGETLB|VM_PFNMAP))\n"
- "+\t\treturn -EINVAL;\n"
- "+\n"
- "+\t/* MADV_FREE works for only anon vma at the moment */\n"
- "+\tif (!vma_is_anonymous(vma))\n"
- "+\t\treturn -EINVAL;\n"
- "+\n"
- "+\tstart = max(vma->vm_start, start_addr);\n"
- "+\tif (start >= vma->vm_end)\n"
- "+\t\treturn -EINVAL;\n"
- "+\tend = min(vma->vm_end, end_addr);\n"
- "+\tif (end <= vma->vm_start)\n"
- "+\t\treturn -EINVAL;\n"
- "+\n"
- "+\tlru_add_drain();\n"
- "+\ttlb_gather_mmu(&tlb, mm, start, end);\n"
- "+\tupdate_hiwater_rss(mm);\n"
- "+\n"
- "+\tmmu_notifier_invalidate_range_start(mm, start, end);\n"
- "+\tmadvise_free_page_range(&tlb, vma, start, end);\n"
- "+\tmmu_notifier_invalidate_range_end(mm, start, end);\n"
- "+\ttlb_finish_mmu(&tlb, start, end);\n"
- "+\n"
- "+\treturn 0;\n"
- "+}\n"
- "+\n"
- "+static long madvise_free(struct vm_area_struct *vma,\n"
- "+\t\t\t     struct vm_area_struct **prev,\n"
- "+\t\t\t     unsigned long start, unsigned long end)\n"
- "+{\n"
- "+\t*prev = vma;\n"
- "+\treturn madvise_free_single_vma(vma, start, end);\n"
- "+}\n"
- "+\n"
- " /*\n"
- "  * Application no longer needs these pages.  If the pages are dirty,\n"
- "  * it's OK to just throw them away.  The app will be more careful about\n"
- "@@ -379,6 +540,14 @@ madvise_vma(struct vm_area_struct *vma, struct vm_area_struct **prev,\n"
- " \t\treturn madvise_remove(vma, prev, start, end);\n"
- " \tcase MADV_WILLNEED:\n"
- " \t\treturn madvise_willneed(vma, prev, start, end);\n"
- "+\tcase MADV_FREE:\n"
- "+\t\t/*\n"
- "+\t\t * XXX: In this implementation, MADV_FREE works like\n"
- "+\t\t * MADV_DONTNEED on swapless system or full swap.\n"
- "+\t\t */\n"
- "+\t\tif (get_nr_swap_pages() > 0)\n"
- "+\t\t\treturn madvise_free(vma, prev, start, end);\n"
- "+\t\t/* passthrough */\n"
- " \tcase MADV_DONTNEED:\n"
- " \t\treturn madvise_dontneed(vma, prev, start, end);\n"
- " \tdefault:\n"
- "@@ -398,6 +567,7 @@ madvise_behavior_valid(int behavior)\n"
- " \tcase MADV_REMOVE:\n"
- " \tcase MADV_WILLNEED:\n"
- " \tcase MADV_DONTNEED:\n"
- "+\tcase MADV_FREE:\n"
- " #ifdef CONFIG_KSM\n"
- " \tcase MADV_MERGEABLE:\n"
- " \tcase MADV_UNMERGEABLE:\n"
- "diff --git a/mm/rmap.c b/mm/rmap.c\n"
- "index 6f371261dd12..321b633ee559 100644\n"
- "--- a/mm/rmap.c\n"
- "+++ b/mm/rmap.c\n"
- "@@ -1508,6 +1508,13 @@ static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,\n"
- " \t\t * See handle_pte_fault() ...\n"
- " \t\t */\n"
- " \t\tVM_BUG_ON_PAGE(!PageSwapCache(page), page);\n"
- "+\n"
- "+\t\tif (!PageDirty(page) && (flags & TTU_LZFREE)) {\n"
- "+\t\t\t/* It's a freeable page by MADV_FREE */\n"
- "+\t\t\tdec_mm_counter(mm, MM_ANONPAGES);\n"
- "+\t\t\tgoto discard;\n"
- "+\t\t}\n"
- "+\n"
- " \t\tif (swap_duplicate(entry) < 0) {\n"
- " \t\t\tset_pte_at(mm, address, pte, pteval);\n"
- " \t\t\tret = SWAP_FAIL;\n"
- "@@ -1528,6 +1535,7 @@ static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,\n"
- " \t} else\n"
- " \t\tdec_mm_counter(mm, mm_counter_file(page));\n"
- " \n"
- "+discard:\n"
- " \tpage_remove_rmap(page, PageHuge(page));\n"
- " \tpage_cache_release(page);\n"
- " \n"
- "diff --git a/mm/swap_state.c b/mm/swap_state.c\n"
- "index d783872d746c..676ff2991380 100644\n"
- "--- a/mm/swap_state.c\n"
- "+++ b/mm/swap_state.c\n"
- "@@ -185,13 +185,12 @@ int add_to_swap(struct page *page, struct list_head *list)\n"
- " \t * deadlock in the swap out path.\n"
- " \t */\n"
- " \t/*\n"
- "-\t * Add it to the swap cache and mark it dirty\n"
- "+\t * Add it to the swap cache.\n"
- " \t */\n"
- " \terr = add_to_swap_cache(page, entry,\n"
- " \t\t\t__GFP_HIGH|__GFP_NOMEMALLOC|__GFP_NOWARN);\n"
- " \n"
- "-\tif (!err) {\t/* Success */\n"
- "-\t\tSetPageDirty(page);\n"
- "+\tif (!err) {\n"
- " \t\treturn 1;\n"
- " \t} else {\t/* -ENOMEM radix-tree allocation failure */\n"
- " \t\t/*\n"
- "diff --git a/mm/vmscan.c b/mm/vmscan.c\n"
- "index 4589cfdbe405..c2f69445190c 100644\n"
- "--- a/mm/vmscan.c\n"
- "+++ b/mm/vmscan.c\n"
- "@@ -908,6 +908,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,\n"
- " \t\tint may_enter_fs;\n"
- " \t\tenum page_references references = PAGEREF_RECLAIM_CLEAN;\n"
- " \t\tbool dirty, writeback;\n"
- "+\t\tbool lazyfree = false;\n"
- " \n"
- " \t\tcond_resched();\n"
- " \n"
- "@@ -1051,6 +1052,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,\n"
- " \t\t\t\tgoto keep_locked;\n"
- " \t\t\tif (!add_to_swap(page, page_list))\n"
- " \t\t\t\tgoto activate_locked;\n"
- "+\t\t\tlazyfree = true;\n"
- " \t\t\tmay_enter_fs = 1;\n"
- " \n"
- " \t\t\t/* Adding to swap updated mapping */\n"
- "@@ -1062,8 +1064,9 @@ static unsigned long shrink_page_list(struct list_head *page_list,\n"
- " \t\t * processes. Try to unmap it here.\n"
- " \t\t */\n"
- " \t\tif (page_mapped(page) && mapping) {\n"
- "-\t\t\tswitch (try_to_unmap(page,\n"
- "-\t\t\t\t\tttu_flags|TTU_BATCH_FLUSH)) {\n"
- "+\t\t\tswitch (try_to_unmap(page, lazyfree ?\n"
- "+\t\t\t\t(ttu_flags | TTU_BATCH_FLUSH | TTU_LZFREE) :\n"
- "+\t\t\t\t(ttu_flags | TTU_BATCH_FLUSH))) {\n"
- " \t\t\tcase SWAP_FAIL:\n"
- " \t\t\t\tgoto activate_locked;\n"
- " \t\t\tcase SWAP_AGAIN:\n"
- "@@ -1188,6 +1191,9 @@ static unsigned long shrink_page_list(struct list_head *page_list,\n"
- " \t\t */\n"
- " \t\t__ClearPageLocked(page);\n"
- " free_it:\n"
- "+\t\tif (lazyfree && !PageDirty(page))\n"
- "+\t\t\tcount_vm_event(PGLAZYFREED);\n"
- "+\n"
- " \t\tnr_reclaimed++;\n"
- " \n"
- " \t\t/*\n"
- "diff --git a/mm/vmstat.c b/mm/vmstat.c\n"
- "index d13cd8eebf70..38929dc79c3d 100644\n"
- "--- a/mm/vmstat.c\n"
- "+++ b/mm/vmstat.c\n"
- "@@ -781,6 +781,7 @@ const char * const vmstat_text[] = {\n"
- " \n"
- " \t\"pgfault\",\n"
- " \t\"pgmajfault\",\n"
- "+\t\"pglazyfreed\",\n"
- " \n"
- " \tTEXTS_FOR_ZONES(\"pgrefill\")\n"
- " \tTEXTS_FOR_ZONES(\"pgsteal_kswapd\")\n"
- "-- \n"
- "1.9.1\n"
- "\n"
- "--\n"
- "To unsubscribe, send a message with 'unsubscribe linux-mm' in\n"
- "the body to majordomo@kvack.org.  For more info on Linux MM,\n"
- "see: http://www.linux-mm.org/ .\n"
- "Don't email: <a href=mailto:\"dont@kvack.org\"> email@kvack.org </a>"
+ Thanks for the review!
 
-c1345d54795b40771caedc2e93ee908877928a9278a344fcb45602321f95bc61
+1c99e283df269c72323b1f5156bec96c43628d1d6c2202451a161197ea9096bd

diff --git a/a/1.txt b/N2/1.txt
index 5fdf196..983b5ec 100644
--- a/a/1.txt
+++ b/N2/1.txt
@@ -564,9 +564,3 @@ index d13cd8eebf70..38929dc79c3d 100644
  	TEXTS_FOR_ZONES("pgsteal_kswapd")
 -- 
 1.9.1
-
---
-To unsubscribe, send a message with 'unsubscribe linux-mm' in
-the body to majordomo@kvack.org.  For more info on Linux MM,
-see: http://www.linux-mm.org/ .
-Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
diff --git a/a/content_digest b/N2/content_digest
index 43edac9..756a36d 100644
--- a/a/content_digest
+++ b/N2/content_digest
@@ -590,12 +590,6 @@
  " \tTEXTS_FOR_ZONES(\"pgrefill\")\n"
  " \tTEXTS_FOR_ZONES(\"pgsteal_kswapd\")\n"
  "-- \n"
- "1.9.1\n"
- "\n"
- "--\n"
- "To unsubscribe, send a message with 'unsubscribe linux-mm' in\n"
- "the body to majordomo@kvack.org.  For more info on Linux MM,\n"
- "see: http://www.linux-mm.org/ .\n"
- "Don't email: <a href=mailto:\"dont@kvack.org\"> email@kvack.org </a>"
+ 1.9.1
 
-c1345d54795b40771caedc2e93ee908877928a9278a344fcb45602321f95bc61
+4af17eb29d43d1b48127a9b56909da0617f4fc37e3c18089c64d3e0ebbe8aeba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.