All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: mm-commits@vger.kernel.org, ziy@nvidia.com, willy@infradead.org,
	vbabka@suse.cz, tsbogend@alpha.franken.de, songliubraving@fb.com,
	sj@kernel.org, shy828301@gmail.com,
	rongwei.wang@linux.alibaba.com, rientjes@google.com,
	peterx@redhat.com, pasha.tatashin@soleen.com, minchan@kernel.org,
	mhocko@suse.com, mattst88@gmail.com, linmiaohe@huawei.com,
	kirill.shutemov@linux.intel.com, jrdr.linux@gmail.com,
	jcmvbkbc@gmail.com, James.Bottomley@HansenPartnership.com,
	ink@jurassic.park.msu.ru, hughd@google.com, deller@gmx.de,
	david@redhat.com, dan.carpenter@oracle.com, ckennelly@google.com,
	chris@zankel.net, axelrasmussen@google.com, axboe@kernel.dk,
	asml.silence@gmail.com, arnd@arndb.de,
	alex.shi@linux.alibaba.com, aarcange@redhat.com,
	zokeefe@google.com, akpm@linux-foundation.org
Subject: [merged mm-stable] mm-khugepaged-add-flag-to-predicate-khugepaged-only-behavior.patch removed from -mm tree
Date: Sun, 11 Sep 2022 20:28:01 -0700	[thread overview]
Message-ID: <20220912032802.8BE34C433D7@smtp.kernel.org> (raw)


The quilt patch titled
     Subject: mm/khugepaged: add flag to predicate khugepaged-only behavior
has been removed from the -mm tree.  Its filename was
     mm-khugepaged-add-flag-to-predicate-khugepaged-only-behavior.patch

This patch was dropped because it was merged into the mm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

------------------------------------------------------
From: "Zach O'Keefe" <zokeefe@google.com>
Subject: mm/khugepaged: add flag to predicate khugepaged-only behavior
Date: Wed, 6 Jul 2022 16:59:24 -0700

Add .is_khugepaged flag to struct collapse_control so khugepaged-specific
behavior can be elided by MADV_COLLAPSE context.

Start by protecting khugepaged-specific heuristics by this flag.  In
MADV_COLLAPSE, the user presumably has reason to believe the collapse will
be beneficial and khugepaged heuristics shouldn't prevent the user from
doing so:

1) sysfs-controlled knobs khugepaged_max_ptes_[none|swap|shared]

2) requirement that some pages in region being collapsed be young or
   referenced

[zokeefe@google.com: consistently order cc->is_khugepaged and pte_* checks]
  Link: https://lkml.kernel.org/r/20220720140603.1958773-3-zokeefe@google.com
  Link: https://lore.kernel.org/linux-mm/Ys2qJm6FaOQcxkha@google.com/
Link: https://lkml.kernel.org/r/20220706235936.2197195-7-zokeefe@google.com
Signed-off-by: Zach O'Keefe <zokeefe@google.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Chris Kennelly <ckennelly@google.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: "Souptick Joarder (HPE)" <jrdr.linux@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/khugepaged.c |   83 ++++++++++++++++++++++++++++++++--------------
 1 file changed, 58 insertions(+), 25 deletions(-)

--- a/mm/khugepaged.c~mm-khugepaged-add-flag-to-predicate-khugepaged-only-behavior
+++ a/mm/khugepaged.c
@@ -73,6 +73,8 @@ static DECLARE_WAIT_QUEUE_HEAD(khugepage
  * default collapse hugepages if there is at least one pte mapped like
  * it would have happened if the vma was large enough during page
  * fault.
+ *
+ * Note that these are only respected if collapse was initiated by khugepaged.
  */
 static unsigned int khugepaged_max_ptes_none __read_mostly;
 static unsigned int khugepaged_max_ptes_swap __read_mostly;
@@ -86,6 +88,8 @@ static struct kmem_cache *mm_slot_cache
 #define MAX_PTE_MAPPED_THP 8
 
 struct collapse_control {
+	bool is_khugepaged;
+
 	/* Num pages scanned per node */
 	u32 node_load[MAX_NUMNODES];
 
@@ -554,6 +558,7 @@ static bool is_refcount_suitable(struct
 static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
 					unsigned long address,
 					pte_t *pte,
+					struct collapse_control *cc,
 					struct list_head *compound_pagelist)
 {
 	struct page *page = NULL;
@@ -566,8 +571,10 @@ static int __collapse_huge_page_isolate(
 		pte_t pteval = *_pte;
 		if (pte_none(pteval) || (pte_present(pteval) &&
 				is_zero_pfn(pte_pfn(pteval)))) {
+			++none_or_zero;
 			if (!userfaultfd_armed(vma) &&
-			    ++none_or_zero <= khugepaged_max_ptes_none) {
+			    (!cc->is_khugepaged ||
+			     none_or_zero <= khugepaged_max_ptes_none)) {
 				continue;
 			} else {
 				result = SCAN_EXCEED_NONE_PTE;
@@ -587,11 +594,14 @@ static int __collapse_huge_page_isolate(
 
 		VM_BUG_ON_PAGE(!PageAnon(page), page);
 
-		if (page_mapcount(page) > 1 &&
-				++shared > khugepaged_max_ptes_shared) {
-			result = SCAN_EXCEED_SHARED_PTE;
-			count_vm_event(THP_SCAN_EXCEED_SHARED_PTE);
-			goto out;
+		if (page_mapcount(page) > 1) {
+			++shared;
+			if (cc->is_khugepaged &&
+			    shared > khugepaged_max_ptes_shared) {
+				result = SCAN_EXCEED_SHARED_PTE;
+				count_vm_event(THP_SCAN_EXCEED_SHARED_PTE);
+				goto out;
+			}
 		}
 
 		if (PageCompound(page)) {
@@ -654,10 +664,14 @@ static int __collapse_huge_page_isolate(
 		if (PageCompound(page))
 			list_add_tail(&page->lru, compound_pagelist);
 next:
-		/* There should be enough young pte to collapse the page */
-		if (pte_young(pteval) ||
-		    page_is_young(page) || PageReferenced(page) ||
-		    mmu_notifier_test_young(vma->vm_mm, address))
+		/*
+		 * If collapse was initiated by khugepaged, check that there is
+		 * enough young pte to justify collapsing the page
+		 */
+		if (cc->is_khugepaged &&
+		    (pte_young(pteval) || page_is_young(page) ||
+		     PageReferenced(page) || mmu_notifier_test_young(vma->vm_mm,
+								     address)))
 			referenced++;
 
 		if (pte_write(pteval))
@@ -666,7 +680,7 @@ next:
 
 	if (unlikely(!writable)) {
 		result = SCAN_PAGE_RO;
-	} else if (unlikely(!referenced)) {
+	} else if (unlikely(cc->is_khugepaged && !referenced)) {
 		result = SCAN_LACK_REFERENCED_PAGE;
 	} else {
 		result = SCAN_SUCCEED;
@@ -745,6 +759,7 @@ static void khugepaged_alloc_sleep(void)
 
 
 struct collapse_control khugepaged_collapse_control = {
+	.is_khugepaged = true,
 	.last_target_node = NUMA_NO_NODE,
 };
 
@@ -1025,7 +1040,7 @@ static int collapse_huge_page(struct mm_
 	mmu_notifier_invalidate_range_end(&range);
 
 	spin_lock(pte_ptl);
-	result =  __collapse_huge_page_isolate(vma, address, pte,
+	result =  __collapse_huge_page_isolate(vma, address, pte, cc,
 					       &compound_pagelist);
 	spin_unlock(pte_ptl);
 
@@ -1116,7 +1131,9 @@ static int khugepaged_scan_pmd(struct mm
 	     _pte++, _address += PAGE_SIZE) {
 		pte_t pteval = *_pte;
 		if (is_swap_pte(pteval)) {
-			if (++unmapped <= khugepaged_max_ptes_swap) {
+			++unmapped;
+			if (!cc->is_khugepaged ||
+			    unmapped <= khugepaged_max_ptes_swap) {
 				/*
 				 * Always be strict with uffd-wp
 				 * enabled swap entries.  Please see
@@ -1134,8 +1151,10 @@ static int khugepaged_scan_pmd(struct mm
 			}
 		}
 		if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) {
+			++none_or_zero;
 			if (!userfaultfd_armed(vma) &&
-			    ++none_or_zero <= khugepaged_max_ptes_none) {
+			    (!cc->is_khugepaged ||
+			     none_or_zero <= khugepaged_max_ptes_none)) {
 				continue;
 			} else {
 				result = SCAN_EXCEED_NONE_PTE;
@@ -1165,11 +1184,14 @@ static int khugepaged_scan_pmd(struct mm
 			goto out_unmap;
 		}
 
-		if (page_mapcount(page) > 1 &&
-				++shared > khugepaged_max_ptes_shared) {
-			result = SCAN_EXCEED_SHARED_PTE;
-			count_vm_event(THP_SCAN_EXCEED_SHARED_PTE);
-			goto out_unmap;
+		if (page_mapcount(page) > 1) {
+			++shared;
+			if (cc->is_khugepaged &&
+			    shared > khugepaged_max_ptes_shared) {
+				result = SCAN_EXCEED_SHARED_PTE;
+				count_vm_event(THP_SCAN_EXCEED_SHARED_PTE);
+				goto out_unmap;
+			}
 		}
 
 		page = compound_head(page);
@@ -1220,14 +1242,22 @@ static int khugepaged_scan_pmd(struct mm
 			result = SCAN_PAGE_COUNT;
 			goto out_unmap;
 		}
-		if (pte_young(pteval) ||
-		    page_is_young(page) || PageReferenced(page) ||
-		    mmu_notifier_test_young(vma->vm_mm, address))
+
+		/*
+		 * If collapse was initiated by khugepaged, check that there is
+		 * enough young pte to justify collapsing the page
+		 */
+		if (cc->is_khugepaged &&
+		    (pte_young(pteval) || page_is_young(page) ||
+		     PageReferenced(page) || mmu_notifier_test_young(vma->vm_mm,
+								     address)))
 			referenced++;
 	}
 	if (!writable) {
 		result = SCAN_PAGE_RO;
-	} else if (!referenced || (unmapped && referenced < HPAGE_PMD_NR/2)) {
+	} else if (cc->is_khugepaged &&
+		   (!referenced ||
+		    (unmapped && referenced < HPAGE_PMD_NR / 2))) {
 		result = SCAN_LACK_REFERENCED_PAGE;
 	} else {
 		result = SCAN_SUCCEED;
@@ -1896,7 +1926,9 @@ static int khugepaged_scan_file(struct m
 			continue;
 
 		if (xa_is_value(page)) {
-			if (++swap > khugepaged_max_ptes_swap) {
+			++swap;
+			if (cc->is_khugepaged &&
+			    swap > khugepaged_max_ptes_swap) {
 				result = SCAN_EXCEED_SWAP_PTE;
 				count_vm_event(THP_SCAN_EXCEED_SWAP_PTE);
 				break;
@@ -1947,7 +1979,8 @@ static int khugepaged_scan_file(struct m
 	rcu_read_unlock();
 
 	if (result == SCAN_SUCCEED) {
-		if (present < HPAGE_PMD_NR - khugepaged_max_ptes_none) {
+		if (cc->is_khugepaged &&
+		    present < HPAGE_PMD_NR - khugepaged_max_ptes_none) {
 			result = SCAN_EXCEED_NONE_PTE;
 			count_vm_event(THP_SCAN_EXCEED_NONE_PTE);
 		} else {
_

Patches currently in -mm which might be from zokeefe@google.com are

mm-shmem-add-flag-to-enforce-shmem-thp-in-hugepage_vma_check.patch
mm-khugepaged-attempt-to-map-file-shmem-backed-pte-mapped-thps-by-pmds.patch
mm-madvise-add-file-and-shmem-support-to-madv_collapse.patch
mm-khugepaged-add-tracepoint-to-hpage_collapse_scan_file.patch
selftests-vm-dedup-thp-helpers.patch
selftests-vm-modularize-thp-collapse-memory-operations.patch
selftests-vm-add-thp-collapse-file-and-tmpfs-testing.patch
selftests-vm-add-thp-collapse-shmem-testing.patch
selftests-vm-add-file-shmem-madv_collapse-selftest-for-cleared-pmd.patch
selftests-vm-add-selftest-for-madv_collapse-of-uffd-minor-memory.patch


                 reply	other threads:[~2022-09-12  3:28 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220912032802.8BE34C433D7@smtp.kernel.org \
    --to=akpm@linux-foundation.org \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=aarcange@redhat.com \
    --cc=alex.shi@linux.alibaba.com \
    --cc=arnd@arndb.de \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=axelrasmussen@google.com \
    --cc=chris@zankel.net \
    --cc=ckennelly@google.com \
    --cc=dan.carpenter@oracle.com \
    --cc=david@redhat.com \
    --cc=deller@gmx.de \
    --cc=hughd@google.com \
    --cc=ink@jurassic.park.msu.ru \
    --cc=jcmvbkbc@gmail.com \
    --cc=jrdr.linux@gmail.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mattst88@gmail.com \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=mm-commits@vger.kernel.org \
    --cc=pasha.tatashin@soleen.com \
    --cc=peterx@redhat.com \
    --cc=rientjes@google.com \
    --cc=rongwei.wang@linux.alibaba.com \
    --cc=shy828301@gmail.com \
    --cc=sj@kernel.org \
    --cc=songliubraving@fb.com \
    --cc=tsbogend@alpha.franken.de \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.