+ kvm-explicitly-set-foll_honor_numa_fault-in-hva_to_pfn_slow.patch added to mm-unstable branch

All of lore.kernel.org
 help / color / mirror / Atom feed

* + kvm-explicitly-set-foll_honor_numa_fault-in-hva_to_pfn_slow.patch added to mm-unstable branch
@ 2023-08-04 18:28 Andrew Morton
  0 siblings, 0 replies; only message in thread
From: Andrew Morton @ 2023-08-04 18:28 UTC (permalink / raw)
  To: mm-commits, willy, torvalds, shuah, peterx, pbonzini, mgorman,
	mgorman, liubo254, jhubbard, jgg, hughd, david, akpm


The patch titled
     Subject: kvm: explicitly set FOLL_HONOR_NUMA_FAULT in hva_to_pfn_slow()
has been added to the -mm mm-unstable branch.  Its filename is
     kvm-explicitly-set-foll_honor_numa_fault-in-hva_to_pfn_slow.patch

This patch will shortly appear at
     https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/kvm-explicitly-set-foll_honor_numa_fault-in-hva_to_pfn_slow.patch

This patch will later appear in the mm-unstable branch at
    git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days

------------------------------------------------------
From: David Hildenbrand <david@redhat.com>
Subject: kvm: explicitly set FOLL_HONOR_NUMA_FAULT in hva_to_pfn_slow()
Date: Thu, 3 Aug 2023 16:32:04 +0200

KVM is *the* case we know that really wants to honor NUMA hinting falls.
As we want to stop setting FOLL_HONOR_NUMA_FAULT implicitly, set
FOLL_HONOR_NUMA_FAULT whenever we might obtain pages on behalf of a VCPU
to map them into a secondary MMU, and add a comment why.

Do that unconditionally in hva_to_pfn_slow() when calling
get_user_pages_unlocked().

kvmppc_book3s_instantiate_page(), hva_to_pfn_fast() and
gfn_to_page_many_atomic() are similarly used to map pages into a
secondary MMU. However, FOLL_WRITE and get_user_page_fast_only() always
implicitly honor NUMA hinting faults -- as documented for
FOLL_HONOR_NUMA_FAULT -- so we can limit this change to a single location
for now.

Don't set it in check_user_page_hwpoison(), where we really only want to
check if the mapped page is HW-poisoned.

We won't set it for other KVM users of get_user_pages()/pin_user_pages()
* arch/powerpc/kvm/book3s_64_mmu_hv.c: not used to map pages into a
  secondary MMU.
* arch/powerpc/kvm/e500_mmu.c: only used on shared TLB pages with userspace
* arch/s390/kvm/*: s390x only supports a single NUMA node either way
* arch/x86/kvm/svm/sev.c: not used to map pages into a secondary MMU.

This is a preparation for making FOLL_HONOR_NUMA_FAULT no longer
implicitly be set by get_user_pages() and friends.

Link: https://lkml.kernel.org/r/20230803143208.383663-4-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: liubo <liubo254@huawei.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 virt/kvm/kvm_main.c |   13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

--- a/virt/kvm/kvm_main.c~kvm-explicitly-set-foll_honor_numa_fault-in-hva_to_pfn_slow
+++ a/virt/kvm/kvm_main.c
@@ -2517,7 +2517,18 @@ static bool hva_to_pfn_fast(unsigned lon
 static int hva_to_pfn_slow(unsigned long addr, bool *async, bool write_fault,
 			   bool interruptible, bool *writable, kvm_pfn_t *pfn)
 {
-	unsigned int flags = FOLL_HWPOISON;
+	/*
+	 * When a VCPU accesses a page that is not mapped into the secondary
+	 * MMU, we lookup the page using GUP to map it, so the guest VCPU can
+	 * make progress. We always want to honor NUMA hinting faults in that
+	 * case, because GUP usage corresponds to memory accesses from the VCPU.
+	 * Otherwise, we'd not trigger NUMA hinting faults once a page is
+	 * mapped into the secondary MMU and gets accessed by a VCPU.
+	 *
+	 * Note that get_user_page_fast_only() and FOLL_WRITE for now
+	 * implicitly honor NUMA hinting faults and don't need this flag.
+	 */
+	unsigned int flags = FOLL_HWPOISON | FOLL_HONOR_NUMA_FAULT;
 	struct page *page;
 	int npages;
 
_

Patches currently in -mm which might be from david@redhat.com are

mm-gup-reintroduce-foll_numa-as-foll_honor_numa_fault.patch
smaps-use-vm_normal_page_pmd-instead-of-follow_trans_huge_pmd.patch
mm-memory_hotplug-document-the-signal_pending-check-in-offline_pages.patch
kvm-explicitly-set-foll_honor_numa_fault-in-hva_to_pfn_slow.patch
mm-gup-dont-implicitly-set-foll_honor_numa_fault.patch
pgtable-improve-pte_protnone-comment.patch
selftest-mm-ksm_functional_tests-test-in-mmap_and_merge_range-if-anything-got-merged.patch
selftest-mm-ksm_functional_tests-add-prot_none-test.patch


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2023-08-04 18:32 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-04 18:28 + kvm-explicitly-set-foll_honor_numa_fault-in-hva_to_pfn_slow.patch added to mm-unstable branch Andrew Morton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.