From: Chao Peng <chao.p.peng@linux.intel.com>
To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
qemu-devel@nongnu.org
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Jonathan Corbet <corbet@lwn.net>,
Sean Christopherson <seanjc@google.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Wanpeng Li <wanpengli@tencent.com>,
Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
x86@kernel.org, "H . Peter Anvin" <hpa@zytor.com>,
Hugh Dickins <hughd@google.com>, Jeff Layton <jlayton@kernel.org>,
"J . Bruce Fields" <bfields@fieldses.org>,
Andrew Morton <akpm@linux-foundation.org>,
Yu Zhang <yu.c.zhang@linux.intel.com>,
Chao Peng <chao.p.peng@linux.intel.com>,
"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
luto@kernel.org, john.ji@intel.com, susie.li@intel.com,
jun.nakajima@intel.com, dave.hansen@intel.com,
ak@linux.intel.com, david@redhat.com
Subject: [PATCH v3 kvm/queue 08/16] KVM: Special handling for fd-based memory invalidation
Date: Thu, 23 Dec 2021 20:30:03 +0800 [thread overview]
Message-ID: <20211223123011.41044-9-chao.p.peng@linux.intel.com> (raw)
In-Reply-To: <20211223123011.41044-1-chao.p.peng@linux.intel.com>
For fd-based guest memory, the memory backend (e.g. the fd provider)
should notify KVM to unmap/invalidate the privated memory from KVM
secondary MMU when userspace punches hole on the fd (e.g. when
userspace converts private memory to shared memory).
To support fd-based memory invalidation, existing hva-based memory
invalidation needs to be extended. A new 'inode' for the fd is passed in
from memfd_falloc_notifier and the 'start/end' will represent start/end
offset in the fd instead of hva range. During the invalidation KVM needs
to check this inode against that in the memslot. Only when the 'inode' in
memslot equals to the passed-in 'inode' we should invalidate the mapping
in KVM.
Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
---
virt/kvm/kvm_main.c | 30 ++++++++++++++++++++++++------
1 file changed, 24 insertions(+), 6 deletions(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index b7a1c4d7eaaa..19736a0013a0 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -494,6 +494,7 @@ typedef void (*on_lock_fn_t)(struct kvm *kvm, unsigned long start,
struct kvm_useraddr_range {
unsigned long start;
unsigned long end;
+ struct inode *inode;
pte_t pte;
gfn_handler_t handler;
on_lock_fn_t on_lock;
@@ -544,14 +545,27 @@ static __always_inline int __kvm_handle_useraddr_range(struct kvm *kvm,
struct interval_tree_node *node;
slots = __kvm_memslots(kvm, i);
- useraddr_tree = &slots->hva_tree;
+ useraddr_tree = range->inode ? &slots->ofs_tree : &slots->hva_tree;
kvm_for_each_memslot_in_useraddr_range(node, useraddr_tree,
range->start, range->end - 1) {
unsigned long useraddr_start, useraddr_end;
+ unsigned long useraddr_base;
+
+ if (range->inode) {
+ slot = container_of(node, struct kvm_memory_slot,
+ ofs_node[slots->node_idx]);
+ if (!slot->file ||
+ slot->file->f_inode != range->inode)
+ continue;
+ useraddr_base = slot->ofs;
+ } else {
+ slot = container_of(node, struct kvm_memory_slot,
+ hva_node[slots->node_idx]);
+ useraddr_base = slot->userspace_addr;
+ }
- slot = container_of(node, struct kvm_memory_slot, hva_node[slots->node_idx]);
- useraddr_start = max(range->start, slot->userspace_addr);
- useraddr_end = min(range->end, slot->userspace_addr +
+ useraddr_start = max(range->start, useraddr_base);
+ useraddr_end = min(range->end, useraddr_base +
(slot->npages << PAGE_SHIFT));
/*
@@ -568,10 +582,10 @@ static __always_inline int __kvm_handle_useraddr_range(struct kvm *kvm,
* {gfn_start, gfn_start+1, ..., gfn_end-1}.
*/
gfn_range.start = useraddr_to_gfn_memslot(useraddr_start,
- slot, true);
+ slot, !range->inode);
gfn_range.end = useraddr_to_gfn_memslot(
useraddr_end + PAGE_SIZE - 1,
- slot, true);
+ slot, !range->inode);
gfn_range.slot = slot;
if (!locked) {
@@ -613,6 +627,7 @@ static __always_inline int kvm_handle_hva_range(struct mmu_notifier *mn,
.on_lock = (void *)kvm_null_fn,
.flush_on_ret = true,
.may_block = false,
+ .inode = NULL,
};
return __kvm_handle_useraddr_range(kvm, &range);
@@ -632,6 +647,7 @@ static __always_inline int kvm_handle_hva_range_no_flush(struct mmu_notifier *mn
.on_lock = (void *)kvm_null_fn,
.flush_on_ret = false,
.may_block = false,
+ .inode = NULL,
};
return __kvm_handle_useraddr_range(kvm, &range);
@@ -700,6 +716,7 @@ static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn,
.on_lock = kvm_inc_notifier_count,
.flush_on_ret = true,
.may_block = mmu_notifier_range_blockable(range),
+ .inode = NULL,
};
trace_kvm_unmap_hva_range(range->start, range->end);
@@ -751,6 +768,7 @@ static void kvm_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn,
.on_lock = kvm_dec_notifier_count,
.flush_on_ret = false,
.may_block = mmu_notifier_range_blockable(range),
+ .inode = NULL,
};
bool wake;
--
2.17.1
next prev parent reply other threads:[~2021-12-23 12:32 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-23 12:29 [PATCH v3 kvm/queue 00/16] KVM: mm: fd-based approach for supporting KVM guest private memory Chao Peng
2021-12-23 12:29 ` [PATCH v3 kvm/queue 01/16] mm/shmem: Introduce F_SEAL_INACCESSIBLE Chao Peng
2022-01-04 14:22 ` David Hildenbrand
2022-01-06 13:06 ` Chao Peng
2022-01-13 15:56 ` David Hildenbrand
2021-12-23 12:29 ` [PATCH v3 kvm/queue 02/16] mm/memfd: Introduce MFD_INACCESSIBLE flag Chao Peng
2021-12-23 12:29 ` [PATCH v3 kvm/queue 03/16] mm/memfd: Introduce MEMFD_OPS Chao Peng
2021-12-24 3:53 ` Robert Hoo
2021-12-31 2:38 ` Chao Peng
2022-01-04 17:38 ` Sean Christopherson
2022-01-05 6:07 ` Chao Peng
2021-12-23 12:29 ` [PATCH v3 kvm/queue 04/16] KVM: Extend the memslot to support fd-based private memory Chao Peng
2021-12-23 17:35 ` Sean Christopherson
2021-12-31 2:53 ` Chao Peng
2022-01-04 17:34 ` Sean Christopherson
2021-12-23 12:30 ` [PATCH v3 kvm/queue 05/16] KVM: Maintain ofs_tree for fast memslot lookup by file offset Chao Peng
2021-12-23 18:02 ` Sean Christopherson
2021-12-24 3:54 ` Chao Peng
2021-12-27 23:50 ` Yao Yuan
2021-12-28 21:48 ` Sean Christopherson
2021-12-31 2:26 ` Chao Peng
2022-01-04 17:43 ` Sean Christopherson
2022-01-05 6:09 ` Chao Peng
2021-12-23 12:30 ` [PATCH v3 kvm/queue 06/16] KVM: Implement fd-based memory using MEMFD_OPS interfaces Chao Peng
2021-12-23 18:34 ` Sean Christopherson
2021-12-23 23:09 ` Paolo Bonzini
2021-12-24 4:25 ` Chao Peng
2021-12-28 22:14 ` Sean Christopherson
2021-12-24 4:12 ` Chao Peng
2021-12-24 4:22 ` Chao Peng
2021-12-23 12:30 ` [PATCH v3 kvm/queue 07/16] KVM: Refactor hva based memory invalidation code Chao Peng
2021-12-23 12:30 ` Chao Peng [this message]
2021-12-23 12:30 ` [PATCH v3 kvm/queue 09/16] KVM: Split out common " Chao Peng
2021-12-23 12:30 ` [PATCH v3 kvm/queue 10/16] KVM: Implement fd-based memory invalidation Chao Peng
2021-12-23 12:30 ` [PATCH v3 kvm/queue 11/16] KVM: Add kvm_map_gfn_range Chao Peng
2021-12-23 18:06 ` Sean Christopherson
2021-12-24 4:13 ` Chao Peng
2021-12-31 2:33 ` Chao Peng
2022-01-04 17:31 ` Sean Christopherson
2022-01-05 6:14 ` Chao Peng
2022-01-05 17:03 ` Sean Christopherson
2022-01-06 12:35 ` Chao Peng
2021-12-23 12:30 ` [PATCH v3 kvm/queue 12/16] KVM: Implement fd-based memory fallocation Chao Peng
2021-12-23 12:30 ` [PATCH v3 kvm/queue 13/16] KVM: Add KVM_EXIT_MEMORY_ERROR exit Chao Peng
2021-12-23 18:28 ` Sean Christopherson
2021-12-23 12:30 ` [PATCH v3 kvm/queue 14/16] KVM: Handle page fault for private memory Chao Peng
2022-01-04 1:46 ` Yan Zhao
2022-01-04 9:10 ` Chao Peng
2022-01-04 10:06 ` Yan Zhao
2022-01-05 6:28 ` Chao Peng
2022-01-05 7:53 ` Yan Zhao
2022-01-05 20:52 ` Sean Christopherson
2022-01-14 5:53 ` Yan Zhao
2021-12-23 12:30 ` [PATCH v3 kvm/queue 15/16] KVM: Use kvm_userspace_memory_region_ext Chao Peng
2021-12-23 12:30 ` [PATCH v3 kvm/queue 16/16] KVM: Register/unregister private memory slot to memfd Chao Peng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211223123011.41044-9-chao.p.peng@linux.intel.com \
--to=chao.p.peng@linux.intel.com \
--cc=ak@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=bfields@fieldses.org \
--cc=bp@alien8.de \
--cc=corbet@lwn.net \
--cc=dave.hansen@intel.com \
--cc=david@redhat.com \
--cc=hpa@zytor.com \
--cc=hughd@google.com \
--cc=jlayton@kernel.org \
--cc=jmattson@google.com \
--cc=john.ji@intel.com \
--cc=joro@8bytes.org \
--cc=jun.nakajima@intel.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=seanjc@google.com \
--cc=susie.li@intel.com \
--cc=tglx@linutronix.de \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
--cc=x86@kernel.org \
--cc=yu.c.zhang@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).