From: Chao Peng <chao.p.peng@linux.intel.com>
To: Xiaoyao Li <xiaoyao.li@intel.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
linux-arch@vger.kernel.org, linux-api@vger.kernel.org,
linux-doc@vger.kernel.org, qemu-devel@nongnu.org,
Paolo Bonzini <pbonzini@redhat.com>,
Jonathan Corbet <corbet@lwn.net>,
Sean Christopherson <seanjc@google.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Wanpeng Li <wanpengli@tencent.com>,
Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
x86@kernel.org, "H . Peter Anvin" <hpa@zytor.com>,
Hugh Dickins <hughd@google.com>, Jeff Layton <jlayton@kernel.org>,
"J . Bruce Fields" <bfields@fieldses.org>,
Andrew Morton <akpm@linux-foundation.org>,
Shuah Khan <shuah@kernel.org>, Mike Rapoport <rppt@kernel.org>,
Steven Price <steven.price@arm.com>,
"Maciej S . Szmigiero" <mail@maciej.szmigiero.name>,
Vlastimil Babka <vbabka@suse.cz>,
Vishal Annapurve <vannapurve@google.com>,
Yu Zhang <yu.c.zhang@linux.intel.com>,
"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
luto@kernel.org, jun.nakajima@intel.com, dave.hansen@intel.com,
ak@linux.intel.com, david@redhat.com, aarcange@redhat.com,
ddutile@redhat.com, dhildenb@redhat.com,
Quentin Perret <qperret@google.com>,
tabba@google.com, Michael Roth <michael.roth@amd.com>,
mhocko@suse.com, Muchun Song <songmuchun@bytedance.com>,
wei.w.wang@intel.com
Subject: Re: [PATCH v9 2/8] KVM: Extend the memslot to support fd-based private memory
Date: Mon, 31 Oct 2022 22:14:26 +0800 [thread overview]
Message-ID: <20221031141426.GA3994099@chaop.bj.intel.com> (raw)
In-Reply-To: <f324f02c-cf76-08a9-07a3-4af60778056f@intel.com>
On Fri, Oct 28, 2022 at 03:04:27PM +0800, Xiaoyao Li wrote:
> On 10/25/2022 11:13 PM, Chao Peng wrote:
> > In memory encryption usage, guest memory may be encrypted with special
> > key and can be accessed only by the guest itself. We call such memory
> > private memory. It's valueless and sometimes can cause problem to allow
> > userspace to access guest private memory. This new KVM memslot extension
> > allows guest private memory being provided though a restrictedmem
> ^
>
> typo
Thanks!
>
> > backed file descriptor(fd) and userspace is restricted to access the
> > bookmarked memory in the fd.
> >
> > This new extension, indicated by the new flag KVM_MEM_PRIVATE, adds two
> > additional KVM memslot fields restricted_fd/restricted_offset to allow
> > userspace to instruct KVM to provide guest memory through restricted_fd.
> > 'guest_phys_addr' is mapped at the restricted_offset of restricted_fd
> > and the size is 'memory_size'.
> >
> > The extended memslot can still have the userspace_addr(hva). When use, a
> > single memslot can maintain both private memory through restricted_fd
> > and shared memory through userspace_addr. Whether the private or shared
> > part is visible to guest is maintained by other KVM code.
> >
> > A restrictedmem_notifier field is also added to the memslot structure to
> > allow the restricted_fd's backing store to notify KVM the memory change,
> > KVM then can invalidate its page table entries.
> >
> > Together with the change, a new config HAVE_KVM_RESTRICTED_MEM is added
> > and right now it is selected on X86_64 only. A KVM_CAP_PRIVATE_MEM is
> > also introduced to indicate KVM support for KVM_MEM_PRIVATE.
> >
> > To make code maintenance easy, internally we use a binary compatible
> > alias struct kvm_user_mem_region to handle both the normal and the
> > '_ext' variants.
> >
> > Co-developed-by: Yu Zhang <yu.c.zhang@linux.intel.com>
> > Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
> > Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> > ---
> > Documentation/virt/kvm/api.rst | 48 ++++++++++++++++++++++++++++-----
> > arch/x86/kvm/Kconfig | 2 ++
> > arch/x86/kvm/x86.c | 2 +-
> > include/linux/kvm_host.h | 13 +++++++--
> > include/uapi/linux/kvm.h | 29 ++++++++++++++++++++
> > virt/kvm/Kconfig | 3 +++
> > virt/kvm/kvm_main.c | 49 ++++++++++++++++++++++++++++------
> > 7 files changed, 128 insertions(+), 18 deletions(-)
> >
> > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> > index eee9f857a986..f3fa75649a78 100644
> > --- a/Documentation/virt/kvm/api.rst
> > +++ b/Documentation/virt/kvm/api.rst
> > @@ -1319,7 +1319,7 @@ yet and must be cleared on entry.
> > :Capability: KVM_CAP_USER_MEMORY
> > :Architectures: all
> > :Type: vm ioctl
> > -:Parameters: struct kvm_userspace_memory_region (in)
> > +:Parameters: struct kvm_userspace_memory_region(_ext) (in)
> > :Returns: 0 on success, -1 on error
> > ::
> > @@ -1332,9 +1332,18 @@ yet and must be cleared on entry.
> > __u64 userspace_addr; /* start of the userspace allocated memory */
> > };
> > + struct kvm_userspace_memory_region_ext {
> > + struct kvm_userspace_memory_region region;
> > + __u64 restricted_offset;
> > + __u32 restricted_fd;
> > + __u32 pad1;
> > + __u64 pad2[14];
> > + };
> > +
> > /* for kvm_memory_region::flags */
> > #define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0)
> > #define KVM_MEM_READONLY (1UL << 1)
> > + #define KVM_MEM_PRIVATE (1UL << 2)
> > This ioctl allows the user to create, modify or delete a guest physical
> > memory slot. Bits 0-15 of "slot" specify the slot id and this value
> > @@ -1365,12 +1374,27 @@ It is recommended that the lower 21 bits of guest_phys_addr and userspace_addr
> > be identical. This allows large pages in the guest to be backed by large
> > pages in the host.
> > -The flags field supports two flags: KVM_MEM_LOG_DIRTY_PAGES and
> > -KVM_MEM_READONLY. The former can be set to instruct KVM to keep track of
> > -writes to memory within the slot. See KVM_GET_DIRTY_LOG ioctl to know how to
> > -use it. The latter can be set, if KVM_CAP_READONLY_MEM capability allows it,
> > -to make a new slot read-only. In this case, writes to this memory will be
> > -posted to userspace as KVM_EXIT_MMIO exits.
> > +kvm_userspace_memory_region_ext struct includes all fields of
> > +kvm_userspace_memory_region struct, while also adds additional fields for some
> > +other features. See below description of flags field for more information.
> > +It's recommended to use kvm_userspace_memory_region_ext in new userspace code.
> > +
> > +The flags field supports following flags:
> > +
> > +- KVM_MEM_LOG_DIRTY_PAGES to instruct KVM to keep track of writes to memory
> > + within the slot. For more details, see KVM_GET_DIRTY_LOG ioctl.
> > +
> > +- KVM_MEM_READONLY, if KVM_CAP_READONLY_MEM allows, to make a new slot
> > + read-only. In this case, writes to this memory will be posted to userspace as
> > + KVM_EXIT_MMIO exits.
> > +
> > +- KVM_MEM_PRIVATE, if KVM_CAP_PRIVATE_MEM allows, to indicate a new slot has
> > + private memory backed by a file descriptor(fd) and userspace access to the
> > + fd may be restricted. Userspace should use restricted_fd/restricted_offset in
> > + kvm_userspace_memory_region_ext to instruct KVM to provide private memory
> > + to guest. Userspace should guarantee not to map the same pfn indicated by
> > + restricted_fd/restricted_offset to different gfns with multiple memslots.
> > + Failed to do this may result undefined behavior.
> > When the KVM_CAP_SYNC_MMU capability is available, changes in the backing of
> > the memory region are automatically reflected into the guest. For example, an
> > @@ -8215,6 +8239,16 @@ structure.
> > When getting the Modified Change Topology Report value, the attr->addr
> > must point to a byte where the value will be stored or retrieved from.
> > +8.36 KVM_CAP_PRIVATE_MEM
> > +------------------------
> > +
> > +:Architectures: x86
> > +
> > +This capability indicates that private memory is supported and userspace can
> > +set KVM_MEM_PRIVATE flag for KVM_SET_USER_MEMORY_REGION ioctl. See
> > +KVM_SET_USER_MEMORY_REGION for details on the usage of KVM_MEM_PRIVATE and
> > +kvm_userspace_memory_region_ext fields.
> > +
> > 9. Known KVM API problems
> > =========================
> > diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
> > index 67be7f217e37..8d2bd455c0cd 100644
> > --- a/arch/x86/kvm/Kconfig
> > +++ b/arch/x86/kvm/Kconfig
> > @@ -49,6 +49,8 @@ config KVM
> > select SRCU
> > select INTERVAL_TREE
> > select HAVE_KVM_PM_NOTIFIER if PM
> > + select HAVE_KVM_RESTRICTED_MEM if X86_64
> > + select RESTRICTEDMEM if HAVE_KVM_RESTRICTED_MEM
> > help
> > Support hosting fully virtualized guest machines using hardware
> > virtualization extensions. You will need a fairly recent
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 4bd5f8a751de..02ad31f46dd7 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -12425,7 +12425,7 @@ void __user * __x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa,
> > }
> > for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
> > - struct kvm_userspace_memory_region m;
> > + struct kvm_user_mem_region m;
> > m.slot = id | (i << 16);
> > m.flags = 0;
> > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > index 32f259fa5801..739a7562a1f3 100644
> > --- a/include/linux/kvm_host.h
> > +++ b/include/linux/kvm_host.h
> > @@ -44,6 +44,7 @@
> > #include <asm/kvm_host.h>
> > #include <linux/kvm_dirty_ring.h>
> > +#include <linux/restrictedmem.h>
> > #ifndef KVM_MAX_VCPU_IDS
> > #define KVM_MAX_VCPU_IDS KVM_MAX_VCPUS
> > @@ -575,8 +576,16 @@ struct kvm_memory_slot {
> > u32 flags;
> > short id;
> > u16 as_id;
> > + struct file *restricted_file;
> > + loff_t restricted_offset;
> > + struct restrictedmem_notifier notifier;
> > };
> > +static inline bool kvm_slot_can_be_private(const struct kvm_memory_slot *slot)
> > +{
> > + return slot && (slot->flags & KVM_MEM_PRIVATE);
> > +}
> > +
>
> We can introduce this function in patch 6 when it's first used.
Good to me.
Chao
>
>
next prev parent reply other threads:[~2022-10-31 14:19 UTC|newest]
Thread overview: 101+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-25 15:13 [PATCH v9 0/8] KVM: mm: fd-based approach for supporting KVM Chao Peng
2022-10-25 15:13 ` [PATCH v9 1/8] mm: Introduce memfd_restricted system call to create restricted user memory Chao Peng
2022-10-26 17:31 ` Isaku Yamahata
2022-10-28 6:12 ` Chao Peng
2022-10-27 10:20 ` Fuad Tabba
2022-10-31 17:47 ` Michael Roth
2022-11-01 11:37 ` Chao Peng
2022-11-01 15:19 ` Michael Roth
2022-11-01 19:30 ` Michael Roth
2022-11-02 14:53 ` Chao Peng
2022-11-02 21:19 ` Michael Roth
2022-11-14 14:02 ` Vlastimil Babka
2022-11-14 15:28 ` Kirill A. Shutemov
2022-11-14 22:16 ` Michael Roth
2022-11-15 9:48 ` Chao Peng
2022-11-14 22:16 ` Michael Roth
2022-11-02 21:14 ` Kirill A. Shutemov
2022-11-02 21:26 ` Michael Roth
2022-11-02 22:07 ` Michael Roth
2022-11-03 16:30 ` Kirill A. Shutemov
2022-11-29 0:06 ` Michael Roth
2022-11-29 11:21 ` Kirill A. Shutemov
2022-11-29 11:39 ` David Hildenbrand
2022-11-29 13:59 ` Chao Peng
2022-11-29 13:58 ` Chao Peng
2022-11-29 0:37 ` Michael Roth
2022-11-29 14:06 ` Chao Peng
2022-11-29 19:06 ` Michael Roth
2022-11-29 19:18 ` Michael Roth
2022-11-30 9:39 ` Chao Peng
2022-11-30 14:31 ` Michael Roth
2022-11-29 18:01 ` Vishal Annapurve
2022-12-02 2:16 ` Vishal Annapurve
2022-12-02 6:49 ` Chao Peng
2022-12-02 13:44 ` Kirill A . Shutemov
2022-10-25 15:13 ` [PATCH v9 2/8] KVM: Extend the memslot to support fd-based private memory Chao Peng
2022-10-27 10:25 ` Fuad Tabba
2022-10-28 7:04 ` Xiaoyao Li
2022-10-31 14:14 ` Chao Peng [this message]
2022-11-14 16:04 ` Alex Bennée
2022-11-15 9:29 ` Chao Peng
2022-10-25 15:13 ` [PATCH v9 3/8] KVM: Add KVM_EXIT_MEMORY_FAULT exit Chao Peng
2022-10-25 15:26 ` Peter Maydell
2022-10-25 16:17 ` Sean Christopherson
2022-10-27 10:27 ` Fuad Tabba
2022-10-28 6:14 ` Chao Peng
2022-11-15 16:56 ` Alex Bennée
2022-11-16 3:14 ` Chao Peng
2022-11-16 19:03 ` Alex Bennée
2022-11-17 13:45 ` Chao Peng
2022-11-17 15:08 ` Alex Bennée
2022-11-18 1:32 ` Chao Peng
2022-11-18 13:23 ` Alex Bennée
2022-11-18 15:59 ` Sean Christopherson
2022-11-22 9:50 ` Chao Peng
2022-11-23 18:02 ` Sean Christopherson
2022-11-16 18:15 ` Andy Lutomirski
2022-11-16 18:48 ` Sean Christopherson
2022-11-17 13:42 ` Chao Peng
2022-10-25 15:13 ` [PATCH v9 4/8] KVM: Use gfn instead of hva for mmu_notifier_retry Chao Peng
2022-10-27 10:29 ` Fuad Tabba
2022-11-04 2:28 ` Chao Peng
2022-11-04 22:29 ` Sean Christopherson
2022-11-08 7:16 ` Chao Peng
2022-11-10 17:53 ` Sean Christopherson
2022-11-10 20:06 ` Sean Christopherson
2022-11-11 8:27 ` Chao Peng
2022-10-25 15:13 ` [PATCH v9 5/8] KVM: Register/unregister the guest private memory regions Chao Peng
2022-10-27 10:31 ` Fuad Tabba
2022-11-03 23:04 ` Sean Christopherson
2022-11-04 8:28 ` Chao Peng
2022-11-04 21:19 ` Sean Christopherson
2022-11-08 8:24 ` Chao Peng
2022-11-08 1:35 ` Yuan Yao
2022-11-08 9:41 ` Chao Peng
2022-11-09 5:52 ` Yuan Yao
2022-11-16 22:24 ` Sean Christopherson
2022-11-17 13:20 ` Chao Peng
2022-10-25 15:13 ` [PATCH v9 6/8] KVM: Update lpage info when private/shared memory are mixed Chao Peng
2022-10-26 20:46 ` Isaku Yamahata
2022-10-28 6:38 ` Chao Peng
2022-11-08 12:08 ` Yuan Yao
2022-11-09 4:13 ` Chao Peng
2022-10-25 15:13 ` [PATCH v9 7/8] KVM: Handle page fault for private memory Chao Peng
2022-10-26 21:54 ` Isaku Yamahata
2022-10-28 6:55 ` Chao Peng
2022-11-01 0:02 ` Isaku Yamahata
2022-11-01 11:38 ` Chao Peng
2022-11-16 20:50 ` Ackerley Tng
2022-11-16 22:13 ` Sean Christopherson
2022-11-17 13:25 ` Chao Peng
2022-10-25 15:13 ` [PATCH v9 8/8] KVM: Enable and expose KVM_MEM_PRIVATE Chao Peng
2022-10-27 10:31 ` Fuad Tabba
2022-11-03 12:13 ` [PATCH v9 0/8] KVM: mm: fd-based approach for supporting KVM Vishal Annapurve
2022-11-08 0:41 ` Isaku Yamahata
2022-11-09 15:54 ` Kirill A. Shutemov
2022-11-15 14:36 ` Kirill A. Shutemov
2022-11-14 11:43 ` Alex Bennée
2022-11-16 5:00 ` Chao Peng
2022-11-16 9:40 ` Alex Bennée
2022-11-17 14:16 ` Chao Peng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221031141426.GA3994099@chaop.bj.intel.com \
--to=chao.p.peng@linux.intel.com \
--cc=aarcange@redhat.com \
--cc=ak@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=bfields@fieldses.org \
--cc=bp@alien8.de \
--cc=corbet@lwn.net \
--cc=dave.hansen@intel.com \
--cc=david@redhat.com \
--cc=ddutile@redhat.com \
--cc=dhildenb@redhat.com \
--cc=hpa@zytor.com \
--cc=hughd@google.com \
--cc=jlayton@kernel.org \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=jun.nakajima@intel.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mail@maciej.szmigiero.name \
--cc=mhocko@suse.com \
--cc=michael.roth@amd.com \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qperret@google.com \
--cc=rppt@kernel.org \
--cc=seanjc@google.com \
--cc=shuah@kernel.org \
--cc=songmuchun@bytedance.com \
--cc=steven.price@arm.com \
--cc=tabba@google.com \
--cc=tglx@linutronix.de \
--cc=vannapurve@google.com \
--cc=vbabka@suse.cz \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
--cc=wei.w.wang@intel.com \
--cc=x86@kernel.org \
--cc=xiaoyao.li@intel.com \
--cc=yu.c.zhang@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.