From: Sean Christopherson <seanjc@google.com>
To: Yan Zhao <yan.y.zhao@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Marc Zyngier <maz@kernel.org>,
Oliver Upton <oliver.upton@linux.dev>,
Huacai Chen <chenhuacai@kernel.org>,
Michael Ellerman <mpe@ellerman.id.au>,
Anup Patel <anup@brainfault.org>,
Paul Walmsley <paul.walmsley@sifive.com>,
Palmer Dabbelt <palmer@dabbelt.com>,
Albert Ou <aou@eecs.berkeley.edu>,
"Matthew Wilcox (Oracle)" <willy@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>,
Paul Moore <paul@paul-moore.com>,
James Morris <jmorris@namei.org>,
"Serge E. Hallyn" <serge@hallyn.com>,
kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
kvmarm@lists.linux.dev, linux-mips@vger.kernel.org,
linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org,
linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org, linux-security-module@vger.kernel.org,
linux-kernel@vger.kernel.org,
Chao Peng <chao.p.peng@linux.intel.com>,
Fuad Tabba <tabba@google.com>,
Jarkko Sakkinen <jarkko@kernel.org>,
Anish Moorthy <amoorthy@google.com>,
Yu Zhang <yu.c.zhang@linux.intel.com>,
Isaku Yamahata <isaku.yamahata@intel.com>,
Xu Yilun <yilun.xu@intel.com>, Vlastimil Babka <vbabka@suse.cz>,
Vishal Annapurve <vannapurve@google.com>,
Ackerley Tng <ackerleytng@google.com>,
Maciej Szmigiero <mail@maciej.szmigiero.name>,
David Hildenbrand <david@redhat.com>,
Quentin Perret <qperret@google.com>,
Michael Roth <michael.roth@amd.com>, Wang <wei.w.wang@intel.com>,
Liam Merwick <liam.merwick@oracle.com>,
Isaku Yamahata <isaku.yamahata@gmail.com>,
"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Subject: Re: [RFC PATCH v12 18/33] KVM: x86/mmu: Handle page fault for private memory
Date: Thu, 21 Sep 2023 07:59:10 -0700 [thread overview]
Message-ID: <ZQxaPsNiFgfXH7aq@google.com> (raw)
In-Reply-To: <ZQef3zAEYjZMkn9k@yzhao56-desk.sh.intel.com>
On Mon, Sep 18, 2023, Yan Zhao wrote:
> On Fri, Sep 15, 2023 at 07:26:16AM -0700, Sean Christopherson wrote:
> > On Fri, Sep 15, 2023, Yan Zhao wrote:
> > > > static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
> > > > {
> > > > struct kvm_memory_slot *slot = fault->slot;
> > > > @@ -4293,6 +4356,14 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
> > > > return RET_PF_EMULATE;
> > > > }
> > > >
> > > > + if (fault->is_private != kvm_mem_is_private(vcpu->kvm, fault->gfn)) {
> > > In patch 21,
> > > fault->is_private is set as:
> > > ".is_private = kvm_mem_is_private(vcpu->kvm, cr2_or_gpa >> PAGE_SHIFT)",
> > > then, the inequality here means memory attribute has been updated after
> > > last check.
> > > So, why an exit to user space for converting is required instead of a mere retry?
> > >
> > > Or, is it because how .is_private is assigned in patch 21 is subjected to change
> > > in future?
> >
> > This. Retrying on SNP or TDX would hang the guest. I suppose we could special
> Is this because if the guest access a page in private way (e.g. via
> private key in TDX), the returned page must be a private page?
Yes, the returned page must be private, because the GHCI (TDX) and GHCB (SNP)
require that the host allow implicit conversions. I.e. if the guest accesses
memory as private (or shared), then the host must map memory as private (or shared).
Simply resuming the guest will not change the guest access, nor will it change KVM's
memory attributes.
Ideally (IMO), implicit conversions would be disallowed, but even if implicit
conversions weren't a thing, retrying would still be wrong as KVM would either
inject an exception into the guest or exit to userspace to let userspace handle
the illegal access.
> > case VMs where .is_private is derived from the memory attributes, but the
> > SW_PROTECTED_VM type is primary a development vehicle at this point. I'd like to
> > have it mimic SNP/TDX as much as possible; performance is a secondary concern.
> Ok. But this mimic is somewhat confusing as it may be problematic in below scenario,
> though sane guest should ensure no one is accessing a page before doing memory
> conversion.
>
>
> CPU 0 CPU 1
> access GFN A in private way
> fault->is_private=true
> convert GFN A to shared
> set memory attribute of A to shared
>
> faultin, mismatch and exit
> set memory attribute of A
> to private
>
> vCPU access GFN A in shared way
> fault->is_private = true
> faultin, match and map a private PFN B
>
> vCPU accesses private PFN B in shared way
If this is a TDX or SNP VM, then the private vs. shared information comes from
the guest itself, e.g. this sequence
vCPU access GFN A in shared way
fault->is_private = true
cannot happen because is_private will be false based on the error code (SNP) or
the GPA (TDX).
And when hardware doesn't generate page faults based on private vs. shared, i.e.
for non-TDX/SNP VMs, from a fault handling perspective there is no concept of the
guest accessing a GFN in a "private way" or a "shared way". I.e. there are no
implicit conversions.
For SEV and SEV-ES, the guest can access memory as private vs. shared, but the
and the host VMM absolutely must be in agreement and synchronized with respect to
the state of a page, otherwise guest memory will be corrupted. But that has
nothing to do with the fault handling, e.g. creating aliases in the guest to access
a single GFN as shared and private from two CPUs will create incoherent cache
entries and/or corrupt data without any involvement from KVM.
In other words, the above isn't possible for TDX/SNP, and for all other types,
the conflict between CPU0 and CPU1 is unequivocally a guest bug.
next prev parent reply other threads:[~2023-09-21 17:03 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-14 1:54 [RFC PATCH v12 00/33] KVM: guest_memfd() and per-page attributes Sean Christopherson
2023-09-14 1:54 ` [RFC PATCH v12 01/33] KVM: Tweak kvm_hva_range and hva_handler_t to allow reusing for gfn ranges Sean Christopherson
2023-09-15 6:47 ` Xiaoyao Li
2023-09-15 21:05 ` Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 02/33] KVM: Use gfn instead of hva for mmu_notifier_retry Sean Christopherson
2023-09-14 3:07 ` Binbin Wu
2023-09-14 14:19 ` Sean Christopherson
2023-09-20 6:07 ` Xu Yilun
2023-09-20 13:55 ` Sean Christopherson
2023-09-21 2:39 ` Xu Yilun
2023-09-21 14:24 ` Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 03/33] KVM: PPC: Drop dead code related to KVM_ARCH_WANT_MMU_NOTIFIER Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 04/33] KVM: PPC: Return '1' unconditionally for KVM_CAP_SYNC_MMU Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 05/33] KVM: Convert KVM_ARCH_WANT_MMU_NOTIFIER to CONFIG_KVM_GENERIC_MMU_NOTIFIER Sean Christopherson
2023-10-09 16:42 ` Anup Patel
2023-09-14 1:55 ` [RFC PATCH v12 06/33] KVM: Introduce KVM_SET_USER_MEMORY_REGION2 Sean Christopherson
2023-09-15 6:59 ` Xiaoyao Li
2023-09-14 1:55 ` [RFC PATCH v12 07/33] KVM: Add KVM_EXIT_MEMORY_FAULT exit to report faults to userspace Sean Christopherson
2023-09-22 6:03 ` Xiaoyao Li
2023-09-22 14:30 ` Sean Christopherson
2023-09-22 16:28 ` Sean Christopherson
2023-09-22 16:35 ` Sean Christopherson
2023-10-02 22:33 ` Anish Moorthy
2023-10-03 1:42 ` Sean Christopherson
2023-10-03 22:59 ` Anish Moorthy
2023-10-03 23:46 ` Sean Christopherson
2023-10-05 22:07 ` Anish Moorthy
2023-10-05 22:46 ` Sean Christopherson
2023-10-10 22:21 ` David Matlack
2023-10-13 18:45 ` Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 08/33] KVM: Add a dedicated mmu_notifier flag for reclaiming freed memory Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 09/33] KVM: Drop .on_unlock() mmu_notifier hook Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 10/33] KVM: Set the stage for handling only shared mappings in mmu_notifier events Sean Christopherson
2023-09-18 1:14 ` Binbin Wu
2023-09-18 15:57 ` Sean Christopherson
2023-09-18 18:07 ` Michael Roth
2023-09-19 0:08 ` Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 11/33] KVM: Introduce per-page memory attributes Sean Christopherson
2023-09-15 6:32 ` Yan Zhao
2023-09-20 21:00 ` Sean Christopherson
2023-09-21 1:21 ` Yan Zhao
2023-09-25 17:37 ` Sean Christopherson
2023-09-18 7:51 ` Binbin Wu
2023-09-20 21:03 ` Sean Christopherson
2023-09-27 5:19 ` Binbin Wu
2023-10-03 12:47 ` Fuad Tabba
2023-10-03 15:59 ` Sean Christopherson
2023-10-03 18:33 ` Fuad Tabba
2023-10-03 20:51 ` Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 12/33] mm: Add AS_UNMOVABLE to mark mapping as completely unmovable Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 13/33] security: Export security_inode_init_security_anon() for use by KVM Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 14/33] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory Sean Christopherson
2023-09-15 6:11 ` Yan Zhao
2023-09-18 16:36 ` Michael Roth
2023-09-20 23:44 ` Sean Christopherson
2023-09-19 9:01 ` Binbin Wu
2023-09-20 14:24 ` Sean Christopherson
2023-09-21 5:58 ` Binbin Wu
2023-09-21 19:10 ` Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 15/33] KVM: Add transparent hugepage support for dedicated guest memory Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 16/33] KVM: x86: "Reset" vcpu->run->exit_reason early in KVM_RUN Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 17/33] KVM: x86: Disallow hugepages when memory attributes are mixed Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 18/33] KVM: x86/mmu: Handle page fault for private memory Sean Christopherson
2023-09-15 5:40 ` Yan Zhao
2023-09-15 14:26 ` Sean Christopherson
2023-09-18 0:54 ` Yan Zhao
2023-09-21 14:59 ` Sean Christopherson [this message]
2023-09-21 5:51 ` Binbin Wu
2023-09-14 1:55 ` [RFC PATCH v12 19/33] KVM: Drop superfluous __KVM_VCPU_MULTIPLE_ADDRESS_SPACE macro Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 20/33] KVM: Allow arch code to track number of memslot address spaces per VM Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 21/33] KVM: x86: Add support for "protected VMs" that can utilize private memory Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 22/33] KVM: selftests: Drop unused kvm_userspace_memory_region_find() helper Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 23/33] KVM: selftests: Convert lib's mem regions to KVM_SET_USER_MEMORY_REGION2 Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 24/33] KVM: selftests: Add support for creating private memslots Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 25/33] KVM: selftests: Add helpers to convert guest memory b/w private and shared Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 26/33] KVM: selftests: Add helpers to do KVM_HC_MAP_GPA_RANGE hypercalls (x86) Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 27/33] KVM: selftests: Introduce VM "shape" to allow tests to specify the VM type Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 28/33] KVM: selftests: Add GUEST_SYNC[1-6] macros for synchronizing more data Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 29/33] KVM: selftests: Add x86-only selftest for private memory conversions Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 30/33] KVM: selftests: Add KVM_SET_USER_MEMORY_REGION2 helper Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 31/33] KVM: selftests: Expand set_memory_region_test to validate guest_memfd() Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 32/33] KVM: selftests: Add basic selftest for guest_memfd() Sean Christopherson
2023-09-14 1:55 ` [RFC PATCH v12 33/33] KVM: selftests: Test KVM exit behavior for private memory/access Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZQxaPsNiFgfXH7aq@google.com \
--to=seanjc@google.com \
--cc=ackerleytng@google.com \
--cc=akpm@linux-foundation.org \
--cc=amoorthy@google.com \
--cc=anup@brainfault.org \
--cc=aou@eecs.berkeley.edu \
--cc=chao.p.peng@linux.intel.com \
--cc=chenhuacai@kernel.org \
--cc=david@redhat.com \
--cc=isaku.yamahata@gmail.com \
--cc=isaku.yamahata@intel.com \
--cc=jarkko@kernel.org \
--cc=jmorris@namei.org \
--cc=kirill.shutemov@linux.intel.com \
--cc=kvm-riscv@lists.infradead.org \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=liam.merwick@oracle.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mips@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-riscv@lists.infradead.org \
--cc=linux-security-module@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mail@maciej.szmigiero.name \
--cc=maz@kernel.org \
--cc=michael.roth@amd.com \
--cc=mpe@ellerman.id.au \
--cc=oliver.upton@linux.dev \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
--cc=paul@paul-moore.com \
--cc=pbonzini@redhat.com \
--cc=qperret@google.com \
--cc=serge@hallyn.com \
--cc=tabba@google.com \
--cc=vannapurve@google.com \
--cc=vbabka@suse.cz \
--cc=wei.w.wang@intel.com \
--cc=willy@infradead.org \
--cc=yan.y.zhao@intel.com \
--cc=yilun.xu@intel.com \
--cc=yu.c.zhang@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).