From: Michael Roth <michael.roth@amd.com>
To: "Huang, Kai" <kai.huang@intel.com>
Cc: "isaku.yamahata@gmail.com" <isaku.yamahata@gmail.com>,
"tglx@linutronix.de" <tglx@linutronix.de>,
"tobin@ibm.com" <tobin@ibm.com>,
"liam.merwick@oracle.com" <liam.merwick@oracle.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"Luck, Tony" <tony.luck@intel.com>,
"jmattson@google.com" <jmattson@google.com>,
"Lutomirski, Andy" <luto@kernel.org>,
"ak@linux.intel.com" <ak@linux.intel.com>,
"pbonzini@redhat.com" <pbonzini@redhat.com>,
"pgonda@google.com" <pgonda@google.com>,
"srinivas.pandruvada@linux.intel.com"
<srinivas.pandruvada@linux.intel.com>,
"slp@redhat.com" <slp@redhat.com>,
"rientjes@google.com" <rientjes@google.com>,
"alpergun@google.com" <alpergun@google.com>,
"peterz@infradead.org" <peterz@infradead.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"dovmurik@linux.ibm.com" <dovmurik@linux.ibm.com>,
"thomas.lendacky@amd.com" <thomas.lendacky@amd.com>,
"Wang, Zhi A" <zhi.a.wang@intel.com>,
"x86@kernel.org" <x86@kernel.org>, "bp@alien8.de" <bp@alien8.de>,
"Annapurve, Vishal" <vannapurve@google.com>,
"dgilbert@redhat.com" <dgilbert@redhat.com>, "Christopherson,,
Sean" <seanjc@google.com>,
"vkuznets@redhat.com" <vkuznets@redhat.com>,
"vbabka@suse.cz" <vbabka@suse.cz>,
"marcorr@google.com" <marcorr@google.com>,
"ashish.kalra@amd.com" <ashish.kalra@amd.com>,
"linux-coco@lists.linux.dev" <linux-coco@lists.linux.dev>,
"nikunj.dadhania@amd.com" <nikunj.dadhania@amd.com>,
"Rodel, Jorg" <jroedel@suse.de>,
"mingo@redhat.com" <mingo@redhat.com>,
"sathyanarayanan.kuppuswamy@linux.intel.com"
<sathyanarayanan.kuppuswamy@linux.intel.com>,
"hpa@zytor.com" <hpa@zytor.com>,
"kirill@shutemov.name" <kirill@shutemov.name>,
"jarkko@kernel.org" <jarkko@kernel.org>,
"ardb@kernel.org" <ardb@kernel.org>,
"linux-crypto@vger.kernel.org" <linux-crypto@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>
Subject: Re: [PATCH RFC v9 04/51] KVM: x86: Determine shared/private faults using a configurable mask
Date: Thu, 22 Jun 2023 10:32:29 -0500 [thread overview]
Message-ID: <20230622153229.vjkrzi6rgiolstns@amd.com> (raw)
In-Reply-To: <c116b7ab8ca02116f2b8d19a8214161c3b30576c.camel@intel.com>
On Thu, Jun 22, 2023 at 09:55:22AM +0000, Huang, Kai wrote:
>
> >
> > So if we were to straight-forwardly implement that based on how TDX
> > currently handles checking for the shared bit in GPA, paired with how
> > SEV-SNP handles checking for private bit in fault flags, it would look
> > something like:
> >
> > bool kvm_fault_is_private(kvm, gpa, err)
> > {
> > /* SEV-SNP handling */
> > if (kvm->arch.mmu_private_fault_mask)
> > return !!(err & arch.mmu_private_fault_mask);
> >
> > /* TDX handling */
> > if (kvm->arch.gfn_shared_mask)
> > return !!(gpa & arch.gfn_shared_mask);
>
> The logic of the two are identical. I think they need to be converged.
I think they're just different enough that trying too hard to converge
them might obfuscate things. If the determination didn't come from 2
completely different fields (gpa vs. fault flags) maybe it could be
simplified a bit more, but have well-defined open-coded handler that
gets called once to set fault->is_private during initial fault time so
that that ugliness never needs to be looked at again by KVM MMU seems
like a good way to keep things simple through the rest of the handling.
>
> Either SEV-SNP should convert the error code private bit to the gfn_shared_mask,
> or TDX's shared bit should be converted to some private error bit.
struct kvm_page_fault seems to be the preferred way to pass additional
params/metadata around, and .is_private field was introduced to track
this private/shared state as part of UPM base series:
https://lore.kernel.org/lkml/20221202061347.1070246-9-chao.p.peng@linux.intel.com/
So it seems like unecessary complexity to track/encode that state into
other additional places rather than just encapsulating it all in
fault->is_private (or some similar field), and synthesizing all this
platform-specific handling into a well-defined value that's conveyed
by something like fault->is_private in a way where KVM MMU doesn't need
to worry as much about platform-specific stuff seems like a good thing,
and in line with what the UPM base series was trying to do by adding the
fault->is_private field.
So all I'm really proposing is that whatever SNP and TDX end up doing
should center around setting that fault->is_private field and keeping
everything contained there. If there are better ways to handle *how*
that's done I don't have any complaints there, but moving/adding bits
to GPA/error_flags after fault time just seems unecessary to me when
fault->is_private field can serve that purpose just as well.
>
> Perhaps converting SEV-SNP makes more sense because if I recall correctly SEV
> guest also has a C-bit, correct?
That's correct, but the C-bit doesn't show in the GPA that gets passed
up to KVM during an #NPF, and instead gets encoded into the fault's
error_flags.
>
> Or, ...
>
> >
> > return false;
> > }
> >
> > kvm_mmu_do_page_fault(vcpu, gpa, err, ...)
> > {
> > struct kvm_page_fault fault = {
> > ...
> > .is_private = kvm_fault_is_private(vcpu->kvm, gpa, err)
>
> ... should we do something like:
>
> .is_private = static_call(kvm_x86_fault_is_private)(vcpu->kvm, gpa,
> err);
We actually had exactly this in v7 of SNP hypervisor patches:
https://lore.kernel.org/linux-coco/20221214194056.161492-7-michael.roth@amd.com/T/#m17841f5bfdfb8350d69d78c6831dd8f3a4748638
but Sean was hoping to avoid a callback, which is why we ended up using
a bitmask in this version since it basically all that callback would
need to do. It's unfortunately that we don't have a common scheme
between SNP/TDX, but maybe that's still possible, I just think that
whatever that ends up being, it should live and be contained inside
whatever helper ends up setting fault->is_private.
There's some other awkwardness with a callback approach. It sort of ties
into your question about selftests so I'll address it below...
>
> ?
>
> > };
> >
> > ...
> > }
> >
> > And then arch.mmu_private_fault_mask and arch.gfn_shared_mask would be
> > set per-KVM-instance, just like they are now with current SNP and TDX
> > patchsets, since stuff like KVM self-test wouldn't be setting those
> > masks, so it makes sense to do it per-instance in that regard.
> >
> > But that still gets a little awkward for the KVM self-test use-case where
> > .is_private should sort of be ignored in favor of whatever the xarray
> > reports via kvm_mem_is_private().
> >
>
> I must have missed something. Why does KVM self-test have impact to how does
> KVM handles private fault?
The self-tests I'm referring to here are the ones from Vishal that shipped with
v10 of Chao's UPM/fd-based private memory series, and also as part of Sean's
gmem tree:
https://github.com/sean-jc/linux/commit/a0f5f8c911804f55935094ad3a277301704330a6
These exercise gmem/UPM handling without the need for any SNP/TDX
hardware support. They do so by "trusting" the shared/private state
that the VMM sets via KVM_SET_MEMORY_ATTRIBUTES. So if VMM says it
should be private, KVM MMU will treat it as private, so we'd never
get a mismatch, so KVM_EXIT_MEMORY_FAULT will never be generated.
>
> > In your Misc. series I believe you
> > handled this by introducing a PFERR_HASATTR_MASK bit so we can determine
> > whether existing value of fault->is_private should be
> > ignored/overwritten or not.
> >
> > So maybe kvm_fault_is_private() needs to return an integer value
> > instead, like:
> >
> > enum {
> > KVM_FAULT_VMM_DEFINED,
> > KVM_FAULT_SHARED,
> > KVM_FAULT_PRIVATE,
> > }
> >
> > bool kvm_fault_is_private(kvm, gpa, err)
> > {
> > /* SEV-SNP handling */
> > if (kvm->arch.mmu_private_fault_mask)
> > (err & arch.mmu_private_fault_mask) ? KVM_FAULT_PRIVATE : KVM_FAULT_SHARED
> >
> > /* TDX handling */
> > if (kvm->arch.gfn_shared_mask)
> > (gpa & arch.gfn_shared_mask) ? KVM_FAULT_SHARED : KVM_FAULT_PRIVATE
> >
> > return KVM_FAULT_VMM_DEFINED;
> > }
> >
> > And then down in __kvm_faultin_pfn() we do:
> >
> > if (fault->is_private == KVM_FAULT_VMM_DEFINED)
> > fault->is_private = kvm_mem_is_private(vcpu->kvm, fault->gfn);
> > else if (fault->is_private != kvm_mem_is_private(vcpu->kvm, fault->gfn))
> > return kvm_do_memory_fault_exit(vcpu, fault);
> >
> > if (fault->is_private)
> > return kvm_faultin_pfn_private(vcpu, fault);
>
>
> What does KVM_FAULT_VMM_DEFINED mean, exactly?
>
> Shouldn't the fault type come from _hardware_?
In above self-test use-case, there is no reliance on hardware support,
and fault->is_private should always be treated as being whatever was
set by the VMM via KVM_SET_MEMORY_ATTRIBUTES, so that's why I proposed
the KVM_FAULT_VMM_DEFINED value to encode that case into
fault->is_private so KVM MMU and handle protected self-test VMs of this
sort.
In the future, this protected self-test VMs might become the basis of
a new protected VM type where some sort of guest-issued hypercall can
be used to set whether a fault should be treated as shared/private,
rather than relying on VMM-defined value. There's some current discussion
about that here:
https://lore.kernel.org/lkml/20230620190443.GU2244082@ls.amr.corp.intel.com/T/#me627bed3d9acf73ea882e8baa76dfcb27759c440
Going back to your callback question above, that makes things a little
awkward, since kvm_x86_ops is statically defined for both
kvm_amd/kvm_intel modules, and either can run these self-tests guests as
well as these proposed "non-CC VMs" which rely on enlightened guest
kernels instead of TDX/SNPhardware support for managing private/shared
access.
So you either need to duplicate the handling for how to determine
private/shared for these other types into the kvm_intel/kvm_amd callbacks,
or have some way for the callback to say to "fall back to the common
handling for self-tests and non-CC VMs". The latter is what we implemented
in v8 of this series, but Isaku suggested it was a bit too heavyweight
and proposed dropping the fall-back logic in favor of updating the
kvm_x86_ops at run-time once we know whether or not it's a TDX/SNP guest:
https://lkml.iu.edu/hypermail/linux/kernel/2303.2/03009.html
which could work, but it still doesn't address Sean's desire to avoid
callbacks completely, and still amounts to a somewhat convulated way
to hide away TDX/SNP-specific bit checks for shared/private. Rather
than hide them away in callbacks that are already frowned upon by
maintainer, I think it makes sense to "open-code" all these checks in a
common handler like kvm_fault_is_private() to we can make some progress
toward a consensus, and then iterate on it from there rather than
refining what may already be a dead-end path.
-Mike
next prev parent reply other threads:[~2023-06-22 15:33 UTC|newest]
Thread overview: 102+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-12 4:25 [PATCH RFC v9 00/51] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 01/51] KVM: x86: Add gmem hook for initializing private memory Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 02/51] KVM: x86: Add gmem hook for invalidating " Michael Roth
2023-06-12 10:49 ` Borislav Petkov
2023-06-19 13:39 ` Borislav Petkov
2023-06-12 4:25 ` [PATCH RFC v9 03/51] KVM: x86: Use full 64-bit error code for kvm_mmu_do_page_fault Michael Roth
2023-06-14 14:24 ` Isaku Yamahata
2023-06-12 4:25 ` [PATCH RFC v9 04/51] KVM: x86: Determine shared/private faults using a configurable mask Michael Roth
2023-06-14 16:47 ` Isaku Yamahata
2023-06-20 20:28 ` Michael Roth
2023-06-20 21:18 ` Isaku Yamahata
2023-06-21 23:00 ` Michael Roth
2023-06-22 8:01 ` Isaku Yamahata
2023-06-22 9:55 ` Huang, Kai
2023-06-22 15:32 ` Michael Roth [this message]
2023-06-22 22:31 ` Huang, Kai
2023-06-22 23:39 ` Isaku Yamahata
2023-06-22 23:52 ` Huang, Kai
2023-06-23 14:43 ` Isaku Yamahata
2023-06-19 16:27 ` Borislav Petkov
2023-06-20 20:36 ` Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 05/51] x86/coco: move CONFIG_HAS_CC_PLATFORM check down into coco/Makefile Michael Roth
2023-06-12 7:07 ` Kirill A . Shutemov
2023-06-20 12:09 ` Borislav Petkov
2023-06-20 20:43 ` Michael Roth
2023-06-21 8:54 ` Borislav Petkov
2023-06-29 21:02 ` Michael Roth
2023-07-10 3:05 ` Sathyanarayanan Kuppuswamy
2023-07-10 13:11 ` Tom Lendacky
2023-06-12 4:25 ` [PATCH RFC v9 06/51] x86/cpufeatures: Add SEV-SNP CPU feature Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 07/51] x86/sev: Add the host SEV-SNP initialization support Michael Roth
2023-06-12 15:34 ` Dave Hansen
2023-06-21 9:15 ` Borislav Petkov
2023-06-21 14:31 ` Dave Hansen
2023-06-21 15:59 ` Borislav Petkov
2023-06-21 9:42 ` Borislav Petkov
2023-06-21 14:36 ` Tom Lendacky
2023-06-21 19:15 ` Kalra, Ashish
2023-08-09 13:03 ` Jeremi Piotrowski
2023-06-12 4:25 ` [PATCH RFC v9 08/51] x86/speculation: Do not enable Automatic IBRS if SEV SNP is enabled Michael Roth
2023-06-12 15:39 ` Dave Hansen
2023-07-18 22:34 ` Kim Phillips
2023-07-18 23:17 ` Dave Hansen
2023-07-20 19:11 ` Kim Phillips
2023-07-20 22:24 ` Dave Hansen
2023-07-21 16:56 ` Kim Phillips
2023-06-12 4:25 ` [PATCH RFC v9 09/51] x86/sev: Add RMP entry lookup helpers Michael Roth
2023-06-12 16:08 ` Dave Hansen
2023-06-30 21:57 ` Michael Roth
2023-06-30 22:29 ` Dave Hansen
2023-06-12 4:25 ` [PATCH RFC v9 10/51] x86/fault: Add helper for dumping RMP entries Michael Roth
2023-06-12 16:12 ` Dave Hansen
2023-06-12 4:25 ` [PATCH RFC v9 11/51] x86/traps: Define RMP violation #PF error code Michael Roth
2023-06-12 16:26 ` Dave Hansen
2023-06-12 4:25 ` [PATCH RFC v9 12/51] x86/fault: Report RMP page faults for kernel addresses Michael Roth
2023-06-12 16:30 ` Dave Hansen
2023-06-12 4:25 ` [PATCH RFC v9 13/51] x86/fault: Handle RMP page faults for user addresses Michael Roth
2023-06-12 16:40 ` Dave Hansen
2023-06-12 4:25 ` [PATCH RFC v9 14/51] x86/sev: Add helper functions for RMPUPDATE and PSMASH instruction Michael Roth
2023-06-12 17:00 ` Dave Hansen
2023-06-12 4:25 ` [PATCH RFC v9 15/51] x86/sev: Invalidate pages from the direct map when adding them to the RMP table Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 16/51] crypto: ccp: Define the SEV-SNP commands Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 17/51] crypto: ccp: Add support to initialize the AMD-SP for SEV-SNP Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 18/51] crypto: ccp: Provide API to issue SEV and SNP commands Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 19/51] x86/sev: Introduce snp leaked pages list Michael Roth
2023-08-09 12:46 ` Jeremi Piotrowski
2023-06-12 4:25 ` [PATCH RFC v9 20/51] crypto: ccp: Handle the legacy TMR allocation when SNP is enabled Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 21/51] crypto: ccp: Handle the legacy SEV command " Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 22/51] crypto: ccp: Add the SNP_PLATFORM_STATUS command Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 23/51] KVM: SEV: Select CONFIG_KVM_PROTECTED_VM when CONFIG_KVM_AMD_SEV=y Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 24/51] KVM: SVM: Add support to handle AP reset MSR protocol Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 25/51] KVM: SVM: Add GHCB handling for Hypervisor Feature Support requests Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 26/51] KVM: SVM: Make AVIC backing, VMSA and VMCB memory allocation SNP safe Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 27/51] KVM: SVM: Add initial SEV-SNP support Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 28/51] KVM: SVM: Add KVM_SNP_INIT command Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 29/51] KVM: SVM: Add KVM_SEV_SNP_LAUNCH_START command Michael Roth
2023-06-12 17:08 ` Peter Gonda
2023-06-12 4:25 ` [PATCH RFC v9 30/51] KVM: Add HVA range operator Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 31/51] KVM: Split out memory attribute xarray updates to helper function Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 32/51] KVM: SVM: Add KVM_SEV_SNP_LAUNCH_UPDATE command Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 33/51] KVM: SVM: Add KVM_SEV_SNP_LAUNCH_FINISH command Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 34/51] KVM: SVM: Add support to handle GHCB GPA register VMGEXIT Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 35/51] KVM: SVM: Add KVM_EXIT_VMGEXIT Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 36/51] KVM: SVM: Add support to handle MSR based Page State Change VMGEXIT Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 37/51] KVM: SVM: Add support to handle " Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 38/51] KVM: x86: Export the kvm_zap_gfn_range() for the SNP use Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 39/51] KVM: x86: Define RMP page fault error bits for #NPF Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 40/51] KVM: SVM: Add support to handle RMP nested page faults Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 41/51] KVM: SVM: Use a VMSA physical address variable for populating VMCB Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 42/51] KVM: SVM: Support SEV-SNP AP Creation NAE event Michael Roth
2023-08-15 16:00 ` Peter Gonda
2023-06-12 4:25 ` [PATCH RFC v9 43/51] KVM: SEV: Configure MMU to check for private fault flags Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 44/51] KVM: SEV: Implement gmem hook for initializing private pages Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 45/51] KVM: SEV: Implement gmem hook for invalidating " Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 46/51] KVM: SVM: Add module parameter to enable the SEV-SNP Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 47/51] iommu/amd: Add IOMMU_SNP_SHUTDOWN support Michael Roth
2023-09-07 10:31 ` Suthikulpanit, Suravee
2023-06-12 4:25 ` [PATCH RFC v9 48/51] crypto: ccp: Add the SNP_{SET,GET}_EXT_CONFIG command Michael Roth
2023-06-13 6:24 ` Alexey Kardashevskiy
2023-06-12 4:25 ` [PATCH RFC v9 49/51] x86/sev: Add KVM commands for per-instance certs Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 50/51] KVM: SVM: Provide support for SNP_GUEST_REQUEST NAE event Michael Roth
2023-06-12 4:25 ` [PATCH RFC v9 51/51] crypto: ccp: Add debug support for decrypting pages Michael Roth
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230622153229.vjkrzi6rgiolstns@amd.com \
--to=michael.roth@amd.com \
--cc=ak@linux.intel.com \
--cc=alpergun@google.com \
--cc=ardb@kernel.org \
--cc=ashish.kalra@amd.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=dgilbert@redhat.com \
--cc=dovmurik@linux.ibm.com \
--cc=hpa@zytor.com \
--cc=isaku.yamahata@gmail.com \
--cc=jarkko@kernel.org \
--cc=jmattson@google.com \
--cc=jroedel@suse.de \
--cc=kai.huang@intel.com \
--cc=kirill@shutemov.name \
--cc=kvm@vger.kernel.org \
--cc=liam.merwick@oracle.com \
--cc=linux-coco@lists.linux.dev \
--cc=linux-crypto@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=marcorr@google.com \
--cc=mingo@redhat.com \
--cc=nikunj.dadhania@amd.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=pgonda@google.com \
--cc=rientjes@google.com \
--cc=sathyanarayanan.kuppuswamy@linux.intel.com \
--cc=seanjc@google.com \
--cc=slp@redhat.com \
--cc=srinivas.pandruvada@linux.intel.com \
--cc=tglx@linutronix.de \
--cc=thomas.lendacky@amd.com \
--cc=tobin@ibm.com \
--cc=tony.luck@intel.com \
--cc=vannapurve@google.com \
--cc=vbabka@suse.cz \
--cc=vkuznets@redhat.com \
--cc=x86@kernel.org \
--cc=zhi.a.wang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox