Re: [RFC V1 PATCH 0/5] selftests: KVM: selftests for fd-based approach of supporting private memory

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Chao Peng <chao.p.peng@linux.intel.com>
To: Michael Roth <michael.roth@amd.com>
Cc: Andy Lutomirski <luto@kernel.org>,
	Vishal Annapurve <vannapurve@google.com>,
	the arch/x86 maintainers <x86@kernel.org>,
	kvm list <kvm@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-kselftest@vger.kernel.org,
	Paolo Bonzini <pbonzini@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	shauh@kernel.org, yang.zhong@intel.com, drjones@redhat.com,
	ricarkol@google.com, aaronlewis@google.com, wei.w.wang@intel.com,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Jonathan Corbet <corbet@lwn.net>, Hugh Dickins <hughd@google.com>,
	Jeff Layton <jlayton@kernel.org>,
	"J . Bruce Fields" <bfields@fieldses.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Yu Zhang <yu.c.zhang@linux.intel.com>,
	"Nakajima, Jun" <jun.nakajima@intel.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Quentin Perret <qperret@google.com>,
	Steven Price <steven.price@arm.com>,
	Andi Kleen <ak@linux.intel.com>,
	David Hildenbrand <david@redhat.com>,
	Vlastimil Babka <vbabka@suse.cz>, Marc Orr <marcorr@google.com>,
	Erdem Aktas <erdemaktas@google.com>,
	Peter Gonda <pgonda@google.com>,
	Sean Christopherson <seanjc@google.com>,
	diviness@google.com
Subject: Re: [RFC V1 PATCH 0/5] selftests: KVM: selftests for fd-based approach of supporting private memory
Date: Thu, 14 Apr 2022 18:07:50 +0800	[thread overview]
Message-ID: <20220414100750.GA16626@chaop.bj.intel.com> (raw)
In-Reply-To: <20220413134200.ms5lscs7lvvih7a5@amd.com>

On Wed, Apr 13, 2022 at 08:42:00AM -0500, Michael Roth wrote:
> On Tue, Apr 12, 2022 at 05:16:22PM -0700, Andy Lutomirski wrote:
> > On Fri, Apr 8, 2022, at 2:05 PM, Vishal Annapurve wrote:
> > > This series implements selftests targeting the feature floated by Chao
> > > via:
> > > https://lore.kernel.org/linux-mm/20220310140911.50924-1-chao.p.peng@linux.intel.com/
> > >
> > > Below changes aim to test the fd based approach for guest private memory
> > > in context of normal (non-confidential) VMs executing on non-confidential
> > > platforms.
> > >
> > > Confidential platforms along with the confidentiality aware software
> > > stack support a notion of private/shared accesses from the confidential
> > > VMs.
> > > Generally, a bit in the GPA conveys the shared/private-ness of the
> > > access. Non-confidential platforms don't have a notion of private or
> > > shared accesses from the guest VMs. To support this notion,
> > > KVM_HC_MAP_GPA_RANGE
> > > is modified to allow marking an access from a VM within a GPA range as
> > > always shared or private. Any suggestions regarding implementing this ioctl
> > > alternatively/cleanly are appreciated.
> > 
> > This is fantastic.  I do think we need to decide how this should work in general.  We have a few platforms with somewhat different properties:
> > 
> > TDX: The guest decides, per memory access (using a GPA bit), whether an access is private or shared.  In principle, the same address could be *both* and be distinguished by only that bit, and the two addresses would refer to different pages.
> > 
> > SEV: The guest decides, per memory access (using a GPA bit), whether an access is private or shared.  At any given time, a physical address (with that bit masked off) can be private, shared, or invalid, but it can't be valid as private and shared at the same time.
> > 
> > pKVM (currently, as I understand it): the guest decides by hypercall, in advance of an access, which addresses are private and which are shared.
> > 
> > This series, if I understood it correctly, is like TDX except with no hardware security.
> > 
> > Sean or Chao, do you have a clear sense of whether the current fd-based private memory proposal can cleanly support SEV and pKVM?  What, if anything, needs to be done on the API side to get that working well?  I don't think we need to support SEV or pKVM right away to get this merged, but I do think we should understand how the API can map to them.
> 
> I've been looking at porting the SEV-SNP hypervisor patches over to
> using memfd, and I hit an issue that I think is generally applicable
> to SEV/SEV-ES as well. Namely at guest init time we have something
> like the following flow:
> 
>   VMM:
>     - allocate shared memory to back the guest and map it into guest
>       address space
>     - initialize shared memory with initialize memory contents (namely
>       the BIOS)
>     - ask KVM to encrypt these pages in-place and measure them to
>       generate the initial measured payload for attestation, via
>       KVM_SEV_LAUNCH_UPDATE with the GPA for each range of memory to
>       encrypt.
>   KVM:
>     - issue SEV_LAUNCH_UPDATE firmware command, which takes an HPA as
>       input and does an in-place encryption/measure of the page.
> 
> With current v5 of the memfd/UPM series, I think the expected flow is that
> we would fallocate() these ranges from the private fd backend in advance of
> calling KVM_SEV_LAUNCH_UPDATE (if VMM does it after we'd destroy the initial
> guest payload, since they'd be replaced by newly-allocated pages). But if
> VMM does it before, VMM has no way to initialize the guest memory contents,
> since mmap()/pwrite() are disallowed due to MFD_INACCESSIBLE.

OK, so for SEV, basically VMM puts vBIOS directly into guest memory and then
do in-place measurement.

TDX has no problem because TDX temporarily uses a VMM buffer (vs. guest memory)
to hold the vBIOS and then asks SEAM-MODULE to measure and copy that to guest
memory.

Maybe something like SHM_LOCK should be used instead of the aggressive
MFD_INACCESSIBLE. Before VMM calling SHM_LOCK on the memfd, the content
can be changed but after that it's not visible to userspace VMM. This
gives userspace a chance to modify the data in private page.

Chao
> 
> I think something similar to your proposal[1] here of making pread()/pwrite()
> possible for private-fd-backed memory that's been flagged as "shareable"
> would work for this case. Although here the "shareable" flag could be
> removed immediately upon successful completion of the SEV_LAUNCH_UPDATE
> firmware command.
> 
> I think with TDX this isn't an issue because their analagous TDH.MEM.PAGE.ADD
> seamcall takes a pair of source/dest HPA as input params, so the VMM
> wouldn't need write access to dest HPA at any point, just source HPA.
> 
> [1] https://lwn.net/ml/linux-kernel/eefc3c74-acca-419c-8947-726ce2458446@www.fastmail.com/

     prev parent reply	other threads:[~2022-04-14 10:08 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-08 21:05 [RFC V1 PATCH 0/5] selftests: KVM: selftests for fd-based approach of supporting private memory Vishal Annapurve
2022-04-08 21:05 ` [RFC V1 PATCH 1/5] x86: kvm: HACK: Allow testing of priv memfd approach Vishal Annapurve
2022-04-08 21:05 ` [RFC V1 PATCH 2/5] selftests: kvm: Fix inline assembly for hypercall Vishal Annapurve
2022-04-08 21:05 ` [RFC V1 PATCH 3/5] selftests: kvm: Add a basic selftest to test private memory Vishal Annapurve
2022-04-08 21:05 ` [RFC V1 PATCH 4/5] selftests: kvm: priv_memfd_test: Add support for memory conversion Vishal Annapurve
2022-04-08 21:05 ` [RFC V1 PATCH 5/5] selftests: kvm: priv_memfd_test: Add shared access test Vishal Annapurve
2022-04-11 12:01 ` [RFC V1 PATCH 0/5] selftests: KVM: selftests for fd-based approach of supporting private memory Nikunj A. Dadhania
2022-04-12  8:25   ` Chao Peng
2022-04-13  0:16 ` Andy Lutomirski
2022-04-13 13:42   ` Michael Roth
2022-04-14 10:07     ` Chao Peng [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220414100750.GA16626@chaop.bj.intel.com \
    --to=chao.p.peng@linux.intel.com \
    --cc=aaronlewis@google.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=bfields@fieldses.org \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=diviness@google.com \
    --cc=drjones@redhat.com \
    --cc=erdemaktas@google.com \
    --cc=hpa@zytor.com \
    --cc=hughd@google.com \
    --cc=jlayton@kernel.org \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=jun.nakajima@intel.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=marcorr@google.com \
    --cc=michael.roth@amd.com \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=pgonda@google.com \
    --cc=qperret@google.com \
    --cc=ricarkol@google.com \
    --cc=seanjc@google.com \
    --cc=shauh@kernel.org \
    --cc=steven.price@arm.com \
    --cc=tglx@linutronix.de \
    --cc=vannapurve@google.com \
    --cc=vbabka@suse.cz \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=wei.w.wang@intel.com \
    --cc=x86@kernel.org \
    --cc=yang.zhong@intel.com \
    --cc=yu.c.zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.