linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Fuad Tabba <tabba@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Marc Zyngier <maz@kernel.org>,
	 Oliver Upton <oliver.upton@linux.dev>,
	Huacai Chen <chenhuacai@kernel.org>,
	 Michael Ellerman <mpe@ellerman.id.au>,
	Anup Patel <anup@brainfault.org>,
	 Paul Walmsley <paul.walmsley@sifive.com>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	 Albert Ou <aou@eecs.berkeley.edu>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Paul Moore <paul@paul-moore.com>,
	 James Morris <jmorris@namei.org>,
	"Serge E. Hallyn" <serge@hallyn.com>,
	kvm@vger.kernel.org,  linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.linux.dev,  linux-mips@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org,  kvm-riscv@lists.infradead.org,
	linux-riscv@lists.infradead.org,  linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org,  linux-security-module@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	 Chao Peng <chao.p.peng@linux.intel.com>,
	Jarkko Sakkinen <jarkko@kernel.org>,
	 Anish Moorthy <amoorthy@google.com>,
	Yu Zhang <yu.c.zhang@linux.intel.com>,
	 Isaku Yamahata <isaku.yamahata@intel.com>,
	Xu Yilun <yilun.xu@intel.com>,  Vlastimil Babka <vbabka@suse.cz>,
	Vishal Annapurve <vannapurve@google.com>,
	 Ackerley Tng <ackerleytng@google.com>,
	Maciej Szmigiero <mail@maciej.szmigiero.name>,
	 David Hildenbrand <david@redhat.com>,
	Quentin Perret <qperret@google.com>,
	 Michael Roth <michael.roth@amd.com>, Wang <wei.w.wang@intel.com>,
	 Liam Merwick <liam.merwick@oracle.com>,
	Isaku Yamahata <isaku.yamahata@gmail.com>,
	 "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Subject: Re: [RFC PATCH v12 11/33] KVM: Introduce per-page memory attributes
Date: Tue, 3 Oct 2023 08:59:27 -0700	[thread overview]
Message-ID: <ZRw6X2BptZnRPNK7@google.com> (raw)
In-Reply-To: <CA+EHjTzSUXx8P9gWmUERg4owxH6r6yNPm1_RL-BzS_2CNPtRKw@mail.gmail.com>

On Tue, Oct 03, 2023, Fuad Tabba wrote:
> Hi,
> 
> > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> > index d2d913acf0df..f8642ff2eb9d 100644
> > --- a/include/uapi/linux/kvm.h
> > +++ b/include/uapi/linux/kvm.h
> > @@ -1227,6 +1227,7 @@ struct kvm_ppc_resize_hpt {
> >  #define KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE 228
> >  #define KVM_CAP_ARM_SUPPORTED_BLOCK_SIZES 229
> >  #define KVM_CAP_USER_MEMORY2 230
> > +#define KVM_CAP_MEMORY_ATTRIBUTES 231
> >
> >  #ifdef KVM_CAP_IRQ_ROUTING
> >
> > @@ -2293,4 +2294,17 @@ struct kvm_s390_zpci_op {
> >  /* flags for kvm_s390_zpci_op->u.reg_aen.flags */
> >  #define KVM_S390_ZPCIOP_REGAEN_HOST    (1 << 0)
> >
> > +/* Available with KVM_CAP_MEMORY_ATTRIBUTES */
> > +#define KVM_GET_SUPPORTED_MEMORY_ATTRIBUTES    _IOR(KVMIO,  0xd2, __u64)
> > +#define KVM_SET_MEMORY_ATTRIBUTES              _IOW(KVMIO,  0xd3, struct kvm_memory_attributes)
> > +
> > +struct kvm_memory_attributes {
> > +       __u64 address;
> > +       __u64 size;
> > +       __u64 attributes;
> > +       __u64 flags;
> > +};
> > +
> > +#define KVM_MEMORY_ATTRIBUTE_PRIVATE           (1ULL << 3)
> > +
> 
> In pKVM, we don't want to allow setting (or clearing) of PRIVATE/SHARED
> attributes from userspace.

Why not?  The whole thing falls apart if userspace doesn't *know* the state of a
page, and the only way for userspace to know the state of a page at a given moment
in time is if userspace controls the attributes.  E.g. even if KVM were to provide
a way for userspace to query attributes, the attributes exposed to usrspace would
become stale the instant KVM drops slots_lock (or whatever lock protects the attributes)
since userspace couldn't prevent future changes.

Why does pKVM need to prevent userspace from stating *its* view of attributes?

If the goal is to reduce memory overhead, that can be solved by using an internal,
non-ABI attributes flag to track pKVM's view of SHARED vs. PRIVATE.  If the guest
attempts to access memory where pKVM and userspace don't agree on the state,
generate an exit to userspace.  Or kill the guest.  Or do something else entirely.

> However, we'd like to use the attributes xarray to track the sharing state of
> guest pages at the host kernel.
> 
> Moreover, we'd rather the default guest page state be PRIVATE, and
> only specify which pages are shared. All pKVM guest pages start off as
> private, and the majority will remain so.

I would rather optimize kvm_vm_set_mem_attributes() to generate range-based
xarray entries, at which point it shouldn't matter all that much whether PRIVATE
or SHARED is the default "empty" state.  We opted not to do that for the initial
merge purely to keep the code as simple as possible (which is obviously still not
exactly simple).

With range-based xarray entries, the cost of tagging huge chunks of memory as
PRIVATE should be a non-issue.  And if that's not enough for whatever reason, I
would rather define the polarity of PRIVATE on a per-VM basis, but only for internal
storage.
 
> I'm not sure if this is the best way to do this: One idea would be to move
> the definition of KVM_MEMORY_ATTRIBUTE_PRIVATE to
> arch/*/include/asm/kvm_host.h, which is where kvm_arch_supported_attributes()
> lives as well. This would allow different architectures to specify their own
> attributes (i.e., instead we'd have a KVM_MEMORY_ATTRIBUTE_SHARED for pKVM).
> This wouldn't help in terms of preventing userspace from clearing attributes
> (i.e., setting a 0 attribute) though.
> 
> The other thing, which we need for pKVM anyway, is to make
> kvm_vm_set_mem_attributes() global, so that it can be called from outside of
> kvm_main.c (already have a local patch for this that declares it in
> kvm_host.h),

That's no problem, but I am definitely opposed to KVM modifying attributes that
are owned by userspace.

> and not gate this function by KVM_GENERIC_MEMORY_ATTRIBUTES.

As above, I am opposed to pKVM having a completely different ABI for managing
PRIVATE vs. SHARED.  I have no objection to pKVM using unclaimed flags in the
attributes to store extra metadata, but if KVM_SET_MEMORY_ATTRIBUTES doesn't work
for pKVM, then we've failed miserably and should revist the uAPI.


  reply	other threads:[~2023-10-03 15:59 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-14  1:54 [RFC PATCH v12 00/33] KVM: guest_memfd() and per-page attributes Sean Christopherson
2023-09-14  1:54 ` [RFC PATCH v12 01/33] KVM: Tweak kvm_hva_range and hva_handler_t to allow reusing for gfn ranges Sean Christopherson
2023-09-15  6:47   ` Xiaoyao Li
2023-09-15 21:05     ` Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 02/33] KVM: Use gfn instead of hva for mmu_notifier_retry Sean Christopherson
2023-09-14  3:07   ` Binbin Wu
2023-09-14 14:19     ` Sean Christopherson
2023-09-20  6:07   ` Xu Yilun
2023-09-20 13:55     ` Sean Christopherson
2023-09-21  2:39       ` Xu Yilun
2023-09-21 14:24         ` Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 03/33] KVM: PPC: Drop dead code related to KVM_ARCH_WANT_MMU_NOTIFIER Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 04/33] KVM: PPC: Return '1' unconditionally for KVM_CAP_SYNC_MMU Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 05/33] KVM: Convert KVM_ARCH_WANT_MMU_NOTIFIER to CONFIG_KVM_GENERIC_MMU_NOTIFIER Sean Christopherson
2023-10-09 16:42   ` Anup Patel
2023-09-14  1:55 ` [RFC PATCH v12 06/33] KVM: Introduce KVM_SET_USER_MEMORY_REGION2 Sean Christopherson
2023-09-15  6:59   ` Xiaoyao Li
2023-09-14  1:55 ` [RFC PATCH v12 07/33] KVM: Add KVM_EXIT_MEMORY_FAULT exit to report faults to userspace Sean Christopherson
2023-09-22  6:03   ` Xiaoyao Li
2023-09-22 14:30     ` Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 08/33] KVM: Add a dedicated mmu_notifier flag for reclaiming freed memory Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 09/33] KVM: Drop .on_unlock() mmu_notifier hook Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 10/33] KVM: Set the stage for handling only shared mappings in mmu_notifier events Sean Christopherson
2023-09-18  1:14   ` Binbin Wu
2023-09-18 15:57     ` Sean Christopherson
2023-09-18 18:07   ` Michael Roth
2023-09-19  0:08     ` Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 11/33] KVM: Introduce per-page memory attributes Sean Christopherson
2023-09-15  6:32   ` Yan Zhao
2023-09-20 21:00     ` Sean Christopherson
2023-09-21  1:21       ` Yan Zhao
2023-09-25 17:37         ` Sean Christopherson
2023-09-18  7:51   ` Binbin Wu
2023-09-20 21:03     ` Sean Christopherson
2023-09-27  5:19       ` Binbin Wu
2023-10-03 12:47   ` Fuad Tabba
2023-10-03 15:59     ` Sean Christopherson [this message]
2023-10-03 18:33       ` Fuad Tabba
2023-10-03 20:51         ` Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 12/33] mm: Add AS_UNMOVABLE to mark mapping as completely unmovable Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 13/33] security: Export security_inode_init_security_anon() for use by KVM Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 14/33] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory Sean Christopherson
2023-09-15  6:11   ` Yan Zhao
2023-09-18 16:36   ` Michael Roth
2023-09-20 23:44     ` Sean Christopherson
2023-09-19  9:01   ` Binbin Wu
2023-09-20 14:24     ` Sean Christopherson
2023-09-21  5:58       ` Binbin Wu
2023-09-21 19:10   ` Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 15/33] KVM: Add transparent hugepage support for dedicated guest memory Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 16/33] KVM: x86: "Reset" vcpu->run->exit_reason early in KVM_RUN Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 17/33] KVM: x86: Disallow hugepages when memory attributes are mixed Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 18/33] KVM: x86/mmu: Handle page fault for private memory Sean Christopherson
2023-09-15  5:40   ` Yan Zhao
2023-09-15 14:26     ` Sean Christopherson
2023-09-18  0:54       ` Yan Zhao
2023-09-21 14:59         ` Sean Christopherson
2023-09-21  5:51       ` Binbin Wu
2023-09-14  1:55 ` [RFC PATCH v12 19/33] KVM: Drop superfluous __KVM_VCPU_MULTIPLE_ADDRESS_SPACE macro Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 20/33] KVM: Allow arch code to track number of memslot address spaces per VM Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 21/33] KVM: x86: Add support for "protected VMs" that can utilize private memory Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 22/33] KVM: selftests: Drop unused kvm_userspace_memory_region_find() helper Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 23/33] KVM: selftests: Convert lib's mem regions to KVM_SET_USER_MEMORY_REGION2 Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 24/33] KVM: selftests: Add support for creating private memslots Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 25/33] KVM: selftests: Add helpers to convert guest memory b/w private and shared Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 26/33] KVM: selftests: Add helpers to do KVM_HC_MAP_GPA_RANGE hypercalls (x86) Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 27/33] KVM: selftests: Introduce VM "shape" to allow tests to specify the VM type Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 28/33] KVM: selftests: Add GUEST_SYNC[1-6] macros for synchronizing more data Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 29/33] KVM: selftests: Add x86-only selftest for private memory conversions Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 30/33] KVM: selftests: Add KVM_SET_USER_MEMORY_REGION2 helper Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 31/33] KVM: selftests: Expand set_memory_region_test to validate guest_memfd() Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 32/33] KVM: selftests: Add basic selftest for guest_memfd() Sean Christopherson
2023-09-14  1:55 ` [RFC PATCH v12 33/33] KVM: selftests: Test KVM exit behavior for private memory/access Sean Christopherson
  -- strict thread matches above, loose matches on Subject: below --
2023-10-05 14:42 [RFC PATCH v12 11/33] KVM: Introduce per-page memory attributes Fuad Tabba
2023-10-06  3:21 ` Sean Christopherson
2023-10-06 12:47   ` Fuad Tabba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZRw6X2BptZnRPNK7@google.com \
    --to=seanjc@google.com \
    --cc=ackerleytng@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=amoorthy@google.com \
    --cc=anup@brainfault.org \
    --cc=aou@eecs.berkeley.edu \
    --cc=chao.p.peng@linux.intel.com \
    --cc=chenhuacai@kernel.org \
    --cc=david@redhat.com \
    --cc=isaku.yamahata@gmail.com \
    --cc=isaku.yamahata@intel.com \
    --cc=jarkko@kernel.org \
    --cc=jmorris@namei.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kvm-riscv@lists.infradead.org \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=liam.merwick@oracle.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mail@maciej.szmigiero.name \
    --cc=maz@kernel.org \
    --cc=michael.roth@amd.com \
    --cc=mpe@ellerman.id.au \
    --cc=oliver.upton@linux.dev \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=paul@paul-moore.com \
    --cc=pbonzini@redhat.com \
    --cc=qperret@google.com \
    --cc=serge@hallyn.com \
    --cc=tabba@google.com \
    --cc=vannapurve@google.com \
    --cc=vbabka@suse.cz \
    --cc=wei.w.wang@intel.com \
    --cc=willy@infradead.org \
    --cc=yilun.xu@intel.com \
    --cc=yu.c.zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).