linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Fuad Tabba <tabba@google.com>
Cc: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org,
	linux-mm@kvack.org,  pbonzini@redhat.com, chenhuacai@kernel.org,
	mpe@ellerman.id.au,  anup@brainfault.org,
	paul.walmsley@sifive.com, palmer@dabbelt.com,
	 aou@eecs.berkeley.edu, viro@zeniv.linux.org.uk,
	brauner@kernel.org,  willy@infradead.org,
	akpm@linux-foundation.org, xiaoyao.li@intel.com,
	 yilun.xu@intel.com, chao.p.peng@linux.intel.com,
	jarkko@kernel.org,  amoorthy@google.com, dmatlack@google.com,
	isaku.yamahata@intel.com,  mic@digikod.net, vbabka@suse.cz,
	vannapurve@google.com,  ackerleytng@google.com,
	mail@maciej.szmigiero.name, david@redhat.com,
	 michael.roth@amd.com, wei.w.wang@intel.com,
	liam.merwick@oracle.com,  isaku.yamahata@gmail.com,
	kirill.shutemov@linux.intel.com,  suzuki.poulose@arm.com,
	steven.price@arm.com, quic_eberman@quicinc.com,
	 quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com,
	quic_svaddagi@quicinc.com,  quic_cvanscha@quicinc.com,
	quic_pderrin@quicinc.com, quic_pheragu@quicinc.com,
	 catalin.marinas@arm.com, james.morse@arm.com,
	yuzenghui@huawei.com,  oliver.upton@linux.dev, maz@kernel.org,
	will@kernel.org, qperret@google.com,  keirf@google.com,
	roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org,
	 jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com,
	fvdl@google.com,  hughd@google.com, jthoughton@google.com,
	peterx@redhat.com,  pankaj.gupta@amd.com
Subject: Re: [PATCH v7 4/7] KVM: guest_memfd: Folio sharing states and functions that manage their transition
Date: Wed, 2 Apr 2025 17:19:38 -0700	[thread overview]
Message-ID: <Z-3UGmcCwJtaP-yF@google.com> (raw)
In-Reply-To: <20250328153133.3504118-5-tabba@google.com>

On Fri, Mar 28, 2025, Fuad Tabba wrote:
> @@ -389,22 +381,211 @@ static void kvm_gmem_init_mount(void)
>  }
>  
>  #ifdef CONFIG_KVM_GMEM_SHARED_MEM
> -static bool kvm_gmem_offset_is_shared(struct file *file, pgoff_t index)
> +/*
> + * An enum of the valid folio sharing states:
> + * Bit 0: set if not shared with the guest (guest cannot fault it in)
> + * Bit 1: set if not shared with the host (host cannot fault it in)
> + */
> +enum folio_shareability {
> +	KVM_GMEM_ALL_SHARED	= 0b00,	/* Shared with the host and the guest. */
> +	KVM_GMEM_GUEST_SHARED	= 0b10, /* Shared only with the guest. */
> +	KVM_GMEM_NONE_SHARED	= 0b11, /* Not shared, transient state. */

Absolutely not.  The proper way to define bitmasks is to use BIT(xxx).  Based on
past discussions, I suspect you went this route so that the most common value
is '0' to avoid extra, but that should be an implementation detail buried deep
in the low level xarray handling, not a

The name is also bizarre and confusing.  To map memory into the guest as private,
it needs to be in KVM_GMEM_GUEST_SHARED.  That's completely unworkable.
Of course, it's not at all obvious that you're actually trying to create a bitmask.
The above looks like an inverted bitmask, but then it's used as if the values don't
matter.

	return (r == KVM_GMEM_ALL_SHARED || r == KVM_GMEM_GUEST_SHARED);

Given that I can't think of a sane use case for allowing guest_memfd to be mapped
into the host but not the guest (modulo temporary demand paging scenarios), I
think all we need is:

	KVM_GMEM_SHARED		  = BIT(0),
	KVM_GMEM_INVALID	  = BIT(1),

As for optimizing xarray storage, assuming it's actually a bitmask, simply let
KVM specify which bits to invert when storing/loading to/from the xarray so that
KVM can optimize storage for the most common value (which is presumably
KVM_GEM_SHARED on arm64?).

If KVM_GMEM_SHARED is the desired "default", invert bit 0, otherwise dont.  If
for some reason we get to a state where the default value is multiple bits, the
inversion trick still works.  E.g. if KVM_GMEM_SHARED where a composite value,
then invert bits 0 and 1.  The polarity shenanigans should be easy to hide in two
low level macros/helpers.

> +/*
> + * Returns true if the folio is shared with the host and the guest.

This is a superfluous comment.  Simple predicates should be self-explanatory
based on function name alone.

> + *
> + * Must be called with the offsets_lock lock held.

Drop these types of comments and document through code, i.e. via lockdep
assertions (which you already have).

> + */
> +static bool kvm_gmem_offset_is_shared(struct inode *inode, pgoff_t index)
> +{
> +	struct xarray *shared_offsets = &kvm_gmem_private(inode)->shared_offsets;
> +	rwlock_t *offsets_lock = &kvm_gmem_private(inode)->offsets_lock;
> +	unsigned long r;
> +
> +	lockdep_assert_held(offsets_lock);
>  
> -	/* For now, VMs that support shared memory share all their memory. */
> -	return kvm_arch_gmem_supports_shared_mem(gmem->kvm);
> +	r = xa_to_value(xa_load(shared_offsets, index));
> +
> +	return r == KVM_GMEM_ALL_SHARED;
> +}
> +
> +/*
> + * Returns true if the folio is shared with the guest (not transitioning).
> + *
> + * Must be called with the offsets_lock lock held.

See above.

>  static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf)

This should be something like kvm_gmem_fault_shared() make it abundantly clear
what's being done.  Because it too me a few looks to realize this is faulting
memory into host userspace, not into the guest.


  parent reply	other threads:[~2025-04-03  0:19 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-28 15:31 [PATCH v7 0/7] KVM: Restricted mapping of guest_memfd at the host and arm64 support Fuad Tabba
2025-03-28 15:31 ` [PATCH v7 1/7] KVM: guest_memfd: Make guest mem use guest mem inodes instead of anonymous inodes Fuad Tabba
2025-04-08 11:53   ` Shivank Garg
2025-03-28 15:31 ` [PATCH v7 2/7] KVM: guest_memfd: Introduce kvm_gmem_get_pfn_locked(), which retains the folio lock Fuad Tabba
2025-03-28 15:31 ` [PATCH v7 3/7] KVM: guest_memfd: Track folio sharing within a struct kvm_gmem_private Fuad Tabba
2025-04-02 22:25   ` Ackerley Tng
2025-04-02 23:56     ` Sean Christopherson
2025-04-03  8:58       ` Fuad Tabba
2025-04-03 14:51         ` Sean Christopherson
2025-04-04  6:43           ` Fuad Tabba
2025-03-28 15:31 ` [PATCH v7 4/7] KVM: guest_memfd: Folio sharing states and functions that manage their transition Fuad Tabba
2025-04-02 23:48   ` Ackerley Tng
2025-04-03  8:58     ` Fuad Tabba
2025-04-03  0:19   ` Sean Christopherson [this message]
2025-04-03  9:11     ` Fuad Tabba
2025-04-03 14:52       ` Sean Christopherson
2025-04-04  6:44         ` Fuad Tabba
2025-03-28 15:31 ` [PATCH v7 5/7] KVM: guest_memfd: Restore folio state after final folio_put() Fuad Tabba
2025-04-05 16:23   ` Matthew Wilcox
2025-03-28 15:31 ` [PATCH v7 6/7] KVM: guest_memfd: Handle invalidation of shared memory Fuad Tabba
2025-03-28 15:31 ` [PATCH v7 7/7] KVM: guest_memfd: Add a guest_memfd() flag to initialize it as shared Fuad Tabba
2025-04-02 22:47   ` Ackerley Tng
2025-04-04  7:39     ` Fuad Tabba
2025-04-02 23:03   ` Ackerley Tng
2025-04-03 11:32     ` Fuad Tabba
2025-04-04 18:05 ` [PATCH v7 0/7] KVM: Restricted mapping of guest_memfd at the host and arm64 support Liam R. Howlett
2025-04-04 20:10   ` David Hildenbrand
2025-04-04 21:48     ` Sean Christopherson
2025-04-05  2:45       ` Liam R. Howlett
2025-04-07  6:28         ` Fuad Tabba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z-3UGmcCwJtaP-yF@google.com \
    --to=seanjc@google.com \
    --cc=ackerleytng@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=amoorthy@google.com \
    --cc=anup@brainfault.org \
    --cc=aou@eecs.berkeley.edu \
    --cc=brauner@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=chao.p.peng@linux.intel.com \
    --cc=chenhuacai@kernel.org \
    --cc=david@redhat.com \
    --cc=dmatlack@google.com \
    --cc=fvdl@google.com \
    --cc=hch@infradead.org \
    --cc=hughd@google.com \
    --cc=isaku.yamahata@gmail.com \
    --cc=isaku.yamahata@intel.com \
    --cc=james.morse@arm.com \
    --cc=jarkko@kernel.org \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=jthoughton@google.com \
    --cc=keirf@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=liam.merwick@oracle.com \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mail@maciej.szmigiero.name \
    --cc=maz@kernel.org \
    --cc=mic@digikod.net \
    --cc=michael.roth@amd.com \
    --cc=mpe@ellerman.id.au \
    --cc=oliver.upton@linux.dev \
    --cc=palmer@dabbelt.com \
    --cc=pankaj.gupta@amd.com \
    --cc=paul.walmsley@sifive.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qperret@google.com \
    --cc=quic_cvanscha@quicinc.com \
    --cc=quic_eberman@quicinc.com \
    --cc=quic_mnalajal@quicinc.com \
    --cc=quic_pderrin@quicinc.com \
    --cc=quic_pheragu@quicinc.com \
    --cc=quic_svaddagi@quicinc.com \
    --cc=quic_tsoni@quicinc.com \
    --cc=rientjes@google.com \
    --cc=roypat@amazon.co.uk \
    --cc=shuah@kernel.org \
    --cc=steven.price@arm.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tabba@google.com \
    --cc=vannapurve@google.com \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=wei.w.wang@intel.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=xiaoyao.li@intel.com \
    --cc=yilun.xu@intel.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).