From: Sean Christopherson <seanjc@google.com>
To: Shivank Garg <shivankg@amd.com>
Cc: Ackerley Tng <ackerleytng@google.com>,
Marc Zyngier <maz@kernel.org>,
Oliver Upton <oliver.upton@linux.dev>,
Paolo Bonzini <pbonzini@redhat.com>,
linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
David Hildenbrand <david@redhat.com>,
Fuad Tabba <tabba@google.com>,
Ashish Kalra <ashish.kalra@amd.com>,
Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [PATCH v12 05/12] KVM: guest_memfd: Enforce NUMA mempolicy using shared policy
Date: Fri, 10 Oct 2025 13:33:14 -0700 [thread overview]
Message-ID: <aOltikRvKzCy1DXN@google.com> (raw)
In-Reply-To: <e9bd02ba-ff0e-47a3-a12e-9a53717dde9b@amd.com>
On Fri, Oct 10, 2025, Shivank Garg wrote:
> >> @@ -112,6 +114,19 @@ static int kvm_gmem_prepare_folio(struct kvm *kvm, struct kvm_memory_slot *slot,
> >> return r;
> >> }
> >>
> >> +static struct mempolicy *kvm_gmem_get_folio_policy(struct gmem_inode *gi,
> >> + pgoff_t index)
> >
> > How about kvm_gmem_get_index_policy() instead, since the policy is keyed
> > by index?
But isn't the policy tied to the folio? I assume/hope that something will split
folios if they have different policies for their indices when a folio contains
more than one page. In other words, how will this work when hugepage support
comes along?
So yeah, I agree that the lookup is keyed on the index, but conceptually aren't
we getting the policy for the folio? The index is a means to an end.
> >> +{
> >> +#ifdef CONFIG_NUMA
> >> + struct mempolicy *mpol;
> >> +
> >> + mpol = mpol_shared_policy_lookup(&gi->policy, index);
> >> + return mpol ? mpol : get_task_policy(current);
> >
> > Should we be returning NULL if no shared policy was defined?
> >
> > By returning NULL, __filemap_get_folio_mpol() can handle the case where
> > cpuset_do_page_mem_spread().
> >
> > If we always return current's task policy, what if the user wants to use
> > cpuset_do_page_mem_spread()?
> >
>
> I initially followed shmem's approach here.
> I agree that returning NULL maintains consistency with the current default
> behavior of cpuset_do_page_mem_spread(), regardless of CONFIG_NUMA.
>
> I'm curious what could be the practical implications of cpuset_do_page_mem_spread()
> v/s get_task_policy() as the fallback?
Userspace could enable page spreading on the task that triggers guest_memfd
allocation. I can't conjure up a reason to do that, but I've been surprised
more than once by KVM setups.
> Which is more appropriate for guest_memfd when no policy is explicitly set
> via mbind()?
I don't think we need to answer that question? Userspace _has_ set a policy,
just through cpuset, not via mbind(). So while I can't imagine there's a sane
use case for cpuset_do_page_mem_spread() with guest_memfd, I also don't see a
reason why KVM should effectively disallow it.
And unless I'm missing something, allocation will eventually fallback to
get_task_policy() (in alloc_frozen_pages_noprof()), so by explicitly getting the
task policy in guest_memfd, KVM is doing _more_ work than necessary _and_ is
unnecessarily restricting usersepace.
Add in that returning NULL would align this code with the ->get_policy hook (and
could be shared again, I assume), and my vote is definitely to return NULL and
not get in the way.
next prev parent reply other threads:[~2025-10-10 20:33 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-07 22:14 [PATCH v12 00/12] KVM: guest_memfd: Add NUMA mempolicy support Sean Christopherson
2025-10-07 22:14 ` [PATCH v12 01/12] KVM: guest_memfd: Rename "struct kvm_gmem" to "struct gmem_file" Sean Christopherson
2025-10-08 5:25 ` Garg, Shivank
2025-10-09 21:08 ` Ackerley Tng
2025-10-10 15:07 ` David Hildenbrand
2025-10-07 22:14 ` [PATCH v12 02/12] KVM: guest_memfd: Add macro to iterate over gmem_files for a mapping/inode Sean Christopherson
2025-10-08 5:30 ` Garg, Shivank
2025-10-09 21:27 ` Ackerley Tng
2025-10-07 22:14 ` [PATCH v12 03/12] KVM: guest_memfd: Use guest mem inodes instead of anonymous inodes Sean Christopherson
2025-10-07 22:14 ` [PATCH v12 04/12] KVM: guest_memfd: Add slab-allocated inode cache Sean Christopherson
2025-10-09 21:39 ` Ackerley Tng
2025-10-09 22:16 ` Ackerley Tng
2025-10-07 22:14 ` [PATCH v12 05/12] KVM: guest_memfd: Enforce NUMA mempolicy using shared policy Sean Christopherson
2025-10-09 22:15 ` Ackerley Tng
2025-10-10 7:57 ` Garg, Shivank
2025-10-10 20:33 ` Sean Christopherson [this message]
2025-10-10 21:57 ` Ackerley Tng
2025-10-12 20:00 ` Garg, Shivank
2025-10-15 16:56 ` Sean Christopherson
2025-10-07 22:14 ` [PATCH v12 06/12] KVM: selftests: Define wrappers for common syscalls to assert success Sean Christopherson
2025-10-09 21:44 ` Ackerley Tng
2025-10-07 22:14 ` [PATCH v12 07/12] KVM: selftests: Report stacktraces SIGBUS, SIGSEGV, SIGILL, and SIGFPE by default Sean Christopherson
2025-10-09 22:31 ` Ackerley Tng
2025-10-07 22:14 ` [PATCH v12 08/12] KVM: selftests: Add additional equivalents to libnuma APIs in KVM's numaif.h Sean Christopherson
2025-10-09 22:34 ` Ackerley Tng
2025-10-07 22:14 ` [PATCH v12 09/12] KVM: selftests: Use proper uAPI headers to pick up mempolicy.h definitions Sean Christopherson
2025-10-10 17:59 ` Ackerley Tng
2025-10-07 22:14 ` [PATCH v12 10/12] KVM: selftests: Add helpers to probe for NUMA support, and multi-node systems Sean Christopherson
2025-10-07 22:14 ` [PATCH v12 11/12] KVM: selftests: Add guest_memfd tests for mmap and NUMA policy support Sean Christopherson
2025-10-09 23:08 ` Ackerley Tng
2025-10-07 22:14 ` [PATCH v12 12/12] KVM: guest_memfd: Add gmem_inode.flags field instead of using i_private Sean Christopherson
2025-10-09 20:58 ` [PATCH v12 00/12] KVM: guest_memfd: Add NUMA mempolicy support Ackerley Tng
2025-10-10 4:59 ` Garg, Shivank
2025-10-10 17:56 ` Ackerley Tng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aOltikRvKzCy1DXN@google.com \
--to=seanjc@google.com \
--cc=ackerleytng@google.com \
--cc=ashish.kalra@amd.com \
--cc=david@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maz@kernel.org \
--cc=oliver.upton@linux.dev \
--cc=pbonzini@redhat.com \
--cc=shivankg@amd.com \
--cc=tabba@google.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.