All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Ritesh Harjani <ritesh.list@gmail.com>
Cc: kvm@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
	linuxppc-dev@lists.ozlabs.org,  linux-kernel@vger.kernel.org,
	Michael Ellerman <mpe@ellerman.id.au>,
	 Christophe Leroy <chleroy@kernel.org>,
	Anushree Mathur <anushree.mathur@linux.ibm.com>,
	 Venkat Rao Bagalkote <venkat88@linux.ibm.com>,
	Harsh Prateek Bora <harshpb@linux.ibm.com>,
	 Ackerley Tng <ackerleytng@google.com>,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	 Claudio Imbrenda <imbrenda@linux.ibm.com>,
	Nicholas Piggin <npiggin@gmail.com>
Subject: Re: [PATCH v3 RESEND 02/10] KVM: selftests: Add aligned guest physical page allocator
Date: Thu, 11 Jun 2026 10:54:08 -0700	[thread overview]
Message-ID: <air2QB-HUzxpCm_M@google.com> (raw)
In-Reply-To: <ldcldopw.ritesh.list@gmail.com>

On Thu, Jun 11, 2026, Ritesh Harjani wrote:
> Sean Christopherson <seanjc@google.com> writes:
> 
> > On Wed, Jun 10, 2026, Ritesh Harjani (IBM) wrote:
> >> From: Nicholas Piggin <npiggin@gmail.com>
> >> 
> >> powerpc will require this to allocate MMU tables in guest memory that
> >> are larger than guest base page size.
> >> 
> >> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> >> [Rebased to latest mainline tree]
> >> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> >> ---
> >>  .../testing/selftests/kvm/include/kvm_util.h  | 20 +++++++++--
> >>  tools/testing/selftests/kvm/lib/kvm_util.c    | 33 +++++++++----------
> >>  2 files changed, 33 insertions(+), 20 deletions(-)
> >> 
> >> diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h
> >> index 3666a8530f31..c515c918c2c9 100644
> >> --- a/tools/testing/selftests/kvm/include/kvm_util.h
> >> +++ b/tools/testing/selftests/kvm/include/kvm_util.h
> >> @@ -991,8 +991,8 @@ void kvm_gsi_routing_write(struct kvm_vm *vm, struct kvm_irq_routing *routing);
> >>  const char *exit_reason_str(unsigned int exit_reason);
> >>  
> >>  gpa_t vm_phy_page_alloc(struct kvm_vm *vm, gpa_t min_gpa, u32 memslot);
> >> -gpa_t __vm_phy_pages_alloc(struct kvm_vm *vm, size_t num, gpa_t min_gpa,
> >> -			   u32 memslot, bool protected);
> >> +gpa_t __vm_phy_pages_alloc(struct kvm_vm *vm, size_t num, size_t align,
> >> +			   gpa_t min_gpa, u32 memslot, bool protected);
> >>  gpa_t vm_alloc_page_table(struct kvm_vm *vm);
> >>  
> >>  static inline gpa_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num,
> >> @@ -1003,10 +1003,24 @@ static inline gpa_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num,
> >>  	 * protected memory, as the majority of memory for such VMs is
> >>  	 * protected, i.e. using shared memory is effectively opt-in.
> >>  	 */
> >> -	return __vm_phy_pages_alloc(vm, num, min_gpa, memslot,
> >> +	return __vm_phy_pages_alloc(vm, num, 1, min_gpa, memslot,
> >>  				    vm_arch_has_protected_memory(vm));
> >>  }
> >>  
> >> +static inline gpa_t vm_phy_pages_alloc_align(struct kvm_vm *vm, size_t num,
> >> +					     size_t align, gpa_t min_gpa,
> >> +					     u32 memslot)
> >
> > Given that the PPC usage is all for naturally aligned allocations, I think it
> > makes sense for that to be the API, i.e. have "bool naturally_aligned" instead
> > of an arbitrary alignment.
> >
> 
> I would still prefer passing an explicit align value to the allocator,
> because IMHO that's a useful allocator interface to have.

Why?  What are the use cases?  I can't think of anything that requires multi-page
allocations to have specific alignment, all of the cases I can think of require
natural alignment.

For sub-page allocations, supporting semi-arbitrary alignments makes sense, but
I'm not convinced we should try and support that for page-granularity allocations.

> However, if we do want to go down this road than I don't have any strong
> objection either, since as you mentioned powerpc mostly just needs
> natual alignment for it's page table region. So alignment can be
> extracted from the region type as you described below, so no need to
> pass it all the time.

..

> > The bonus is that @min_gpa goes away too.
> 
> powerpc needs min_gpa for it's exception handling pages. see.
> 
> 	excp_paddr = vm_phy_pages_alloc(vm, excp_pages, 0,
> 					vm->memslots[MEM_REGION_DATA]);
> 
> 	TEST_ASSERT(excp_paddr == 0,
> 		    "Interrupt vectors not allocated at gPA address 0: (0x%lx)",
> 		    excp_paddr);
> 
> So, will arch still have an access to the API for passing min_gpa = 0
> for cases like above?

Yep, ____vm_phy_pages_alloc() will be globally visible, and I think is quite
appropriate in this case since it's more than just specifying a minimum GPA,
it's really specifying an _exact_ GPA for the allocation.

gpa_t ____vm_phy_pages_alloc(struct kvm_vm *vm, size_t nr_pages, gpa_t min_gpa,
			     u32 memslot, bool protected, bool naturally_aligned)

> > It'll probably take me a few days/weeks, but I'll try get a series posted before
> > the 7.2 merge window closes, so that you can build on top to get the PPC selftests
> > support landed in 7.3.
> >
> 
> Thanks Sean for your help! Yes, landing kvm selftests for powerpc will
> be definitely helpful to verify against any kvm regressions.
> 
> BTW, I was thinking whether landing powerpc first will be easier for you
> to consider all the API requirements from all users / usecases?  But
> either way is fine please. I can work on top of your changes too and if
> something is missing for powerpc, we can add / modify on top, once your
> series is finished.

One idea would be for me to include the PPC support in the series; it's "just"
a patch or two on top, albeit one pretty big patch.

> >> @@ -2039,23 +2039,22 @@ gpa_t __vm_phy_pages_alloc(struct kvm_vm *vm, size_t num,
> >>  	TEST_ASSERT(!protected || region->protected_phy_pages,
> >>  		    "Region doesn't support protected memory");
> >>  
> >> -	base = pg = min_gpa >> vm->page_shift;
> >> -	do {
> >> -		for (; pg < base + num; ++pg) {
> >> -			if (!sparsebit_is_set(region->unused_phy_pages, pg)) {
> >> -				base = pg = sparsebit_next_set(region->unused_phy_pages, pg);
> >> -				break;
> >> +	base = min_gpa >> vm->page_shift;
> >> +again:
> >> +	base = (base + align - 1) & ~(align - 1);
> >> +	for (pg = base; pg < base + num; ++pg) {
> >> +		if (!sparsebit_is_set(region->unused_phy_pages, pg)) {
> >> +			base = sparsebit_next_set(region->unused_phy_pages, pg);
> >> +			if (!base) {
> >> +				fprintf(stderr, "No guest physical page available, "
> >> +					"min_gpa: 0x%lx page_size: 0x%x memslot: %u\n",
> >> +					min_gpa, vm->page_size, memslot);
> >> +				fputs("---- vm dump ----\n", stderr);
> >> +				vm_dump(stderr, vm, 2);
> >> +				abort();
> >>  			}
> >> +			goto again;
> >>  		}
> >> -	} while (pg && pg != base + num);
> >> -
> >> -	if (pg == 0) {
> >> -		fprintf(stderr, "No guest physical page available, "
> >> -			"min_gpa: 0x%lx page_size: 0x%x memslot: %u\n",
> >> -			min_gpa, vm->page_size, memslot);
> >> -		fputs("---- vm dump ----\n", stderr);
> >> -		vm_dump(stderr, vm, 2);
> >> -		abort();
> >>  	}
> >
> > This is unnecessary churn.  I'm not saying the current code is pretty or anything,
> > but unless I'm missing something, this can simply be:
> 
> Not really, we need the base to be aligned everytime, that's why the
> goto again loop aligns the base in the new code.

Ooh, right, pg needs to be advanced by the aligned number of pages.  Ugh, the
whole do-while part is bizarre, and AFAICT only "works" by dumb luck. 

> Plus I feel the above refactoring also simplifies the special handling of pg
> == 0 case, which earlier was being handled separately after the loop ends.

Yeah, I fiddled with a few other options, but I think I like your approach the
most.


  reply	other threads:[~2026-06-11 17:54 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-10 12:47 [PATCH v3 RESEND 00/10] KVM: selftests: add powerpc support Ritesh Harjani (IBM)
2026-06-10 12:47 ` [PATCH v3 RESEND 01/10] KVM: selftests: Move pgd_created check into virt_pgd_alloc Ritesh Harjani (IBM)
2026-06-10 12:47 ` [PATCH v3 RESEND 02/10] KVM: selftests: Add aligned guest physical page allocator Ritesh Harjani (IBM)
2026-06-10 16:18   ` Sean Christopherson
2026-06-11 16:30     ` Ritesh Harjani
2026-06-11 17:54       ` Sean Christopherson [this message]
2026-06-10 12:47 ` [PATCH v3 RESEND 03/10] KVM: PPC: selftests: add support for powerpc Ritesh Harjani (IBM)
2026-06-10 12:47 ` [PATCH v3 RESEND 04/10] KVM: PPC: selftests: powerpc enable kvm_create_max_vcpus test Ritesh Harjani (IBM)
2026-06-10 17:59   ` Sean Christopherson
2026-06-10 12:47 ` [PATCH v3 RESEND 05/10] KVM: selftests: Print the vcpu_id when KVM_CREATE_VCPU ioctl fails Ritesh Harjani (IBM)
2026-06-10 12:59   ` Sean Christopherson
2026-06-10 12:47 ` [PATCH v3 RESEND 06/10] KVM: PPC: selftests: Use u64 instead of uint64_t Ritesh Harjani (IBM)
2026-06-10 12:47 ` [PATCH v3 RESEND 07/10] KVM: PPC: selftests: Use s64 instead of int64_t Ritesh Harjani (IBM)
2026-06-10 12:47 ` [PATCH v3 RESEND 08/10] KVM: PPC: selftests: Use u32 instead of uint32_t Ritesh Harjani (IBM)
2026-06-10 12:47 ` [PATCH v3 RESEND 09/10] KVM: PPC: selftests: Use u8 instead of uint8_t Ritesh Harjani (IBM)
2026-06-10 12:54   ` sashiko-bot
2026-06-10 12:47 ` [PATCH v3 RESEND 10/10] KVM: PPC: selftests: Replace u64 gpa, u64 gva|vaddr with gpa_t and gva_t Ritesh Harjani (IBM)
2026-06-10 12:51 ` [PATCH v3 RESEND 00/10] KVM: selftests: add powerpc support Sean Christopherson
2026-06-10 12:53   ` Ritesh Harjani
2026-06-10 16:19     ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=air2QB-HUzxpCm_M@google.com \
    --to=seanjc@google.com \
    --cc=ackerleytng@google.com \
    --cc=anushree.mathur@linux.ibm.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=chleroy@kernel.org \
    --cc=harshpb@linux.ibm.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=ritesh.list@gmail.com \
    --cc=venkat88@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.