[PATCH] KVM: SVM: Consider NUMA affinity when allocating per-CPU save

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] KVM: SVM: Consider NUMA affinity when allocating per-CPU save_area
@ 2024-04-18  3:07 Li RongQing
  2024-04-23  0:29 ` Sean Christopherson
  0 siblings, 1 reply; 9+ messages in thread
From: Li RongQing @ 2024-04-18  3:07 UTC (permalink / raw)
  To: seanjc, pbonzini, kvm; +Cc: Li RongQing

save_area of per-CPU svm_data are dominantly accessed from their
own local CPUs, so allocate them node-local for performance reason

Signed-off-by: Li RongQing <lirongqing@baidu.com>
---
 arch/x86/kvm/svm/sev.c | 6 +++---
 arch/x86/kvm/svm/svm.c | 2 +-
 arch/x86/kvm/svm/svm.h | 6 +++++-
 3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 61a7531..cce8ec7 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3179,13 +3179,13 @@ void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
 	ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, 1);
 }
 
-struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
+struct page *snp_safe_alloc_page_node(struct kvm_vcpu *vcpu, int node)
 {
 	unsigned long pfn;
 	struct page *p;
 
 	if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
-		return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
+		return alloc_pages_node(node, GFP_KERNEL_ACCOUNT | __GFP_ZERO, 0);
 
 	/*
 	 * Allocate an SNP-safe page to workaround the SNP erratum where
@@ -3196,7 +3196,7 @@ struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
 	 * Allocate one extra page, choose a page which is not
 	 * 2MB-aligned, and free the other.
 	 */
-	p = alloc_pages(GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
+	p = alloc_pages_node(node, GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
 	if (!p)
 		return NULL;
 
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d1a9f995..69fc809 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -703,7 +703,7 @@ static int svm_cpu_init(int cpu)
 	int ret = -ENOMEM;
 
 	memset(sd, 0, sizeof(struct svm_cpu_data));
-	sd->save_area = snp_safe_alloc_page(NULL);
+	sd->save_area = snp_safe_alloc_page_node(NULL, cpu_to_node(cpu));
 	if (!sd->save_area)
 		return ret;
 
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 7f1fbd8..3bbf638 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -694,8 +694,12 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm);
 void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector);
 void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa);
 void sev_es_unmap_ghcb(struct vcpu_svm *svm);
-struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu);
+struct page *snp_safe_alloc_page_node(struct kvm_vcpu *vcpu, int node);
 
+static inline struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
+{
+	return snp_safe_alloc_page_node(vcpu, NUMA_NO_NODE);
+}
 /* vmenter.S */
 
 void __svm_sev_es_vcpu_run(struct vcpu_svm *svm, bool spec_ctrl_intercepted);
-- 
2.9.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] KVM: SVM: Consider NUMA affinity when allocating per-CPU save_area
  2024-04-18  3:07 [PATCH] KVM: SVM: Consider NUMA affinity when allocating per-CPU save_area Li RongQing
@ 2024-04-23  0:29 ` Sean Christopherson
  2024-04-23 13:30   ` Peter Gonda
  0 siblings, 1 reply; 9+ messages in thread
From: Sean Christopherson @ 2024-04-23  0:29 UTC (permalink / raw)
  To: Li RongQing; +Cc: pbonzini, kvm, Tom Lendacky, Michael Roth, Peter Gonda

+Tom, Mike, and Peter

On Thu, Apr 18, 2024, Li RongQing wrote:
> save_area of per-CPU svm_data are dominantly accessed from their
> own local CPUs, so allocate them node-local for performance reason
> 
> Signed-off-by: Li RongQing <lirongqing@baidu.com>
> ---
>  arch/x86/kvm/svm/sev.c | 6 +++---
>  arch/x86/kvm/svm/svm.c | 2 +-
>  arch/x86/kvm/svm/svm.h | 6 +++++-
>  3 files changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 61a7531..cce8ec7 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -3179,13 +3179,13 @@ void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
>  	ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, 1);
>  }
>  
> -struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
> +struct page *snp_safe_alloc_page_node(struct kvm_vcpu *vcpu, int node)
>  {
>  	unsigned long pfn;
>  	struct page *p;
>  
>  	if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
> -		return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
> +		return alloc_pages_node(node, GFP_KERNEL_ACCOUNT | __GFP_ZERO, 0);
>  
>  	/*
>  	 * Allocate an SNP-safe page to workaround the SNP erratum where
> @@ -3196,7 +3196,7 @@ struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
>  	 * Allocate one extra page, choose a page which is not
>  	 * 2MB-aligned, and free the other.
>  	 */
> -	p = alloc_pages(GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
> +	p = alloc_pages_node(node, GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);

This made me realize the existing code is buggy.  The allocation for the per-CPU
save area shouldn't be accounted.

Also, what's the point of taking @vcpu?  It's a nice enough flag to say "this
should be accounted", but it's decidely odd.

How about we fix both in a single series, and end up with this over 3-4 patches?
I.e. pass -1 where vcpu is non-NULL, and the current CPU for the save area.

struct page *snp_safe_alloc_page(int cpu)
{
	unsigned long pfn;
	struct page *p;
	gfp_t gpf;
	int node;

	if (cpu >= 0) {
		node = cpu_to_node(cpu);
		gfp = GFP_KERNEL;
	} else {
		node = NUMA_NO_NODE;
		gfp = GFP_KERNEL_ACCOUNT
	}
	gfp |= __GFP_ZERO;

	if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
		return alloc_pages_node(node, gfp, 0);

	/*
	 * Allocate an SNP-safe page to workaround the SNP erratum where
	 * the CPU will incorrectly signal an RMP violation #PF if a
	 * hugepage (2MB or 1GB) collides with the RMP entry of a
	 * 2MB-aligned VMCB, VMSA, or AVIC backing page.
	 *
	 * Allocate one extra page, choose a page which is not
	 * 2MB-aligned, and free the other.
	 */
	p = alloc_pages_node(node, gfp, 1);
	if (!p)
		return NULL;

	split_page(p, 1);

	pfn = page_to_pfn(p);
	if (IS_ALIGNED(pfn, PTRS_PER_PMD))
		__free_page(p++);
	else
		__free_page(p + 1);

	return p;
}


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] KVM: SVM: Consider NUMA affinity when allocating per-CPU save_area
  2024-04-23  0:29 ` Sean Christopherson
@ 2024-04-23 13:30   ` Peter Gonda
  2024-04-23 18:00     ` Yosry Ahmed
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Gonda @ 2024-04-23 13:30 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Li RongQing, pbonzini, kvm, Tom Lendacky, Michael Roth,
	David Rientjes, Yosry Ahmed

On Mon, Apr 22, 2024 at 6:29 PM Sean Christopherson <seanjc@google.com> wrote:
>
> +Tom, Mike, and Peter
>
> On Thu, Apr 18, 2024, Li RongQing wrote:
> > save_area of per-CPU svm_data are dominantly accessed from their
> > own local CPUs, so allocate them node-local for performance reason
> >
> > Signed-off-by: Li RongQing <lirongqing@baidu.com>
> > ---
> >  arch/x86/kvm/svm/sev.c | 6 +++---
> >  arch/x86/kvm/svm/svm.c | 2 +-
> >  arch/x86/kvm/svm/svm.h | 6 +++++-
> >  3 files changed, 9 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> > index 61a7531..cce8ec7 100644
> > --- a/arch/x86/kvm/svm/sev.c
> > +++ b/arch/x86/kvm/svm/sev.c
> > @@ -3179,13 +3179,13 @@ void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
> >       ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, 1);
> >  }
> >
> > -struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
> > +struct page *snp_safe_alloc_page_node(struct kvm_vcpu *vcpu, int node)
> >  {
> >       unsigned long pfn;
> >       struct page *p;
> >
> >       if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
> > -             return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
> > +             return alloc_pages_node(node, GFP_KERNEL_ACCOUNT | __GFP_ZERO, 0);
> >
> >       /*
> >        * Allocate an SNP-safe page to workaround the SNP erratum where
> > @@ -3196,7 +3196,7 @@ struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
> >        * Allocate one extra page, choose a page which is not
> >        * 2MB-aligned, and free the other.
> >        */
> > -     p = alloc_pages(GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
> > +     p = alloc_pages_node(node, GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
>
> This made me realize the existing code is buggy.  The allocation for the per-CPU
> save area shouldn't be accounted.
>
> Also, what's the point of taking @vcpu?  It's a nice enough flag to say "this
> should be accounted", but it's decidely odd.
>
> How about we fix both in a single series, and end up with this over 3-4 patches?
> I.e. pass -1 where vcpu is non-NULL, and the current CPU for the save area.

Looks good to me. Internally we already use GFP_KERNEL for these
allocations. But we had an issue with split_page() and memcg
accounting internally. Yosry submitted the following:

+  if (memcg_kmem_charge(p, GFP_KERNEL_ACCOUNT, 0)) {
+    __free_page(p);
+    return NULL;
+  }

Not sure if this is an issue with our kernel or if we should use
split_page_memcg() here? It was suggested internally but we didn't
want to backport it.

>
> struct page *snp_safe_alloc_page(int cpu)
> {
>         unsigned long pfn;
>         struct page *p;
>         gfp_t gpf;
>         int node;
>
>         if (cpu >= 0) {
>                 node = cpu_to_node(cpu);
>                 gfp = GFP_KERNEL;
>         } else {
>                 node = NUMA_NO_NODE;
>                 gfp = GFP_KERNEL_ACCOUNT
>         }
>         gfp |= __GFP_ZERO;
>
>         if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
>                 return alloc_pages_node(node, gfp, 0);
>
>         /*
>          * Allocate an SNP-safe page to workaround the SNP erratum where
>          * the CPU will incorrectly signal an RMP violation #PF if a
>          * hugepage (2MB or 1GB) collides with the RMP entry of a
>          * 2MB-aligned VMCB, VMSA, or AVIC backing page.
>          *
>          * Allocate one extra page, choose a page which is not
>          * 2MB-aligned, and free the other.
>          */
>         p = alloc_pages_node(node, gfp, 1);
>         if (!p)
>                 return NULL;
>
>         split_page(p, 1);
>
>         pfn = page_to_pfn(p);
>         if (IS_ALIGNED(pfn, PTRS_PER_PMD))
>                 __free_page(p++);
>         else
>                 __free_page(p + 1);
>
>         return p;
> }
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] KVM: SVM: Consider NUMA affinity when allocating per-CPU save_area
  2024-04-23 13:30   ` Peter Gonda
@ 2024-04-23 18:00     ` Yosry Ahmed
  2024-04-23 18:43       ` Sean Christopherson
  0 siblings, 1 reply; 9+ messages in thread
From: Yosry Ahmed @ 2024-04-23 18:00 UTC (permalink / raw)
  To: Peter Gonda
  Cc: Sean Christopherson, Li RongQing, pbonzini, kvm, Tom Lendacky,
	Michael Roth, David Rientjes

On Tue, Apr 23, 2024 at 6:30 AM Peter Gonda <pgonda@google.com> wrote:
>
> On Mon, Apr 22, 2024 at 6:29 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > +Tom, Mike, and Peter
> >
> > On Thu, Apr 18, 2024, Li RongQing wrote:
> > > save_area of per-CPU svm_data are dominantly accessed from their
> > > own local CPUs, so allocate them node-local for performance reason
> > >
> > > Signed-off-by: Li RongQing <lirongqing@baidu.com>
> > > ---
> > >  arch/x86/kvm/svm/sev.c | 6 +++---
> > >  arch/x86/kvm/svm/svm.c | 2 +-
> > >  arch/x86/kvm/svm/svm.h | 6 +++++-
> > >  3 files changed, 9 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> > > index 61a7531..cce8ec7 100644
> > > --- a/arch/x86/kvm/svm/sev.c
> > > +++ b/arch/x86/kvm/svm/sev.c
> > > @@ -3179,13 +3179,13 @@ void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
> > >       ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, 1);
> > >  }
> > >
> > > -struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
> > > +struct page *snp_safe_alloc_page_node(struct kvm_vcpu *vcpu, int node)
> > >  {
> > >       unsigned long pfn;
> > >       struct page *p;
> > >
> > >       if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
> > > -             return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
> > > +             return alloc_pages_node(node, GFP_KERNEL_ACCOUNT | __GFP_ZERO, 0);
> > >
> > >       /*
> > >        * Allocate an SNP-safe page to workaround the SNP erratum where
> > > @@ -3196,7 +3196,7 @@ struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
> > >        * Allocate one extra page, choose a page which is not
> > >        * 2MB-aligned, and free the other.
> > >        */
> > > -     p = alloc_pages(GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
> > > +     p = alloc_pages_node(node, GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
> >
> > This made me realize the existing code is buggy.  The allocation for the per-CPU
> > save area shouldn't be accounted.
> >
> > Also, what's the point of taking @vcpu?  It's a nice enough flag to say "this
> > should be accounted", but it's decidely odd.
> >
> > How about we fix both in a single series, and end up with this over 3-4 patches?
> > I.e. pass -1 where vcpu is non-NULL, and the current CPU for the save area.
>
> Looks good to me. Internally we already use GFP_KERNEL for these
> allocations. But we had an issue with split_page() and memcg
> accounting internally. Yosry submitted the following:
>
> +  if (memcg_kmem_charge(p, GFP_KERNEL_ACCOUNT, 0)) {
> +    __free_page(p);
> +    return NULL;
> +  }
>
> Not sure if this is an issue with our kernel or if we should use
> split_page_memcg() here? It was suggested internally but we didn't
> want to backport it.

The referenced internal code is in an older kernel where split_page()
was buggy and did not handle memcg charging correctly (did not call
split_page_memcg()). So we needed to make the allocation with
GFP_KERNEL, then manually account the used page after splitting and
freeing the unneeded page.

AFAICT, this is not needed upstream and the current code is fine.

>
> >
> > struct page *snp_safe_alloc_page(int cpu)
> > {
> >         unsigned long pfn;
> >         struct page *p;
> >         gfp_t gpf;
> >         int node;
> >
> >         if (cpu >= 0) {
> >                 node = cpu_to_node(cpu);
> >                 gfp = GFP_KERNEL;
> >         } else {
> >                 node = NUMA_NO_NODE;
> >                 gfp = GFP_KERNEL_ACCOUNT
> >         }

FWIW, from the pov of someone who has zero familiarity with this code,
passing @cpu only to make inferences about the GFP flags and numa
nodes is confusing tbh.

Would it be clearer to pass in the gfp flags and node id directly? I
think it would make it clearer why we choose to account the allocation
and/or specify a node in some cases but not others.

> >         gfp |= __GFP_ZERO;
> >
> >         if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
> >                 return alloc_pages_node(node, gfp, 0);
> >
> >         /*
> >          * Allocate an SNP-safe page to workaround the SNP erratum where
> >          * the CPU will incorrectly signal an RMP violation #PF if a
> >          * hugepage (2MB or 1GB) collides with the RMP entry of a
> >          * 2MB-aligned VMCB, VMSA, or AVIC backing page.
> >          *
> >          * Allocate one extra page, choose a page which is not
> >          * 2MB-aligned, and free the other.
> >          */
> >         p = alloc_pages_node(node, gfp, 1);
> >         if (!p)
> >                 return NULL;
> >
> >         split_page(p, 1);
> >
> >         pfn = page_to_pfn(p);
> >         if (IS_ALIGNED(pfn, PTRS_PER_PMD))
> >                 __free_page(p++);
> >         else
> >                 __free_page(p + 1);
> >
> >         return p;
> > }
> >

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] KVM: SVM: Consider NUMA affinity when allocating per-CPU save_area
  2024-04-23 18:00     ` Yosry Ahmed
@ 2024-04-23 18:43       ` Sean Christopherson
  2024-04-23 18:52         ` Yosry Ahmed
  0 siblings, 1 reply; 9+ messages in thread
From: Sean Christopherson @ 2024-04-23 18:43 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Peter Gonda, Li RongQing, pbonzini, kvm, Tom Lendacky,
	Michael Roth, David Rientjes

On Tue, Apr 23, 2024, Yosry Ahmed wrote:
> On Tue, Apr 23, 2024 at 6:30 AM Peter Gonda <pgonda@google.com> wrote:
> > > struct page *snp_safe_alloc_page(int cpu)
> > > {
> > >         unsigned long pfn;
> > >         struct page *p;
> > >         gfp_t gpf;
> > >         int node;
> > >
> > >         if (cpu >= 0) {
> > >                 node = cpu_to_node(cpu);
> > >                 gfp = GFP_KERNEL;
> > >         } else {
> > >                 node = NUMA_NO_NODE;
> > >                 gfp = GFP_KERNEL_ACCOUNT
> > >         }
> 
> FWIW, from the pov of someone who has zero familiarity with this code,
> passing @cpu only to make inferences about the GFP flags and numa
> nodes is confusing tbh.
> 
> Would it be clearer to pass in the gfp flags and node id directly? I
> think it would make it clearer why we choose to account the allocation
> and/or specify a node in some cases but not others.

Hmm, yeah, passing GFP directly makes sense, if only to align with alloc_page()
and not reinvent the wheel.  But forcing all callers to explicitly provide a node
ID is a net negative, i.e. having snp_safe_alloc_page() and snp_safe_alloc_page_node()
as originally suggested makes sense.

But snp_safe_alloc_page() should again flow alloc_pages() and pass numa_node_id(),
not NUMA_NO_NODE.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] KVM: SVM: Consider NUMA affinity when allocating per-CPU save_area
  2024-04-23 18:43       ` Sean Christopherson
@ 2024-04-23 18:52         ` Yosry Ahmed
  2024-04-23 19:00           ` Sean Christopherson
  0 siblings, 1 reply; 9+ messages in thread
From: Yosry Ahmed @ 2024-04-23 18:52 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Peter Gonda, Li RongQing, pbonzini, kvm, Tom Lendacky,
	Michael Roth, David Rientjes

On Tue, Apr 23, 2024 at 11:43 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Tue, Apr 23, 2024, Yosry Ahmed wrote:
> > On Tue, Apr 23, 2024 at 6:30 AM Peter Gonda <pgonda@google.com> wrote:
> > > > struct page *snp_safe_alloc_page(int cpu)
> > > > {
> > > >         unsigned long pfn;
> > > >         struct page *p;
> > > >         gfp_t gpf;
> > > >         int node;
> > > >
> > > >         if (cpu >= 0) {
> > > >                 node = cpu_to_node(cpu);
> > > >                 gfp = GFP_KERNEL;
> > > >         } else {
> > > >                 node = NUMA_NO_NODE;
> > > >                 gfp = GFP_KERNEL_ACCOUNT
> > > >         }
> >
> > FWIW, from the pov of someone who has zero familiarity with this code,
> > passing @cpu only to make inferences about the GFP flags and numa
> > nodes is confusing tbh.
> >
> > Would it be clearer to pass in the gfp flags and node id directly? I
> > think it would make it clearer why we choose to account the allocation
> > and/or specify a node in some cases but not others.
>
> Hmm, yeah, passing GFP directly makes sense, if only to align with alloc_page()
> and not reinvent the wheel.  But forcing all callers to explicitly provide a node
> ID is a net negative, i.e. having snp_safe_alloc_page() and snp_safe_alloc_page_node()
> as originally suggested makes sense.

Yeah if most callers do not need to provide a node then it makes sense
to have a specialized snp_safe_alloc_page_node() variant.

>
> But snp_safe_alloc_page() should again flow alloc_pages() and pass numa_node_id(),
> not NUMA_NO_NODE.

alloc_pages_node() will use numa_node_id() if you pass in
NUMA_NO_NODE. That's the documented behavior and it seems to be widely
used. I don't see anyone using numa_node_id() directly with
alloc_pages_node().

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] KVM: SVM: Consider NUMA affinity when allocating per-CPU save_area
  2024-04-23 18:52         ` Yosry Ahmed
@ 2024-04-23 19:00           ` Sean Christopherson
  2024-04-23 19:08             ` Yosry Ahmed
  0 siblings, 1 reply; 9+ messages in thread
From: Sean Christopherson @ 2024-04-23 19:00 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Peter Gonda, Li RongQing, pbonzini, kvm, Tom Lendacky,
	Michael Roth, David Rientjes

On Tue, Apr 23, 2024, Yosry Ahmed wrote:
> On Tue, Apr 23, 2024 at 11:43 AM Sean Christopherson <seanjc@google.com> wrote:
> > But snp_safe_alloc_page() should again flow alloc_pages() and pass numa_node_id(),
> > not NUMA_NO_NODE.
> 
> alloc_pages_node() will use numa_node_id() if you pass in NUMA_NO_NODE.

The code uses numa_mem_id()

	if (nid == NUMA_NO_NODE)
		nid = numa_mem_id();

which is presumably the exact same thing as numa_node_id() on x86.  But I don't
want to have to think that hard :-)

In other words, all I'm saying is that if we want to mirror alloc_pages() and
alloc_pages_node(), then we should mirror them exactly.

> That's the documented behavior and it seems to be widely
> used. I don't see anyone using numa_node_id() directly with
> alloc_pages_node().



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] KVM: SVM: Consider NUMA affinity when allocating per-CPU save_area
  2024-04-23 19:00           ` Sean Christopherson
@ 2024-04-23 19:08             ` Yosry Ahmed
  2024-04-23 20:38               ` Tom Lendacky
  0 siblings, 1 reply; 9+ messages in thread
From: Yosry Ahmed @ 2024-04-23 19:08 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Peter Gonda, Li RongQing, pbonzini, kvm, Tom Lendacky,
	Michael Roth, David Rientjes

On Tue, Apr 23, 2024 at 12:00 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Tue, Apr 23, 2024, Yosry Ahmed wrote:
> > On Tue, Apr 23, 2024 at 11:43 AM Sean Christopherson <seanjc@google.com> wrote:
> > > But snp_safe_alloc_page() should again flow alloc_pages() and pass numa_node_id(),
> > > not NUMA_NO_NODE.
> >
> > alloc_pages_node() will use numa_node_id() if you pass in NUMA_NO_NODE.
>
> The code uses numa_mem_id()
>
>         if (nid == NUMA_NO_NODE)
>                 nid = numa_mem_id();
>
> which is presumably the exact same thing as numa_node_id() on x86.  But I don't
> want to have to think that hard :-)

Uh yes, I missed numa_node_id() vs numa_mem_id(). Anyway, using
NUMA_NO_NODE with alloc_pages_node() is intended as an abstraction
such that you don't need to think about it :P

>
> In other words, all I'm saying is that if we want to mirror alloc_pages() and
> alloc_pages_node(), then we should mirror them exactly.

It's different though, these functions are core MM APIs used by the
entire kernel. snp_safe_alloc_page() is just a helper here wrapping
those core MM APIs rather than mirroring them.

In this case snp_safe_alloc_page() would wrap
snp_safe_alloc_page_node() which would wrap alloc_pages_node(). So it
should use alloc_pages_node() as intended: pass in a node id or
NUMA_NO_NODE if you just want the closest node.

Just my 2c, not that it will make a difference in practice.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] KVM: SVM: Consider NUMA affinity when allocating per-CPU save_area
  2024-04-23 19:08             ` Yosry Ahmed
@ 2024-04-23 20:38               ` Tom Lendacky
  0 siblings, 0 replies; 9+ messages in thread
From: Tom Lendacky @ 2024-04-23 20:38 UTC (permalink / raw)
  To: Yosry Ahmed, Sean Christopherson
  Cc: Peter Gonda, Li RongQing, pbonzini, kvm, Michael Roth,
	David Rientjes

On 4/23/24 14:08, Yosry Ahmed wrote:
> On Tue, Apr 23, 2024 at 12:00 PM Sean Christopherson <seanjc@google.com> wrote:
>> On Tue, Apr 23, 2024, Yosry Ahmed wrote:
>>> On Tue, Apr 23, 2024 at 11:43 AM Sean Christopherson <seanjc@google.com> wrote:
>>>> But snp_safe_alloc_page() should again flow alloc_pages() and pass numa_node_id(),
>>>> not NUMA_NO_NODE.
>>>
>>> alloc_pages_node() will use numa_node_id() if you pass in NUMA_NO_NODE.
>>
>> The code uses numa_mem_id()
>>
>>          if (nid == NUMA_NO_NODE)
>>                  nid = numa_mem_id();
>>
>> which is presumably the exact same thing as numa_node_id() on x86.  But I don't
>> want to have to think that hard :-)
> 
> Uh yes, I missed numa_node_id() vs numa_mem_id(). Anyway, using
> NUMA_NO_NODE with alloc_pages_node() is intended as an abstraction
> such that you don't need to think about it :P
> 
>>
>> In other words, all I'm saying is that if we want to mirror alloc_pages() and
>> alloc_pages_node(), then we should mirror them exactly.
> 
> It's different though, these functions are core MM APIs used by the
> entire kernel. snp_safe_alloc_page() is just a helper here wrapping
> those core MM APIs rather than mirroring them.
> 
> In this case snp_safe_alloc_page() would wrap
> snp_safe_alloc_page_node() which would wrap alloc_pages_node(). So it
> should use alloc_pages_node() as intended: pass in a node id or
> NUMA_NO_NODE if you just want the closest node.
> 
> Just my 2c, not that it will make a difference in practice.

Pretty much agree with everything everyone has said. The per-CPU 
shouldn't be GFP_KERNEL_ACCOUNT and having a snp_safe_alloc_page() that 
wraps snp_safe_alloc_page_node() which then calls alloc_pages_node() 
seems the best way to me.

Thanks,
Tom

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-04-23 20:38 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-18  3:07 [PATCH] KVM: SVM: Consider NUMA affinity when allocating per-CPU save_area Li RongQing
2024-04-23  0:29 ` Sean Christopherson
2024-04-23 13:30   ` Peter Gonda
2024-04-23 18:00     ` Yosry Ahmed
2024-04-23 18:43       ` Sean Christopherson
2024-04-23 18:52         ` Yosry Ahmed
2024-04-23 19:00           ` Sean Christopherson
2024-04-23 19:08             ` Yosry Ahmed
2024-04-23 20:38               ` Tom Lendacky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.