* [PATCH] KVM: SVM: Consider NUMA affinity when allocating per-CPU save_area
@ 2024-04-18 3:07 Li RongQing
2024-04-23 0:29 ` Sean Christopherson
0 siblings, 1 reply; 9+ messages in thread
From: Li RongQing @ 2024-04-18 3:07 UTC (permalink / raw)
To: seanjc, pbonzini, kvm; +Cc: Li RongQing
save_area of per-CPU svm_data are dominantly accessed from their
own local CPUs, so allocate them node-local for performance reason
Signed-off-by: Li RongQing <lirongqing@baidu.com>
---
arch/x86/kvm/svm/sev.c | 6 +++---
arch/x86/kvm/svm/svm.c | 2 +-
arch/x86/kvm/svm/svm.h | 6 +++++-
3 files changed, 9 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 61a7531..cce8ec7 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3179,13 +3179,13 @@ void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, 1);
}
-struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
+struct page *snp_safe_alloc_page_node(struct kvm_vcpu *vcpu, int node)
{
unsigned long pfn;
struct page *p;
if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
- return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
+ return alloc_pages_node(node, GFP_KERNEL_ACCOUNT | __GFP_ZERO, 0);
/*
* Allocate an SNP-safe page to workaround the SNP erratum where
@@ -3196,7 +3196,7 @@ struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
* Allocate one extra page, choose a page which is not
* 2MB-aligned, and free the other.
*/
- p = alloc_pages(GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
+ p = alloc_pages_node(node, GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
if (!p)
return NULL;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d1a9f995..69fc809 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -703,7 +703,7 @@ static int svm_cpu_init(int cpu)
int ret = -ENOMEM;
memset(sd, 0, sizeof(struct svm_cpu_data));
- sd->save_area = snp_safe_alloc_page(NULL);
+ sd->save_area = snp_safe_alloc_page_node(NULL, cpu_to_node(cpu));
if (!sd->save_area)
return ret;
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 7f1fbd8..3bbf638 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -694,8 +694,12 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm);
void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector);
void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa);
void sev_es_unmap_ghcb(struct vcpu_svm *svm);
-struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu);
+struct page *snp_safe_alloc_page_node(struct kvm_vcpu *vcpu, int node);
+static inline struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
+{
+ return snp_safe_alloc_page_node(vcpu, NUMA_NO_NODE);
+}
/* vmenter.S */
void __svm_sev_es_vcpu_run(struct vcpu_svm *svm, bool spec_ctrl_intercepted);
--
2.9.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] KVM: SVM: Consider NUMA affinity when allocating per-CPU save_area
2024-04-18 3:07 [PATCH] KVM: SVM: Consider NUMA affinity when allocating per-CPU save_area Li RongQing
@ 2024-04-23 0:29 ` Sean Christopherson
2024-04-23 13:30 ` Peter Gonda
0 siblings, 1 reply; 9+ messages in thread
From: Sean Christopherson @ 2024-04-23 0:29 UTC (permalink / raw)
To: Li RongQing; +Cc: pbonzini, kvm, Tom Lendacky, Michael Roth, Peter Gonda
+Tom, Mike, and Peter
On Thu, Apr 18, 2024, Li RongQing wrote:
> save_area of per-CPU svm_data are dominantly accessed from their
> own local CPUs, so allocate them node-local for performance reason
>
> Signed-off-by: Li RongQing <lirongqing@baidu.com>
> ---
> arch/x86/kvm/svm/sev.c | 6 +++---
> arch/x86/kvm/svm/svm.c | 2 +-
> arch/x86/kvm/svm/svm.h | 6 +++++-
> 3 files changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 61a7531..cce8ec7 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -3179,13 +3179,13 @@ void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
> ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, 1);
> }
>
> -struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
> +struct page *snp_safe_alloc_page_node(struct kvm_vcpu *vcpu, int node)
> {
> unsigned long pfn;
> struct page *p;
>
> if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
> - return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
> + return alloc_pages_node(node, GFP_KERNEL_ACCOUNT | __GFP_ZERO, 0);
>
> /*
> * Allocate an SNP-safe page to workaround the SNP erratum where
> @@ -3196,7 +3196,7 @@ struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
> * Allocate one extra page, choose a page which is not
> * 2MB-aligned, and free the other.
> */
> - p = alloc_pages(GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
> + p = alloc_pages_node(node, GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
This made me realize the existing code is buggy. The allocation for the per-CPU
save area shouldn't be accounted.
Also, what's the point of taking @vcpu? It's a nice enough flag to say "this
should be accounted", but it's decidely odd.
How about we fix both in a single series, and end up with this over 3-4 patches?
I.e. pass -1 where vcpu is non-NULL, and the current CPU for the save area.
struct page *snp_safe_alloc_page(int cpu)
{
unsigned long pfn;
struct page *p;
gfp_t gpf;
int node;
if (cpu >= 0) {
node = cpu_to_node(cpu);
gfp = GFP_KERNEL;
} else {
node = NUMA_NO_NODE;
gfp = GFP_KERNEL_ACCOUNT
}
gfp |= __GFP_ZERO;
if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
return alloc_pages_node(node, gfp, 0);
/*
* Allocate an SNP-safe page to workaround the SNP erratum where
* the CPU will incorrectly signal an RMP violation #PF if a
* hugepage (2MB or 1GB) collides with the RMP entry of a
* 2MB-aligned VMCB, VMSA, or AVIC backing page.
*
* Allocate one extra page, choose a page which is not
* 2MB-aligned, and free the other.
*/
p = alloc_pages_node(node, gfp, 1);
if (!p)
return NULL;
split_page(p, 1);
pfn = page_to_pfn(p);
if (IS_ALIGNED(pfn, PTRS_PER_PMD))
__free_page(p++);
else
__free_page(p + 1);
return p;
}
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] KVM: SVM: Consider NUMA affinity when allocating per-CPU save_area
2024-04-23 0:29 ` Sean Christopherson
@ 2024-04-23 13:30 ` Peter Gonda
2024-04-23 18:00 ` Yosry Ahmed
0 siblings, 1 reply; 9+ messages in thread
From: Peter Gonda @ 2024-04-23 13:30 UTC (permalink / raw)
To: Sean Christopherson
Cc: Li RongQing, pbonzini, kvm, Tom Lendacky, Michael Roth,
David Rientjes, Yosry Ahmed
On Mon, Apr 22, 2024 at 6:29 PM Sean Christopherson <seanjc@google.com> wrote:
>
> +Tom, Mike, and Peter
>
> On Thu, Apr 18, 2024, Li RongQing wrote:
> > save_area of per-CPU svm_data are dominantly accessed from their
> > own local CPUs, so allocate them node-local for performance reason
> >
> > Signed-off-by: Li RongQing <lirongqing@baidu.com>
> > ---
> > arch/x86/kvm/svm/sev.c | 6 +++---
> > arch/x86/kvm/svm/svm.c | 2 +-
> > arch/x86/kvm/svm/svm.h | 6 +++++-
> > 3 files changed, 9 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> > index 61a7531..cce8ec7 100644
> > --- a/arch/x86/kvm/svm/sev.c
> > +++ b/arch/x86/kvm/svm/sev.c
> > @@ -3179,13 +3179,13 @@ void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
> > ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, 1);
> > }
> >
> > -struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
> > +struct page *snp_safe_alloc_page_node(struct kvm_vcpu *vcpu, int node)
> > {
> > unsigned long pfn;
> > struct page *p;
> >
> > if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
> > - return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
> > + return alloc_pages_node(node, GFP_KERNEL_ACCOUNT | __GFP_ZERO, 0);
> >
> > /*
> > * Allocate an SNP-safe page to workaround the SNP erratum where
> > @@ -3196,7 +3196,7 @@ struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
> > * Allocate one extra page, choose a page which is not
> > * 2MB-aligned, and free the other.
> > */
> > - p = alloc_pages(GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
> > + p = alloc_pages_node(node, GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
>
> This made me realize the existing code is buggy. The allocation for the per-CPU
> save area shouldn't be accounted.
>
> Also, what's the point of taking @vcpu? It's a nice enough flag to say "this
> should be accounted", but it's decidely odd.
>
> How about we fix both in a single series, and end up with this over 3-4 patches?
> I.e. pass -1 where vcpu is non-NULL, and the current CPU for the save area.
Looks good to me. Internally we already use GFP_KERNEL for these
allocations. But we had an issue with split_page() and memcg
accounting internally. Yosry submitted the following:
+ if (memcg_kmem_charge(p, GFP_KERNEL_ACCOUNT, 0)) {
+ __free_page(p);
+ return NULL;
+ }
Not sure if this is an issue with our kernel or if we should use
split_page_memcg() here? It was suggested internally but we didn't
want to backport it.
>
> struct page *snp_safe_alloc_page(int cpu)
> {
> unsigned long pfn;
> struct page *p;
> gfp_t gpf;
> int node;
>
> if (cpu >= 0) {
> node = cpu_to_node(cpu);
> gfp = GFP_KERNEL;
> } else {
> node = NUMA_NO_NODE;
> gfp = GFP_KERNEL_ACCOUNT
> }
> gfp |= __GFP_ZERO;
>
> if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
> return alloc_pages_node(node, gfp, 0);
>
> /*
> * Allocate an SNP-safe page to workaround the SNP erratum where
> * the CPU will incorrectly signal an RMP violation #PF if a
> * hugepage (2MB or 1GB) collides with the RMP entry of a
> * 2MB-aligned VMCB, VMSA, or AVIC backing page.
> *
> * Allocate one extra page, choose a page which is not
> * 2MB-aligned, and free the other.
> */
> p = alloc_pages_node(node, gfp, 1);
> if (!p)
> return NULL;
>
> split_page(p, 1);
>
> pfn = page_to_pfn(p);
> if (IS_ALIGNED(pfn, PTRS_PER_PMD))
> __free_page(p++);
> else
> __free_page(p + 1);
>
> return p;
> }
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] KVM: SVM: Consider NUMA affinity when allocating per-CPU save_area
2024-04-23 13:30 ` Peter Gonda
@ 2024-04-23 18:00 ` Yosry Ahmed
2024-04-23 18:43 ` Sean Christopherson
0 siblings, 1 reply; 9+ messages in thread
From: Yosry Ahmed @ 2024-04-23 18:00 UTC (permalink / raw)
To: Peter Gonda
Cc: Sean Christopherson, Li RongQing, pbonzini, kvm, Tom Lendacky,
Michael Roth, David Rientjes
On Tue, Apr 23, 2024 at 6:30 AM Peter Gonda <pgonda@google.com> wrote:
>
> On Mon, Apr 22, 2024 at 6:29 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > +Tom, Mike, and Peter
> >
> > On Thu, Apr 18, 2024, Li RongQing wrote:
> > > save_area of per-CPU svm_data are dominantly accessed from their
> > > own local CPUs, so allocate them node-local for performance reason
> > >
> > > Signed-off-by: Li RongQing <lirongqing@baidu.com>
> > > ---
> > > arch/x86/kvm/svm/sev.c | 6 +++---
> > > arch/x86/kvm/svm/svm.c | 2 +-
> > > arch/x86/kvm/svm/svm.h | 6 +++++-
> > > 3 files changed, 9 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> > > index 61a7531..cce8ec7 100644
> > > --- a/arch/x86/kvm/svm/sev.c
> > > +++ b/arch/x86/kvm/svm/sev.c
> > > @@ -3179,13 +3179,13 @@ void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
> > > ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, 1);
> > > }
> > >
> > > -struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
> > > +struct page *snp_safe_alloc_page_node(struct kvm_vcpu *vcpu, int node)
> > > {
> > > unsigned long pfn;
> > > struct page *p;
> > >
> > > if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
> > > - return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
> > > + return alloc_pages_node(node, GFP_KERNEL_ACCOUNT | __GFP_ZERO, 0);
> > >
> > > /*
> > > * Allocate an SNP-safe page to workaround the SNP erratum where
> > > @@ -3196,7 +3196,7 @@ struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
> > > * Allocate one extra page, choose a page which is not
> > > * 2MB-aligned, and free the other.
> > > */
> > > - p = alloc_pages(GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
> > > + p = alloc_pages_node(node, GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
> >
> > This made me realize the existing code is buggy. The allocation for the per-CPU
> > save area shouldn't be accounted.
> >
> > Also, what's the point of taking @vcpu? It's a nice enough flag to say "this
> > should be accounted", but it's decidely odd.
> >
> > How about we fix both in a single series, and end up with this over 3-4 patches?
> > I.e. pass -1 where vcpu is non-NULL, and the current CPU for the save area.
>
> Looks good to me. Internally we already use GFP_KERNEL for these
> allocations. But we had an issue with split_page() and memcg
> accounting internally. Yosry submitted the following:
>
> + if (memcg_kmem_charge(p, GFP_KERNEL_ACCOUNT, 0)) {
> + __free_page(p);
> + return NULL;
> + }
>
> Not sure if this is an issue with our kernel or if we should use
> split_page_memcg() here? It was suggested internally but we didn't
> want to backport it.
The referenced internal code is in an older kernel where split_page()
was buggy and did not handle memcg charging correctly (did not call
split_page_memcg()). So we needed to make the allocation with
GFP_KERNEL, then manually account the used page after splitting and
freeing the unneeded page.
AFAICT, this is not needed upstream and the current code is fine.
>
> >
> > struct page *snp_safe_alloc_page(int cpu)
> > {
> > unsigned long pfn;
> > struct page *p;
> > gfp_t gpf;
> > int node;
> >
> > if (cpu >= 0) {
> > node = cpu_to_node(cpu);
> > gfp = GFP_KERNEL;
> > } else {
> > node = NUMA_NO_NODE;
> > gfp = GFP_KERNEL_ACCOUNT
> > }
FWIW, from the pov of someone who has zero familiarity with this code,
passing @cpu only to make inferences about the GFP flags and numa
nodes is confusing tbh.
Would it be clearer to pass in the gfp flags and node id directly? I
think it would make it clearer why we choose to account the allocation
and/or specify a node in some cases but not others.
> > gfp |= __GFP_ZERO;
> >
> > if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
> > return alloc_pages_node(node, gfp, 0);
> >
> > /*
> > * Allocate an SNP-safe page to workaround the SNP erratum where
> > * the CPU will incorrectly signal an RMP violation #PF if a
> > * hugepage (2MB or 1GB) collides with the RMP entry of a
> > * 2MB-aligned VMCB, VMSA, or AVIC backing page.
> > *
> > * Allocate one extra page, choose a page which is not
> > * 2MB-aligned, and free the other.
> > */
> > p = alloc_pages_node(node, gfp, 1);
> > if (!p)
> > return NULL;
> >
> > split_page(p, 1);
> >
> > pfn = page_to_pfn(p);
> > if (IS_ALIGNED(pfn, PTRS_PER_PMD))
> > __free_page(p++);
> > else
> > __free_page(p + 1);
> >
> > return p;
> > }
> >
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] KVM: SVM: Consider NUMA affinity when allocating per-CPU save_area
2024-04-23 18:00 ` Yosry Ahmed
@ 2024-04-23 18:43 ` Sean Christopherson
2024-04-23 18:52 ` Yosry Ahmed
0 siblings, 1 reply; 9+ messages in thread
From: Sean Christopherson @ 2024-04-23 18:43 UTC (permalink / raw)
To: Yosry Ahmed
Cc: Peter Gonda, Li RongQing, pbonzini, kvm, Tom Lendacky,
Michael Roth, David Rientjes
On Tue, Apr 23, 2024, Yosry Ahmed wrote:
> On Tue, Apr 23, 2024 at 6:30 AM Peter Gonda <pgonda@google.com> wrote:
> > > struct page *snp_safe_alloc_page(int cpu)
> > > {
> > > unsigned long pfn;
> > > struct page *p;
> > > gfp_t gpf;
> > > int node;
> > >
> > > if (cpu >= 0) {
> > > node = cpu_to_node(cpu);
> > > gfp = GFP_KERNEL;
> > > } else {
> > > node = NUMA_NO_NODE;
> > > gfp = GFP_KERNEL_ACCOUNT
> > > }
>
> FWIW, from the pov of someone who has zero familiarity with this code,
> passing @cpu only to make inferences about the GFP flags and numa
> nodes is confusing tbh.
>
> Would it be clearer to pass in the gfp flags and node id directly? I
> think it would make it clearer why we choose to account the allocation
> and/or specify a node in some cases but not others.
Hmm, yeah, passing GFP directly makes sense, if only to align with alloc_page()
and not reinvent the wheel. But forcing all callers to explicitly provide a node
ID is a net negative, i.e. having snp_safe_alloc_page() and snp_safe_alloc_page_node()
as originally suggested makes sense.
But snp_safe_alloc_page() should again flow alloc_pages() and pass numa_node_id(),
not NUMA_NO_NODE.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] KVM: SVM: Consider NUMA affinity when allocating per-CPU save_area
2024-04-23 18:43 ` Sean Christopherson
@ 2024-04-23 18:52 ` Yosry Ahmed
2024-04-23 19:00 ` Sean Christopherson
0 siblings, 1 reply; 9+ messages in thread
From: Yosry Ahmed @ 2024-04-23 18:52 UTC (permalink / raw)
To: Sean Christopherson
Cc: Peter Gonda, Li RongQing, pbonzini, kvm, Tom Lendacky,
Michael Roth, David Rientjes
On Tue, Apr 23, 2024 at 11:43 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Tue, Apr 23, 2024, Yosry Ahmed wrote:
> > On Tue, Apr 23, 2024 at 6:30 AM Peter Gonda <pgonda@google.com> wrote:
> > > > struct page *snp_safe_alloc_page(int cpu)
> > > > {
> > > > unsigned long pfn;
> > > > struct page *p;
> > > > gfp_t gpf;
> > > > int node;
> > > >
> > > > if (cpu >= 0) {
> > > > node = cpu_to_node(cpu);
> > > > gfp = GFP_KERNEL;
> > > > } else {
> > > > node = NUMA_NO_NODE;
> > > > gfp = GFP_KERNEL_ACCOUNT
> > > > }
> >
> > FWIW, from the pov of someone who has zero familiarity with this code,
> > passing @cpu only to make inferences about the GFP flags and numa
> > nodes is confusing tbh.
> >
> > Would it be clearer to pass in the gfp flags and node id directly? I
> > think it would make it clearer why we choose to account the allocation
> > and/or specify a node in some cases but not others.
>
> Hmm, yeah, passing GFP directly makes sense, if only to align with alloc_page()
> and not reinvent the wheel. But forcing all callers to explicitly provide a node
> ID is a net negative, i.e. having snp_safe_alloc_page() and snp_safe_alloc_page_node()
> as originally suggested makes sense.
Yeah if most callers do not need to provide a node then it makes sense
to have a specialized snp_safe_alloc_page_node() variant.
>
> But snp_safe_alloc_page() should again flow alloc_pages() and pass numa_node_id(),
> not NUMA_NO_NODE.
alloc_pages_node() will use numa_node_id() if you pass in
NUMA_NO_NODE. That's the documented behavior and it seems to be widely
used. I don't see anyone using numa_node_id() directly with
alloc_pages_node().
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] KVM: SVM: Consider NUMA affinity when allocating per-CPU save_area
2024-04-23 18:52 ` Yosry Ahmed
@ 2024-04-23 19:00 ` Sean Christopherson
2024-04-23 19:08 ` Yosry Ahmed
0 siblings, 1 reply; 9+ messages in thread
From: Sean Christopherson @ 2024-04-23 19:00 UTC (permalink / raw)
To: Yosry Ahmed
Cc: Peter Gonda, Li RongQing, pbonzini, kvm, Tom Lendacky,
Michael Roth, David Rientjes
On Tue, Apr 23, 2024, Yosry Ahmed wrote:
> On Tue, Apr 23, 2024 at 11:43 AM Sean Christopherson <seanjc@google.com> wrote:
> > But snp_safe_alloc_page() should again flow alloc_pages() and pass numa_node_id(),
> > not NUMA_NO_NODE.
>
> alloc_pages_node() will use numa_node_id() if you pass in NUMA_NO_NODE.
The code uses numa_mem_id()
if (nid == NUMA_NO_NODE)
nid = numa_mem_id();
which is presumably the exact same thing as numa_node_id() on x86. But I don't
want to have to think that hard :-)
In other words, all I'm saying is that if we want to mirror alloc_pages() and
alloc_pages_node(), then we should mirror them exactly.
> That's the documented behavior and it seems to be widely
> used. I don't see anyone using numa_node_id() directly with
> alloc_pages_node().
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] KVM: SVM: Consider NUMA affinity when allocating per-CPU save_area
2024-04-23 19:00 ` Sean Christopherson
@ 2024-04-23 19:08 ` Yosry Ahmed
2024-04-23 20:38 ` Tom Lendacky
0 siblings, 1 reply; 9+ messages in thread
From: Yosry Ahmed @ 2024-04-23 19:08 UTC (permalink / raw)
To: Sean Christopherson
Cc: Peter Gonda, Li RongQing, pbonzini, kvm, Tom Lendacky,
Michael Roth, David Rientjes
On Tue, Apr 23, 2024 at 12:00 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Tue, Apr 23, 2024, Yosry Ahmed wrote:
> > On Tue, Apr 23, 2024 at 11:43 AM Sean Christopherson <seanjc@google.com> wrote:
> > > But snp_safe_alloc_page() should again flow alloc_pages() and pass numa_node_id(),
> > > not NUMA_NO_NODE.
> >
> > alloc_pages_node() will use numa_node_id() if you pass in NUMA_NO_NODE.
>
> The code uses numa_mem_id()
>
> if (nid == NUMA_NO_NODE)
> nid = numa_mem_id();
>
> which is presumably the exact same thing as numa_node_id() on x86. But I don't
> want to have to think that hard :-)
Uh yes, I missed numa_node_id() vs numa_mem_id(). Anyway, using
NUMA_NO_NODE with alloc_pages_node() is intended as an abstraction
such that you don't need to think about it :P
>
> In other words, all I'm saying is that if we want to mirror alloc_pages() and
> alloc_pages_node(), then we should mirror them exactly.
It's different though, these functions are core MM APIs used by the
entire kernel. snp_safe_alloc_page() is just a helper here wrapping
those core MM APIs rather than mirroring them.
In this case snp_safe_alloc_page() would wrap
snp_safe_alloc_page_node() which would wrap alloc_pages_node(). So it
should use alloc_pages_node() as intended: pass in a node id or
NUMA_NO_NODE if you just want the closest node.
Just my 2c, not that it will make a difference in practice.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] KVM: SVM: Consider NUMA affinity when allocating per-CPU save_area
2024-04-23 19:08 ` Yosry Ahmed
@ 2024-04-23 20:38 ` Tom Lendacky
0 siblings, 0 replies; 9+ messages in thread
From: Tom Lendacky @ 2024-04-23 20:38 UTC (permalink / raw)
To: Yosry Ahmed, Sean Christopherson
Cc: Peter Gonda, Li RongQing, pbonzini, kvm, Michael Roth,
David Rientjes
On 4/23/24 14:08, Yosry Ahmed wrote:
> On Tue, Apr 23, 2024 at 12:00 PM Sean Christopherson <seanjc@google.com> wrote:
>> On Tue, Apr 23, 2024, Yosry Ahmed wrote:
>>> On Tue, Apr 23, 2024 at 11:43 AM Sean Christopherson <seanjc@google.com> wrote:
>>>> But snp_safe_alloc_page() should again flow alloc_pages() and pass numa_node_id(),
>>>> not NUMA_NO_NODE.
>>>
>>> alloc_pages_node() will use numa_node_id() if you pass in NUMA_NO_NODE.
>>
>> The code uses numa_mem_id()
>>
>> if (nid == NUMA_NO_NODE)
>> nid = numa_mem_id();
>>
>> which is presumably the exact same thing as numa_node_id() on x86. But I don't
>> want to have to think that hard :-)
>
> Uh yes, I missed numa_node_id() vs numa_mem_id(). Anyway, using
> NUMA_NO_NODE with alloc_pages_node() is intended as an abstraction
> such that you don't need to think about it :P
>
>>
>> In other words, all I'm saying is that if we want to mirror alloc_pages() and
>> alloc_pages_node(), then we should mirror them exactly.
>
> It's different though, these functions are core MM APIs used by the
> entire kernel. snp_safe_alloc_page() is just a helper here wrapping
> those core MM APIs rather than mirroring them.
>
> In this case snp_safe_alloc_page() would wrap
> snp_safe_alloc_page_node() which would wrap alloc_pages_node(). So it
> should use alloc_pages_node() as intended: pass in a node id or
> NUMA_NO_NODE if you just want the closest node.
>
> Just my 2c, not that it will make a difference in practice.
Pretty much agree with everything everyone has said. The per-CPU
shouldn't be GFP_KERNEL_ACCOUNT and having a snp_safe_alloc_page() that
wraps snp_safe_alloc_page_node() which then calls alloc_pages_node()
seems the best way to me.
Thanks,
Tom
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-04-23 20:38 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-18 3:07 [PATCH] KVM: SVM: Consider NUMA affinity when allocating per-CPU save_area Li RongQing
2024-04-23 0:29 ` Sean Christopherson
2024-04-23 13:30 ` Peter Gonda
2024-04-23 18:00 ` Yosry Ahmed
2024-04-23 18:43 ` Sean Christopherson
2024-04-23 18:52 ` Yosry Ahmed
2024-04-23 19:00 ` Sean Christopherson
2024-04-23 19:08 ` Yosry Ahmed
2024-04-23 20:38 ` Tom Lendacky
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.