linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] KVM: arm64: Consider NUMA affinity when allocating per-CPU stack_page
@ 2024-04-15  3:36 Li RongQing
  2024-04-15  7:36 ` Marc Zyngier
  0 siblings, 1 reply; 4+ messages in thread
From: Li RongQing @ 2024-04-15  3:36 UTC (permalink / raw)
  To: maz, oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, linux-arm-kernel, kvmarm
  Cc: Li RongQing

per-CPU stack_page are dominantly accessed from their own local CPUs,
so allocate them node-local to improve performance.

Signed-off-by: Li RongQing <lirongqing@baidu.com>
---
 arch/arm64/kvm/arm.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index c4a0a35..d745d01 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -2330,15 +2330,15 @@ static int __init init_hyp_mode(void)
 	 * Allocate stack pages for Hypervisor-mode
 	 */
 	for_each_possible_cpu(cpu) {
-		unsigned long stack_page;
+		struct page *page;
 
-		stack_page = __get_free_page(GFP_KERNEL);
-		if (!stack_page) {
+		page = alloc_pages_node(cpu_to_node(cpu), GFP_KERNEL, 0);
+		if (!page) {
 			err = -ENOMEM;
 			goto out_err;
 		}
 
-		per_cpu(kvm_arm_hyp_stack_page, cpu) = stack_page;
+		per_cpu(kvm_arm_hyp_stack_page, cpu) = page_address(page);
 	}
 
 	/*
-- 
2.9.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] KVM: arm64: Consider NUMA affinity when allocating per-CPU stack_page
  2024-04-15  3:36 [PATCH] KVM: arm64: Consider NUMA affinity when allocating per-CPU stack_page Li RongQing
@ 2024-04-15  7:36 ` Marc Zyngier
  2024-04-18  6:53   ` Li,Rongqing
  0 siblings, 1 reply; 4+ messages in thread
From: Marc Zyngier @ 2024-04-15  7:36 UTC (permalink / raw)
  To: Li RongQing
  Cc: oliver.upton, james.morse, suzuki.poulose, yuzenghui,
	catalin.marinas, will, linux-arm-kernel, kvmarm

On Mon, 15 Apr 2024 04:36:14 +0100,
Li RongQing <lirongqing@baidu.com> wrote:
> 
> per-CPU stack_page are dominantly accessed from their own local CPUs,
> so allocate them node-local to improve performance.

Do you have any performance data to back this up?

Given that this is only used in the non-VHE case, and that by doing so
you have left quite a lot of performance on the floor already, I'm
even more surprised to see the performance argument.

Furthermore, you don't address the allocation of per-CPU data, which
has a much larger potential impact, given how the HYP code is
structured.

> 
> Signed-off-by: Li RongQing <lirongqing@baidu.com>
> ---
>  arch/arm64/kvm/arm.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index c4a0a35..d745d01 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -2330,15 +2330,15 @@ static int __init init_hyp_mode(void)
>  	 * Allocate stack pages for Hypervisor-mode
>  	 */
>  	for_each_possible_cpu(cpu) {
> -		unsigned long stack_page;
> +		struct page *page;
>  
> -		stack_page = __get_free_page(GFP_KERNEL);
> -		if (!stack_page) {
> +		page = alloc_pages_node(cpu_to_node(cpu), GFP_KERNEL, 0);
> +		if (!page) {
>  			err = -ENOMEM;
>  			goto out_err;
>  		}
>  
> -		per_cpu(kvm_arm_hyp_stack_page, cpu) = stack_page;
> +		per_cpu(kvm_arm_hyp_stack_page, cpu) = page_address(page);
>  	}
>  
>  	/*

The patch itself looks correct, however partial and lacking some
convincing data.

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH] KVM: arm64: Consider NUMA affinity when allocating per-CPU stack_page
  2024-04-15  7:36 ` Marc Zyngier
@ 2024-04-18  6:53   ` Li,Rongqing
  2024-04-18  9:03     ` Marc Zyngier
  0 siblings, 1 reply; 4+ messages in thread
From: Li,Rongqing @ 2024-04-18  6:53 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: oliver.upton@linux.dev, james.morse@arm.com,
	suzuki.poulose@arm.com, yuzenghui@huawei.com,
	catalin.marinas@arm.com, will@kernel.org,
	linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev



> Li RongQing <lirongqing@baidu.com> wrote:
> >
> > per-CPU stack_page are dominantly accessed from their own local CPUs,
> > so allocate them node-local to improve performance.
> 
> Do you have any performance data to back this up?
> 
> Given that this is only used in the non-VHE case, and that by doing so you have
> left quite a lot of performance on the floor already, I'm even more surprised to
> see the performance argument.
> 

Sorry, I have not setup to test it.

> Furthermore, you don't address the allocation of per-CPU data, which has a
> much larger potential impact, given how the HYP code is structured.
> 

I will add it in V2

Thanks

-LiRongQing


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] KVM: arm64: Consider NUMA affinity when allocating per-CPU stack_page
  2024-04-18  6:53   ` Li,Rongqing
@ 2024-04-18  9:03     ` Marc Zyngier
  0 siblings, 0 replies; 4+ messages in thread
From: Marc Zyngier @ 2024-04-18  9:03 UTC (permalink / raw)
  To: Li,Rongqing
  Cc: oliver.upton@linux.dev, james.morse@arm.com,
	suzuki.poulose@arm.com, yuzenghui@huawei.com,
	catalin.marinas@arm.com, will@kernel.org,
	linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev

On Thu, 18 Apr 2024 07:53:58 +0100,
"Li,Rongqing" <lirongqing@baidu.com> wrote:
> > Li RongQing <lirongqing@baidu.com> wrote:
> > >
> > > per-CPU stack_page are dominantly accessed from their own local CPUs,
> > > so allocate them node-local to improve performance.
> > 
> > Do you have any performance data to back this up?
> > 
> > Given that this is only used in the non-VHE case, and that by doing so you have
> > left quite a lot of performance on the floor already, I'm even more surprised to
> > see the performance argument.
> > 
> 
> Sorry, I have not setup to test it.
> 
> > Furthermore, you don't address the allocation of per-CPU data, which has a
> > much larger potential impact, given how the HYP code is structured.
> > 
> 
> I will add it in V2

Please test your patch before posting it. I'm not interested in
looking at patches that have not been tested by their author.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-04-18  9:03 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-15  3:36 [PATCH] KVM: arm64: Consider NUMA affinity when allocating per-CPU stack_page Li RongQing
2024-04-15  7:36 ` Marc Zyngier
2024-04-18  6:53   ` Li,Rongqing
2024-04-18  9:03     ` Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).