linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] arm64: irq: set the correct node for VMAP stack
@ 2023-11-14  9:16 Huang Shijie
  2023-11-15 14:50 ` kernel test robot
  2023-11-16 17:18 ` Catalin Marinas
  0 siblings, 2 replies; 10+ messages in thread
From: Huang Shijie @ 2023-11-14  9:16 UTC (permalink / raw)
  To: catalin.marinas
  Cc: will, gregkh, rafael, arnd, mark.rutland, broonie, keescook,
	linux-arm-kernel, linux-kernel, linux-arch, patches, Huang Shijie

In current code, init_irq_stacks() will call cpu_to_node().
The cpu_to_node() depends on percpu "numa_node" which is initialized in:
     arch_call_rest_init() --> rest_init() -- kernel_init()
	--> kernel_init_freeable() --> smp_prepare_cpus()

But init_irq_stacks() is called in init_IRQ() which is before
arch_call_rest_init().

So in init_irq_stacks(), the cpu_to_node() does not work, it
always return 0. In NUMA, it makes the node 1 cpu accesses the IRQ stack which
is in the node 0.

This patch fixes it by exporting the early_cpu_to_node(), and use it
in the init_irq_stacks().

Signed-off-by: Huang Shijie <shijie@os.amperecomputing.com>
---
 arch/arm64/kernel/irq.c    | 2 +-
 drivers/base/arch_numa.c   | 2 +-
 include/asm-generic/numa.h | 1 +
 3 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
index 6ad5c6ef5329..e62d3cb3f74c 100644
--- a/arch/arm64/kernel/irq.c
+++ b/arch/arm64/kernel/irq.c
@@ -57,7 +57,7 @@ static void init_irq_stacks(void)
 	unsigned long *p;
 
 	for_each_possible_cpu(cpu) {
-		p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, cpu_to_node(cpu));
+		p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, early_cpu_to_node(cpu));
 		per_cpu(irq_stack_ptr, cpu) = p;
 	}
 }
diff --git a/drivers/base/arch_numa.c b/drivers/base/arch_numa.c
index eaa31e567d1e..90519d981471 100644
--- a/drivers/base/arch_numa.c
+++ b/drivers/base/arch_numa.c
@@ -144,7 +144,7 @@ void __init early_map_cpu_to_node(unsigned int cpu, int nid)
 unsigned long __per_cpu_offset[NR_CPUS] __read_mostly;
 EXPORT_SYMBOL(__per_cpu_offset);
 
-static int __init early_cpu_to_node(int cpu)
+int early_cpu_to_node(int cpu)
 {
 	return cpu_to_node_map[cpu];
 }
diff --git a/include/asm-generic/numa.h b/include/asm-generic/numa.h
index 1a3ad6d29833..fc8a9bd6a444 100644
--- a/include/asm-generic/numa.h
+++ b/include/asm-generic/numa.h
@@ -38,6 +38,7 @@ void __init early_map_cpu_to_node(unsigned int cpu, int nid);
 void numa_store_cpu_info(unsigned int cpu);
 void numa_add_cpu(unsigned int cpu);
 void numa_remove_cpu(unsigned int cpu);
+int early_cpu_to_node(int cpu);
 
 #else	/* CONFIG_NUMA */
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] arm64: irq: set the correct node for VMAP stack
  2023-11-14  9:16 [PATCH] arm64: irq: set the correct node for VMAP stack Huang Shijie
@ 2023-11-15 14:50 ` kernel test robot
  2023-11-16 17:18 ` Catalin Marinas
  1 sibling, 0 replies; 10+ messages in thread
From: kernel test robot @ 2023-11-15 14:50 UTC (permalink / raw)
  To: Huang Shijie, catalin.marinas
  Cc: oe-kbuild-all, will, gregkh, rafael, arnd, mark.rutland, broonie,
	keescook, linux-arm-kernel, linux-kernel, linux-arch, patches,
	Huang Shijie

Hi Huang,

kernel test robot noticed the following build errors:

[auto build test ERROR on arm64/for-next/core]
[also build test ERROR on driver-core/driver-core-testing driver-core/driver-core-next driver-core/driver-core-linus arnd-asm-generic/master linus/master v6.7-rc1 next-20231115]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Huang-Shijie/arm64-irq-set-the-correct-node-for-VMAP-stack/20231114-171932
base:   https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-next/core
patch link:    https://lore.kernel.org/r/20231114091643.59530-1-shijie%40os.amperecomputing.com
patch subject: [PATCH] arm64: irq: set the correct node for VMAP stack
config: arm64-randconfig-001-20231115 (https://download.01.org/0day-ci/archive/20231115/202311152250.ozO781vZ-lkp@intel.com/config)
compiler: aarch64-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231115/202311152250.ozO781vZ-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202311152250.ozO781vZ-lkp@intel.com/

All errors (new ones prefixed by >>):

   arch/arm64/kernel/irq.c: In function 'init_irq_stacks':
>> arch/arm64/kernel/irq.c:60:59: error: implicit declaration of function 'early_cpu_to_node'; did you mean 'early_pfn_to_nid'? [-Werror=implicit-function-declaration]
      60 |                 p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, early_cpu_to_node(cpu));
         |                                                           ^~~~~~~~~~~~~~~~~
         |                                                           early_pfn_to_nid
   cc1: some warnings being treated as errors


vim +60 arch/arm64/kernel/irq.c

    52	
    53	#ifdef CONFIG_VMAP_STACK
    54	static void init_irq_stacks(void)
    55	{
    56		int cpu;
    57		unsigned long *p;
    58	
    59		for_each_possible_cpu(cpu) {
  > 60			p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, early_cpu_to_node(cpu));
    61			per_cpu(irq_stack_ptr, cpu) = p;
    62		}
    63	}
    64	#else
    65	/* irq stack only needs to be 16 byte aligned - not IRQ_STACK_SIZE aligned. */
    66	DEFINE_PER_CPU_ALIGNED(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack);
    67	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] arm64: irq: set the correct node for VMAP stack
  2023-11-14  9:16 [PATCH] arm64: irq: set the correct node for VMAP stack Huang Shijie
  2023-11-15 14:50 ` kernel test robot
@ 2023-11-16 17:18 ` Catalin Marinas
  2023-11-17  2:50   ` Shijie Huang
                     ` (2 more replies)
  1 sibling, 3 replies; 10+ messages in thread
From: Catalin Marinas @ 2023-11-16 17:18 UTC (permalink / raw)
  To: Huang Shijie
  Cc: will, gregkh, rafael, arnd, mark.rutland, broonie, keescook,
	linux-arm-kernel, linux-kernel, linux-arch, patches

On Tue, Nov 14, 2023 at 05:16:43PM +0800, Huang Shijie wrote:
> diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
> index 6ad5c6ef5329..e62d3cb3f74c 100644
> --- a/arch/arm64/kernel/irq.c
> +++ b/arch/arm64/kernel/irq.c
> @@ -57,7 +57,7 @@ static void init_irq_stacks(void)
>  	unsigned long *p;
>  
>  	for_each_possible_cpu(cpu) {
> -		p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, cpu_to_node(cpu));
> +		p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, early_cpu_to_node(cpu));
>  		per_cpu(irq_stack_ptr, cpu) = p;
>  	}
>  }

This looks alright to me, I don't have a better suggestion. The generic
code already has the cpu_to_node_map[] array populated by
early_map_cpu_to_node(), so let's reuse it.

> diff --git a/drivers/base/arch_numa.c b/drivers/base/arch_numa.c
> index eaa31e567d1e..90519d981471 100644
> --- a/drivers/base/arch_numa.c
> +++ b/drivers/base/arch_numa.c
> @@ -144,7 +144,7 @@ void __init early_map_cpu_to_node(unsigned int cpu, int nid)
>  unsigned long __per_cpu_offset[NR_CPUS] __read_mostly;
>  EXPORT_SYMBOL(__per_cpu_offset);
>  
> -static int __init early_cpu_to_node(int cpu)
> +int early_cpu_to_node(int cpu)
>  {
>  	return cpu_to_node_map[cpu];
>  }
> diff --git a/include/asm-generic/numa.h b/include/asm-generic/numa.h
> index 1a3ad6d29833..fc8a9bd6a444 100644
> --- a/include/asm-generic/numa.h
> +++ b/include/asm-generic/numa.h
> @@ -38,6 +38,7 @@ void __init early_map_cpu_to_node(unsigned int cpu, int nid);
>  void numa_store_cpu_info(unsigned int cpu);
>  void numa_add_cpu(unsigned int cpu);
>  void numa_remove_cpu(unsigned int cpu);
> +int early_cpu_to_node(int cpu);

Here I'd move this just below early_map_cpu_to_node() and, for
completeness, also add the dummy static inline for the !NUMA case.

-- 
Catalin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] arm64: irq: set the correct node for VMAP stack
  2023-11-16 17:18 ` Catalin Marinas
@ 2023-11-17  2:50   ` Shijie Huang
  2023-11-18 15:47   ` [PATCH v2] " Huang Shijie
  2023-11-18 16:02   ` [PATCH v3] " Huang Shijie
  2 siblings, 0 replies; 10+ messages in thread
From: Shijie Huang @ 2023-11-17  2:50 UTC (permalink / raw)
  To: Catalin Marinas, Huang Shijie
  Cc: will, gregkh, rafael, arnd, mark.rutland, broonie, keescook,
	linux-arm-kernel, linux-kernel, linux-arch, patches

Hi Catalin,

在 2023/11/17 1:18, Catalin Marinas 写道:
> On Tue, Nov 14, 2023 at 05:16:43PM +0800, Huang Shijie wrote:
>> diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
>> index 6ad5c6ef5329..e62d3cb3f74c 100644
>> --- a/arch/arm64/kernel/irq.c
>> +++ b/arch/arm64/kernel/irq.c
>> @@ -57,7 +57,7 @@ static void init_irq_stacks(void)
>>   	unsigned long *p;
>>   
>>   	for_each_possible_cpu(cpu) {
>> -		p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, cpu_to_node(cpu));
>> +		p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, early_cpu_to_node(cpu));
>>   		per_cpu(irq_stack_ptr, cpu) = p;
>>   	}
>>   }
> This looks alright to me, I don't have a better suggestion. The generic
> code already has the cpu_to_node_map[] array populated by
> early_map_cpu_to_node(), so let's reuse it.
>
>> diff --git a/drivers/base/arch_numa.c b/drivers/base/arch_numa.c
>> index eaa31e567d1e..90519d981471 100644
>> --- a/drivers/base/arch_numa.c
>> +++ b/drivers/base/arch_numa.c
>> @@ -144,7 +144,7 @@ void __init early_map_cpu_to_node(unsigned int cpu, int nid)
>>   unsigned long __per_cpu_offset[NR_CPUS] __read_mostly;
>>   EXPORT_SYMBOL(__per_cpu_offset);
>>   
>> -static int __init early_cpu_to_node(int cpu)
>> +int early_cpu_to_node(int cpu)
>>   {
>>   	return cpu_to_node_map[cpu];
>>   }
>> diff --git a/include/asm-generic/numa.h b/include/asm-generic/numa.h
>> index 1a3ad6d29833..fc8a9bd6a444 100644
>> --- a/include/asm-generic/numa.h
>> +++ b/include/asm-generic/numa.h
>> @@ -38,6 +38,7 @@ void __init early_map_cpu_to_node(unsigned int cpu, int nid);
>>   void numa_store_cpu_info(unsigned int cpu);
>>   void numa_add_cpu(unsigned int cpu);
>>   void numa_remove_cpu(unsigned int cpu);
>> +int early_cpu_to_node(int cpu);
> Here I'd move this just below early_map_cpu_to_node() and, for
> completeness, also add the dummy static inline for the !NUMA case.

Thanks a lot.  It seems there is no need for me to send the V2 for this.


Thanks

Huang Shijie



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2] arm64: irq: set the correct node for VMAP stack
  2023-11-16 17:18 ` Catalin Marinas
  2023-11-17  2:50   ` Shijie Huang
@ 2023-11-18 15:47   ` Huang Shijie
  2023-11-18 16:02   ` [PATCH v3] " Huang Shijie
  2 siblings, 0 replies; 10+ messages in thread
From: Huang Shijie @ 2023-11-18 15:47 UTC (permalink / raw)
  To: catalin.marinas
  Cc: will, gregkh, rafael, arnd, mark.rutland, broonie, keescook,
	linux-arm-kernel, linux-kernel, linux-arch, patches, Huang Shijie

In current code, init_irq_stacks() will call cpu_to_node().
The cpu_to_node() depends on percpu "numa_node" which is initialized in:
     arch_call_rest_init() --> rest_init() -- kernel_init()
	--> kernel_init_freeable() --> smp_prepare_cpus()

But init_irq_stacks() is called in init_IRQ() which is before
arch_call_rest_init().

So in init_irq_stacks(), the cpu_to_node() does not work, it
always return 0. In NUMA, it makes the node 1 cpu accesses the IRQ stack which
is in the node 0.

This patch fixes it by exporting the early_cpu_to_node(), and use it
in the init_irq_stacks().

Signed-off-by: Huang Shijie <shijie@os.amperecomputing.com>
---
v1 --> v2:
	fix the !NUMA compiling error.

---
 arch/arm64/kernel/irq.c    | 3 ++-
 drivers/base/arch_numa.c   | 2 +-
 include/asm-generic/numa.h | 2 ++
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
index 6ad5c6ef5329..5226030979ae 100644
--- a/arch/arm64/kernel/irq.c
+++ b/arch/arm64/kernel/irq.c
@@ -25,6 +25,7 @@
 #include <asm/softirq_stack.h>
 #include <asm/stacktrace.h>
 #include <asm/vmap_stack.h>
+#include <asm/numa.h>
 
 /* Only access this in an NMI enter/exit */
 DEFINE_PER_CPU(struct nmi_ctx, nmi_contexts);
@@ -57,7 +58,7 @@ static void init_irq_stacks(void)
 	unsigned long *p;
 
 	for_each_possible_cpu(cpu) {
-		p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, cpu_to_node(cpu));
+		p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, early_cpu_to_node(cpu));
 		per_cpu(irq_stack_ptr, cpu) = p;
 	}
 }
diff --git a/drivers/base/arch_numa.c b/drivers/base/arch_numa.c
index eaa31e567d1e..90519d981471 100644
--- a/drivers/base/arch_numa.c
+++ b/drivers/base/arch_numa.c
@@ -144,7 +144,7 @@ void __init early_map_cpu_to_node(unsigned int cpu, int nid)
 unsigned long __per_cpu_offset[NR_CPUS] __read_mostly;
 EXPORT_SYMBOL(__per_cpu_offset);
 
-static int __init early_cpu_to_node(int cpu)
+int early_cpu_to_node(int cpu)
 {
 	return cpu_to_node_map[cpu];
 }
diff --git a/include/asm-generic/numa.h b/include/asm-generic/numa.h
index 1a3ad6d29833..16073111bffc 100644
--- a/include/asm-generic/numa.h
+++ b/include/asm-generic/numa.h
@@ -35,6 +35,7 @@ int __init numa_add_memblk(int nodeid, u64 start, u64 end);
 void __init numa_set_distance(int from, int to, int distance);
 void __init numa_free_distance(void);
 void __init early_map_cpu_to_node(unsigned int cpu, int nid);
+int early_cpu_to_node(int cpu);
 void numa_store_cpu_info(unsigned int cpu);
 void numa_add_cpu(unsigned int cpu);
 void numa_remove_cpu(unsigned int cpu);
@@ -46,6 +47,7 @@ static inline void numa_add_cpu(unsigned int cpu) { }
 static inline void numa_remove_cpu(unsigned int cpu) { }
 static inline void arch_numa_init(void) { }
 static inline void early_map_cpu_to_node(unsigned int cpu, int nid) { }
+static inline int early_cpu_to_node(int cpu) { return 0; }
 
 #endif	/* CONFIG_NUMA */
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3] arm64: irq: set the correct node for VMAP stack
  2023-11-16 17:18 ` Catalin Marinas
  2023-11-17  2:50   ` Shijie Huang
  2023-11-18 15:47   ` [PATCH v2] " Huang Shijie
@ 2023-11-18 16:02   ` Huang Shijie
  2023-11-23 16:55     ` Catalin Marinas
  2 siblings, 1 reply; 10+ messages in thread
From: Huang Shijie @ 2023-11-18 16:02 UTC (permalink / raw)
  To: catalin.marinas
  Cc: will, gregkh, rafael, arnd, mark.rutland, broonie, keescook,
	linux-arm-kernel, linux-kernel, linux-arch, patches, Huang Shijie

In current code, init_irq_stacks() will call cpu_to_node().
The cpu_to_node() depends on percpu "numa_node" which is initialized in:
     arch_call_rest_init() --> rest_init() -- kernel_init()
	--> kernel_init_freeable() --> smp_prepare_cpus()

But init_irq_stacks() is called in init_IRQ() which is before
arch_call_rest_init().

So in init_irq_stacks(), the cpu_to_node() does not work, it
always return 0. In NUMA, it makes the node 1 cpu accesses the IRQ stack which
is in the node 0.

This patch fixes it by exporting the early_cpu_to_node(), and use it
in the init_irq_stacks().

Signed-off-by: Huang Shijie <shijie@os.amperecomputing.com>
---
v2 --> v3:
	move the "numa.h" to the right position.
---
 arch/arm64/kernel/irq.c    | 3 ++-
 drivers/base/arch_numa.c   | 2 +-
 include/asm-generic/numa.h | 2 ++
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
index 6ad5c6ef5329..d9ee14723478 100644
--- a/arch/arm64/kernel/irq.c
+++ b/arch/arm64/kernel/irq.c
@@ -22,6 +22,7 @@
 #include <linux/vmalloc.h>
 #include <asm/daifflags.h>
 #include <asm/exception.h>
+#include <asm/numa.h>
 #include <asm/softirq_stack.h>
 #include <asm/stacktrace.h>
 #include <asm/vmap_stack.h>
@@ -57,7 +58,7 @@ static void init_irq_stacks(void)
 	unsigned long *p;
 
 	for_each_possible_cpu(cpu) {
-		p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, cpu_to_node(cpu));
+		p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, early_cpu_to_node(cpu));
 		per_cpu(irq_stack_ptr, cpu) = p;
 	}
 }
diff --git a/drivers/base/arch_numa.c b/drivers/base/arch_numa.c
index eaa31e567d1e..90519d981471 100644
--- a/drivers/base/arch_numa.c
+++ b/drivers/base/arch_numa.c
@@ -144,7 +144,7 @@ void __init early_map_cpu_to_node(unsigned int cpu, int nid)
 unsigned long __per_cpu_offset[NR_CPUS] __read_mostly;
 EXPORT_SYMBOL(__per_cpu_offset);
 
-static int __init early_cpu_to_node(int cpu)
+int early_cpu_to_node(int cpu)
 {
 	return cpu_to_node_map[cpu];
 }
diff --git a/include/asm-generic/numa.h b/include/asm-generic/numa.h
index 1a3ad6d29833..16073111bffc 100644
--- a/include/asm-generic/numa.h
+++ b/include/asm-generic/numa.h
@@ -35,6 +35,7 @@ int __init numa_add_memblk(int nodeid, u64 start, u64 end);
 void __init numa_set_distance(int from, int to, int distance);
 void __init numa_free_distance(void);
 void __init early_map_cpu_to_node(unsigned int cpu, int nid);
+int early_cpu_to_node(int cpu);
 void numa_store_cpu_info(unsigned int cpu);
 void numa_add_cpu(unsigned int cpu);
 void numa_remove_cpu(unsigned int cpu);
@@ -46,6 +47,7 @@ static inline void numa_add_cpu(unsigned int cpu) { }
 static inline void numa_remove_cpu(unsigned int cpu) { }
 static inline void arch_numa_init(void) { }
 static inline void early_map_cpu_to_node(unsigned int cpu, int nid) { }
+static inline int early_cpu_to_node(int cpu) { return 0; }
 
 #endif	/* CONFIG_NUMA */
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v3] arm64: irq: set the correct node for VMAP stack
  2023-11-18 16:02   ` [PATCH v3] " Huang Shijie
@ 2023-11-23 16:55     ` Catalin Marinas
  2023-11-24  3:15       ` [PATCH v4] " Huang Shijie
  0 siblings, 1 reply; 10+ messages in thread
From: Catalin Marinas @ 2023-11-23 16:55 UTC (permalink / raw)
  To: Huang Shijie
  Cc: will, gregkh, rafael, arnd, mark.rutland, broonie, keescook,
	linux-arm-kernel, linux-kernel, linux-arch, patches

On Sun, Nov 19, 2023 at 12:02:05AM +0800, Huang Shijie wrote:
> diff --git a/drivers/base/arch_numa.c b/drivers/base/arch_numa.c
> index eaa31e567d1e..90519d981471 100644
> --- a/drivers/base/arch_numa.c
> +++ b/drivers/base/arch_numa.c
> @@ -144,7 +144,7 @@ void __init early_map_cpu_to_node(unsigned int cpu, int nid)
>  unsigned long __per_cpu_offset[NR_CPUS] __read_mostly;
>  EXPORT_SYMBOL(__per_cpu_offset);
>  
> -static int __init early_cpu_to_node(int cpu)
> +int early_cpu_to_node(int cpu)
>  {
>  	return cpu_to_node_map[cpu];
>  }

I don't think we need this change, let's make the arm64
init_irq_stacks() an __init function instead.

> diff --git a/include/asm-generic/numa.h b/include/asm-generic/numa.h
> index 1a3ad6d29833..16073111bffc 100644
> --- a/include/asm-generic/numa.h
> +++ b/include/asm-generic/numa.h
> @@ -35,6 +35,7 @@ int __init numa_add_memblk(int nodeid, u64 start, u64 end);
>  void __init numa_set_distance(int from, int to, int distance);
>  void __init numa_free_distance(void);
>  void __init early_map_cpu_to_node(unsigned int cpu, int nid);
> +int early_cpu_to_node(int cpu);

And add __init here.

With these changes:

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

Happy to take this patch through the arm64 tree if I get an ack from
Greg or Rafael on the drivers/* change.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v4] arm64: irq: set the correct node for VMAP stack
  2023-11-23 16:55     ` Catalin Marinas
@ 2023-11-24  3:15       ` Huang Shijie
  2023-11-24 11:47         ` Catalin Marinas
  2023-12-05 15:16         ` Will Deacon
  0 siblings, 2 replies; 10+ messages in thread
From: Huang Shijie @ 2023-11-24  3:15 UTC (permalink / raw)
  To: catalin.marinas
  Cc: will, gregkh, rafael, arnd, linux-arm-kernel, linux-kernel,
	linux-arch, Huang Shijie

In current code, init_irq_stacks() will call cpu_to_node().
The cpu_to_node() depends on percpu "numa_node" which is initialized in:
     arch_call_rest_init() --> rest_init() -- kernel_init()
	--> kernel_init_freeable() --> smp_prepare_cpus()

But init_irq_stacks() is called in init_IRQ() which is before
arch_call_rest_init().

So in init_irq_stacks(), the cpu_to_node() does not work, it
always return 0. In NUMA, it makes the node 1 cpu accesses the IRQ stack which
is in the node 0.

This patch fixes it by:
  1.) export the early_cpu_to_node(), and use it in the init_irq_stacks().
  2.) change init_irq_stacks() to __init function.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>  
Signed-off-by: Huang Shijie <shijie@os.amperecomputing.com>
---
v3 --> v4:
	1.) keep early_cpu_to_node() as __init function.
	2.) change init_irq_stacks() to __init function.

---
 arch/arm64/kernel/irq.c    | 5 +++--
 drivers/base/arch_numa.c   | 2 +-
 include/asm-generic/numa.h | 2 ++
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
index 6ad5c6ef5329..9f253d8efe90 100644
--- a/arch/arm64/kernel/irq.c
+++ b/arch/arm64/kernel/irq.c
@@ -22,6 +22,7 @@
 #include <linux/vmalloc.h>
 #include <asm/daifflags.h>
 #include <asm/exception.h>
+#include <asm/numa.h>
 #include <asm/softirq_stack.h>
 #include <asm/stacktrace.h>
 #include <asm/vmap_stack.h>
@@ -51,13 +52,13 @@ static void init_irq_scs(void)
 }
 
 #ifdef CONFIG_VMAP_STACK
-static void init_irq_stacks(void)
+static void __init init_irq_stacks(void)
 {
 	int cpu;
 	unsigned long *p;
 
 	for_each_possible_cpu(cpu) {
-		p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, cpu_to_node(cpu));
+		p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, early_cpu_to_node(cpu));
 		per_cpu(irq_stack_ptr, cpu) = p;
 	}
 }
diff --git a/drivers/base/arch_numa.c b/drivers/base/arch_numa.c
index eaa31e567d1e..5b59d133b6af 100644
--- a/drivers/base/arch_numa.c
+++ b/drivers/base/arch_numa.c
@@ -144,7 +144,7 @@ void __init early_map_cpu_to_node(unsigned int cpu, int nid)
 unsigned long __per_cpu_offset[NR_CPUS] __read_mostly;
 EXPORT_SYMBOL(__per_cpu_offset);
 
-static int __init early_cpu_to_node(int cpu)
+int __init early_cpu_to_node(int cpu)
 {
 	return cpu_to_node_map[cpu];
 }
diff --git a/include/asm-generic/numa.h b/include/asm-generic/numa.h
index 1a3ad6d29833..c32e0cf23c90 100644
--- a/include/asm-generic/numa.h
+++ b/include/asm-generic/numa.h
@@ -35,6 +35,7 @@ int __init numa_add_memblk(int nodeid, u64 start, u64 end);
 void __init numa_set_distance(int from, int to, int distance);
 void __init numa_free_distance(void);
 void __init early_map_cpu_to_node(unsigned int cpu, int nid);
+int __init early_cpu_to_node(int cpu);
 void numa_store_cpu_info(unsigned int cpu);
 void numa_add_cpu(unsigned int cpu);
 void numa_remove_cpu(unsigned int cpu);
@@ -46,6 +47,7 @@ static inline void numa_add_cpu(unsigned int cpu) { }
 static inline void numa_remove_cpu(unsigned int cpu) { }
 static inline void arch_numa_init(void) { }
 static inline void early_map_cpu_to_node(unsigned int cpu, int nid) { }
+static inline int early_cpu_to_node(int cpu) { return 0; }
 
 #endif	/* CONFIG_NUMA */
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v4] arm64: irq: set the correct node for VMAP stack
  2023-11-24  3:15       ` [PATCH v4] " Huang Shijie
@ 2023-11-24 11:47         ` Catalin Marinas
  2023-12-05 15:16         ` Will Deacon
  1 sibling, 0 replies; 10+ messages in thread
From: Catalin Marinas @ 2023-11-24 11:47 UTC (permalink / raw)
  To: Huang Shijie
  Cc: will, gregkh, rafael, arnd, linux-arm-kernel, linux-kernel,
	linux-arch

On Fri, Nov 24, 2023 at 11:15:13AM +0800, Huang Shijie wrote:
> In current code, init_irq_stacks() will call cpu_to_node().
> The cpu_to_node() depends on percpu "numa_node" which is initialized in:
>      arch_call_rest_init() --> rest_init() -- kernel_init()
> 	--> kernel_init_freeable() --> smp_prepare_cpus()
> 
> But init_irq_stacks() is called in init_IRQ() which is before
> arch_call_rest_init().
> 
> So in init_irq_stacks(), the cpu_to_node() does not work, it
> always return 0. In NUMA, it makes the node 1 cpu accesses the IRQ stack which
> is in the node 0.
> 
> This patch fixes it by:
>   1.) export the early_cpu_to_node(), and use it in the init_irq_stacks().
>   2.) change init_irq_stacks() to __init function.
> 
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>  
> Signed-off-by: Huang Shijie <shijie@os.amperecomputing.com>
> ---
> v3 --> v4:
> 	1.) keep early_cpu_to_node() as __init function.
> 	2.) change init_irq_stacks() to __init function.
> 
> ---
>  arch/arm64/kernel/irq.c    | 5 +++--
>  drivers/base/arch_numa.c   | 2 +-
>  include/asm-generic/numa.h | 2 ++
>  3 files changed, 6 insertions(+), 3 deletions(-)

Greg, Rafael - any objections to taking this patch through the arm64
tree?

-- 
Catalin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v4] arm64: irq: set the correct node for VMAP stack
  2023-11-24  3:15       ` [PATCH v4] " Huang Shijie
  2023-11-24 11:47         ` Catalin Marinas
@ 2023-12-05 15:16         ` Will Deacon
  1 sibling, 0 replies; 10+ messages in thread
From: Will Deacon @ 2023-12-05 15:16 UTC (permalink / raw)
  To: catalin.marinas, Huang Shijie
  Cc: kernel-team, Will Deacon, rafael, linux-arch, gregkh,
	linux-arm-kernel, arnd, linux-kernel

On Fri, 24 Nov 2023 11:15:13 +0800, Huang Shijie wrote:
> In current code, init_irq_stacks() will call cpu_to_node().
> The cpu_to_node() depends on percpu "numa_node" which is initialized in:
>      arch_call_rest_init() --> rest_init() -- kernel_init()
> 	--> kernel_init_freeable() --> smp_prepare_cpus()
> 
> But init_irq_stacks() is called in init_IRQ() which is before
> arch_call_rest_init().
> 
> [...]

Applied to arm64 (for-next/mm), thanks!

[1/1] arm64: irq: set the correct node for VMAP stack
      https://git.kernel.org/arm64/c/75b5e0bf90bf

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-12-05 15:17 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-14  9:16 [PATCH] arm64: irq: set the correct node for VMAP stack Huang Shijie
2023-11-15 14:50 ` kernel test robot
2023-11-16 17:18 ` Catalin Marinas
2023-11-17  2:50   ` Shijie Huang
2023-11-18 15:47   ` [PATCH v2] " Huang Shijie
2023-11-18 16:02   ` [PATCH v3] " Huang Shijie
2023-11-23 16:55     ` Catalin Marinas
2023-11-24  3:15       ` [PATCH v4] " Huang Shijie
2023-11-24 11:47         ` Catalin Marinas
2023-12-05 15:16         ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).