* [PATCH v2 01/11] percpu: Introduce percpu hot section
2025-02-26 18:05 [PATCH v2 00/11] Add a percpu subsection for cache hot data Brian Gerst
@ 2025-02-26 18:05 ` Brian Gerst
2025-02-26 19:36 ` Uros Bizjak
` (2 more replies)
2025-02-26 18:05 ` [PATCH v2 02/11] x86/percpu: Move pcpu_hot to " Brian Gerst
` (10 subsequent siblings)
11 siblings, 3 replies; 23+ messages in thread
From: Brian Gerst @ 2025-02-26 18:05 UTC (permalink / raw)
To: linux-kernel, x86
Cc: Ingo Molnar, H . Peter Anvin, Thomas Gleixner, Borislav Petkov,
Ard Biesheuvel, Uros Bizjak, Linus Torvalds, Andy Lutomirski,
Peter Zijlstra, Andrew Morton, Brian Gerst
Add a subsection to the percpu data for frequently accessed variables
that should remain cached on each processor. These varables should not
be accessed from other processors to avoid cacheline bouncing.
This will replace the pcpu_hot struct on x86, and open up similar
functionality to other architectures and the kernel core.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
---
include/asm-generic/vmlinux.lds.h | 10 ++++++++++
include/linux/percpu-defs.h | 12 ++++++++++++
2 files changed, 22 insertions(+)
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 92fc06f7da74..92dd6065fd0a 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -385,6 +385,11 @@ defined(CONFIG_AUTOFDO_CLANG) || defined(CONFIG_PROPELLER_CLANG)
. = ALIGN(PAGE_SIZE); \
__nosave_end = .;
+#define CACHE_HOT_DATA(align) \
+ . = ALIGN(align); \
+ *(SORT_BY_ALIGNMENT(.data..hot.*)) \
+ . = ALIGN(align);
+
#define PAGE_ALIGNED_DATA(page_align) \
. = ALIGN(page_align); \
*(.data..page_aligned) \
@@ -1065,6 +1070,10 @@ defined(CONFIG_AUTOFDO_CLANG) || defined(CONFIG_PROPELLER_CLANG)
. = ALIGN(PAGE_SIZE); \
*(.data..percpu..page_aligned) \
. = ALIGN(cacheline); \
+ __per_cpu_hot_start = .; \
+ *(SORT_BY_ALIGNMENT(.data..percpu..hot.*)) \
+ . = ALIGN(cacheline); \
+ __per_cpu_hot_end = .; \
*(.data..percpu..read_mostly) \
. = ALIGN(cacheline); \
*(.data..percpu) \
@@ -1112,6 +1121,7 @@ defined(CONFIG_AUTOFDO_CLANG) || defined(CONFIG_PROPELLER_CLANG)
INIT_TASK_DATA(inittask) \
NOSAVE_DATA \
PAGE_ALIGNED_DATA(pagealigned) \
+ CACLE_HOT_DATA(cacheline) \
CACHELINE_ALIGNED_DATA(cacheline) \
READ_MOSTLY_DATA(cacheline) \
DATA_DATA \
diff --git a/include/linux/percpu-defs.h b/include/linux/percpu-defs.h
index 40d34e032d5b..eb3393f96e5a 100644
--- a/include/linux/percpu-defs.h
+++ b/include/linux/percpu-defs.h
@@ -112,6 +112,18 @@
#define DEFINE_PER_CPU(type, name) \
DEFINE_PER_CPU_SECTION(type, name, "")
+/*
+ * Declaration/definition used for per-CPU variables that are frequently
+ * accessed and should be in a single cacheline.
+ *
+ * For use only by architecture and core code.
+ */
+#define DECLARE_PER_CPU_CACHE_HOT(type, name) \
+ DECLARE_PER_CPU_SECTION(type, name, "..hot.." #name)
+
+#define DEFINE_PER_CPU_CACHE_HOT(type, name) \
+ DEFINE_PER_CPU_SECTION(type, name, "..hot.." #name)
+
/*
* Declaration/definition used for per-CPU variables that must be cacheline
* aligned under SMP conditions so that, whilst a particular instance of the
--
2.48.1
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [PATCH v2 01/11] percpu: Introduce percpu hot section
2025-02-26 18:05 ` [PATCH v2 01/11] percpu: Introduce percpu hot section Brian Gerst
@ 2025-02-26 19:36 ` Uros Bizjak
2025-02-27 2:09 ` Brian Gerst
2025-02-27 14:16 ` kernel test robot
2025-02-27 19:29 ` kernel test robot
2 siblings, 1 reply; 23+ messages in thread
From: Uros Bizjak @ 2025-02-26 19:36 UTC (permalink / raw)
To: Brian Gerst
Cc: linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Thomas Gleixner,
Borislav Petkov, Ard Biesheuvel, Linus Torvalds, Andy Lutomirski,
Peter Zijlstra, Andrew Morton
On Wed, Feb 26, 2025 at 7:05 PM Brian Gerst <brgerst@gmail.com> wrote:
>
> Add a subsection to the percpu data for frequently accessed variables
> that should remain cached on each processor. These varables should not
> be accessed from other processors to avoid cacheline bouncing.
>
> This will replace the pcpu_hot struct on x86, and open up similar
> functionality to other architectures and the kernel core.
>
> Signed-off-by: Brian Gerst <brgerst@gmail.com>
> ---
> include/asm-generic/vmlinux.lds.h | 10 ++++++++++
> include/linux/percpu-defs.h | 12 ++++++++++++
> 2 files changed, 22 insertions(+)
>
> diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> index 92fc06f7da74..92dd6065fd0a 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -385,6 +385,11 @@ defined(CONFIG_AUTOFDO_CLANG) || defined(CONFIG_PROPELLER_CLANG)
> . = ALIGN(PAGE_SIZE); \
> __nosave_end = .;
>
> +#define CACHE_HOT_DATA(align) \
> + . = ALIGN(align); \
> + *(SORT_BY_ALIGNMENT(.data..hot.*)) \
> + . = ALIGN(align);
> +
> #define PAGE_ALIGNED_DATA(page_align) \
> . = ALIGN(page_align); \
> *(.data..page_aligned) \
> @@ -1065,6 +1070,10 @@ defined(CONFIG_AUTOFDO_CLANG) || defined(CONFIG_PROPELLER_CLANG)
> . = ALIGN(PAGE_SIZE); \
> *(.data..percpu..page_aligned) \
> . = ALIGN(cacheline); \
> + __per_cpu_hot_start = .; \
> + *(SORT_BY_ALIGNMENT(.data..percpu..hot.*)) \
> + . = ALIGN(cacheline); \
> + __per_cpu_hot_end = .; \
> *(.data..percpu..read_mostly) \
> . = ALIGN(cacheline); \
> *(.data..percpu) \
> @@ -1112,6 +1121,7 @@ defined(CONFIG_AUTOFDO_CLANG) || defined(CONFIG_PROPELLER_CLANG)
> INIT_TASK_DATA(inittask) \
> NOSAVE_DATA \
> PAGE_ALIGNED_DATA(pagealigned) \
> + CACLE_HOT_DATA(cacheline) \
There is a typo in the above macro name.
Uros.
> CACHELINE_ALIGNED_DATA(cacheline) \
> READ_MOSTLY_DATA(cacheline) \
> DATA_DATA \
> diff --git a/include/linux/percpu-defs.h b/include/linux/percpu-defs.h
> index 40d34e032d5b..eb3393f96e5a 100644
> --- a/include/linux/percpu-defs.h
> +++ b/include/linux/percpu-defs.h
> @@ -112,6 +112,18 @@
> #define DEFINE_PER_CPU(type, name) \
> DEFINE_PER_CPU_SECTION(type, name, "")
>
> +/*
> + * Declaration/definition used for per-CPU variables that are frequently
> + * accessed and should be in a single cacheline.
> + *
> + * For use only by architecture and core code.
> + */
> +#define DECLARE_PER_CPU_CACHE_HOT(type, name) \
> + DECLARE_PER_CPU_SECTION(type, name, "..hot.." #name)
> +
> +#define DEFINE_PER_CPU_CACHE_HOT(type, name) \
> + DEFINE_PER_CPU_SECTION(type, name, "..hot.." #name)
> +
> /*
> * Declaration/definition used for per-CPU variables that must be cacheline
> * aligned under SMP conditions so that, whilst a particular instance of the
> --
> 2.48.1
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 01/11] percpu: Introduce percpu hot section
2025-02-26 19:36 ` Uros Bizjak
@ 2025-02-27 2:09 ` Brian Gerst
0 siblings, 0 replies; 23+ messages in thread
From: Brian Gerst @ 2025-02-27 2:09 UTC (permalink / raw)
To: Uros Bizjak
Cc: linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Thomas Gleixner,
Borislav Petkov, Ard Biesheuvel, Linus Torvalds, Andy Lutomirski,
Peter Zijlstra, Andrew Morton
On Wed, Feb 26, 2025 at 2:36 PM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Wed, Feb 26, 2025 at 7:05 PM Brian Gerst <brgerst@gmail.com> wrote:
> >
> > Add a subsection to the percpu data for frequently accessed variables
> > that should remain cached on each processor. These varables should not
> > be accessed from other processors to avoid cacheline bouncing.
> >
> > This will replace the pcpu_hot struct on x86, and open up similar
> > functionality to other architectures and the kernel core.
> >
> > Signed-off-by: Brian Gerst <brgerst@gmail.com>
> > ---
> > include/asm-generic/vmlinux.lds.h | 10 ++++++++++
> > include/linux/percpu-defs.h | 12 ++++++++++++
> > 2 files changed, 22 insertions(+)
> >
> > diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> > index 92fc06f7da74..92dd6065fd0a 100644
> > --- a/include/asm-generic/vmlinux.lds.h
> > +++ b/include/asm-generic/vmlinux.lds.h
> > @@ -385,6 +385,11 @@ defined(CONFIG_AUTOFDO_CLANG) || defined(CONFIG_PROPELLER_CLANG)
> > . = ALIGN(PAGE_SIZE); \
> > __nosave_end = .;
> >
> > +#define CACHE_HOT_DATA(align) \
> > + . = ALIGN(align); \
> > + *(SORT_BY_ALIGNMENT(.data..hot.*)) \
> > + . = ALIGN(align);
> > +
> > #define PAGE_ALIGNED_DATA(page_align) \
> > . = ALIGN(page_align); \
> > *(.data..page_aligned) \
> > @@ -1065,6 +1070,10 @@ defined(CONFIG_AUTOFDO_CLANG) || defined(CONFIG_PROPELLER_CLANG)
> > . = ALIGN(PAGE_SIZE); \
> > *(.data..percpu..page_aligned) \
> > . = ALIGN(cacheline); \
> > + __per_cpu_hot_start = .; \
> > + *(SORT_BY_ALIGNMENT(.data..percpu..hot.*)) \
> > + . = ALIGN(cacheline); \
> > + __per_cpu_hot_end = .; \
> > *(.data..percpu..read_mostly) \
> > . = ALIGN(cacheline); \
> > *(.data..percpu) \
> > @@ -1112,6 +1121,7 @@ defined(CONFIG_AUTOFDO_CLANG) || defined(CONFIG_PROPELLER_CLANG)
> > INIT_TASK_DATA(inittask) \
> > NOSAVE_DATA \
> > PAGE_ALIGNED_DATA(pagealigned) \
> > + CACLE_HOT_DATA(cacheline) \
>
> There is a typo in the above macro name.
Fixed in the next version.
Brian Gerst
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 01/11] percpu: Introduce percpu hot section
2025-02-26 18:05 ` [PATCH v2 01/11] percpu: Introduce percpu hot section Brian Gerst
2025-02-26 19:36 ` Uros Bizjak
@ 2025-02-27 14:16 ` kernel test robot
2025-02-27 19:29 ` kernel test robot
2 siblings, 0 replies; 23+ messages in thread
From: kernel test robot @ 2025-02-27 14:16 UTC (permalink / raw)
To: Brian Gerst, linux-kernel, x86
Cc: llvm, oe-kbuild-all, Ingo Molnar, H . Peter Anvin,
Thomas Gleixner, Borislav Petkov, Ard Biesheuvel, Uros Bizjak,
Andy Lutomirski, Peter Zijlstra, Andrew Morton,
Linux Memory Management List, Brian Gerst
Hi Brian,
kernel test robot noticed the following build errors:
[auto build test ERROR on 79165720f31868d9a9f7e5a50a09d5fe510d1822]
url: https://github.com/intel-lab-lkp/linux/commits/Brian-Gerst/percpu-Introduce-percpu-hot-section/20250227-021212
base: 79165720f31868d9a9f7e5a50a09d5fe510d1822
patch link: https://lore.kernel.org/r/20250226180531.1242429-2-brgerst%40gmail.com
patch subject: [PATCH v2 01/11] percpu: Introduce percpu hot section
config: s390-allnoconfig (https://download.01.org/0day-ci/archive/20250227/202502272142.2EFoWquv-lkp@intel.com/config)
compiler: clang version 15.0.7 (https://github.com/llvm/llvm-project 8dfdcc7b7bf66834a761bd8de445840ef68e4d1a)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250227/202502272142.2EFoWquv-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202502272142.2EFoWquv-lkp@intel.com/
All errors (new ones prefixed by >>):
>> s390x-linux-ld: cannot find CACLE_HOT_DATA: No such file or directory
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 01/11] percpu: Introduce percpu hot section
2025-02-26 18:05 ` [PATCH v2 01/11] percpu: Introduce percpu hot section Brian Gerst
2025-02-26 19:36 ` Uros Bizjak
2025-02-27 14:16 ` kernel test robot
@ 2025-02-27 19:29 ` kernel test robot
2 siblings, 0 replies; 23+ messages in thread
From: kernel test robot @ 2025-02-27 19:29 UTC (permalink / raw)
To: Brian Gerst, linux-kernel, x86
Cc: oe-kbuild-all, Ingo Molnar, H . Peter Anvin, Thomas Gleixner,
Borislav Petkov, Ard Biesheuvel, Uros Bizjak, Andy Lutomirski,
Peter Zijlstra, Andrew Morton, Linux Memory Management List,
Brian Gerst
Hi Brian,
kernel test robot noticed the following build errors:
[auto build test ERROR on 79165720f31868d9a9f7e5a50a09d5fe510d1822]
url: https://github.com/intel-lab-lkp/linux/commits/Brian-Gerst/percpu-Introduce-percpu-hot-section/20250227-021212
base: 79165720f31868d9a9f7e5a50a09d5fe510d1822
patch link: https://lore.kernel.org/r/20250226180531.1242429-2-brgerst%40gmail.com
patch subject: [PATCH v2 01/11] percpu: Introduce percpu hot section
config: arm64-allnoconfig (https://download.01.org/0day-ci/archive/20250228/202502280328.SFEgOJ50-lkp@intel.com/config)
compiler: aarch64-linux-gcc (GCC) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250228/202502280328.SFEgOJ50-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202502280328.SFEgOJ50-lkp@intel.com/
All errors (new ones prefixed by >>):
>> aarch64-linux-ld:./arch/arm64/kernel/vmlinux.lds:107: syntax error
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v2 02/11] x86/percpu: Move pcpu_hot to percpu hot section
2025-02-26 18:05 [PATCH v2 00/11] Add a percpu subsection for cache hot data Brian Gerst
2025-02-26 18:05 ` [PATCH v2 01/11] percpu: Introduce percpu hot section Brian Gerst
@ 2025-02-26 18:05 ` Brian Gerst
2025-02-26 18:05 ` [PATCH v2 03/11] x86/preempt: Move preempt count " Brian Gerst
` (9 subsequent siblings)
11 siblings, 0 replies; 23+ messages in thread
From: Brian Gerst @ 2025-02-26 18:05 UTC (permalink / raw)
To: linux-kernel, x86
Cc: Ingo Molnar, H . Peter Anvin, Thomas Gleixner, Borislav Petkov,
Ard Biesheuvel, Uros Bizjak, Linus Torvalds, Andy Lutomirski,
Peter Zijlstra, Andrew Morton, Brian Gerst
No functional change.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
---
arch/x86/include/asm/current.h | 28 +++++++++++-----------------
arch/x86/kernel/cpu/common.c | 2 +-
arch/x86/kernel/vmlinux.lds.S | 3 +++
3 files changed, 15 insertions(+), 18 deletions(-)
diff --git a/arch/x86/include/asm/current.h b/arch/x86/include/asm/current.h
index bf5953883ec3..60bc66edca83 100644
--- a/arch/x86/include/asm/current.h
+++ b/arch/x86/include/asm/current.h
@@ -13,32 +13,26 @@
struct task_struct;
struct pcpu_hot {
- union {
- struct {
- struct task_struct *current_task;
- int preempt_count;
- int cpu_number;
+ struct task_struct *current_task;
+ int preempt_count;
+ int cpu_number;
#ifdef CONFIG_MITIGATION_CALL_DEPTH_TRACKING
- u64 call_depth;
+ u64 call_depth;
#endif
- unsigned long top_of_stack;
- void *hardirq_stack_ptr;
- u16 softirq_pending;
+ unsigned long top_of_stack;
+ void *hardirq_stack_ptr;
+ u16 softirq_pending;
#ifdef CONFIG_X86_64
- bool hardirq_stack_inuse;
+ bool hardirq_stack_inuse;
#else
- void *softirq_stack_ptr;
+ void *softirq_stack_ptr;
#endif
- };
- u8 pad[64];
- };
};
-static_assert(sizeof(struct pcpu_hot) == 64);
-DECLARE_PER_CPU_ALIGNED(struct pcpu_hot, pcpu_hot);
+DECLARE_PER_CPU_CACHE_HOT(struct pcpu_hot, pcpu_hot);
/* const-qualified alias to pcpu_hot, aliased by linker. */
-DECLARE_PER_CPU_ALIGNED(const struct pcpu_hot __percpu_seg_override,
+DECLARE_PER_CPU_CACHE_HOT(const struct pcpu_hot __percpu_seg_override,
const_pcpu_hot);
static __always_inline struct task_struct *get_current(void)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 8b49b1338f76..9b8bf43019e8 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -2014,7 +2014,7 @@ static __init int setup_clearcpuid(char *arg)
}
__setup("clearcpuid=", setup_clearcpuid);
-DEFINE_PER_CPU_ALIGNED(struct pcpu_hot, pcpu_hot) = {
+DEFINE_PER_CPU_CACHE_HOT(struct pcpu_hot, pcpu_hot) = {
.current_task = &init_task,
.preempt_count = INIT_PREEMPT_COUNT,
.top_of_stack = TOP_OF_INIT_STACK,
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 1769a7126224..7586a9be8c59 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -187,6 +187,8 @@ SECTIONS
PAGE_ALIGNED_DATA(PAGE_SIZE)
+ CACHE_HOT_DATA(L1_CACHE_BYTES)
+
CACHELINE_ALIGNED_DATA(L1_CACHE_BYTES)
DATA_DATA
@@ -328,6 +330,7 @@ SECTIONS
}
PERCPU_SECTION(INTERNODE_CACHE_BYTES)
+ ASSERT(__per_cpu_hot_end - __per_cpu_hot_start <= 64, "percpu cache hot section too large")
RUNTIME_CONST_VARIABLES
RUNTIME_CONST(ptr, USER_PTR_MAX)
--
2.48.1
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH v2 03/11] x86/preempt: Move preempt count to percpu hot section
2025-02-26 18:05 [PATCH v2 00/11] Add a percpu subsection for cache hot data Brian Gerst
2025-02-26 18:05 ` [PATCH v2 01/11] percpu: Introduce percpu hot section Brian Gerst
2025-02-26 18:05 ` [PATCH v2 02/11] x86/percpu: Move pcpu_hot to " Brian Gerst
@ 2025-02-26 18:05 ` Brian Gerst
2025-02-26 18:05 ` [PATCH v2 04/11] x86/smp: Move cpu number " Brian Gerst
` (8 subsequent siblings)
11 siblings, 0 replies; 23+ messages in thread
From: Brian Gerst @ 2025-02-26 18:05 UTC (permalink / raw)
To: linux-kernel, x86
Cc: Ingo Molnar, H . Peter Anvin, Thomas Gleixner, Borislav Petkov,
Ard Biesheuvel, Uros Bizjak, Linus Torvalds, Andy Lutomirski,
Peter Zijlstra, Andrew Morton, Brian Gerst
No functional change.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
---
arch/x86/include/asm/current.h | 1 -
arch/x86/include/asm/preempt.h | 25 +++++++++++++------------
arch/x86/kernel/cpu/common.c | 4 +++-
include/linux/preempt.h | 1 +
4 files changed, 17 insertions(+), 14 deletions(-)
diff --git a/arch/x86/include/asm/current.h b/arch/x86/include/asm/current.h
index 60bc66edca83..46a736d6f2ec 100644
--- a/arch/x86/include/asm/current.h
+++ b/arch/x86/include/asm/current.h
@@ -14,7 +14,6 @@ struct task_struct;
struct pcpu_hot {
struct task_struct *current_task;
- int preempt_count;
int cpu_number;
#ifdef CONFIG_MITIGATION_CALL_DEPTH_TRACKING
u64 call_depth;
diff --git a/arch/x86/include/asm/preempt.h b/arch/x86/include/asm/preempt.h
index 919909d8cb77..578441db09f0 100644
--- a/arch/x86/include/asm/preempt.h
+++ b/arch/x86/include/asm/preempt.h
@@ -4,10 +4,11 @@
#include <asm/rmwcc.h>
#include <asm/percpu.h>
-#include <asm/current.h>
#include <linux/static_call_types.h>
+DECLARE_PER_CPU_CACHE_HOT(int, __preempt_count);
+
/* We use the MSB mostly because its available */
#define PREEMPT_NEED_RESCHED 0x80000000
@@ -23,18 +24,18 @@
*/
static __always_inline int preempt_count(void)
{
- return raw_cpu_read_4(pcpu_hot.preempt_count) & ~PREEMPT_NEED_RESCHED;
+ return raw_cpu_read_4(__preempt_count) & ~PREEMPT_NEED_RESCHED;
}
static __always_inline void preempt_count_set(int pc)
{
int old, new;
- old = raw_cpu_read_4(pcpu_hot.preempt_count);
+ old = raw_cpu_read_4(__preempt_count);
do {
new = (old & PREEMPT_NEED_RESCHED) |
(pc & ~PREEMPT_NEED_RESCHED);
- } while (!raw_cpu_try_cmpxchg_4(pcpu_hot.preempt_count, &old, new));
+ } while (!raw_cpu_try_cmpxchg_4(__preempt_count, &old, new));
}
/*
@@ -43,7 +44,7 @@ static __always_inline void preempt_count_set(int pc)
#define init_task_preempt_count(p) do { } while (0)
#define init_idle_preempt_count(p, cpu) do { \
- per_cpu(pcpu_hot.preempt_count, (cpu)) = PREEMPT_DISABLED; \
+ per_cpu(__preempt_count, (cpu)) = PREEMPT_DISABLED; \
} while (0)
/*
@@ -57,17 +58,17 @@ static __always_inline void preempt_count_set(int pc)
static __always_inline void set_preempt_need_resched(void)
{
- raw_cpu_and_4(pcpu_hot.preempt_count, ~PREEMPT_NEED_RESCHED);
+ raw_cpu_and_4(__preempt_count, ~PREEMPT_NEED_RESCHED);
}
static __always_inline void clear_preempt_need_resched(void)
{
- raw_cpu_or_4(pcpu_hot.preempt_count, PREEMPT_NEED_RESCHED);
+ raw_cpu_or_4(__preempt_count, PREEMPT_NEED_RESCHED);
}
static __always_inline bool test_preempt_need_resched(void)
{
- return !(raw_cpu_read_4(pcpu_hot.preempt_count) & PREEMPT_NEED_RESCHED);
+ return !(raw_cpu_read_4(__preempt_count) & PREEMPT_NEED_RESCHED);
}
/*
@@ -76,12 +77,12 @@ static __always_inline bool test_preempt_need_resched(void)
static __always_inline void __preempt_count_add(int val)
{
- raw_cpu_add_4(pcpu_hot.preempt_count, val);
+ raw_cpu_add_4(__preempt_count, val);
}
static __always_inline void __preempt_count_sub(int val)
{
- raw_cpu_add_4(pcpu_hot.preempt_count, -val);
+ raw_cpu_add_4(__preempt_count, -val);
}
/*
@@ -91,7 +92,7 @@ static __always_inline void __preempt_count_sub(int val)
*/
static __always_inline bool __preempt_count_dec_and_test(void)
{
- return GEN_UNARY_RMWcc("decl", __my_cpu_var(pcpu_hot.preempt_count), e,
+ return GEN_UNARY_RMWcc("decl", __my_cpu_var(__preempt_count), e,
__percpu_arg([var]));
}
@@ -100,7 +101,7 @@ static __always_inline bool __preempt_count_dec_and_test(void)
*/
static __always_inline bool should_resched(int preempt_offset)
{
- return unlikely(raw_cpu_read_4(pcpu_hot.preempt_count) == preempt_offset);
+ return unlikely(raw_cpu_read_4(__preempt_count) == preempt_offset);
}
#ifdef CONFIG_PREEMPTION
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 9b8bf43019e8..1470f687f8d6 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -2016,12 +2016,14 @@ __setup("clearcpuid=", setup_clearcpuid);
DEFINE_PER_CPU_CACHE_HOT(struct pcpu_hot, pcpu_hot) = {
.current_task = &init_task,
- .preempt_count = INIT_PREEMPT_COUNT,
.top_of_stack = TOP_OF_INIT_STACK,
};
EXPORT_PER_CPU_SYMBOL(pcpu_hot);
EXPORT_PER_CPU_SYMBOL(const_pcpu_hot);
+DEFINE_PER_CPU_CACHE_HOT(int, __preempt_count) = INIT_PREEMPT_COUNT;
+EXPORT_PER_CPU_SYMBOL(__preempt_count);
+
#ifdef CONFIG_X86_64
static void wrmsrl_cstar(unsigned long val)
{
diff --git a/include/linux/preempt.h b/include/linux/preempt.h
index ca86235ac15c..4c1af9b7e28b 100644
--- a/include/linux/preempt.h
+++ b/include/linux/preempt.h
@@ -319,6 +319,7 @@ do { \
#ifdef CONFIG_PREEMPT_NOTIFIERS
struct preempt_notifier;
+struct task_struct;
/**
* preempt_ops - notifiers called when a task is preempted and rescheduled
--
2.48.1
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH v2 04/11] x86/smp: Move cpu number to percpu hot section
2025-02-26 18:05 [PATCH v2 00/11] Add a percpu subsection for cache hot data Brian Gerst
` (2 preceding siblings ...)
2025-02-26 18:05 ` [PATCH v2 03/11] x86/preempt: Move preempt count " Brian Gerst
@ 2025-02-26 18:05 ` Brian Gerst
2025-02-26 18:05 ` [PATCH v2 05/11] x86/retbleed: Move call depth " Brian Gerst
` (7 subsequent siblings)
11 siblings, 0 replies; 23+ messages in thread
From: Brian Gerst @ 2025-02-26 18:05 UTC (permalink / raw)
To: linux-kernel, x86
Cc: Ingo Molnar, H . Peter Anvin, Thomas Gleixner, Borislav Petkov,
Ard Biesheuvel, Uros Bizjak, Linus Torvalds, Andy Lutomirski,
Peter Zijlstra, Andrew Morton, Brian Gerst
No functional change.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
---
arch/x86/include/asm/current.h | 1 -
arch/x86/include/asm/smp.h | 7 ++++---
arch/x86/kernel/setup_percpu.c | 5 ++++-
kernel/bpf/verifier.c | 4 ++--
4 files changed, 10 insertions(+), 7 deletions(-)
diff --git a/arch/x86/include/asm/current.h b/arch/x86/include/asm/current.h
index 46a736d6f2ec..f988462d8b69 100644
--- a/arch/x86/include/asm/current.h
+++ b/arch/x86/include/asm/current.h
@@ -14,7 +14,6 @@ struct task_struct;
struct pcpu_hot {
struct task_struct *current_task;
- int cpu_number;
#ifdef CONFIG_MITIGATION_CALL_DEPTH_TRACKING
u64 call_depth;
#endif
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index ca073f40698f..d1db2c131b1d 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -5,9 +5,10 @@
#include <linux/cpumask.h>
#include <asm/cpumask.h>
-#include <asm/current.h>
#include <asm/thread_info.h>
+DECLARE_PER_CPU_CACHE_HOT(int, cpu_number);
+
DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_map);
DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_map);
DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_die_map);
@@ -133,8 +134,8 @@ __visible void smp_call_function_single_interrupt(struct pt_regs *r);
* This function is needed by all SMP systems. It must _always_ be valid
* from the initial startup.
*/
-#define raw_smp_processor_id() this_cpu_read(pcpu_hot.cpu_number)
-#define __smp_processor_id() __this_cpu_read(pcpu_hot.cpu_number)
+#define raw_smp_processor_id() this_cpu_read(cpu_number)
+#define __smp_processor_id() __this_cpu_read(cpu_number)
#ifdef CONFIG_X86_32
extern int safe_smp_processor_id(void);
diff --git a/arch/x86/kernel/setup_percpu.c b/arch/x86/kernel/setup_percpu.c
index 1e7be9409aa2..175afc3ffb12 100644
--- a/arch/x86/kernel/setup_percpu.c
+++ b/arch/x86/kernel/setup_percpu.c
@@ -23,6 +23,9 @@
#include <asm/cpumask.h>
#include <asm/cpu.h>
+DEFINE_PER_CPU_CACHE_HOT(int, cpu_number);
+EXPORT_PER_CPU_SYMBOL(cpu_number);
+
DEFINE_PER_CPU_READ_MOSTLY(unsigned long, this_cpu_off);
EXPORT_PER_CPU_SYMBOL(this_cpu_off);
@@ -161,7 +164,7 @@ void __init setup_per_cpu_areas(void)
for_each_possible_cpu(cpu) {
per_cpu_offset(cpu) = delta + pcpu_unit_offsets[cpu];
per_cpu(this_cpu_off, cpu) = per_cpu_offset(cpu);
- per_cpu(pcpu_hot.cpu_number, cpu) = cpu;
+ per_cpu(cpu_number, cpu) = cpu;
setup_percpu_segment(cpu);
/*
* Copy data used in early init routines from the
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 9971c03adfd5..604134d33282 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -21687,12 +21687,12 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
if (insn->imm == BPF_FUNC_get_smp_processor_id &&
verifier_inlines_helper_call(env, insn->imm)) {
/* BPF_FUNC_get_smp_processor_id inlining is an
- * optimization, so if pcpu_hot.cpu_number is ever
+ * optimization, so if cpu_number is ever
* changed in some incompatible and hard to support
* way, it's fine to back out this inlining logic
*/
#ifdef CONFIG_SMP
- insn_buf[0] = BPF_MOV32_IMM(BPF_REG_0, (u32)(unsigned long)&pcpu_hot.cpu_number);
+ insn_buf[0] = BPF_MOV32_IMM(BPF_REG_0, (u32)(unsigned long)&cpu_number);
insn_buf[1] = BPF_MOV64_PERCPU_REG(BPF_REG_0, BPF_REG_0);
insn_buf[2] = BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_0, 0);
cnt = 3;
--
2.48.1
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH v2 05/11] x86/retbleed: Move call depth to percpu hot section
2025-02-26 18:05 [PATCH v2 00/11] Add a percpu subsection for cache hot data Brian Gerst
` (3 preceding siblings ...)
2025-02-26 18:05 ` [PATCH v2 04/11] x86/smp: Move cpu number " Brian Gerst
@ 2025-02-26 18:05 ` Brian Gerst
2025-02-26 18:05 ` [PATCH v2 06/11] x86/softirq: Move softirq_pending " Brian Gerst
` (6 subsequent siblings)
11 siblings, 0 replies; 23+ messages in thread
From: Brian Gerst @ 2025-02-26 18:05 UTC (permalink / raw)
To: linux-kernel, x86
Cc: Ingo Molnar, H . Peter Anvin, Thomas Gleixner, Borislav Petkov,
Ard Biesheuvel, Uros Bizjak, Linus Torvalds, Andy Lutomirski,
Peter Zijlstra, Andrew Morton, Brian Gerst
No functional change.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
---
arch/x86/include/asm/current.h | 3 ---
arch/x86/include/asm/nospec-branch.h | 11 ++++++-----
arch/x86/kernel/asm-offsets.c | 3 ---
arch/x86/kernel/cpu/common.c | 8 ++++++++
arch/x86/lib/retpoline.S | 2 +-
5 files changed, 15 insertions(+), 12 deletions(-)
diff --git a/arch/x86/include/asm/current.h b/arch/x86/include/asm/current.h
index f988462d8b69..8ba2c0f8bcaf 100644
--- a/arch/x86/include/asm/current.h
+++ b/arch/x86/include/asm/current.h
@@ -14,9 +14,6 @@ struct task_struct;
struct pcpu_hot {
struct task_struct *current_task;
-#ifdef CONFIG_MITIGATION_CALL_DEPTH_TRACKING
- u64 call_depth;
-#endif
unsigned long top_of_stack;
void *hardirq_stack_ptr;
u16 softirq_pending;
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 7e8bf78c03d5..a5602055bfc9 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -12,7 +12,6 @@
#include <asm/msr-index.h>
#include <asm/unwind_hints.h>
#include <asm/percpu.h>
-#include <asm/current.h>
/*
* Call depth tracking for Intel SKL CPUs to address the RSB underflow
@@ -78,21 +77,21 @@
#include <asm/asm-offsets.h>
#define CREDIT_CALL_DEPTH \
- movq $-1, PER_CPU_VAR(pcpu_hot + X86_call_depth);
+ movq $-1, PER_CPU_VAR(__x86_call_depth);
#define RESET_CALL_DEPTH \
xor %eax, %eax; \
bts $63, %rax; \
- movq %rax, PER_CPU_VAR(pcpu_hot + X86_call_depth);
+ movq %rax, PER_CPU_VAR(__x86_call_depth);
#define RESET_CALL_DEPTH_FROM_CALL \
movb $0xfc, %al; \
shl $56, %rax; \
- movq %rax, PER_CPU_VAR(pcpu_hot + X86_call_depth); \
+ movq %rax, PER_CPU_VAR(__x86_call_depth); \
CALL_THUNKS_DEBUG_INC_CALLS
#define INCREMENT_CALL_DEPTH \
- sarq $5, PER_CPU_VAR(pcpu_hot + X86_call_depth); \
+ sarq $5, PER_CPU_VAR(__x86_call_depth); \
CALL_THUNKS_DEBUG_INC_CALLS
#else
@@ -388,6 +387,8 @@ extern void call_depth_return_thunk(void);
__stringify(INCREMENT_CALL_DEPTH), \
X86_FEATURE_CALL_DEPTH)
+DECLARE_PER_CPU_CACHE_HOT(u64, __x86_call_depth);
+
#ifdef CONFIG_CALL_THUNKS_DEBUG
DECLARE_PER_CPU(u64, __x86_call_count);
DECLARE_PER_CPU(u64, __x86_ret_count);
diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index a98020bf31bb..6fae88f8ae1e 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -109,9 +109,6 @@ static void __used common(void)
OFFSET(TSS_sp2, tss_struct, x86_tss.sp2);
OFFSET(X86_top_of_stack, pcpu_hot, top_of_stack);
OFFSET(X86_current_task, pcpu_hot, current_task);
-#ifdef CONFIG_MITIGATION_CALL_DEPTH_TRACKING
- OFFSET(X86_call_depth, pcpu_hot, call_depth);
-#endif
#if IS_ENABLED(CONFIG_CRYPTO_ARIA_AESNI_AVX_X86_64)
/* Offset for fields in aria_ctx */
BLANK();
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 1470f687f8d6..01f33fb86f05 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -2025,6 +2025,14 @@ DEFINE_PER_CPU_CACHE_HOT(int, __preempt_count) = INIT_PREEMPT_COUNT;
EXPORT_PER_CPU_SYMBOL(__preempt_count);
#ifdef CONFIG_X86_64
+/*
+ * Note: Do not make this dependant on CONFIG_MITIGATION_CALL_DEPTH_TRACKING
+ * so that this space is reserved in the hot cache section even when the
+ * mitigation is disabled.
+ */
+DEFINE_PER_CPU_CACHE_HOT(u64, __x86_call_depth);
+EXPORT_PER_CPU_SYMBOL(__x86_call_depth);
+
static void wrmsrl_cstar(unsigned long val)
{
/*
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index 391059b2c6fb..04502e843de0 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -342,7 +342,7 @@ SYM_FUNC_START(call_depth_return_thunk)
* case.
*/
CALL_THUNKS_DEBUG_INC_RETS
- shlq $5, PER_CPU_VAR(pcpu_hot + X86_call_depth)
+ shlq $5, PER_CPU_VAR(__x86_call_depth)
jz 1f
ANNOTATE_UNRET_SAFE
ret
--
2.48.1
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH v2 06/11] x86/softirq: Move softirq_pending to percpu hot section
2025-02-26 18:05 [PATCH v2 00/11] Add a percpu subsection for cache hot data Brian Gerst
` (4 preceding siblings ...)
2025-02-26 18:05 ` [PATCH v2 05/11] x86/retbleed: Move call depth " Brian Gerst
@ 2025-02-26 18:05 ` Brian Gerst
2025-02-26 18:05 ` [PATCH v2 07/11] x86/irq: Move irq stacks " Brian Gerst
` (5 subsequent siblings)
11 siblings, 0 replies; 23+ messages in thread
From: Brian Gerst @ 2025-02-26 18:05 UTC (permalink / raw)
To: linux-kernel, x86
Cc: Ingo Molnar, H . Peter Anvin, Thomas Gleixner, Borislav Petkov,
Ard Biesheuvel, Uros Bizjak, Linus Torvalds, Andy Lutomirski,
Peter Zijlstra, Andrew Morton, Brian Gerst
No functional change.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
---
arch/x86/include/asm/current.h | 1 -
arch/x86/include/asm/hardirq.h | 4 ++--
arch/x86/kernel/irq.c | 3 +++
3 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/current.h b/arch/x86/include/asm/current.h
index 8ba2c0f8bcaf..f153c77853de 100644
--- a/arch/x86/include/asm/current.h
+++ b/arch/x86/include/asm/current.h
@@ -16,7 +16,6 @@ struct pcpu_hot {
struct task_struct *current_task;
unsigned long top_of_stack;
void *hardirq_stack_ptr;
- u16 softirq_pending;
#ifdef CONFIG_X86_64
bool hardirq_stack_inuse;
#else
diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h
index 6ffa8b75f4cd..f00c09ffe6a9 100644
--- a/arch/x86/include/asm/hardirq.h
+++ b/arch/x86/include/asm/hardirq.h
@@ -3,7 +3,6 @@
#define _ASM_X86_HARDIRQ_H
#include <linux/threads.h>
-#include <asm/current.h>
typedef struct {
#if IS_ENABLED(CONFIG_KVM_INTEL)
@@ -66,7 +65,8 @@ extern u64 arch_irq_stat_cpu(unsigned int cpu);
extern u64 arch_irq_stat(void);
#define arch_irq_stat arch_irq_stat
-#define local_softirq_pending_ref pcpu_hot.softirq_pending
+DECLARE_PER_CPU_CACHE_HOT(u16, __softirq_pending);
+#define local_softirq_pending_ref __softirq_pending
#if IS_ENABLED(CONFIG_KVM_INTEL)
/*
diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 385e3a5fc304..474af15ae017 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -31,6 +31,9 @@
DEFINE_PER_CPU_SHARED_ALIGNED(irq_cpustat_t, irq_stat);
EXPORT_PER_CPU_SYMBOL(irq_stat);
+DEFINE_PER_CPU_CACHE_HOT(u16, __softirq_pending);
+EXPORT_PER_CPU_SYMBOL(__softirq_pending);
+
atomic_t irq_err_count;
/*
--
2.48.1
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH v2 07/11] x86/irq: Move irq stacks to percpu hot section
2025-02-26 18:05 [PATCH v2 00/11] Add a percpu subsection for cache hot data Brian Gerst
` (5 preceding siblings ...)
2025-02-26 18:05 ` [PATCH v2 06/11] x86/softirq: Move softirq_pending " Brian Gerst
@ 2025-02-26 18:05 ` Brian Gerst
2025-02-26 20:25 ` Peter Zijlstra
2025-02-26 18:05 ` [PATCH v2 08/11] x86/percpu: Move top_of_stack " Brian Gerst
` (4 subsequent siblings)
11 siblings, 1 reply; 23+ messages in thread
From: Brian Gerst @ 2025-02-26 18:05 UTC (permalink / raw)
To: linux-kernel, x86
Cc: Ingo Molnar, H . Peter Anvin, Thomas Gleixner, Borislav Petkov,
Ard Biesheuvel, Uros Bizjak, Linus Torvalds, Andy Lutomirski,
Peter Zijlstra, Andrew Morton, Brian Gerst
No functional change.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
---
arch/x86/include/asm/current.h | 6 ------
arch/x86/include/asm/irq_stack.h | 12 ++++++------
arch/x86/include/asm/processor.h | 7 +++++++
arch/x86/kernel/dumpstack_32.c | 4 ++--
arch/x86/kernel/dumpstack_64.c | 2 +-
arch/x86/kernel/irq.c | 5 +++++
arch/x86/kernel/irq_32.c | 12 +++++++-----
arch/x86/kernel/irq_64.c | 6 +++---
arch/x86/kernel/process_64.c | 2 +-
9 files changed, 32 insertions(+), 24 deletions(-)
diff --git a/arch/x86/include/asm/current.h b/arch/x86/include/asm/current.h
index f153c77853de..6fad5a4c21d7 100644
--- a/arch/x86/include/asm/current.h
+++ b/arch/x86/include/asm/current.h
@@ -15,12 +15,6 @@ struct task_struct;
struct pcpu_hot {
struct task_struct *current_task;
unsigned long top_of_stack;
- void *hardirq_stack_ptr;
-#ifdef CONFIG_X86_64
- bool hardirq_stack_inuse;
-#else
- void *softirq_stack_ptr;
-#endif
};
DECLARE_PER_CPU_CACHE_HOT(struct pcpu_hot, pcpu_hot);
diff --git a/arch/x86/include/asm/irq_stack.h b/arch/x86/include/asm/irq_stack.h
index 562a547c29a5..735c3a491f60 100644
--- a/arch/x86/include/asm/irq_stack.h
+++ b/arch/x86/include/asm/irq_stack.h
@@ -116,7 +116,7 @@
ASM_CALL_ARG2
#define call_on_irqstack(func, asm_call, argconstr...) \
- call_on_stack(__this_cpu_read(pcpu_hot.hardirq_stack_ptr), \
+ call_on_stack(__this_cpu_read(hardirq_stack_ptr), \
func, asm_call, argconstr)
/* Macros to assert type correctness for run_*_on_irqstack macros */
@@ -135,7 +135,7 @@
* User mode entry and interrupt on the irq stack do not \
* switch stacks. If from user mode the task stack is empty. \
*/ \
- if (user_mode(regs) || __this_cpu_read(pcpu_hot.hardirq_stack_inuse)) { \
+ if (user_mode(regs) || __this_cpu_read(hardirq_stack_inuse)) { \
irq_enter_rcu(); \
func(c_args); \
irq_exit_rcu(); \
@@ -146,9 +146,9 @@
* places. Invoke the stack switch macro with the call \
* sequence which matches the above direct invocation. \
*/ \
- __this_cpu_write(pcpu_hot.hardirq_stack_inuse, true); \
+ __this_cpu_write(hardirq_stack_inuse, true); \
call_on_irqstack(func, asm_call, constr); \
- __this_cpu_write(pcpu_hot.hardirq_stack_inuse, false); \
+ __this_cpu_write(hardirq_stack_inuse, false); \
} \
}
@@ -212,9 +212,9 @@
*/
#define do_softirq_own_stack() \
{ \
- __this_cpu_write(pcpu_hot.hardirq_stack_inuse, true); \
+ __this_cpu_write(hardirq_stack_inuse, true); \
call_on_irqstack(__do_softirq, ASM_CALL_ARG0); \
- __this_cpu_write(pcpu_hot.hardirq_stack_inuse, false); \
+ __this_cpu_write(hardirq_stack_inuse, false); \
}
#endif
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index b3d153730f63..54fce8d7504d 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -420,6 +420,13 @@ struct irq_stack {
char stack[IRQ_STACK_SIZE];
} __aligned(IRQ_STACK_SIZE);
+DECLARE_PER_CPU_CACHE_HOT(struct irq_stack *, hardirq_stack_ptr);
+#ifdef CONFIG_X86_64
+DECLARE_PER_CPU_CACHE_HOT(bool, hardirq_stack_inuse);
+#else
+DECLARE_PER_CPU_CACHE_HOT(struct irq_stack *, softirq_stack_ptr);
+#endif
+
#ifdef CONFIG_X86_64
static inline unsigned long cpu_kernelmode_gs_base(int cpu)
{
diff --git a/arch/x86/kernel/dumpstack_32.c b/arch/x86/kernel/dumpstack_32.c
index b4905d5173fd..722fd712e1cf 100644
--- a/arch/x86/kernel/dumpstack_32.c
+++ b/arch/x86/kernel/dumpstack_32.c
@@ -37,7 +37,7 @@ const char *stack_type_name(enum stack_type type)
static bool in_hardirq_stack(unsigned long *stack, struct stack_info *info)
{
- unsigned long *begin = (unsigned long *)this_cpu_read(pcpu_hot.hardirq_stack_ptr);
+ unsigned long *begin = (unsigned long *)this_cpu_read(hardirq_stack_ptr);
unsigned long *end = begin + (THREAD_SIZE / sizeof(long));
/*
@@ -62,7 +62,7 @@ static bool in_hardirq_stack(unsigned long *stack, struct stack_info *info)
static bool in_softirq_stack(unsigned long *stack, struct stack_info *info)
{
- unsigned long *begin = (unsigned long *)this_cpu_read(pcpu_hot.softirq_stack_ptr);
+ unsigned long *begin = (unsigned long *)this_cpu_read(softirq_stack_ptr);
unsigned long *end = begin + (THREAD_SIZE / sizeof(long));
/*
diff --git a/arch/x86/kernel/dumpstack_64.c b/arch/x86/kernel/dumpstack_64.c
index f05339fee778..6c5defd6569a 100644
--- a/arch/x86/kernel/dumpstack_64.c
+++ b/arch/x86/kernel/dumpstack_64.c
@@ -134,7 +134,7 @@ static __always_inline bool in_exception_stack(unsigned long *stack, struct stac
static __always_inline bool in_irq_stack(unsigned long *stack, struct stack_info *info)
{
- unsigned long *end = (unsigned long *)this_cpu_read(pcpu_hot.hardirq_stack_ptr);
+ unsigned long *end = (unsigned long *)this_cpu_read(hardirq_stack_ptr);
unsigned long *begin;
/*
diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 474af15ae017..2cd2064457b1 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -34,6 +34,11 @@ EXPORT_PER_CPU_SYMBOL(irq_stat);
DEFINE_PER_CPU_CACHE_HOT(u16, __softirq_pending);
EXPORT_PER_CPU_SYMBOL(__softirq_pending);
+DEFINE_PER_CPU_CACHE_HOT(struct irq_stack *, hardirq_stack_ptr);
+#ifdef CONFIG_X86_64
+DEFINE_PER_CPU_CACHE_HOT(bool, hardirq_stack_inuse);
+#endif
+
atomic_t irq_err_count;
/*
diff --git a/arch/x86/kernel/irq_32.c b/arch/x86/kernel/irq_32.c
index dc1049c01f9b..48a27cde9635 100644
--- a/arch/x86/kernel/irq_32.c
+++ b/arch/x86/kernel/irq_32.c
@@ -52,6 +52,8 @@ static inline int check_stack_overflow(void) { return 0; }
static inline void print_stack_overflow(void) { }
#endif
+DEFINE_PER_CPU_CACHE_HOT(struct irq_stack *, softirq_stack_ptr);
+
static void call_on_stack(void *func, void *stack)
{
asm volatile("xchgl %%ebx,%%esp \n"
@@ -74,7 +76,7 @@ static inline int execute_on_irq_stack(int overflow, struct irq_desc *desc)
u32 *isp, *prev_esp, arg1;
curstk = (struct irq_stack *) current_stack();
- irqstk = __this_cpu_read(pcpu_hot.hardirq_stack_ptr);
+ irqstk = __this_cpu_read(hardirq_stack_ptr);
/*
* this is where we switch to the IRQ stack. However, if we are
@@ -112,7 +114,7 @@ int irq_init_percpu_irqstack(unsigned int cpu)
int node = cpu_to_node(cpu);
struct page *ph, *ps;
- if (per_cpu(pcpu_hot.hardirq_stack_ptr, cpu))
+ if (per_cpu(hardirq_stack_ptr, cpu))
return 0;
ph = alloc_pages_node(node, THREADINFO_GFP, THREAD_SIZE_ORDER);
@@ -124,8 +126,8 @@ int irq_init_percpu_irqstack(unsigned int cpu)
return -ENOMEM;
}
- per_cpu(pcpu_hot.hardirq_stack_ptr, cpu) = page_address(ph);
- per_cpu(pcpu_hot.softirq_stack_ptr, cpu) = page_address(ps);
+ per_cpu(hardirq_stack_ptr, cpu) = page_address(ph);
+ per_cpu(softirq_stack_ptr, cpu) = page_address(ps);
return 0;
}
@@ -135,7 +137,7 @@ void do_softirq_own_stack(void)
struct irq_stack *irqstk;
u32 *isp, *prev_esp;
- irqstk = __this_cpu_read(pcpu_hot.softirq_stack_ptr);
+ irqstk = __this_cpu_read(softirq_stack_ptr);
/* build the stack frame on the softirq stack */
isp = (u32 *) ((char *)irqstk + sizeof(*irqstk));
diff --git a/arch/x86/kernel/irq_64.c b/arch/x86/kernel/irq_64.c
index 56bdeecd8ee0..4834e317e568 100644
--- a/arch/x86/kernel/irq_64.c
+++ b/arch/x86/kernel/irq_64.c
@@ -50,7 +50,7 @@ static int map_irq_stack(unsigned int cpu)
return -ENOMEM;
/* Store actual TOS to avoid adjustment in the hotpath */
- per_cpu(pcpu_hot.hardirq_stack_ptr, cpu) = va + IRQ_STACK_SIZE - 8;
+ per_cpu(hardirq_stack_ptr, cpu) = va + IRQ_STACK_SIZE - 8;
return 0;
}
#else
@@ -63,14 +63,14 @@ static int map_irq_stack(unsigned int cpu)
void *va = per_cpu_ptr(&irq_stack_backing_store, cpu);
/* Store actual TOS to avoid adjustment in the hotpath */
- per_cpu(pcpu_hot.hardirq_stack_ptr, cpu) = va + IRQ_STACK_SIZE - 8;
+ per_cpu(hardirq_stack_ptr, cpu) = va + IRQ_STACK_SIZE - 8;
return 0;
}
#endif
int irq_init_percpu_irqstack(unsigned int cpu)
{
- if (per_cpu(pcpu_hot.hardirq_stack_ptr, cpu))
+ if (per_cpu(hardirq_stack_ptr, cpu))
return 0;
return map_irq_stack(cpu);
}
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 226472332a70..93de583c05d1 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -614,7 +614,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
int cpu = smp_processor_id();
WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ENTRY) &&
- this_cpu_read(pcpu_hot.hardirq_stack_inuse));
+ this_cpu_read(hardirq_stack_inuse));
if (!test_tsk_thread_flag(prev_p, TIF_NEED_FPU_LOAD))
switch_fpu_prepare(prev_p, cpu);
--
2.48.1
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [PATCH v2 07/11] x86/irq: Move irq stacks to percpu hot section
2025-02-26 18:05 ` [PATCH v2 07/11] x86/irq: Move irq stacks " Brian Gerst
@ 2025-02-26 20:25 ` Peter Zijlstra
2025-02-27 0:10 ` Brian Gerst
0 siblings, 1 reply; 23+ messages in thread
From: Peter Zijlstra @ 2025-02-26 20:25 UTC (permalink / raw)
To: Brian Gerst
Cc: linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Thomas Gleixner,
Borislav Petkov, Ard Biesheuvel, Uros Bizjak, Linus Torvalds,
Andy Lutomirski, Andrew Morton
On Wed, Feb 26, 2025 at 01:05:26PM -0500, Brian Gerst wrote:
> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
> index 474af15ae017..2cd2064457b1 100644
> --- a/arch/x86/kernel/irq.c
> +++ b/arch/x86/kernel/irq.c
> @@ -34,6 +34,11 @@ EXPORT_PER_CPU_SYMBOL(irq_stat);
> DEFINE_PER_CPU_CACHE_HOT(u16, __softirq_pending);
> EXPORT_PER_CPU_SYMBOL(__softirq_pending);
>
> +DEFINE_PER_CPU_CACHE_HOT(struct irq_stack *, hardirq_stack_ptr);
> +#ifdef CONFIG_X86_64
> +DEFINE_PER_CPU_CACHE_HOT(bool, hardirq_stack_inuse);
> +#endif
> +
> atomic_t irq_err_count;
>
> /*
Perhaps instead of the above #ifdef,...
> diff --git a/arch/x86/kernel/irq_32.c b/arch/x86/kernel/irq_32.c
> index dc1049c01f9b..48a27cde9635 100644
> --- a/arch/x86/kernel/irq_32.c
> +++ b/arch/x86/kernel/irq_32.c
> @@ -52,6 +52,8 @@ static inline int check_stack_overflow(void) { return 0; }
> static inline void print_stack_overflow(void) { }
> #endif
>
> +DEFINE_PER_CPU_CACHE_HOT(struct irq_stack *, softirq_stack_ptr);
> +
> static void call_on_stack(void *func, void *stack)
> {
> asm volatile("xchgl %%ebx,%%esp \n"
> diff --git a/arch/x86/kernel/irq_64.c b/arch/x86/kernel/irq_64.c
> index 56bdeecd8ee0..4834e317e568 100644
> --- a/arch/x86/kernel/irq_64.c
> +++ b/arch/x86/kernel/irq_64.c
stick it in this file, like you already did for the 32bit case?
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH v2 07/11] x86/irq: Move irq stacks to percpu hot section
2025-02-26 20:25 ` Peter Zijlstra
@ 2025-02-27 0:10 ` Brian Gerst
0 siblings, 0 replies; 23+ messages in thread
From: Brian Gerst @ 2025-02-27 0:10 UTC (permalink / raw)
To: Peter Zijlstra
Cc: linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Thomas Gleixner,
Borislav Petkov, Ard Biesheuvel, Uros Bizjak, Linus Torvalds,
Andy Lutomirski, Andrew Morton
On Wed, Feb 26, 2025 at 3:25 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Wed, Feb 26, 2025 at 01:05:26PM -0500, Brian Gerst wrote:
>
> > diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
> > index 474af15ae017..2cd2064457b1 100644
> > --- a/arch/x86/kernel/irq.c
> > +++ b/arch/x86/kernel/irq.c
> > @@ -34,6 +34,11 @@ EXPORT_PER_CPU_SYMBOL(irq_stat);
> > DEFINE_PER_CPU_CACHE_HOT(u16, __softirq_pending);
> > EXPORT_PER_CPU_SYMBOL(__softirq_pending);
> >
> > +DEFINE_PER_CPU_CACHE_HOT(struct irq_stack *, hardirq_stack_ptr);
> > +#ifdef CONFIG_X86_64
> > +DEFINE_PER_CPU_CACHE_HOT(bool, hardirq_stack_inuse);
> > +#endif
> > +
> > atomic_t irq_err_count;
> >
> > /*
>
> Perhaps instead of the above #ifdef,...
>
> > diff --git a/arch/x86/kernel/irq_32.c b/arch/x86/kernel/irq_32.c
> > index dc1049c01f9b..48a27cde9635 100644
> > --- a/arch/x86/kernel/irq_32.c
> > +++ b/arch/x86/kernel/irq_32.c
> > @@ -52,6 +52,8 @@ static inline int check_stack_overflow(void) { return 0; }
> > static inline void print_stack_overflow(void) { }
> > #endif
> >
> > +DEFINE_PER_CPU_CACHE_HOT(struct irq_stack *, softirq_stack_ptr);
> > +
> > static void call_on_stack(void *func, void *stack)
> > {
> > asm volatile("xchgl %%ebx,%%esp \n"
>
> > diff --git a/arch/x86/kernel/irq_64.c b/arch/x86/kernel/irq_64.c
> > index 56bdeecd8ee0..4834e317e568 100644
> > --- a/arch/x86/kernel/irq_64.c
> > +++ b/arch/x86/kernel/irq_64.c
>
> stick it in this file, like you already did for the 32bit case?
I had it that way originally, but it wasn't packing efficiently before
I added SORT_BY_ALIGNMENT() to the linker script. I'll change it
back.
Brian Gerst
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v2 08/11] x86/percpu: Move top_of_stack to percpu hot section
2025-02-26 18:05 [PATCH v2 00/11] Add a percpu subsection for cache hot data Brian Gerst
` (6 preceding siblings ...)
2025-02-26 18:05 ` [PATCH v2 07/11] x86/irq: Move irq stacks " Brian Gerst
@ 2025-02-26 18:05 ` Brian Gerst
2025-02-26 20:08 ` Uros Bizjak
2025-02-26 18:05 ` [PATCH v2 09/11] x86/percpu: Move current_task " Brian Gerst
` (3 subsequent siblings)
11 siblings, 1 reply; 23+ messages in thread
From: Brian Gerst @ 2025-02-26 18:05 UTC (permalink / raw)
To: linux-kernel, x86
Cc: Ingo Molnar, H . Peter Anvin, Thomas Gleixner, Borislav Petkov,
Ard Biesheuvel, Uros Bizjak, Linus Torvalds, Andy Lutomirski,
Peter Zijlstra, Andrew Morton, Brian Gerst
No functional change.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
---
arch/x86/entry/entry_32.S | 4 ++--
arch/x86/entry/entry_64.S | 6 +++---
arch/x86/entry/entry_64_compat.S | 4 ++--
arch/x86/include/asm/current.h | 1 -
arch/x86/include/asm/percpu.h | 2 +-
arch/x86/include/asm/processor.h | 8 ++++++--
arch/x86/kernel/asm-offsets.c | 1 -
arch/x86/kernel/cpu/common.c | 3 ++-
arch/x86/kernel/process_32.c | 4 ++--
arch/x86/kernel/process_64.c | 2 +-
arch/x86/kernel/smpboot.c | 2 +-
arch/x86/kernel/vmlinux.lds.S | 1 +
12 files changed, 21 insertions(+), 17 deletions(-)
diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 20be5758c2d2..92c0b4a94e0a 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -1153,7 +1153,7 @@ SYM_CODE_START(asm_exc_nmi)
* is using the thread stack right now, so it's safe for us to use it.
*/
movl %esp, %ebx
- movl PER_CPU_VAR(pcpu_hot + X86_top_of_stack), %esp
+ movl PER_CPU_VAR(cpu_current_top_of_stack), %esp
call exc_nmi
movl %ebx, %esp
@@ -1217,7 +1217,7 @@ SYM_CODE_START(rewind_stack_and_make_dead)
/* Prevent any naive code from trying to unwind to our caller. */
xorl %ebp, %ebp
- movl PER_CPU_VAR(pcpu_hot + X86_top_of_stack), %esi
+ movl PER_CPU_VAR(cpu_current_top_of_stack), %esi
leal -TOP_OF_KERNEL_STACK_PADDING-PTREGS_SIZE(%esi), %esp
call make_task_dead
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 33a955aa01d8..9baf32a7a118 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -92,7 +92,7 @@ SYM_CODE_START(entry_SYSCALL_64)
/* tss.sp2 is scratch space. */
movq %rsp, PER_CPU_VAR(cpu_tss_rw + TSS_sp2)
SWITCH_TO_KERNEL_CR3 scratch_reg=%rsp
- movq PER_CPU_VAR(pcpu_hot + X86_top_of_stack), %rsp
+ movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
SYM_INNER_LABEL(entry_SYSCALL_64_safe_stack, SYM_L_GLOBAL)
ANNOTATE_NOENDBR
@@ -1166,7 +1166,7 @@ SYM_CODE_START(asm_exc_nmi)
FENCE_SWAPGS_USER_ENTRY
SWITCH_TO_KERNEL_CR3 scratch_reg=%rdx
movq %rsp, %rdx
- movq PER_CPU_VAR(pcpu_hot + X86_top_of_stack), %rsp
+ movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
UNWIND_HINT_IRET_REGS base=%rdx offset=8
pushq 5*8(%rdx) /* pt_regs->ss */
pushq 4*8(%rdx) /* pt_regs->rsp */
@@ -1484,7 +1484,7 @@ SYM_CODE_START_NOALIGN(rewind_stack_and_make_dead)
/* Prevent any naive code from trying to unwind to our caller. */
xorl %ebp, %ebp
- movq PER_CPU_VAR(pcpu_hot + X86_top_of_stack), %rax
+ movq PER_CPU_VAR(cpu_current_top_of_stack), %rax
leaq -PTREGS_SIZE(%rax), %rsp
UNWIND_HINT_REGS
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index ed0a5f2dc129..a45e1125fc6c 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -57,7 +57,7 @@ SYM_CODE_START(entry_SYSENTER_compat)
SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
popq %rax
- movq PER_CPU_VAR(pcpu_hot + X86_top_of_stack), %rsp
+ movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
/* Construct struct pt_regs on stack */
pushq $__USER_DS /* pt_regs->ss */
@@ -193,7 +193,7 @@ SYM_CODE_START(entry_SYSCALL_compat)
SWITCH_TO_KERNEL_CR3 scratch_reg=%rsp
/* Switch to the kernel stack */
- movq PER_CPU_VAR(pcpu_hot + X86_top_of_stack), %rsp
+ movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
SYM_INNER_LABEL(entry_SYSCALL_compat_safe_stack, SYM_L_GLOBAL)
ANNOTATE_NOENDBR
diff --git a/arch/x86/include/asm/current.h b/arch/x86/include/asm/current.h
index 6fad5a4c21d7..3d1b123c2ee3 100644
--- a/arch/x86/include/asm/current.h
+++ b/arch/x86/include/asm/current.h
@@ -14,7 +14,6 @@ struct task_struct;
struct pcpu_hot {
struct task_struct *current_task;
- unsigned long top_of_stack;
};
DECLARE_PER_CPU_CACHE_HOT(struct pcpu_hot, pcpu_hot);
diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
index 7cb4f64b2e60..044410462d36 100644
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -554,7 +554,7 @@ do { \
* it is accessed while this_cpu_read_stable() allows the value to be cached.
* this_cpu_read_stable() is more efficient and can be used if its value
* is guaranteed to be valid across CPUs. The current users include
- * pcpu_hot.current_task and pcpu_hot.top_of_stack, both of which are
+ * pcpu_hot.current_task and cpu_current_top_of_stack, both of which are
* actually per-thread variables implemented as per-CPU variables and
* thus stable for the duration of the respective task.
*/
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 54fce8d7504d..b4d51de071f2 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -427,6 +427,10 @@ DECLARE_PER_CPU_CACHE_HOT(bool, hardirq_stack_inuse);
DECLARE_PER_CPU_CACHE_HOT(struct irq_stack *, softirq_stack_ptr);
#endif
+DECLARE_PER_CPU_CACHE_HOT(unsigned long, cpu_current_top_of_stack);
+/* const-qualified alias provided by the linker. */
+DECLARE_PER_CPU_CACHE_HOT(const unsigned long __percpu_seg_override, const_cpu_current_top_of_stack);
+
#ifdef CONFIG_X86_64
static inline unsigned long cpu_kernelmode_gs_base(int cpu)
{
@@ -552,9 +556,9 @@ static __always_inline unsigned long current_top_of_stack(void)
* entry trampoline.
*/
if (IS_ENABLED(CONFIG_USE_X86_SEG_SUPPORT))
- return this_cpu_read_const(const_pcpu_hot.top_of_stack);
+ return this_cpu_read_const(const_cpu_current_top_of_stack);
- return this_cpu_read_stable(pcpu_hot.top_of_stack);
+ return this_cpu_read_stable(cpu_current_top_of_stack);
}
static __always_inline bool on_thread_stack(void)
diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index 6fae88f8ae1e..54ace808defd 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -107,7 +107,6 @@ static void __used common(void)
OFFSET(TSS_sp0, tss_struct, x86_tss.sp0);
OFFSET(TSS_sp1, tss_struct, x86_tss.sp1);
OFFSET(TSS_sp2, tss_struct, x86_tss.sp2);
- OFFSET(X86_top_of_stack, pcpu_hot, top_of_stack);
OFFSET(X86_current_task, pcpu_hot, current_task);
#if IS_ENABLED(CONFIG_CRYPTO_ARIA_AESNI_AVX_X86_64)
/* Offset for fields in aria_ctx */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 01f33fb86f05..fc059e9c8867 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -2016,7 +2016,6 @@ __setup("clearcpuid=", setup_clearcpuid);
DEFINE_PER_CPU_CACHE_HOT(struct pcpu_hot, pcpu_hot) = {
.current_task = &init_task,
- .top_of_stack = TOP_OF_INIT_STACK,
};
EXPORT_PER_CPU_SYMBOL(pcpu_hot);
EXPORT_PER_CPU_SYMBOL(const_pcpu_hot);
@@ -2024,6 +2023,8 @@ EXPORT_PER_CPU_SYMBOL(const_pcpu_hot);
DEFINE_PER_CPU_CACHE_HOT(int, __preempt_count) = INIT_PREEMPT_COUNT;
EXPORT_PER_CPU_SYMBOL(__preempt_count);
+DEFINE_PER_CPU_CACHE_HOT(unsigned long, cpu_current_top_of_stack) = TOP_OF_INIT_STACK;
+
#ifdef CONFIG_X86_64
/*
* Note: Do not make this dependant on CONFIG_MITIGATION_CALL_DEPTH_TRACKING
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 0917c7f25720..3afb2428bedb 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -190,13 +190,13 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
arch_end_context_switch(next_p);
/*
- * Reload esp0 and pcpu_hot.top_of_stack. This changes
+ * Reload esp0 and cpu_current_top_of_stack. This changes
* current_thread_info(). Refresh the SYSENTER configuration in
* case prev or next is vm86.
*/
update_task_stack(next_p);
refresh_sysenter_cs(next);
- this_cpu_write(pcpu_hot.top_of_stack,
+ this_cpu_write(cpu_current_top_of_stack,
(unsigned long)task_stack_page(next_p) +
THREAD_SIZE);
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 93de583c05d1..f68da7b7e50c 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -669,7 +669,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
* Switch the PDA and FPU contexts.
*/
raw_cpu_write(pcpu_hot.current_task, next_p);
- raw_cpu_write(pcpu_hot.top_of_stack, task_top_of_stack(next_p));
+ raw_cpu_write(cpu_current_top_of_stack, task_top_of_stack(next_p));
switch_fpu_finish(next_p);
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index c10850ae6f09..15e054f4cbf6 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -851,7 +851,7 @@ int common_cpu_up(unsigned int cpu, struct task_struct *idle)
#ifdef CONFIG_X86_32
/* Stack for startup_32 can be just as for start_secondary onwards */
- per_cpu(pcpu_hot.top_of_stack, cpu) = task_top_of_stack(idle);
+ per_cpu(cpu_current_top_of_stack, cpu) = task_top_of_stack(idle);
#endif
return 0;
}
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 7586a9be8c59..85032c085af2 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -44,6 +44,7 @@ ENTRY(phys_startup_64)
jiffies = jiffies_64;
const_pcpu_hot = pcpu_hot;
+const_cpu_current_top_of_stack = cpu_current_top_of_stack;
#if defined(CONFIG_X86_64)
/*
--
2.48.1
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [PATCH v2 08/11] x86/percpu: Move top_of_stack to percpu hot section
2025-02-26 18:05 ` [PATCH v2 08/11] x86/percpu: Move top_of_stack " Brian Gerst
@ 2025-02-26 20:08 ` Uros Bizjak
2025-02-27 2:10 ` Brian Gerst
0 siblings, 1 reply; 23+ messages in thread
From: Uros Bizjak @ 2025-02-26 20:08 UTC (permalink / raw)
To: Brian Gerst
Cc: linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Thomas Gleixner,
Borislav Petkov, Ard Biesheuvel, Linus Torvalds, Andy Lutomirski,
Peter Zijlstra, Andrew Morton
On Wed, Feb 26, 2025 at 7:06 PM Brian Gerst <brgerst@gmail.com> wrote:
>
> No functional change.
>
> Signed-off-by: Brian Gerst <brgerst@gmail.com>
> ---
> arch/x86/entry/entry_32.S | 4 ++--
> arch/x86/entry/entry_64.S | 6 +++---
> arch/x86/entry/entry_64_compat.S | 4 ++--
> arch/x86/include/asm/current.h | 1 -
> arch/x86/include/asm/percpu.h | 2 +-
> arch/x86/include/asm/processor.h | 8 ++++++--
> arch/x86/kernel/asm-offsets.c | 1 -
> arch/x86/kernel/cpu/common.c | 3 ++-
> arch/x86/kernel/process_32.c | 4 ++--
> arch/x86/kernel/process_64.c | 2 +-
> arch/x86/kernel/smpboot.c | 2 +-
> arch/x86/kernel/vmlinux.lds.S | 1 +
> 12 files changed, 21 insertions(+), 17 deletions(-)
>
> diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
> index 20be5758c2d2..92c0b4a94e0a 100644
> --- a/arch/x86/entry/entry_32.S
> +++ b/arch/x86/entry/entry_32.S
> @@ -1153,7 +1153,7 @@ SYM_CODE_START(asm_exc_nmi)
> * is using the thread stack right now, so it's safe for us to use it.
> */
> movl %esp, %ebx
> - movl PER_CPU_VAR(pcpu_hot + X86_top_of_stack), %esp
> + movl PER_CPU_VAR(cpu_current_top_of_stack), %esp
> call exc_nmi
> movl %ebx, %esp
>
> @@ -1217,7 +1217,7 @@ SYM_CODE_START(rewind_stack_and_make_dead)
> /* Prevent any naive code from trying to unwind to our caller. */
> xorl %ebp, %ebp
>
> - movl PER_CPU_VAR(pcpu_hot + X86_top_of_stack), %esi
> + movl PER_CPU_VAR(cpu_current_top_of_stack), %esi
> leal -TOP_OF_KERNEL_STACK_PADDING-PTREGS_SIZE(%esi), %esp
>
> call make_task_dead
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 33a955aa01d8..9baf32a7a118 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -92,7 +92,7 @@ SYM_CODE_START(entry_SYSCALL_64)
> /* tss.sp2 is scratch space. */
> movq %rsp, PER_CPU_VAR(cpu_tss_rw + TSS_sp2)
> SWITCH_TO_KERNEL_CR3 scratch_reg=%rsp
> - movq PER_CPU_VAR(pcpu_hot + X86_top_of_stack), %rsp
> + movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
>
> SYM_INNER_LABEL(entry_SYSCALL_64_safe_stack, SYM_L_GLOBAL)
> ANNOTATE_NOENDBR
> @@ -1166,7 +1166,7 @@ SYM_CODE_START(asm_exc_nmi)
> FENCE_SWAPGS_USER_ENTRY
> SWITCH_TO_KERNEL_CR3 scratch_reg=%rdx
> movq %rsp, %rdx
> - movq PER_CPU_VAR(pcpu_hot + X86_top_of_stack), %rsp
> + movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
> UNWIND_HINT_IRET_REGS base=%rdx offset=8
> pushq 5*8(%rdx) /* pt_regs->ss */
> pushq 4*8(%rdx) /* pt_regs->rsp */
> @@ -1484,7 +1484,7 @@ SYM_CODE_START_NOALIGN(rewind_stack_and_make_dead)
> /* Prevent any naive code from trying to unwind to our caller. */
> xorl %ebp, %ebp
>
> - movq PER_CPU_VAR(pcpu_hot + X86_top_of_stack), %rax
> + movq PER_CPU_VAR(cpu_current_top_of_stack), %rax
> leaq -PTREGS_SIZE(%rax), %rsp
> UNWIND_HINT_REGS
>
> diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
> index ed0a5f2dc129..a45e1125fc6c 100644
> --- a/arch/x86/entry/entry_64_compat.S
> +++ b/arch/x86/entry/entry_64_compat.S
> @@ -57,7 +57,7 @@ SYM_CODE_START(entry_SYSENTER_compat)
> SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
> popq %rax
>
> - movq PER_CPU_VAR(pcpu_hot + X86_top_of_stack), %rsp
> + movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
>
> /* Construct struct pt_regs on stack */
> pushq $__USER_DS /* pt_regs->ss */
> @@ -193,7 +193,7 @@ SYM_CODE_START(entry_SYSCALL_compat)
> SWITCH_TO_KERNEL_CR3 scratch_reg=%rsp
>
> /* Switch to the kernel stack */
> - movq PER_CPU_VAR(pcpu_hot + X86_top_of_stack), %rsp
> + movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
>
> SYM_INNER_LABEL(entry_SYSCALL_compat_safe_stack, SYM_L_GLOBAL)
> ANNOTATE_NOENDBR
> diff --git a/arch/x86/include/asm/current.h b/arch/x86/include/asm/current.h
> index 6fad5a4c21d7..3d1b123c2ee3 100644
> --- a/arch/x86/include/asm/current.h
> +++ b/arch/x86/include/asm/current.h
> @@ -14,7 +14,6 @@ struct task_struct;
>
> struct pcpu_hot {
> struct task_struct *current_task;
> - unsigned long top_of_stack;
> };
>
> DECLARE_PER_CPU_CACHE_HOT(struct pcpu_hot, pcpu_hot);
> diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
> index 7cb4f64b2e60..044410462d36 100644
> --- a/arch/x86/include/asm/percpu.h
> +++ b/arch/x86/include/asm/percpu.h
> @@ -554,7 +554,7 @@ do { \
> * it is accessed while this_cpu_read_stable() allows the value to be cached.
> * this_cpu_read_stable() is more efficient and can be used if its value
> * is guaranteed to be valid across CPUs. The current users include
> - * pcpu_hot.current_task and pcpu_hot.top_of_stack, both of which are
> + * pcpu_hot.current_task and cpu_current_top_of_stack, both of which are
> * actually per-thread variables implemented as per-CPU variables and
> * thus stable for the duration of the respective task.
> */
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index 54fce8d7504d..b4d51de071f2 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -427,6 +427,10 @@ DECLARE_PER_CPU_CACHE_HOT(bool, hardirq_stack_inuse);
> DECLARE_PER_CPU_CACHE_HOT(struct irq_stack *, softirq_stack_ptr);
> #endif
>
> +DECLARE_PER_CPU_CACHE_HOT(unsigned long, cpu_current_top_of_stack);
> +/* const-qualified alias provided by the linker. */
> +DECLARE_PER_CPU_CACHE_HOT(const unsigned long __percpu_seg_override, const_cpu_current_top_of_stack);
Please split the above line, like you did with const_current_task declaration.
Uros.
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH v2 08/11] x86/percpu: Move top_of_stack to percpu hot section
2025-02-26 20:08 ` Uros Bizjak
@ 2025-02-27 2:10 ` Brian Gerst
0 siblings, 0 replies; 23+ messages in thread
From: Brian Gerst @ 2025-02-27 2:10 UTC (permalink / raw)
To: Uros Bizjak
Cc: linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Thomas Gleixner,
Borislav Petkov, Ard Biesheuvel, Linus Torvalds, Andy Lutomirski,
Peter Zijlstra, Andrew Morton
On Wed, Feb 26, 2025 at 3:08 PM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Wed, Feb 26, 2025 at 7:06 PM Brian Gerst <brgerst@gmail.com> wrote:
> >
> > No functional change.
> >
> > Signed-off-by: Brian Gerst <brgerst@gmail.com>
> > ---
> > arch/x86/entry/entry_32.S | 4 ++--
> > arch/x86/entry/entry_64.S | 6 +++---
> > arch/x86/entry/entry_64_compat.S | 4 ++--
> > arch/x86/include/asm/current.h | 1 -
> > arch/x86/include/asm/percpu.h | 2 +-
> > arch/x86/include/asm/processor.h | 8 ++++++--
> > arch/x86/kernel/asm-offsets.c | 1 -
> > arch/x86/kernel/cpu/common.c | 3 ++-
> > arch/x86/kernel/process_32.c | 4 ++--
> > arch/x86/kernel/process_64.c | 2 +-
> > arch/x86/kernel/smpboot.c | 2 +-
> > arch/x86/kernel/vmlinux.lds.S | 1 +
> > 12 files changed, 21 insertions(+), 17 deletions(-)
> >
> > diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
> > index 20be5758c2d2..92c0b4a94e0a 100644
> > --- a/arch/x86/entry/entry_32.S
> > +++ b/arch/x86/entry/entry_32.S
> > @@ -1153,7 +1153,7 @@ SYM_CODE_START(asm_exc_nmi)
> > * is using the thread stack right now, so it's safe for us to use it.
> > */
> > movl %esp, %ebx
> > - movl PER_CPU_VAR(pcpu_hot + X86_top_of_stack), %esp
> > + movl PER_CPU_VAR(cpu_current_top_of_stack), %esp
> > call exc_nmi
> > movl %ebx, %esp
> >
> > @@ -1217,7 +1217,7 @@ SYM_CODE_START(rewind_stack_and_make_dead)
> > /* Prevent any naive code from trying to unwind to our caller. */
> > xorl %ebp, %ebp
> >
> > - movl PER_CPU_VAR(pcpu_hot + X86_top_of_stack), %esi
> > + movl PER_CPU_VAR(cpu_current_top_of_stack), %esi
> > leal -TOP_OF_KERNEL_STACK_PADDING-PTREGS_SIZE(%esi), %esp
> >
> > call make_task_dead
> > diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> > index 33a955aa01d8..9baf32a7a118 100644
> > --- a/arch/x86/entry/entry_64.S
> > +++ b/arch/x86/entry/entry_64.S
> > @@ -92,7 +92,7 @@ SYM_CODE_START(entry_SYSCALL_64)
> > /* tss.sp2 is scratch space. */
> > movq %rsp, PER_CPU_VAR(cpu_tss_rw + TSS_sp2)
> > SWITCH_TO_KERNEL_CR3 scratch_reg=%rsp
> > - movq PER_CPU_VAR(pcpu_hot + X86_top_of_stack), %rsp
> > + movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
> >
> > SYM_INNER_LABEL(entry_SYSCALL_64_safe_stack, SYM_L_GLOBAL)
> > ANNOTATE_NOENDBR
> > @@ -1166,7 +1166,7 @@ SYM_CODE_START(asm_exc_nmi)
> > FENCE_SWAPGS_USER_ENTRY
> > SWITCH_TO_KERNEL_CR3 scratch_reg=%rdx
> > movq %rsp, %rdx
> > - movq PER_CPU_VAR(pcpu_hot + X86_top_of_stack), %rsp
> > + movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
> > UNWIND_HINT_IRET_REGS base=%rdx offset=8
> > pushq 5*8(%rdx) /* pt_regs->ss */
> > pushq 4*8(%rdx) /* pt_regs->rsp */
> > @@ -1484,7 +1484,7 @@ SYM_CODE_START_NOALIGN(rewind_stack_and_make_dead)
> > /* Prevent any naive code from trying to unwind to our caller. */
> > xorl %ebp, %ebp
> >
> > - movq PER_CPU_VAR(pcpu_hot + X86_top_of_stack), %rax
> > + movq PER_CPU_VAR(cpu_current_top_of_stack), %rax
> > leaq -PTREGS_SIZE(%rax), %rsp
> > UNWIND_HINT_REGS
> >
> > diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
> > index ed0a5f2dc129..a45e1125fc6c 100644
> > --- a/arch/x86/entry/entry_64_compat.S
> > +++ b/arch/x86/entry/entry_64_compat.S
> > @@ -57,7 +57,7 @@ SYM_CODE_START(entry_SYSENTER_compat)
> > SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
> > popq %rax
> >
> > - movq PER_CPU_VAR(pcpu_hot + X86_top_of_stack), %rsp
> > + movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
> >
> > /* Construct struct pt_regs on stack */
> > pushq $__USER_DS /* pt_regs->ss */
> > @@ -193,7 +193,7 @@ SYM_CODE_START(entry_SYSCALL_compat)
> > SWITCH_TO_KERNEL_CR3 scratch_reg=%rsp
> >
> > /* Switch to the kernel stack */
> > - movq PER_CPU_VAR(pcpu_hot + X86_top_of_stack), %rsp
> > + movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
> >
> > SYM_INNER_LABEL(entry_SYSCALL_compat_safe_stack, SYM_L_GLOBAL)
> > ANNOTATE_NOENDBR
> > diff --git a/arch/x86/include/asm/current.h b/arch/x86/include/asm/current.h
> > index 6fad5a4c21d7..3d1b123c2ee3 100644
> > --- a/arch/x86/include/asm/current.h
> > +++ b/arch/x86/include/asm/current.h
> > @@ -14,7 +14,6 @@ struct task_struct;
> >
> > struct pcpu_hot {
> > struct task_struct *current_task;
> > - unsigned long top_of_stack;
> > };
> >
> > DECLARE_PER_CPU_CACHE_HOT(struct pcpu_hot, pcpu_hot);
> > diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
> > index 7cb4f64b2e60..044410462d36 100644
> > --- a/arch/x86/include/asm/percpu.h
> > +++ b/arch/x86/include/asm/percpu.h
> > @@ -554,7 +554,7 @@ do { \
> > * it is accessed while this_cpu_read_stable() allows the value to be cached.
> > * this_cpu_read_stable() is more efficient and can be used if its value
> > * is guaranteed to be valid across CPUs. The current users include
> > - * pcpu_hot.current_task and pcpu_hot.top_of_stack, both of which are
> > + * pcpu_hot.current_task and cpu_current_top_of_stack, both of which are
> > * actually per-thread variables implemented as per-CPU variables and
> > * thus stable for the duration of the respective task.
> > */
> > diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> > index 54fce8d7504d..b4d51de071f2 100644
> > --- a/arch/x86/include/asm/processor.h
> > +++ b/arch/x86/include/asm/processor.h
> > @@ -427,6 +427,10 @@ DECLARE_PER_CPU_CACHE_HOT(bool, hardirq_stack_inuse);
> > DECLARE_PER_CPU_CACHE_HOT(struct irq_stack *, softirq_stack_ptr);
> > #endif
> >
> > +DECLARE_PER_CPU_CACHE_HOT(unsigned long, cpu_current_top_of_stack);
> > +/* const-qualified alias provided by the linker. */
> > +DECLARE_PER_CPU_CACHE_HOT(const unsigned long __percpu_seg_override, const_cpu_current_top_of_stack);
>
> Please split the above line, like you did with const_current_task declaration.
Fixed in the next version.
Brian Gerst
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v2 09/11] x86/percpu: Move current_task to percpu hot section
2025-02-26 18:05 [PATCH v2 00/11] Add a percpu subsection for cache hot data Brian Gerst
` (7 preceding siblings ...)
2025-02-26 18:05 ` [PATCH v2 08/11] x86/percpu: Move top_of_stack " Brian Gerst
@ 2025-02-26 18:05 ` Brian Gerst
2025-02-26 18:05 ` [PATCH v2 10/11] x86/stackprotector: Move __stack_chk_guard " Brian Gerst
` (2 subsequent siblings)
11 siblings, 0 replies; 23+ messages in thread
From: Brian Gerst @ 2025-02-26 18:05 UTC (permalink / raw)
To: linux-kernel, x86
Cc: Ingo Molnar, H . Peter Anvin, Thomas Gleixner, Borislav Petkov,
Ard Biesheuvel, Uros Bizjak, Linus Torvalds, Andy Lutomirski,
Peter Zijlstra, Andrew Morton, Brian Gerst
No functional change.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
---
arch/x86/include/asm/current.h | 17 ++++++-----------
arch/x86/include/asm/percpu.h | 2 +-
arch/x86/kernel/asm-offsets.c | 1 -
arch/x86/kernel/cpu/common.c | 8 +++-----
arch/x86/kernel/head_64.S | 4 ++--
arch/x86/kernel/process_32.c | 2 +-
arch/x86/kernel/process_64.c | 2 +-
arch/x86/kernel/smpboot.c | 2 +-
arch/x86/kernel/vmlinux.lds.S | 2 +-
scripts/gdb/linux/cpus.py | 2 +-
10 files changed, 17 insertions(+), 25 deletions(-)
diff --git a/arch/x86/include/asm/current.h b/arch/x86/include/asm/current.h
index 3d1b123c2ee3..dea7d8b854f0 100644
--- a/arch/x86/include/asm/current.h
+++ b/arch/x86/include/asm/current.h
@@ -12,22 +12,17 @@
struct task_struct;
-struct pcpu_hot {
- struct task_struct *current_task;
-};
-
-DECLARE_PER_CPU_CACHE_HOT(struct pcpu_hot, pcpu_hot);
-
-/* const-qualified alias to pcpu_hot, aliased by linker. */
-DECLARE_PER_CPU_CACHE_HOT(const struct pcpu_hot __percpu_seg_override,
- const_pcpu_hot);
+DECLARE_PER_CPU_CACHE_HOT(struct task_struct *, current_task);
+/* const-qualified alias provided by the linker. */
+DECLARE_PER_CPU_CACHE_HOT(struct task_struct * const __percpu_seg_override,
+ const_current_task);
static __always_inline struct task_struct *get_current(void)
{
if (IS_ENABLED(CONFIG_USE_X86_SEG_SUPPORT))
- return this_cpu_read_const(const_pcpu_hot.current_task);
+ return this_cpu_read_const(const_current_task);
- return this_cpu_read_stable(pcpu_hot.current_task);
+ return this_cpu_read_stable(current_task);
}
#define current get_current()
diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
index 044410462d36..e347c6656ce8 100644
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -554,7 +554,7 @@ do { \
* it is accessed while this_cpu_read_stable() allows the value to be cached.
* this_cpu_read_stable() is more efficient and can be used if its value
* is guaranteed to be valid across CPUs. The current users include
- * pcpu_hot.current_task and cpu_current_top_of_stack, both of which are
+ * current_task and cpu_current_top_of_stack, both of which are
* actually per-thread variables implemented as per-CPU variables and
* thus stable for the duration of the respective task.
*/
diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index 54ace808defd..ad4ea6fb3b6c 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -107,7 +107,6 @@ static void __used common(void)
OFFSET(TSS_sp0, tss_struct, x86_tss.sp0);
OFFSET(TSS_sp1, tss_struct, x86_tss.sp1);
OFFSET(TSS_sp2, tss_struct, x86_tss.sp2);
- OFFSET(X86_current_task, pcpu_hot, current_task);
#if IS_ENABLED(CONFIG_CRYPTO_ARIA_AESNI_AVX_X86_64)
/* Offset for fields in aria_ctx */
BLANK();
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index fc059e9c8867..ac8721a0eb3a 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -2014,11 +2014,9 @@ static __init int setup_clearcpuid(char *arg)
}
__setup("clearcpuid=", setup_clearcpuid);
-DEFINE_PER_CPU_CACHE_HOT(struct pcpu_hot, pcpu_hot) = {
- .current_task = &init_task,
-};
-EXPORT_PER_CPU_SYMBOL(pcpu_hot);
-EXPORT_PER_CPU_SYMBOL(const_pcpu_hot);
+DEFINE_PER_CPU_CACHE_HOT(struct task_struct *, current_task) = &init_task;
+EXPORT_PER_CPU_SYMBOL(current_task);
+EXPORT_PER_CPU_SYMBOL(const_current_task);
DEFINE_PER_CPU_CACHE_HOT(int, __preempt_count) = INIT_PREEMPT_COUNT;
EXPORT_PER_CPU_SYMBOL(__preempt_count);
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 2843b0a56198..fefe2a25cf02 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -322,7 +322,7 @@ SYM_INNER_LABEL(common_startup_64, SYM_L_LOCAL)
*
* RDX contains the per-cpu offset
*/
- movq pcpu_hot + X86_current_task(%rdx), %rax
+ movq current_task(%rdx), %rax
movq TASK_threadsp(%rax), %rsp
/*
@@ -433,7 +433,7 @@ SYM_CODE_START(soft_restart_cpu)
UNWIND_HINT_END_OF_STACK
/* Find the idle task stack */
- movq PER_CPU_VAR(pcpu_hot + X86_current_task), %rcx
+ movq PER_CPU_VAR(current_task), %rcx
movq TASK_threadsp(%rcx), %rsp
jmp .Ljump_to_C_code
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 3afb2428bedb..c276dfda387f 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -206,7 +206,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
if (prev->gs | next->gs)
loadsegment(gs, next->gs);
- raw_cpu_write(pcpu_hot.current_task, next_p);
+ raw_cpu_write(current_task, next_p);
switch_fpu_finish(next_p);
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index f68da7b7e50c..13893ec03d85 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -668,7 +668,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
/*
* Switch the PDA and FPU contexts.
*/
- raw_cpu_write(pcpu_hot.current_task, next_p);
+ raw_cpu_write(current_task, next_p);
raw_cpu_write(cpu_current_top_of_stack, task_top_of_stack(next_p));
switch_fpu_finish(next_p);
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 15e054f4cbf6..c89545a61d08 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -841,7 +841,7 @@ int common_cpu_up(unsigned int cpu, struct task_struct *idle)
/* Just in case we booted with a single CPU. */
alternatives_enable_smp();
- per_cpu(pcpu_hot.current_task, cpu) = idle;
+ per_cpu(current_task, cpu) = idle;
cpu_init_stack_canary(cpu, idle);
/* Initialize the interrupt stack(s) */
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 85032c085af2..9ac6b42701fa 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -43,7 +43,7 @@ ENTRY(phys_startup_64)
#endif
jiffies = jiffies_64;
-const_pcpu_hot = pcpu_hot;
+const_current_task = current_task;
const_cpu_current_top_of_stack = cpu_current_top_of_stack;
#if defined(CONFIG_X86_64)
diff --git a/scripts/gdb/linux/cpus.py b/scripts/gdb/linux/cpus.py
index 13eb8b3901b8..8f7c4fb78c2c 100644
--- a/scripts/gdb/linux/cpus.py
+++ b/scripts/gdb/linux/cpus.py
@@ -164,7 +164,7 @@ def get_current_task(cpu):
var_ptr = gdb.parse_and_eval("(struct task_struct *)cpu_tasks[0].task")
return var_ptr.dereference()
else:
- var_ptr = gdb.parse_and_eval("&pcpu_hot.current_task")
+ var_ptr = gdb.parse_and_eval("¤t_task")
return per_cpu(var_ptr, cpu).dereference()
elif utils.is_target_arch("aarch64"):
current_task_addr = gdb.parse_and_eval("(unsigned long)$SP_EL0")
--
2.48.1
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH v2 10/11] x86/stackprotector: Move __stack_chk_guard to percpu hot section
2025-02-26 18:05 [PATCH v2 00/11] Add a percpu subsection for cache hot data Brian Gerst
` (8 preceding siblings ...)
2025-02-26 18:05 ` [PATCH v2 09/11] x86/percpu: Move current_task " Brian Gerst
@ 2025-02-26 18:05 ` Brian Gerst
2025-02-26 18:05 ` [PATCH v2 11/11] x86/smp: Move this_cpu_off " Brian Gerst
2025-02-26 20:23 ` [PATCH v2 00/11] Add a percpu subsection for cache hot data Peter Zijlstra
11 siblings, 0 replies; 23+ messages in thread
From: Brian Gerst @ 2025-02-26 18:05 UTC (permalink / raw)
To: linux-kernel, x86
Cc: Ingo Molnar, H . Peter Anvin, Thomas Gleixner, Borislav Petkov,
Ard Biesheuvel, Uros Bizjak, Linus Torvalds, Andy Lutomirski,
Peter Zijlstra, Andrew Morton, Brian Gerst
No functional change.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
---
arch/x86/include/asm/stackprotector.h | 2 +-
arch/x86/kernel/cpu/common.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/stackprotector.h b/arch/x86/include/asm/stackprotector.h
index d43fb589fcf6..cd761b14eb02 100644
--- a/arch/x86/include/asm/stackprotector.h
+++ b/arch/x86/include/asm/stackprotector.h
@@ -20,7 +20,7 @@
#include <linux/sched.h>
-DECLARE_PER_CPU(unsigned long, __stack_chk_guard);
+DECLARE_PER_CPU_CACHE_HOT(unsigned long, __stack_chk_guard);
/*
* Initialize the stackprotector canary value.
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index ac8721a0eb3a..62472b8f798a 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -2097,7 +2097,7 @@ void syscall_init(void)
#endif /* CONFIG_X86_64 */
#ifdef CONFIG_STACKPROTECTOR
-DEFINE_PER_CPU(unsigned long, __stack_chk_guard);
+DEFINE_PER_CPU_CACHE_HOT(unsigned long, __stack_chk_guard);
#ifndef CONFIG_SMP
EXPORT_PER_CPU_SYMBOL(__stack_chk_guard);
#endif
--
2.48.1
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH v2 11/11] x86/smp: Move this_cpu_off to percpu hot section
2025-02-26 18:05 [PATCH v2 00/11] Add a percpu subsection for cache hot data Brian Gerst
` (9 preceding siblings ...)
2025-02-26 18:05 ` [PATCH v2 10/11] x86/stackprotector: Move __stack_chk_guard " Brian Gerst
@ 2025-02-26 18:05 ` Brian Gerst
2025-02-26 20:23 ` [PATCH v2 00/11] Add a percpu subsection for cache hot data Peter Zijlstra
11 siblings, 0 replies; 23+ messages in thread
From: Brian Gerst @ 2025-02-26 18:05 UTC (permalink / raw)
To: linux-kernel, x86
Cc: Ingo Molnar, H . Peter Anvin, Thomas Gleixner, Borislav Petkov,
Ard Biesheuvel, Uros Bizjak, Linus Torvalds, Andy Lutomirski,
Peter Zijlstra, Andrew Morton, Brian Gerst
No functional change.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
---
arch/x86/include/asm/percpu.h | 2 +-
arch/x86/kernel/setup_percpu.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
index e347c6656ce8..f29c85a0abf4 100644
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -589,7 +589,7 @@ do { \
#include <asm-generic/percpu.h>
/* We can use this directly for local CPU (faster). */
-DECLARE_PER_CPU_READ_MOSTLY(unsigned long, this_cpu_off);
+DECLARE_PER_CPU_CACHE_HOT(unsigned long, this_cpu_off);
#endif /* !__ASSEMBLY__ */
diff --git a/arch/x86/kernel/setup_percpu.c b/arch/x86/kernel/setup_percpu.c
index 175afc3ffb12..bfa48e7a32a2 100644
--- a/arch/x86/kernel/setup_percpu.c
+++ b/arch/x86/kernel/setup_percpu.c
@@ -26,7 +26,7 @@
DEFINE_PER_CPU_CACHE_HOT(int, cpu_number);
EXPORT_PER_CPU_SYMBOL(cpu_number);
-DEFINE_PER_CPU_READ_MOSTLY(unsigned long, this_cpu_off);
+DEFINE_PER_CPU_CACHE_HOT(unsigned long, this_cpu_off);
EXPORT_PER_CPU_SYMBOL(this_cpu_off);
unsigned long __per_cpu_offset[NR_CPUS] __ro_after_init;
--
2.48.1
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [PATCH v2 00/11] Add a percpu subsection for cache hot data
2025-02-26 18:05 [PATCH v2 00/11] Add a percpu subsection for cache hot data Brian Gerst
` (10 preceding siblings ...)
2025-02-26 18:05 ` [PATCH v2 11/11] x86/smp: Move this_cpu_off " Brian Gerst
@ 2025-02-26 20:23 ` Peter Zijlstra
2025-02-27 1:29 ` Brian Gerst
11 siblings, 1 reply; 23+ messages in thread
From: Peter Zijlstra @ 2025-02-26 20:23 UTC (permalink / raw)
To: Brian Gerst
Cc: linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Thomas Gleixner,
Borislav Petkov, Ard Biesheuvel, Uros Bizjak, Linus Torvalds,
Andy Lutomirski, Andrew Morton
On Wed, Feb 26, 2025 at 01:05:19PM -0500, Brian Gerst wrote:
> Add a new percpu subsection for data that is frequently accessed and
> exclusive to each processor. This replaces the pcpu_hot struct on x86,
> and is available to all architectures and the core kernel.
>
> ffffffff842fa000 D __per_cpu_hot_start
> ffffffff842fa000 D hardirq_stack_ptr
> ffffffff842fa008 D __ref_stack_chk_guard
> ffffffff842fa008 D __stack_chk_guard
> ffffffff842fa010 D const_cpu_current_top_of_stack
> ffffffff842fa010 D cpu_current_top_of_stack
> ffffffff842fa018 D const_current_task
> ffffffff842fa018 D current_task
> ffffffff842fa020 D __x86_call_depth
> ffffffff842fa028 D this_cpu_off
> ffffffff842fa030 D __preempt_count
> ffffffff842fa034 D cpu_number
> ffffffff842fa038 D __softirq_pending
> ffffffff842fa03a D hardirq_stack_inuse
> ffffffff842fa040 D __per_cpu_hot_end
The above is useful, but not quite as useful as looking at:
$ pahole -C pcpu_hot defconfig-build/vmlinux.o
struct pcpu_hot {
union {
struct {
struct task_struct * current_task; /* 0 8 */
int preempt_count; /* 8 4 */
int cpu_number; /* 12 4 */
u64 call_depth; /* 16 8 */
long unsigned int top_of_stack; /* 24 8 */
void * hardirq_stack_ptr; /* 32 8 */
u16 softirq_pending; /* 40 2 */
bool hardirq_stack_inuse; /* 42 1 */
}; /* 0 48 */
u8 pad[64]; /* 0 64 */
}; /* 0 64 */
/* size: 64, cachelines: 1, members: 1 */
};
A slightly more useful variant of your listing would be:
$ readelf -Ws defconfig-build/vmlinux | sort -k 2 | awk 'BEGIN {p=0} /__per_cpu_hot_start/ {p=1} { if (p) print $2 " " $3 " " $8 } /__per_cpu_hot_end/ {p=0}'
ffffffff834f5000 0 __per_cpu_hot_start
ffffffff834f5000 8 hardirq_stack_ptr
ffffffff834f5008 0 __ref_stack_chk_guard
ffffffff834f5008 8 __stack_chk_guard
ffffffff834f5010 0 const_cpu_current_top_of_stack
ffffffff834f5010 8 cpu_current_top_of_stack
ffffffff834f5018 0 const_current_task
ffffffff834f5018 8 current_task
ffffffff834f5020 8 __x86_call_depth
ffffffff834f5028 8 this_cpu_off
ffffffff834f5030 4 __preempt_count
ffffffff834f5034 4 cpu_number
ffffffff834f5038 2 __softirq_pending
ffffffff834f503a 1 hardirq_stack_inuse
ffffffff834f5040 0 __per_cpu_hot_end
as it also gets the size for each symbol. Allowing us to compute the
hole as 0x40-0x3b, or 5 bytes.
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH v2 00/11] Add a percpu subsection for cache hot data
2025-02-26 20:23 ` [PATCH v2 00/11] Add a percpu subsection for cache hot data Peter Zijlstra
@ 2025-02-27 1:29 ` Brian Gerst
2025-02-27 1:52 ` Brian Gerst
0 siblings, 1 reply; 23+ messages in thread
From: Brian Gerst @ 2025-02-27 1:29 UTC (permalink / raw)
To: Peter Zijlstra
Cc: linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Thomas Gleixner,
Borislav Petkov, Ard Biesheuvel, Uros Bizjak, Linus Torvalds,
Andy Lutomirski, Andrew Morton
On Wed, Feb 26, 2025 at 3:23 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Wed, Feb 26, 2025 at 01:05:19PM -0500, Brian Gerst wrote:
> > Add a new percpu subsection for data that is frequently accessed and
> > exclusive to each processor. This replaces the pcpu_hot struct on x86,
> > and is available to all architectures and the core kernel.
> >
> > ffffffff842fa000 D __per_cpu_hot_start
> > ffffffff842fa000 D hardirq_stack_ptr
> > ffffffff842fa008 D __ref_stack_chk_guard
> > ffffffff842fa008 D __stack_chk_guard
> > ffffffff842fa010 D const_cpu_current_top_of_stack
> > ffffffff842fa010 D cpu_current_top_of_stack
> > ffffffff842fa018 D const_current_task
> > ffffffff842fa018 D current_task
> > ffffffff842fa020 D __x86_call_depth
> > ffffffff842fa028 D this_cpu_off
> > ffffffff842fa030 D __preempt_count
> > ffffffff842fa034 D cpu_number
> > ffffffff842fa038 D __softirq_pending
> > ffffffff842fa03a D hardirq_stack_inuse
> > ffffffff842fa040 D __per_cpu_hot_end
>
> The above is useful, but not quite as useful as looking at:
>
> $ pahole -C pcpu_hot defconfig-build/vmlinux.o
> struct pcpu_hot {
> union {
> struct {
> struct task_struct * current_task; /* 0 8 */
> int preempt_count; /* 8 4 */
> int cpu_number; /* 12 4 */
> u64 call_depth; /* 16 8 */
> long unsigned int top_of_stack; /* 24 8 */
> void * hardirq_stack_ptr; /* 32 8 */
> u16 softirq_pending; /* 40 2 */
> bool hardirq_stack_inuse; /* 42 1 */
> }; /* 0 48 */
> u8 pad[64]; /* 0 64 */
> }; /* 0 64 */
>
> /* size: 64, cachelines: 1, members: 1 */
> };
>
> A slightly more useful variant of your listing would be:
>
> $ readelf -Ws defconfig-build/vmlinux | sort -k 2 | awk 'BEGIN {p=0} /__per_cpu_hot_start/ {p=1} { if (p) print $2 " " $3 " " $8 } /__per_cpu_hot_end/ {p=0}'
> ffffffff834f5000 0 __per_cpu_hot_start
> ffffffff834f5000 8 hardirq_stack_ptr
> ffffffff834f5008 0 __ref_stack_chk_guard
> ffffffff834f5008 8 __stack_chk_guard
> ffffffff834f5010 0 const_cpu_current_top_of_stack
> ffffffff834f5010 8 cpu_current_top_of_stack
> ffffffff834f5018 0 const_current_task
> ffffffff834f5018 8 current_task
> ffffffff834f5020 8 __x86_call_depth
> ffffffff834f5028 8 this_cpu_off
> ffffffff834f5030 4 __preempt_count
> ffffffff834f5034 4 cpu_number
> ffffffff834f5038 2 __softirq_pending
> ffffffff834f503a 1 hardirq_stack_inuse
> ffffffff834f5040 0 __per_cpu_hot_end
>
> as it also gets the size for each symbol. Allowing us to compute the
> hole as 0x40-0x3b, or 5 bytes.
If all the variables in this section are scalar or pointer types,
SORT_BY_ALIGNMENT() should result in no padding between them. I can
add a __per_cpu_hot_pad symbol to show the actual end of the data
(not aligned to the next cacheline like __per_cpu_hot_end).
Brian Gerst
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH v2 00/11] Add a percpu subsection for cache hot data
2025-02-27 1:29 ` Brian Gerst
@ 2025-02-27 1:52 ` Brian Gerst
0 siblings, 0 replies; 23+ messages in thread
From: Brian Gerst @ 2025-02-27 1:52 UTC (permalink / raw)
To: Peter Zijlstra
Cc: linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Thomas Gleixner,
Borislav Petkov, Ard Biesheuvel, Uros Bizjak, Linus Torvalds,
Andy Lutomirski, Andrew Morton
On Wed, Feb 26, 2025 at 8:29 PM Brian Gerst <brgerst@gmail.com> wrote:
>
> On Wed, Feb 26, 2025 at 3:23 PM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Wed, Feb 26, 2025 at 01:05:19PM -0500, Brian Gerst wrote:
> > > Add a new percpu subsection for data that is frequently accessed and
> > > exclusive to each processor. This replaces the pcpu_hot struct on x86,
> > > and is available to all architectures and the core kernel.
> > >
> > > ffffffff842fa000 D __per_cpu_hot_start
> > > ffffffff842fa000 D hardirq_stack_ptr
> > > ffffffff842fa008 D __ref_stack_chk_guard
> > > ffffffff842fa008 D __stack_chk_guard
> > > ffffffff842fa010 D const_cpu_current_top_of_stack
> > > ffffffff842fa010 D cpu_current_top_of_stack
> > > ffffffff842fa018 D const_current_task
> > > ffffffff842fa018 D current_task
> > > ffffffff842fa020 D __x86_call_depth
> > > ffffffff842fa028 D this_cpu_off
> > > ffffffff842fa030 D __preempt_count
> > > ffffffff842fa034 D cpu_number
> > > ffffffff842fa038 D __softirq_pending
> > > ffffffff842fa03a D hardirq_stack_inuse
> > > ffffffff842fa040 D __per_cpu_hot_end
> >
> > The above is useful, but not quite as useful as looking at:
> >
> > $ pahole -C pcpu_hot defconfig-build/vmlinux.o
> > struct pcpu_hot {
> > union {
> > struct {
> > struct task_struct * current_task; /* 0 8 */
> > int preempt_count; /* 8 4 */
> > int cpu_number; /* 12 4 */
> > u64 call_depth; /* 16 8 */
> > long unsigned int top_of_stack; /* 24 8 */
> > void * hardirq_stack_ptr; /* 32 8 */
> > u16 softirq_pending; /* 40 2 */
> > bool hardirq_stack_inuse; /* 42 1 */
> > }; /* 0 48 */
> > u8 pad[64]; /* 0 64 */
> > }; /* 0 64 */
> >
> > /* size: 64, cachelines: 1, members: 1 */
> > };
> >
> > A slightly more useful variant of your listing would be:
> >
> > $ readelf -Ws defconfig-build/vmlinux | sort -k 2 | awk 'BEGIN {p=0} /__per_cpu_hot_start/ {p=1} { if (p) print $2 " " $3 " " $8 } /__per_cpu_hot_end/ {p=0}'
> > ffffffff834f5000 0 __per_cpu_hot_start
> > ffffffff834f5000 8 hardirq_stack_ptr
> > ffffffff834f5008 0 __ref_stack_chk_guard
> > ffffffff834f5008 8 __stack_chk_guard
> > ffffffff834f5010 0 const_cpu_current_top_of_stack
> > ffffffff834f5010 8 cpu_current_top_of_stack
> > ffffffff834f5018 0 const_current_task
> > ffffffff834f5018 8 current_task
> > ffffffff834f5020 8 __x86_call_depth
> > ffffffff834f5028 8 this_cpu_off
> > ffffffff834f5030 4 __preempt_count
> > ffffffff834f5034 4 cpu_number
> > ffffffff834f5038 2 __softirq_pending
> > ffffffff834f503a 1 hardirq_stack_inuse
> > ffffffff834f5040 0 __per_cpu_hot_end
> >
> > as it also gets the size for each symbol. Allowing us to compute the
> > hole as 0x40-0x3b, or 5 bytes.
>
> If all the variables in this section are scalar or pointer types,
> SORT_BY_ALIGNMENT() should result in no padding between them. I can
> add a __per_cpu_hot_pad symbol to show the actual end of the data
> (not aligned to the next cacheline like __per_cpu_hot_end).
Is this better? (from System.map)
ffffffff834f5000 D __per_cpu_hot_start
ffffffff834f5000 D hardirq_stack_ptr
ffffffff834f5008 D __ref_stack_chk_guard
ffffffff834f5008 D __stack_chk_guard
ffffffff834f5010 D const_cpu_current_top_of_stack
ffffffff834f5010 D cpu_current_top_of_stack
ffffffff834f5018 D const_current_task
ffffffff834f5018 D current_task
ffffffff834f5020 D __x86_call_depth
ffffffff834f5028 D this_cpu_off
ffffffff834f5030 D __preempt_count
ffffffff834f5034 D cpu_number
ffffffff834f5038 D __softirq_pending
ffffffff834f503a D hardirq_stack_inuse
ffffffff834f503b D __per_cpu_hot_pad
ffffffff834f5040 D __per_cpu_hot_end
Brian Gerst
^ permalink raw reply [flat|nested] 23+ messages in thread