* [PATCH v2 0/3] sched: make migrate_enable/migrate_disable inline
@ 2025-08-19 1:58 Menglong Dong
2025-08-19 1:58 ` [PATCH v2 1/3] arch: add the macro COMPILE_OFFSETS to all the asm-offsets.c Menglong Dong
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Menglong Dong @ 2025-08-19 1:58 UTC (permalink / raw)
To: peterz
Cc: mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, ast, daniel, john.fastabend, andrii,
martin.lau, eddyz87, song, yonghong.song, kpsingh, sdf, haoluo,
jolsa, simona.vetter, tzimmermann, jani.nikula, linux-kernel, bpf
In this series, we make migrate_enable/migrate_disable inline to obtain
better performance in some case.
In the first patch, we add the macro "COMPILE_OFFSETS" to all the
asm-offset.c to avoid circular dependency in the 2nd patch.
In the 2nd patch, we generate the offset of nr_pinned in "struct rq" with
rq-offsets.c, as the "struct rq" is defined internally and we need to
access the "nr_pinned" field in migrate_enable and migrate_disable. Then,
we move the definition of migrate_enable/migrate_disable from
kernel/sched/core.c to include/linux/sched.h.
In the 3rd patch, we fix some typos in include/linux/preempt.h.
One of the beneficiaries of this series is BPF trampoline. Without this
series, the migrate_enable/migrate_disable is hot when we run the
benchmark for FENTRY, FEXIT, MODIFY_RETURN, etc:
54.63% bpf_prog_2dcccf652aac1793_bench_trigger_fentry [k]
bpf_prog_2dcccf652aac1793_bench_trigger_fentry
10.43% [kernel] [k] migrate_enable
10.07% bpf_trampoline_6442517037 [k] bpf_trampoline_6442517037
8.06% [kernel] [k] __bpf_prog_exit_recur
4.11% libc.so.6 [.] syscall
2.15% [kernel] [k] entry_SYSCALL_64
1.48% [kernel] [k] memchr_inv
1.32% [kernel] [k] fput
1.16% [kernel] [k] _copy_to_user
0.73% [kernel] [k] bpf_prog_test_run_raw_tp
Before this patch, the performance of BPF FENTRY is:
fentry : 113.030 ± 0.149M/s
fentry : 112.501 ± 0.187M/s
fentry : 112.828 ± 0.267M/s
fentry : 115.287 ± 0.241M/s
After this patch, the performance of BPF FENTRY increases to:
fentry : 143.644 ± 0.670M/s
fentry : 149.764 ± 0.362M/s
fentry : 149.642 ± 0.156M/s
fentry : 145.263 ± 0.221M/s
Changes since V1:
* use PERCPU_PTR() for this_rq_raw() if !CONFIG_SMP in the 2nd patch
Menglong Dong (3):
arch: add the macro COMPILE_OFFSETS to all the asm-offsets.c
sched: make migrate_enable/migrate_disable inline
sched: fix some typos in include/linux/preempt.h
Kbuild | 13 ++++-
arch/alpha/kernel/asm-offsets.c | 1 +
arch/arc/kernel/asm-offsets.c | 1 +
arch/arm/kernel/asm-offsets.c | 2 +
arch/arm64/kernel/asm-offsets.c | 1 +
arch/csky/kernel/asm-offsets.c | 1 +
arch/hexagon/kernel/asm-offsets.c | 1 +
arch/loongarch/kernel/asm-offsets.c | 2 +
arch/m68k/kernel/asm-offsets.c | 1 +
arch/microblaze/kernel/asm-offsets.c | 1 +
arch/mips/kernel/asm-offsets.c | 2 +
arch/nios2/kernel/asm-offsets.c | 1 +
arch/openrisc/kernel/asm-offsets.c | 1 +
arch/parisc/kernel/asm-offsets.c | 1 +
arch/powerpc/kernel/asm-offsets.c | 1 +
arch/riscv/kernel/asm-offsets.c | 1 +
arch/s390/kernel/asm-offsets.c | 1 +
arch/sh/kernel/asm-offsets.c | 1 +
arch/sparc/kernel/asm-offsets.c | 1 +
arch/um/kernel/asm-offsets.c | 2 +
arch/xtensa/kernel/asm-offsets.c | 1 +
include/linux/preempt.h | 11 ++--
include/linux/sched.h | 77 ++++++++++++++++++++++++++++
kernel/bpf/verifier.c | 3 +-
kernel/sched/core.c | 56 ++------------------
kernel/sched/rq-offsets.c | 12 +++++
26 files changed, 134 insertions(+), 62 deletions(-)
create mode 100644 kernel/sched/rq-offsets.c
--
2.50.1
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 1/3] arch: add the macro COMPILE_OFFSETS to all the asm-offsets.c
2025-08-19 1:58 [PATCH v2 0/3] sched: make migrate_enable/migrate_disable inline Menglong Dong
@ 2025-08-19 1:58 ` Menglong Dong
2025-08-19 1:58 ` [PATCH v2 2/3] sched: make migrate_enable/migrate_disable inline Menglong Dong
2025-08-19 1:58 ` [PATCH v2 3/3] sched: fix some typos in include/linux/preempt.h Menglong Dong
2 siblings, 0 replies; 11+ messages in thread
From: Menglong Dong @ 2025-08-19 1:58 UTC (permalink / raw)
To: peterz
Cc: mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, ast, daniel, john.fastabend, andrii,
martin.lau, eddyz87, song, yonghong.song, kpsingh, sdf, haoluo,
jolsa, simona.vetter, tzimmermann, jani.nikula, linux-kernel, bpf
The include/generated/asm-offsets.h is generated in Kbuild during
compiling from arch/SRCARCH/kernel/asm-offsets.c. When we want to
generate another similar offset header file, circular dependency can
happen.
For example, we want to generate a offset file include/generated/test.h,
which is included in include/sched/sched.h. If we generate asm-offsets.h
first, it will fail, as include/sched/sched.h is included in asm-offsets.c
and include/generated/test.h doesn't exist; If we generate test.h first,
it can't success neither, as include/generated/asm-offsets.h is included
by it.
In x86_64, the macro COMPILE_OFFSETS is used to avoid such circular
dependency. We can generate asm-offsets.h first, and if the
COMPILE_OFFSETS is defined, we don't include the "generated/test.h".
And we define the macro COMPILE_OFFSETS for all the asm-offsets.c for this
purpose.
Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
---
arch/alpha/kernel/asm-offsets.c | 1 +
arch/arc/kernel/asm-offsets.c | 1 +
arch/arm/kernel/asm-offsets.c | 2 ++
arch/arm64/kernel/asm-offsets.c | 1 +
arch/csky/kernel/asm-offsets.c | 1 +
arch/hexagon/kernel/asm-offsets.c | 1 +
arch/loongarch/kernel/asm-offsets.c | 2 ++
arch/m68k/kernel/asm-offsets.c | 1 +
arch/microblaze/kernel/asm-offsets.c | 1 +
arch/mips/kernel/asm-offsets.c | 2 ++
arch/nios2/kernel/asm-offsets.c | 1 +
arch/openrisc/kernel/asm-offsets.c | 1 +
arch/parisc/kernel/asm-offsets.c | 1 +
arch/powerpc/kernel/asm-offsets.c | 1 +
arch/riscv/kernel/asm-offsets.c | 1 +
arch/s390/kernel/asm-offsets.c | 1 +
arch/sh/kernel/asm-offsets.c | 1 +
arch/sparc/kernel/asm-offsets.c | 1 +
arch/um/kernel/asm-offsets.c | 2 ++
arch/xtensa/kernel/asm-offsets.c | 1 +
20 files changed, 24 insertions(+)
diff --git a/arch/alpha/kernel/asm-offsets.c b/arch/alpha/kernel/asm-offsets.c
index e9dad60b147f..1ebb05890499 100644
--- a/arch/alpha/kernel/asm-offsets.c
+++ b/arch/alpha/kernel/asm-offsets.c
@@ -4,6 +4,7 @@
* This code generates raw asm output which is post-processed to extract
* and format the required data.
*/
+#define COMPILE_OFFSETS
#include <linux/types.h>
#include <linux/stddef.h>
diff --git a/arch/arc/kernel/asm-offsets.c b/arch/arc/kernel/asm-offsets.c
index f77deb799175..2978da85fcb6 100644
--- a/arch/arc/kernel/asm-offsets.c
+++ b/arch/arc/kernel/asm-offsets.c
@@ -2,6 +2,7 @@
/*
* Copyright (C) 2004, 2007-2010, 2011-2012 Synopsys, Inc. (www.synopsys.com)
*/
+#define COMPILE_OFFSETS
#include <linux/sched.h>
#include <linux/mm.h>
diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
index 123f4a8ef446..2101938d27fc 100644
--- a/arch/arm/kernel/asm-offsets.c
+++ b/arch/arm/kernel/asm-offsets.c
@@ -7,6 +7,8 @@
* This code generates raw asm output which is post-processed to extract
* and format the required data.
*/
+#define COMPILE_OFFSETS
+
#include <linux/compiler.h>
#include <linux/sched.h>
#include <linux/mm.h>
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 30d4bbe68661..b6367ff3a49c 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -6,6 +6,7 @@
* 2001-2002 Keith Owens
* Copyright (C) 2012 ARM Ltd.
*/
+#define COMPILE_OFFSETS
#include <linux/arm_sdei.h>
#include <linux/sched.h>
diff --git a/arch/csky/kernel/asm-offsets.c b/arch/csky/kernel/asm-offsets.c
index d1e903579473..5525c8e7e1d9 100644
--- a/arch/csky/kernel/asm-offsets.c
+++ b/arch/csky/kernel/asm-offsets.c
@@ -1,5 +1,6 @@
// SPDX-License-Identifier: GPL-2.0
// Copyright (C) 2018 Hangzhou C-SKY Microsystems co.,ltd.
+#define COMPILE_OFFSETS
#include <linux/sched.h>
#include <linux/kernel_stat.h>
diff --git a/arch/hexagon/kernel/asm-offsets.c b/arch/hexagon/kernel/asm-offsets.c
index 03a7063f9456..50eea9fa6f13 100644
--- a/arch/hexagon/kernel/asm-offsets.c
+++ b/arch/hexagon/kernel/asm-offsets.c
@@ -8,6 +8,7 @@
*
* Copyright (c) 2010-2012, The Linux Foundation. All rights reserved.
*/
+#define COMPILE_OFFSETS
#include <linux/compat.h>
#include <linux/types.h>
diff --git a/arch/loongarch/kernel/asm-offsets.c b/arch/loongarch/kernel/asm-offsets.c
index db1e4bb26b6a..3017c7157600 100644
--- a/arch/loongarch/kernel/asm-offsets.c
+++ b/arch/loongarch/kernel/asm-offsets.c
@@ -4,6 +4,8 @@
*
* Copyright (C) 2020-2022 Loongson Technology Corporation Limited
*/
+#define COMPILE_OFFSETS
+
#include <linux/types.h>
#include <linux/sched.h>
#include <linux/mm.h>
diff --git a/arch/m68k/kernel/asm-offsets.c b/arch/m68k/kernel/asm-offsets.c
index 906d73230537..67a1990f9d74 100644
--- a/arch/m68k/kernel/asm-offsets.c
+++ b/arch/m68k/kernel/asm-offsets.c
@@ -9,6 +9,7 @@
* #defines from the assembly-language output.
*/
+#define COMPILE_OFFSETS
#define ASM_OFFSETS_C
#include <linux/stddef.h>
diff --git a/arch/microblaze/kernel/asm-offsets.c b/arch/microblaze/kernel/asm-offsets.c
index 104c3ac5f30c..b4b67d58e7f6 100644
--- a/arch/microblaze/kernel/asm-offsets.c
+++ b/arch/microblaze/kernel/asm-offsets.c
@@ -7,6 +7,7 @@
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*/
+#define COMPILE_OFFSETS
#include <linux/init.h>
#include <linux/stddef.h>
diff --git a/arch/mips/kernel/asm-offsets.c b/arch/mips/kernel/asm-offsets.c
index 1e29efcba46e..5debd9a3854a 100644
--- a/arch/mips/kernel/asm-offsets.c
+++ b/arch/mips/kernel/asm-offsets.c
@@ -9,6 +9,8 @@
* Kevin Kissell, kevink@mips.com and Carsten Langgaard, carstenl@mips.com
* Copyright (C) 2000 MIPS Technologies, Inc.
*/
+#define COMPILE_OFFSETS
+
#include <linux/compat.h>
#include <linux/types.h>
#include <linux/sched.h>
diff --git a/arch/nios2/kernel/asm-offsets.c b/arch/nios2/kernel/asm-offsets.c
index e3d9b7b6fb48..88190b503ce5 100644
--- a/arch/nios2/kernel/asm-offsets.c
+++ b/arch/nios2/kernel/asm-offsets.c
@@ -2,6 +2,7 @@
/*
* Copyright (C) 2011 Tobias Klauser <tklauser@distanz.ch>
*/
+#define COMPILE_OFFSETS
#include <linux/stddef.h>
#include <linux/sched.h>
diff --git a/arch/openrisc/kernel/asm-offsets.c b/arch/openrisc/kernel/asm-offsets.c
index 710651d5aaae..3cc826f2216b 100644
--- a/arch/openrisc/kernel/asm-offsets.c
+++ b/arch/openrisc/kernel/asm-offsets.c
@@ -18,6 +18,7 @@
* compile this file to assembler, and then extract the
* #defines from the assembly-language output.
*/
+#define COMPILE_OFFSETS
#include <linux/signal.h>
#include <linux/sched.h>
diff --git a/arch/parisc/kernel/asm-offsets.c b/arch/parisc/kernel/asm-offsets.c
index 757816a7bd4b..9abfe65492c6 100644
--- a/arch/parisc/kernel/asm-offsets.c
+++ b/arch/parisc/kernel/asm-offsets.c
@@ -13,6 +13,7 @@
* Copyright (C) 2002 Randolph Chung <tausq with parisc-linux.org>
* Copyright (C) 2003 James Bottomley <jejb at parisc-linux.org>
*/
+#define COMPILE_OFFSETS
#include <linux/types.h>
#include <linux/sched.h>
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index b3048f6d3822..a4bc80b30410 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -8,6 +8,7 @@
* compile this file to assembler, and then extract the
* #defines from the assembly-language output.
*/
+#define COMPILE_OFFSETS
#include <linux/compat.h>
#include <linux/signal.h>
diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c
index 6e8c0d6feae9..7d42d3b8a32a 100644
--- a/arch/riscv/kernel/asm-offsets.c
+++ b/arch/riscv/kernel/asm-offsets.c
@@ -3,6 +3,7 @@
* Copyright (C) 2012 Regents of the University of California
* Copyright (C) 2017 SiFive
*/
+#define COMPILE_OFFSETS
#include <linux/kbuild.h>
#include <linux/mm.h>
diff --git a/arch/s390/kernel/asm-offsets.c b/arch/s390/kernel/asm-offsets.c
index 95ecad9c7d7d..a8915663e917 100644
--- a/arch/s390/kernel/asm-offsets.c
+++ b/arch/s390/kernel/asm-offsets.c
@@ -4,6 +4,7 @@
* This code generates raw asm output which is post-processed to extract
* and format the required data.
*/
+#define COMPILE_OFFSETS
#include <linux/kbuild.h>
#include <linux/sched.h>
diff --git a/arch/sh/kernel/asm-offsets.c b/arch/sh/kernel/asm-offsets.c
index a0322e832845..429b6a763146 100644
--- a/arch/sh/kernel/asm-offsets.c
+++ b/arch/sh/kernel/asm-offsets.c
@@ -8,6 +8,7 @@
* compile this file to assembler, and then extract the
* #defines from the assembly-language output.
*/
+#define COMPILE_OFFSETS
#include <linux/stddef.h>
#include <linux/types.h>
diff --git a/arch/sparc/kernel/asm-offsets.c b/arch/sparc/kernel/asm-offsets.c
index 3d9b9855dce9..6e660bde48dd 100644
--- a/arch/sparc/kernel/asm-offsets.c
+++ b/arch/sparc/kernel/asm-offsets.c
@@ -10,6 +10,7 @@
*
* On sparc, thread_info data is static and TI_XXX offsets are computed by hand.
*/
+#define COMPILE_OFFSETS
#include <linux/sched.h>
#include <linux/mm_types.h>
diff --git a/arch/um/kernel/asm-offsets.c b/arch/um/kernel/asm-offsets.c
index 1fb12235ab9c..a69873aa697f 100644
--- a/arch/um/kernel/asm-offsets.c
+++ b/arch/um/kernel/asm-offsets.c
@@ -1 +1,3 @@
+#define COMPILE_OFFSETS
+
#include <sysdep/kernel-offsets.h>
diff --git a/arch/xtensa/kernel/asm-offsets.c b/arch/xtensa/kernel/asm-offsets.c
index da38de20ae59..cfbced95e944 100644
--- a/arch/xtensa/kernel/asm-offsets.c
+++ b/arch/xtensa/kernel/asm-offsets.c
@@ -11,6 +11,7 @@
*
* Chris Zankel <chris@zankel.net>
*/
+#define COMPILE_OFFSETS
#include <asm/processor.h>
#include <asm/coprocessor.h>
--
2.50.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 2/3] sched: make migrate_enable/migrate_disable inline
2025-08-19 1:58 [PATCH v2 0/3] sched: make migrate_enable/migrate_disable inline Menglong Dong
2025-08-19 1:58 ` [PATCH v2 1/3] arch: add the macro COMPILE_OFFSETS to all the asm-offsets.c Menglong Dong
@ 2025-08-19 1:58 ` Menglong Dong
2025-08-19 12:32 ` Peter Zijlstra
2025-08-19 12:40 ` Peter Zijlstra
2025-08-19 1:58 ` [PATCH v2 3/3] sched: fix some typos in include/linux/preempt.h Menglong Dong
2 siblings, 2 replies; 11+ messages in thread
From: Menglong Dong @ 2025-08-19 1:58 UTC (permalink / raw)
To: peterz
Cc: mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, ast, daniel, john.fastabend, andrii,
martin.lau, eddyz87, song, yonghong.song, kpsingh, sdf, haoluo,
jolsa, simona.vetter, tzimmermann, jani.nikula, linux-kernel, bpf
For now, migrate_enable and migrate_disable are global, which makes them
become hotspots in some case. Take BPF for example, the function calling
to migrate_enable and migrate_disable in BPF trampoline can introduce
significant overhead, and following is the 'perf top' of FENTRY's
benchmark (./tools/testing/selftests/bpf/bench trig-fentry):
54.63% bpf_prog_2dcccf652aac1793_bench_trigger_fentry [k]
bpf_prog_2dcccf652aac1793_bench_trigger_fentry
10.43% [kernel] [k] migrate_enable
10.07% bpf_trampoline_6442517037 [k] bpf_trampoline_6442517037
8.06% [kernel] [k] __bpf_prog_exit_recur
4.11% libc.so.6 [.] syscall
2.15% [kernel] [k] entry_SYSCALL_64
1.48% [kernel] [k] memchr_inv
1.32% [kernel] [k] fput
1.16% [kernel] [k] _copy_to_user
0.73% [kernel] [k] bpf_prog_test_run_raw_tp
So in this commit, we make migrate_enable/migrate_disable inline to obtain
better performance. The struct rq is defined internally in
kernel/sched/sched.h, and the field "nr_pinned" is accessed in
migrate_enable/migrate_disable, which makes it hard to make them inline.
Alexei Starovoitov suggests to generate the offset of "nr_pinned" in [1],
so we can define the migrate_enable/migrate_disable in
include/linux/sched.h and access "this_rq()->nr_pinned" with
"(void *)this_rq() + RQ_nr_pinned".
The offset of "nr_pinned" is generated in include/generated/rq-offsets.h
by kernel/sched/rq-offsets.c.
Generally speaking, we move the definition of migrate_enable and
migrate_disable to include/linux/sched.h from kernel/sched/core.c. The
calling to __set_cpus_allowed_ptr() is leaved in __migrate_enable().
The "struct rq" is not available in include/linux/sched.h, so we can't
access the "runqueues" with this_cpu_ptr(), as the compilation will fail
in this_cpu_ptr() -> raw_cpu_ptr() -> __verify_pcpu_ptr():
typeof((ptr) + 0)
So we introduce the this_rq_raw() and access the runqueues with
arch_raw_cpu_ptr() directly.
Before this patch, the performance of BPF FENTRY is:
fentry : 113.030 ± 0.149M/s
fentry : 112.501 ± 0.187M/s
fentry : 112.828 ± 0.267M/s
fentry : 115.287 ± 0.241M/s
After this patch, the performance of BPF FENTRY increases to:
fentry : 143.644 ± 0.670M/s
fentry : 149.764 ± 0.362M/s
fentry : 149.642 ± 0.156M/s
fentry : 145.263 ± 0.221M/s
Link: https://lore.kernel.org/bpf/CAADnVQ+5sEDKHdsJY5ZsfGDO_1SEhhQWHrt2SMBG5SYyQ+jt7w@mail.gmail.com/ [1]
Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
---
v2:
- use PERCPU_PTR() for this_rq_raw() if !CONFIG_SMP
---
Kbuild | 13 ++++++-
include/linux/preempt.h | 3 --
include/linux/sched.h | 77 +++++++++++++++++++++++++++++++++++++++
kernel/bpf/verifier.c | 3 +-
kernel/sched/core.c | 56 ++--------------------------
kernel/sched/rq-offsets.c | 12 ++++++
6 files changed, 106 insertions(+), 58 deletions(-)
create mode 100644 kernel/sched/rq-offsets.c
diff --git a/Kbuild b/Kbuild
index f327ca86990c..13324b4bbe23 100644
--- a/Kbuild
+++ b/Kbuild
@@ -34,13 +34,24 @@ arch/$(SRCARCH)/kernel/asm-offsets.s: $(timeconst-file) $(bounds-file)
$(offsets-file): arch/$(SRCARCH)/kernel/asm-offsets.s FORCE
$(call filechk,offsets,__ASM_OFFSETS_H__)
+# Generate rq-offsets.h
+
+rq-offsets-file := include/generated/rq-offsets.h
+
+targets += kernel/sched/rq-offsets.s
+
+kernel/sched/rq-offsets.s: $(offsets-file)
+
+$(rq-offsets-file): kernel/sched/rq-offsets.s FORCE
+ $(call filechk,offsets,__RQ_OFFSETS_H__)
+
# Check for missing system calls
quiet_cmd_syscalls = CALL $<
cmd_syscalls = $(CONFIG_SHELL) $< $(CC) $(c_flags) $(missing_syscalls_flags)
PHONY += missing-syscalls
-missing-syscalls: scripts/checksyscalls.sh $(offsets-file)
+missing-syscalls: scripts/checksyscalls.sh $(rq-offsets-file)
$(call cmd,syscalls)
# Check the manual modification of atomic headers
diff --git a/include/linux/preempt.h b/include/linux/preempt.h
index 1fad1c8a4c76..92237c319035 100644
--- a/include/linux/preempt.h
+++ b/include/linux/preempt.h
@@ -424,8 +424,6 @@ static inline void preempt_notifier_init(struct preempt_notifier *notifier,
* work-conserving schedulers.
*
*/
-extern void migrate_disable(void);
-extern void migrate_enable(void);
/**
* preempt_disable_nested - Disable preemption inside a normally preempt disabled section
@@ -471,7 +469,6 @@ static __always_inline void preempt_enable_nested(void)
DEFINE_LOCK_GUARD_0(preempt, preempt_disable(), preempt_enable())
DEFINE_LOCK_GUARD_0(preempt_notrace, preempt_disable_notrace(), preempt_enable_notrace())
-DEFINE_LOCK_GUARD_0(migrate, migrate_disable(), migrate_enable())
#ifdef CONFIG_PREEMPT_DYNAMIC
diff --git a/include/linux/sched.h b/include/linux/sched.h
index f8188b833350..b554a1e65e3e 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -49,6 +49,9 @@
#include <linux/tracepoint-defs.h>
#include <linux/unwind_deferred_types.h>
#include <asm/kmap_size.h>
+#ifndef COMPILE_OFFSETS
+#include <generated/rq-offsets.h>
+#endif
/* task_struct member predeclarations (sorted alphabetically): */
struct audit_context;
@@ -2312,4 +2315,78 @@ static __always_inline void alloc_tag_restore(struct alloc_tag *tag, struct allo
#define alloc_tag_restore(_tag, _old) do {} while (0)
#endif
+#ifndef COMPILE_OFFSETS
+
+extern void __migrate_enable(void);
+
+struct rq;
+DECLARE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
+
+#ifdef CONFIG_SMP
+#define this_rq_raw() arch_raw_cpu_ptr(&runqueues)
+#else
+#define this_rq_raw() PERCPU_PTR(&runqueues)
+#endif
+
+static inline void migrate_enable(void)
+{
+ struct task_struct *p = current;
+
+#ifdef CONFIG_DEBUG_PREEMPT
+ /*
+ * Check both overflow from migrate_disable() and superfluous
+ * migrate_enable().
+ */
+ if (WARN_ON_ONCE((s16)p->migration_disabled <= 0))
+ return;
+#endif
+
+ if (p->migration_disabled > 1) {
+ p->migration_disabled--;
+ return;
+ }
+
+ /*
+ * Ensure stop_task runs either before or after this, and that
+ * __set_cpus_allowed_ptr(SCA_MIGRATE_ENABLE) doesn't schedule().
+ */
+ guard(preempt)();
+ if (unlikely(p->cpus_ptr != &p->cpus_mask))
+ __migrate_enable();
+ /*
+ * Mustn't clear migration_disabled() until cpus_ptr points back at the
+ * regular cpus_mask, otherwise things that race (eg.
+ * select_fallback_rq) get confused.
+ */
+ barrier();
+ p->migration_disabled = 0;
+ (*(unsigned int *)((void *)this_rq_raw() + RQ_nr_pinned))--;
+}
+
+static inline void migrate_disable(void)
+{
+ struct task_struct *p = current;
+
+ if (p->migration_disabled) {
+#ifdef CONFIG_DEBUG_PREEMPT
+ /*
+ *Warn about overflow half-way through the range.
+ */
+ WARN_ON_ONCE((s16)p->migration_disabled < 0);
+#endif
+ p->migration_disabled++;
+ return;
+ }
+
+ guard(preempt)();
+ (*(unsigned int *)((void *)this_rq_raw() + RQ_nr_pinned))++;
+ p->migration_disabled = 1;
+}
+#else
+static inline void migrate_disable(void) { }
+static inline void migrate_enable(void) { }
+#endif
+
+DEFINE_LOCK_GUARD_0(migrate, migrate_disable(), migrate_enable())
+
#endif
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index c4f69a9e9af6..88bf2ef3e60c 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -23855,8 +23855,7 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
BTF_SET_START(btf_id_deny)
BTF_ID_UNUSED
#ifdef CONFIG_SMP
-BTF_ID(func, migrate_disable)
-BTF_ID(func, migrate_enable)
+BTF_ID(func, __migrate_enable)
#endif
#if !defined CONFIG_PREEMPT_RCU && !defined CONFIG_TINY_RCU
BTF_ID(func, rcu_read_unlock_strict)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index be00629f0ba4..00383fed9f63 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -119,6 +119,7 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(sched_update_nr_running_tp);
EXPORT_TRACEPOINT_SYMBOL_GPL(sched_compute_energy_tp);
DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
+EXPORT_SYMBOL_GPL(runqueues);
#ifdef CONFIG_SCHED_PROXY_EXEC
DEFINE_STATIC_KEY_TRUE(__sched_proxy_exec);
@@ -2381,28 +2382,7 @@ static void migrate_disable_switch(struct rq *rq, struct task_struct *p)
__do_set_cpus_allowed(p, &ac);
}
-void migrate_disable(void)
-{
- struct task_struct *p = current;
-
- if (p->migration_disabled) {
-#ifdef CONFIG_DEBUG_PREEMPT
- /*
- *Warn about overflow half-way through the range.
- */
- WARN_ON_ONCE((s16)p->migration_disabled < 0);
-#endif
- p->migration_disabled++;
- return;
- }
-
- guard(preempt)();
- this_rq()->nr_pinned++;
- p->migration_disabled = 1;
-}
-EXPORT_SYMBOL_GPL(migrate_disable);
-
-void migrate_enable(void)
+void __migrate_enable(void)
{
struct task_struct *p = current;
struct affinity_context ac = {
@@ -2410,37 +2390,9 @@ void migrate_enable(void)
.flags = SCA_MIGRATE_ENABLE,
};
-#ifdef CONFIG_DEBUG_PREEMPT
- /*
- * Check both overflow from migrate_disable() and superfluous
- * migrate_enable().
- */
- if (WARN_ON_ONCE((s16)p->migration_disabled <= 0))
- return;
-#endif
-
- if (p->migration_disabled > 1) {
- p->migration_disabled--;
- return;
- }
-
- /*
- * Ensure stop_task runs either before or after this, and that
- * __set_cpus_allowed_ptr(SCA_MIGRATE_ENABLE) doesn't schedule().
- */
- guard(preempt)();
- if (p->cpus_ptr != &p->cpus_mask)
- __set_cpus_allowed_ptr(p, &ac);
- /*
- * Mustn't clear migration_disabled() until cpus_ptr points back at the
- * regular cpus_mask, otherwise things that race (eg.
- * select_fallback_rq) get confused.
- */
- barrier();
- p->migration_disabled = 0;
- this_rq()->nr_pinned--;
+ __set_cpus_allowed_ptr(p, &ac);
}
-EXPORT_SYMBOL_GPL(migrate_enable);
+EXPORT_SYMBOL_GPL(__migrate_enable);
static inline bool rq_has_pinned_tasks(struct rq *rq)
{
diff --git a/kernel/sched/rq-offsets.c b/kernel/sched/rq-offsets.c
new file mode 100644
index 000000000000..a23747bbe25b
--- /dev/null
+++ b/kernel/sched/rq-offsets.c
@@ -0,0 +1,12 @@
+// SPDX-License-Identifier: GPL-2.0
+#define COMPILE_OFFSETS
+#include <linux/kbuild.h>
+#include <linux/types.h>
+#include "sched.h"
+
+int main(void)
+{
+ DEFINE(RQ_nr_pinned, offsetof(struct rq, nr_pinned));
+
+ return 0;
+}
--
2.50.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 3/3] sched: fix some typos in include/linux/preempt.h
2025-08-19 1:58 [PATCH v2 0/3] sched: make migrate_enable/migrate_disable inline Menglong Dong
2025-08-19 1:58 ` [PATCH v2 1/3] arch: add the macro COMPILE_OFFSETS to all the asm-offsets.c Menglong Dong
2025-08-19 1:58 ` [PATCH v2 2/3] sched: make migrate_enable/migrate_disable inline Menglong Dong
@ 2025-08-19 1:58 ` Menglong Dong
2 siblings, 0 replies; 11+ messages in thread
From: Menglong Dong @ 2025-08-19 1:58 UTC (permalink / raw)
To: peterz
Cc: mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, ast, daniel, john.fastabend, andrii,
martin.lau, eddyz87, song, yonghong.song, kpsingh, sdf, haoluo,
jolsa, simona.vetter, tzimmermann, jani.nikula, linux-kernel, bpf
There are some typos in the comments of migrate in
include/linux/preempt.h:
elegible -> eligible
it's -> its
migirate_disable -> migrate_disable
abritrary -> arbitrary
Just fix them.
Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
---
include/linux/preempt.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/linux/preempt.h b/include/linux/preempt.h
index 92237c319035..102202185d7a 100644
--- a/include/linux/preempt.h
+++ b/include/linux/preempt.h
@@ -372,7 +372,7 @@ static inline void preempt_notifier_init(struct preempt_notifier *notifier,
/*
* Migrate-Disable and why it is undesired.
*
- * When a preempted task becomes elegible to run under the ideal model (IOW it
+ * When a preempted task becomes eligible to run under the ideal model (IOW it
* becomes one of the M highest priority tasks), it might still have to wait
* for the preemptee's migrate_disable() section to complete. Thereby suffering
* a reduction in bandwidth in the exact duration of the migrate_disable()
@@ -387,7 +387,7 @@ static inline void preempt_notifier_init(struct preempt_notifier *notifier,
* - a lower priority tasks; which under preempt_disable() could've instantly
* migrated away when another CPU becomes available, is now constrained
* by the ability to push the higher priority task away, which might itself be
- * in a migrate_disable() section, reducing it's available bandwidth.
+ * in a migrate_disable() section, reducing its available bandwidth.
*
* IOW it trades latency / moves the interference term, but it stays in the
* system, and as long as it remains unbounded, the system is not fully
@@ -399,7 +399,7 @@ static inline void preempt_notifier_init(struct preempt_notifier *notifier,
* PREEMPT_RT breaks a number of assumptions traditionally held. By forcing a
* number of primitives into becoming preemptible, they would also allow
* migration. This turns out to break a bunch of per-cpu usage. To this end,
- * all these primitives employ migirate_disable() to restore this implicit
+ * all these primitives employ migrate_disable() to restore this implicit
* assumption.
*
* This is a 'temporary' work-around at best. The correct solution is getting
@@ -407,7 +407,7 @@ static inline void preempt_notifier_init(struct preempt_notifier *notifier,
* per-cpu locking or short preempt-disable regions.
*
* The end goal must be to get rid of migrate_disable(), alternatively we need
- * a schedulability theory that does not depend on abritrary migration.
+ * a schedulability theory that does not depend on arbitrary migration.
*
*
* Notes on the implementation.
--
2.50.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/3] sched: make migrate_enable/migrate_disable inline
2025-08-19 1:58 ` [PATCH v2 2/3] sched: make migrate_enable/migrate_disable inline Menglong Dong
@ 2025-08-19 12:32 ` Peter Zijlstra
2025-08-19 12:45 ` Peter Zijlstra
` (2 more replies)
2025-08-19 12:40 ` Peter Zijlstra
1 sibling, 3 replies; 11+ messages in thread
From: Peter Zijlstra @ 2025-08-19 12:32 UTC (permalink / raw)
To: Menglong Dong
Cc: mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, ast, daniel, john.fastabend, andrii,
martin.lau, eddyz87, song, yonghong.song, kpsingh, sdf, haoluo,
jolsa, simona.vetter, tzimmermann, jani.nikula, linux-kernel, bpf
On Tue, Aug 19, 2025 at 09:58:31AM +0800, Menglong Dong wrote:
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index be00629f0ba4..00383fed9f63 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -119,6 +119,7 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(sched_update_nr_running_tp);
> EXPORT_TRACEPOINT_SYMBOL_GPL(sched_compute_energy_tp);
>
> DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
> +EXPORT_SYMBOL_GPL(runqueues);
Oh no, absolutely not.
You never, ever, export a variable, and certainly not this one.
How about something like so?
I tried 'clever' things with export inline, but the compiler hates me,
so the below is the best I could make work.
---
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2315,6 +2315,7 @@ static __always_inline void alloc_tag_re
#define alloc_tag_restore(_tag, _old) do {} while (0)
#endif
+#ifndef MODULE
#ifndef COMPILE_OFFSETS
extern void __migrate_enable(void);
@@ -2328,7 +2329,7 @@ DECLARE_PER_CPU_SHARED_ALIGNED(struct rq
#define this_rq_raw() PERCPU_PTR(&runqueues)
#endif
-static inline void migrate_enable(void)
+static inline void _migrate_enable(void)
{
struct task_struct *p = current;
@@ -2363,7 +2364,7 @@ static inline void migrate_enable(void)
(*(unsigned int *)((void *)this_rq_raw() + RQ_nr_pinned))--;
}
-static inline void migrate_disable(void)
+static inline void _migrate_disable(void)
{
struct task_struct *p = current;
@@ -2382,10 +2383,30 @@ static inline void migrate_disable(void)
(*(unsigned int *)((void *)this_rq_raw() + RQ_nr_pinned))++;
p->migration_disabled = 1;
}
-#else
-static inline void migrate_disable(void) { }
-static inline void migrate_enable(void) { }
-#endif
+#else /* !COMPILE_OFFSETS */
+static inline void _migrate_disable(void) { }
+static inline void _migrate_enable(void) { }
+#endif /* !COMPILE_OFFSETS */
+
+#ifndef CREATE_MIGRATE_DISABLE
+static inline void migrate_disable(void)
+{
+ _migrate_disable();
+}
+
+static inline void migrate_enable(void)
+{
+ _migrate_enable();
+}
+#else /* CREATE_MIGRATE_DISABLE */
+extern void migrate_disable(void);
+extern void migrate_enable(void);
+#endif /* CREATE_MIGRATE_DISABLE */
+
+#else /* !MODULE */
+extern void migrate_disable(void);
+extern void migrate_enable(void);
+#endif /* !MODULE */
DEFINE_LOCK_GUARD_0(migrate, migrate_disable(), migrate_enable())
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7,6 +7,9 @@
* Copyright (C) 1991-2002 Linus Torvalds
* Copyright (C) 1998-2024 Ingo Molnar, Red Hat
*/
+#define CREATE_MIGRATE_DISABLE
+#include <linux/sched.h>
+
#include <linux/highmem.h>
#include <linux/hrtimer_api.h>
#include <linux/ktime_api.h>
@@ -119,7 +122,6 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(sched_updat
EXPORT_TRACEPOINT_SYMBOL_GPL(sched_compute_energy_tp);
DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
-EXPORT_SYMBOL_GPL(runqueues);
#ifdef CONFIG_SCHED_PROXY_EXEC
DEFINE_STATIC_KEY_TRUE(__sched_proxy_exec);
@@ -2382,6 +2384,11 @@ static void migrate_disable_switch(struc
__do_set_cpus_allowed(p, &ac);
}
+void migrate_disable(void)
+{
+ _migrate_disable();
+}
+
void __migrate_enable(void)
{
struct task_struct *p = current;
@@ -2392,7 +2399,11 @@ void __migrate_enable(void)
__set_cpus_allowed_ptr(p, &ac);
}
-EXPORT_SYMBOL_GPL(__migrate_enable);
+
+void migrate_enable(void)
+{
+ _migrate_enable();
+}
static inline bool rq_has_pinned_tasks(struct rq *rq)
{
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/3] sched: make migrate_enable/migrate_disable inline
2025-08-19 1:58 ` [PATCH v2 2/3] sched: make migrate_enable/migrate_disable inline Menglong Dong
2025-08-19 12:32 ` Peter Zijlstra
@ 2025-08-19 12:40 ` Peter Zijlstra
2025-08-20 2:34 ` Menglong Dong
1 sibling, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2025-08-19 12:40 UTC (permalink / raw)
To: Menglong Dong
Cc: mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, ast, daniel, john.fastabend, andrii,
martin.lau, eddyz87, song, yonghong.song, kpsingh, sdf, haoluo,
jolsa, simona.vetter, tzimmermann, jani.nikula, linux-kernel, bpf
On Tue, Aug 19, 2025 at 09:58:31AM +0800, Menglong Dong wrote:
> The "struct rq" is not available in include/linux/sched.h, so we can't
> access the "runqueues" with this_cpu_ptr(), as the compilation will fail
> in this_cpu_ptr() -> raw_cpu_ptr() -> __verify_pcpu_ptr():
> typeof((ptr) + 0)
>
> So we introduce the this_rq_raw() and access the runqueues with
> arch_raw_cpu_ptr() directly.
^ That, wants to be a comment near here:
> @@ -2312,4 +2315,78 @@ static __always_inline void alloc_tag_restore(struct alloc_tag *tag, struct allo
> #define alloc_tag_restore(_tag, _old) do {} while (0)
> #endif
>
> +#ifndef COMPILE_OFFSETS
> +
> +extern void __migrate_enable(void);
> +
> +struct rq;
> +DECLARE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
> +
> +#ifdef CONFIG_SMP
> +#define this_rq_raw() arch_raw_cpu_ptr(&runqueues)
> +#else
> +#define this_rq_raw() PERCPU_PTR(&runqueues)
> +#endif
Because that arch_ thing really is weird.
> + (*(unsigned int *)((void *)this_rq_raw() + RQ_nr_pinned))--;
> + (*(unsigned int *)((void *)this_rq_raw() + RQ_nr_pinned))++;
And since you did a macro anyway, why not fold that magic in there,
instead of duplicating it?
#define __this_rq_raw() ((void *)arch_raw_cpu_ptr(&runqueues))
#define this_rq_pinned() (*(unsigned int *)(__this_rq_raw() + RQ_nr_pinned))
this_rq_pinned()--;
this_rq_pinned()++;
is nicer, no?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/3] sched: make migrate_enable/migrate_disable inline
2025-08-19 12:32 ` Peter Zijlstra
@ 2025-08-19 12:45 ` Peter Zijlstra
2025-08-19 12:49 ` Jani Nikula
2025-08-20 2:32 ` Menglong Dong
2 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2025-08-19 12:45 UTC (permalink / raw)
To: Menglong Dong
Cc: mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, ast, daniel, john.fastabend, andrii,
martin.lau, eddyz87, song, yonghong.song, kpsingh, sdf, haoluo,
jolsa, simona.vetter, tzimmermann, jani.nikula, linux-kernel, bpf
On Tue, Aug 19, 2025 at 02:32:14PM +0200, Peter Zijlstra wrote:
> On Tue, Aug 19, 2025 at 09:58:31AM +0800, Menglong Dong wrote:
>
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index be00629f0ba4..00383fed9f63 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -119,6 +119,7 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(sched_update_nr_running_tp);
> > EXPORT_TRACEPOINT_SYMBOL_GPL(sched_compute_energy_tp);
> >
> > DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
> > +EXPORT_SYMBOL_GPL(runqueues);
>
> Oh no, absolutely not.
>
> You never, ever, export a variable, and certainly not this one.
>
> How about something like so?
>
> I tried 'clever' things with export inline, but the compiler hates me,
> so the below is the best I could make work.
extern inline, that is, obviously...
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/3] sched: make migrate_enable/migrate_disable inline
2025-08-19 12:32 ` Peter Zijlstra
2025-08-19 12:45 ` Peter Zijlstra
@ 2025-08-19 12:49 ` Jani Nikula
2025-08-19 13:08 ` Peter Zijlstra
2025-08-20 2:32 ` Menglong Dong
2 siblings, 1 reply; 11+ messages in thread
From: Jani Nikula @ 2025-08-19 12:49 UTC (permalink / raw)
To: Peter Zijlstra, Menglong Dong
Cc: mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, ast, daniel, john.fastabend, andrii,
martin.lau, eddyz87, song, yonghong.song, kpsingh, sdf, haoluo,
jolsa, simona.vetter, tzimmermann, linux-kernel, bpf
On Tue, 19 Aug 2025, Peter Zijlstra <peterz@infradead.org> wrote:
>> +EXPORT_SYMBOL_GPL(runqueues);
>
> Oh no, absolutely not.
>
> You never, ever, export a variable, and certainly not this one.
Tangential thought:
I think it would be possible to warn about non-function exports at build
time, and maybe plug it in W=1 builds.
BR,
Jani.
--
Jani Nikula, Intel
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/3] sched: make migrate_enable/migrate_disable inline
2025-08-19 12:49 ` Jani Nikula
@ 2025-08-19 13:08 ` Peter Zijlstra
0 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2025-08-19 13:08 UTC (permalink / raw)
To: Jani Nikula
Cc: Menglong Dong, mingo, juri.lelli, vincent.guittot,
dietmar.eggemann, rostedt, bsegall, mgorman, vschneid, ast,
daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
yonghong.song, kpsingh, sdf, haoluo, jolsa, simona.vetter,
tzimmermann, linux-kernel, bpf
On Tue, Aug 19, 2025 at 03:49:54PM +0300, Jani Nikula wrote:
> On Tue, 19 Aug 2025, Peter Zijlstra <peterz@infradead.org> wrote:
> >> +EXPORT_SYMBOL_GPL(runqueues);
> >
> > Oh no, absolutely not.
> >
> > You never, ever, export a variable, and certainly not this one.
>
> Tangential thought:
>
> I think it would be possible to warn about non-function exports at build
> time, and maybe plug it in W=1 builds.
>
Too much noise, there's a metric ton of variables exported. Sometimes
its unavoidable.
I just try and avoid wherever possible.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/3] sched: make migrate_enable/migrate_disable inline
2025-08-19 12:32 ` Peter Zijlstra
2025-08-19 12:45 ` Peter Zijlstra
2025-08-19 12:49 ` Jani Nikula
@ 2025-08-20 2:32 ` Menglong Dong
2 siblings, 0 replies; 11+ messages in thread
From: Menglong Dong @ 2025-08-20 2:32 UTC (permalink / raw)
To: Peter Zijlstra
Cc: mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, ast, daniel, john.fastabend, andrii,
martin.lau, eddyz87, song, yonghong.song, kpsingh, sdf, haoluo,
jolsa, simona.vetter, tzimmermann, jani.nikula, linux-kernel, bpf
On Tue, Aug 19, 2025 at 8:32 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, Aug 19, 2025 at 09:58:31AM +0800, Menglong Dong wrote:
>
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index be00629f0ba4..00383fed9f63 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -119,6 +119,7 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(sched_update_nr_running_tp);
> > EXPORT_TRACEPOINT_SYMBOL_GPL(sched_compute_energy_tp);
> >
> > DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
> > +EXPORT_SYMBOL_GPL(runqueues);
>
> Oh no, absolutely not.
>
> You never, ever, export a variable, and certainly not this one.
>
> How about something like so?
>
> I tried 'clever' things with export inline, but the compiler hates me,
> so the below is the best I could make work.
I see. You mean that we don't export the various, and use the
inlined version in vmlinux, and use the external version in modules,
which I think is nice ;)
(I were not aware that we should export various :/)
I'll try your advice.
Thanks!
Menglong Dong
>
> ---
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -2315,6 +2315,7 @@ static __always_inline void alloc_tag_re
> #define alloc_tag_restore(_tag, _old) do {} while (0)
> #endif
>
> +#ifndef MODULE
> #ifndef COMPILE_OFFSETS
>
> extern void __migrate_enable(void);
> @@ -2328,7 +2329,7 @@ DECLARE_PER_CPU_SHARED_ALIGNED(struct rq
> #define this_rq_raw() PERCPU_PTR(&runqueues)
> #endif
>
> -static inline void migrate_enable(void)
> +static inline void _migrate_enable(void)
> {
> struct task_struct *p = current;
>
> @@ -2363,7 +2364,7 @@ static inline void migrate_enable(void)
> (*(unsigned int *)((void *)this_rq_raw() + RQ_nr_pinned))--;
> }
>
> -static inline void migrate_disable(void)
> +static inline void _migrate_disable(void)
> {
> struct task_struct *p = current;
>
> @@ -2382,10 +2383,30 @@ static inline void migrate_disable(void)
> (*(unsigned int *)((void *)this_rq_raw() + RQ_nr_pinned))++;
> p->migration_disabled = 1;
> }
> -#else
> -static inline void migrate_disable(void) { }
> -static inline void migrate_enable(void) { }
> -#endif
> +#else /* !COMPILE_OFFSETS */
> +static inline void _migrate_disable(void) { }
> +static inline void _migrate_enable(void) { }
> +#endif /* !COMPILE_OFFSETS */
> +
> +#ifndef CREATE_MIGRATE_DISABLE
> +static inline void migrate_disable(void)
> +{
> + _migrate_disable();
> +}
> +
> +static inline void migrate_enable(void)
> +{
> + _migrate_enable();
> +}
> +#else /* CREATE_MIGRATE_DISABLE */
> +extern void migrate_disable(void);
> +extern void migrate_enable(void);
> +#endif /* CREATE_MIGRATE_DISABLE */
> +
> +#else /* !MODULE */
> +extern void migrate_disable(void);
> +extern void migrate_enable(void);
> +#endif /* !MODULE */
>
> DEFINE_LOCK_GUARD_0(migrate, migrate_disable(), migrate_enable())
>
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -7,6 +7,9 @@
> * Copyright (C) 1991-2002 Linus Torvalds
> * Copyright (C) 1998-2024 Ingo Molnar, Red Hat
> */
> +#define CREATE_MIGRATE_DISABLE
> +#include <linux/sched.h>
> +
> #include <linux/highmem.h>
> #include <linux/hrtimer_api.h>
> #include <linux/ktime_api.h>
> @@ -119,7 +122,6 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(sched_updat
> EXPORT_TRACEPOINT_SYMBOL_GPL(sched_compute_energy_tp);
>
> DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
> -EXPORT_SYMBOL_GPL(runqueues);
>
> #ifdef CONFIG_SCHED_PROXY_EXEC
> DEFINE_STATIC_KEY_TRUE(__sched_proxy_exec);
> @@ -2382,6 +2384,11 @@ static void migrate_disable_switch(struc
> __do_set_cpus_allowed(p, &ac);
> }
>
> +void migrate_disable(void)
> +{
> + _migrate_disable();
> +}
> +
> void __migrate_enable(void)
> {
> struct task_struct *p = current;
> @@ -2392,7 +2399,11 @@ void __migrate_enable(void)
>
> __set_cpus_allowed_ptr(p, &ac);
> }
> -EXPORT_SYMBOL_GPL(__migrate_enable);
> +
> +void migrate_enable(void)
> +{
> + _migrate_enable();
> +}
>
> static inline bool rq_has_pinned_tasks(struct rq *rq)
> {
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/3] sched: make migrate_enable/migrate_disable inline
2025-08-19 12:40 ` Peter Zijlstra
@ 2025-08-20 2:34 ` Menglong Dong
0 siblings, 0 replies; 11+ messages in thread
From: Menglong Dong @ 2025-08-20 2:34 UTC (permalink / raw)
To: Peter Zijlstra
Cc: mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, ast, daniel, john.fastabend, andrii,
martin.lau, eddyz87, song, yonghong.song, kpsingh, sdf, haoluo,
jolsa, simona.vetter, tzimmermann, jani.nikula, linux-kernel, bpf
On Tue, Aug 19, 2025 at 8:40 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, Aug 19, 2025 at 09:58:31AM +0800, Menglong Dong wrote:
>
> > The "struct rq" is not available in include/linux/sched.h, so we can't
> > access the "runqueues" with this_cpu_ptr(), as the compilation will fail
> > in this_cpu_ptr() -> raw_cpu_ptr() -> __verify_pcpu_ptr():
> > typeof((ptr) + 0)
> >
> > So we introduce the this_rq_raw() and access the runqueues with
> > arch_raw_cpu_ptr() directly.
>
> ^ That, wants to be a comment near here:
>
> > @@ -2312,4 +2315,78 @@ static __always_inline void alloc_tag_restore(struct alloc_tag *tag, struct allo
> > #define alloc_tag_restore(_tag, _old) do {} while (0)
> > #endif
> >
> > +#ifndef COMPILE_OFFSETS
> > +
> > +extern void __migrate_enable(void);
> > +
> > +struct rq;
> > +DECLARE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
> > +
> > +#ifdef CONFIG_SMP
> > +#define this_rq_raw() arch_raw_cpu_ptr(&runqueues)
> > +#else
> > +#define this_rq_raw() PERCPU_PTR(&runqueues)
> > +#endif
>
> Because that arch_ thing really is weird.
OK! I'll comment on this part.
>
> > + (*(unsigned int *)((void *)this_rq_raw() + RQ_nr_pinned))--;
> > + (*(unsigned int *)((void *)this_rq_raw() + RQ_nr_pinned))++;
>
> And since you did a macro anyway, why not fold that magic in there,
> instead of duplicating it?
>
> #define __this_rq_raw() ((void *)arch_raw_cpu_ptr(&runqueues))
> #define this_rq_pinned() (*(unsigned int *)(__this_rq_raw() + RQ_nr_pinned))
>
> this_rq_pinned()--;
> this_rq_pinned()++;
>
> is nicer, no?
Yeah, much better!
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-08-20 2:34 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-19 1:58 [PATCH v2 0/3] sched: make migrate_enable/migrate_disable inline Menglong Dong
2025-08-19 1:58 ` [PATCH v2 1/3] arch: add the macro COMPILE_OFFSETS to all the asm-offsets.c Menglong Dong
2025-08-19 1:58 ` [PATCH v2 2/3] sched: make migrate_enable/migrate_disable inline Menglong Dong
2025-08-19 12:32 ` Peter Zijlstra
2025-08-19 12:45 ` Peter Zijlstra
2025-08-19 12:49 ` Jani Nikula
2025-08-19 13:08 ` Peter Zijlstra
2025-08-20 2:32 ` Menglong Dong
2025-08-19 12:40 ` Peter Zijlstra
2025-08-20 2:34 ` Menglong Dong
2025-08-19 1:58 ` [PATCH v2 3/3] sched: fix some typos in include/linux/preempt.h Menglong Dong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).