linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/8] dce, riscv: Unused syscall trimming with PUSHSECTION and conditional KEEP()
@ 2025-10-15  6:16 ` Yuan Tan
  2025-10-15  6:38   ` [PATCH v2/resend " Yuan Tan
  2025-10-15  7:47   ` [PATCH v2 " Arnd Bergmann
  0 siblings, 2 replies; 15+ messages in thread
From: Yuan Tan @ 2025-10-15  6:16 UTC (permalink / raw)
  To: arnd, masahiroy, nathan, palmer, linux-kbuild, linux-riscv
  Cc: linux-arch, linux-kernel, i, tanyuan, falcon, ronbogo,
	z1652074432, lx24

Hi all,

This series aims to introduce syscall trimming support based on dead code
and data elimination (DCE). This can reduce the final image size, which is
particularly useful for embedded devices, while also reducing the attack
surface. It might further benefit specialized scenarios such as unikernels
or LTO builds, and could potentially help shrink the instruction cache
footprint.

Besides that, this series also introduces a new PUSHSECTION macro. This
wrapper allows sections created by .pushsection to have a proper reference
relationship with their callers, so that --gc-sections can safely work
without requiring unconditional KEEP() entries in linker scripts.

Since the new syscalltbl.sh infrastructure has been merged, I think it’s a
good time to push this patchsetTODO? forward.

Patch 1–3 introduce the infrastructure for TRIM_UNUSED_SYSCALLS, mainly
allowing syscalltbl.sh to decide which syscalls to keep according to
USED_SYSCALLS.
Patch 4 enables TRIM_UNUSED_SYSCALLS for the RISC-V architecture. With
syscalltbl.sh now available, this feature should be applicable to all
architectures that support LD_DEAD_CODE_DATA_ELIMINATION and use
syscalltbl.sh, but let’s focus on RISC-V first.
Patch 5–8 address the dependency inversion problem caused by sections
created with .pushsection that are forcibly retained by KEEP() in linker
scripts.

Here is an example to illustrate the problem:

void fun2(void);

void fun1(void) {
	asm volatile (
		".pushsection .text.pushed,\"ax\"\n\t" "call fun2\n\t"
		".popsection\n\t"
	);
}

If fun1() is used, .text.fun1 is kept alive, but .text.pushed has no
reference to .text.fun1, so --gc-sections may incorrectly discard
.text.pushed. To avoid this, the kernel traditionally wraps such sections
with KEEP() in the linker script. However, KEEP() introduces a dependency
inversion: if fun1() and fun2() are unused, .text.fun1, .text.fun2 and
.text.pushed should be removed, but KEEP() forces .text.pushed to stay,
which even keeps .text.fun2. As a result, sections that should be
eliminated are retained unnecessarily.

In Linux, sections such as ex_table, jump_table, bug_table, and alternative
are created by .pushsection and suffer from this issue. They prevent some
syscalls from being trimmed.

Ideally, .text.fun1 and .text.pushed should share the same fate: if fun1()
is not referenced, .text.pushed should be discarded as well. To achieve
this, we can establish a relocation with a directive between the caller and
the section created by .pushsection:

.section .text.fun1,"ax"
.reloc ., BFD_RELOC_NONE, pushedlabel
.pushsection .text.pushed,"ax" pushedlabel:
	call fun2
.popsection

Based on this idea, we introduce the PUSHSECTION macro. This macro emits a
relocation directive and a new label automatically, while remaining fully
compatible with all existing .pushsection parameters. With this macro, all
current uses of .pushsection (and even .section) in the kernel can be
replaced, significantly reducing the number of KEEP() in linker scripts and
enabling --gc-sections to work more effectively.

Without PUSHSECTION, there are 56 syscalls that cannot be trimmed in
defconfig and TRIM_UNUSED_SYSCALLS enabled. With PUSHSECTION, all syscalls
can now be properly trimmed.

We have tested enabling TRIM_UNUSED_SYSCALLS while keeping all syscalls
listed in USED_SYSCALLS and successfully booted Ubuntu on a configuration
based on v6.18-rc1 defconfig. The detailed configuration is provided in
[1]. This confirms that the trimming mechanism functions correctly under a
standard kernel setup.

The vmlinux size with tinyconfig is as follows:

|                                 | syscall remain | vmlinux size   | vmlinux after strip |
| ------------------------------- | -------------- | -------------- | ------------------- |
| enable DCE                      | 188            | 1437008        | 915160              |
| enable DCE and syscall trimming | 3              | 1263528 (-12%) | 800472 (-13%)       |


Changes in v2:
- Rebased on the unified syscalltbl.sh infrastructure for syscall trimming.
USED_SYSCALLS now accepts only syscall names to avoid confusion, whereas v1
also allowed entry point symbols.
- Uses the .reloc directive to establish dependencies.
Compared with previous proposals using SHF_LINK_ORDER or SHF_GROUP, this
approach provides a generic, parameter-compatible macro for all
.pushsection usages without side effects.


Previous versions:
- RFC: https://lore.kernel.org/lkml/cover.1676594211.git.falcon@tinylab.org/
- v1 part 1: https://lore.kernel.org/lkml/cover.1695679700.git.falcon@tinylab.org/
- v1 part 2: https://lore.kernel.org/lkml/cover.1699025537.git.tanyuan@tinylab.org/

Links:
[1] https://pastebin.com/St51bk2K


Yuan Tan (4):
  kconfig: add CONFIG_PUSHSECTION_WITH_RELOC for relocation support
  compiler.h: introduce PUSHSECTION macro to establish proper references
  vmlinux.lds.h: support conditional KEEP() in linker script
  riscv: use PUSHSECTION in ex_table, jump_table, bug_table and
    alternatives

Yuhang Zheng (4):
  init/Kconfig: add CONFIG_TRIM_UNUSED_SYSCALLS and related options
  scripts/syscalltbl.sh: add optional --used-syscalls argument for
    syscall trimming
  scripts/Makefile.asm-headers: pass USED_SYSCALLS to syscalltbl.sh
  riscv: enable HAVE_TRIM_UNUSED_SYSCALLS when toolchain supports DCE

 arch/riscv/Kconfig                          |  1 +
 arch/riscv/include/asm/alternative-macros.h |  8 ++--
 arch/riscv/include/asm/asm-extable.h        | 10 +++--
 arch/riscv/include/asm/bug.h                |  2 +-
 arch/riscv/include/asm/jump_label.h         |  3 +-
 arch/riscv/kernel/vmlinux.lds.S             |  9 +++-
 include/asm-generic/vmlinux.lds.h           | 12 ++++-
 include/linux/compiler.h                    | 43 +++++++++++++++++-
 include/linux/compiler_types.h              |  8 ++--
 init/Kconfig                                | 49 +++++++++++++++++++++
 scripts/Makefile.asm-headers                |  4 ++
 scripts/syscalltbl.sh                       | 19 +++++++-
 12 files changed, 150 insertions(+), 18 deletions(-)


base-commit: 3a8660878839faadb4f1a6dd72c3179c1df56787
prerequisite-patch-id: 7af3175326df94637f04a050dee7356416eb1edd
-- 
2.43.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 1/8] init/Kconfig: add CONFIG_TRIM_UNUSED_SYSCALLS and related options
       [not found] <cover.1760463245.git.tanyuan@tinylab.org>
  2025-10-15  6:16 ` [PATCH v2 0/8] dce, riscv: Unused syscall trimming with PUSHSECTION and conditional KEEP() Yuan Tan
@ 2025-10-15  6:17 ` Yuan Tan
  2025-10-15  6:17 ` [PATCH v2 2/8] scripts/syscalltbl.sh: add optional --used-syscalls argument for syscall trimming Yuan Tan
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Yuan Tan @ 2025-10-15  6:17 UTC (permalink / raw)
  To: arnd, masahiroy, nathan, palmer, linux-kbuild, linux-riscv
  Cc: linux-arch, linux-kernel, i, tanyuan, falcon, ronbogo,
	z1652074432, lx24

From: Yuhang Zheng <z1652074432@gmail.com>

Introduce configuration options to enable dead code/data elimination for
unused system calls.

This change adds the following Kconfig symbols:
 - TRIM_UNUSED_SYSCALLS: user option to enable trimming of unused system
   calls
 - HAVE_TRIM_UNUSED_SYSCALLS: architecture capability symbol
 - USED_SYSCALLS: string list of system calls to keep(separated by spaces)

These options integrate with scripts/Makefile.asm-headers and syscalltbl.sh
to generate syscall tables only for selected entries. The feature depends
on LD_DEAD_CODE_DATA_ELIMINATION and on architectures that use
syscalltbl.sh to generate their syscall tables.

Signed-off-by: Yuhang Zheng <z1652074432@gmail.com>
Signed-off-by: Yuan Tan <tanyuan@tinylab.org>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
---
 init/Kconfig | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/init/Kconfig b/init/Kconfig
index cab3ad28ca49..2c6f86c44d96 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1628,6 +1628,9 @@ config SYSFS_SYSCALL
 config HAVE_PCSPKR_PLATFORM
 	bool
 
+config HAVE_TRIM_UNUSED_SYSCALLS
+	bool
+
 menuconfig EXPERT
 	bool "Configure standard kernel features (expert users)"
 	# Unhide debug options, to make the on-by-default options visible
@@ -1932,6 +1935,36 @@ config CACHESTAT_SYSCALL
 
 	  If unsure say Y here.
 
+config TRIM_UNUSED_SYSCALLS
+	bool "Trim unused syscalls" if EXPERT
+	default n
+	depends on HAVE_TRIM_UNUSED_SYSCALLS
+	depends on LD_DEAD_CODE_DATA_ELIMINATION
+	help
+	  Enable this option to trim unused system calls from the final kernel
+	  image. Only the syscalls explicitly listed in CONFIG_USED_SYSCALLS
+	  will be kept.
+
+	  Note that some unused syscalls may still be retained if their sections
+	  are forcibly kept by other sections created with .pushsection and
+	  preserved via KEEP() in the linker script.
+
+	  If unsure, say N.
+
+config USED_SYSCALLS
+	string "Configure used syscalls" if EXPERT
+	depends on TRIM_UNUSED_SYSCALLS
+	default ""
+	help
+	  Specify a list of system calls that should be kept when
+	  TRIM_UNUSED_SYSCALLS is enabled.
+
+	  The system calls should be listed one by one, separated by spaces.
+	  For example, set CONFIG_USED_SYSCALLS="write exit reboot". If left
+	  empty, all syscalls will be trimmed.
+
+	  If unsure, please disable TRIM_UNUSED_SYSCALLS.
+
 config KALLSYMS
 	bool "Load all symbols for debugging/ksymoops" if EXPERT
 	default y
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 2/8] scripts/syscalltbl.sh: add optional --used-syscalls argument for syscall trimming
       [not found] <cover.1760463245.git.tanyuan@tinylab.org>
  2025-10-15  6:16 ` [PATCH v2 0/8] dce, riscv: Unused syscall trimming with PUSHSECTION and conditional KEEP() Yuan Tan
  2025-10-15  6:17 ` [PATCH v2 1/8] init/Kconfig: add CONFIG_TRIM_UNUSED_SYSCALLS and related options Yuan Tan
@ 2025-10-15  6:17 ` Yuan Tan
  2025-10-15  6:18 ` [PATCH v2 3/8] scripts/Makefile.asm-headers: pass USED_SYSCALLS to syscalltbl.sh Yuan Tan
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Yuan Tan @ 2025-10-15  6:17 UTC (permalink / raw)
  To: arnd, masahiroy, nathan, palmer, linux-kbuild, linux-riscv
  Cc: linux-arch, linux-kernel, i, tanyuan, falcon, ronbogo,
	z1652074432, lx24

From: Yuhang Zheng <z1652074432@gmail.com>

Add support for an optional `--used-syscalls` argument to
scripts/syscalltbl.sh. When provided, the argument takes a comma-separated
list of syscall names that should remain enabled in the generated syscall
table. Any syscall not present in the list will be replaced with a
`__SYSCALL(nr, sys_ni_syscall)` entry.

This enables selective system call table generation when
CONFIG_TRIM_UNUSED_SYSCALLS is set.

Signed-off-by: Yuhang Zheng <z1652074432@gmail.com>
Signed-off-by: Yuan Tan <tanyuan@tinylab.org>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
---
 scripts/syscalltbl.sh | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/scripts/syscalltbl.sh b/scripts/syscalltbl.sh
index 6a903b87a7c2..27d8dfce5748 100755
--- a/scripts/syscalltbl.sh
+++ b/scripts/syscalltbl.sh
@@ -22,12 +22,14 @@ usage() {
 	echo >&2 "  OUTFILE   output header file"
 	echo >&2
 	echo >&2 "options:"
-	echo >&2 "  --abis ABIS        ABI(s) to handle (By default, all lines are handled)"
+	echo >&2 "  --abis ABIS                ABI(s) to handle (By default, all lines are handled)"
+	echo >&2 "  --used-syscalls SYSCALLS   Keep only the specified syscall; trim others"
 	exit 1
 }
 
 # default unless specified by options
 abis=
+used_syscalls=
 
 while [ $# -gt 0 ]
 do
@@ -35,6 +37,14 @@ do
 	--abis)
 		abis=$(echo "($2)" | tr ',' '|')
 		shift 2;;
+    --used-syscalls=*)
+        used_syscalls_raw=${1#--used-syscalls=}
+        if [ -z "$used_syscalls_raw" ]; then
+            used_syscalls='^$'
+        else
+            used_syscalls=$(echo "$used_syscalls_raw" | tr ',' '|')
+        fi
+        shift;;
 	-*)
 		echo "$1: unknown option" >&2
 		usage;;
@@ -52,6 +62,7 @@ outfile="$2"
 
 nxt=0
 
+
 grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | {
 
 	while read nr abi name native compat noreturn; do
@@ -66,6 +77,12 @@ grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | {
 			nxt=$((nxt + 1))
 		done
 
+		if [ -n "$used_syscalls" ] && ! echo "$name" | grep -qwE "$used_syscalls"; then
+			echo "__SYSCALL($nr, sys_ni_syscall)"
+			nxt=$((nr + 1))
+			continue
+		fi
+
 		if [ "$compat" = "-" ]; then
 			unset compat
 		fi
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 3/8] scripts/Makefile.asm-headers: pass USED_SYSCALLS to syscalltbl.sh
       [not found] <cover.1760463245.git.tanyuan@tinylab.org>
                   ` (2 preceding siblings ...)
  2025-10-15  6:17 ` [PATCH v2 2/8] scripts/syscalltbl.sh: add optional --used-syscalls argument for syscall trimming Yuan Tan
@ 2025-10-15  6:18 ` Yuan Tan
  2025-10-15  6:18 ` [PATCH v2 4/8] riscv: enable HAVE_TRIM_UNUSED_SYSCALLS when toolchain supports DCE Yuan Tan
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Yuan Tan @ 2025-10-15  6:18 UTC (permalink / raw)
  To: arnd, masahiroy, nathan, palmer, linux-kbuild, linux-riscv
  Cc: linux-arch, linux-kernel, i, tanyuan, falcon, ronbogo,
	z1652074432, lx24

From: Yuhang Zheng <z1652074432@gmail.com>

Include auto.conf in asm-headers and pass CONFIG_USED_SYSCALLS to
syscalltbl.sh when CONFIG_TRIM_UNUSED_SYSCALLS is enabled.

Signed-off-by: Yuhang Zheng <z1652074432@gmail.com>
Signed-off-by: Yuan Tan <tanyuan@tinylab.org>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
---
 scripts/Makefile.asm-headers | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/scripts/Makefile.asm-headers b/scripts/Makefile.asm-headers
index 8a4856e74180..0ae82c6a2a15 100644
--- a/scripts/Makefile.asm-headers
+++ b/scripts/Makefile.asm-headers
@@ -13,6 +13,8 @@
 PHONY := all
 all:
 
+include $(objtree)/include/config/auto.conf
+
 src := $(srctree)/$(subst /generated,,$(obj))
 
 syscall_abis_32  += common,32
@@ -68,6 +70,8 @@ quiet_cmd_systbl = SYSTBL  $@
       cmd_systbl = $(CONFIG_SHELL) $(systbl) \
 		   $(if $(systbl-args-$*),$(systbl-args-$*),$(systbl-args)) \
 		   --abis $(subst $(space),$(comma),$(strip $(syscall_abis_$*))) \
+		   $(if $(CONFIG_TRIM_UNUSED_SYSCALLS), \
+		   --used-syscalls=$(subst $(space),$(comma),$(strip $(CONFIG_USED_SYSCALLS)))) \
 		   $< $@
 
 all: $(generic-y) $(syscall-y)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 4/8] riscv: enable HAVE_TRIM_UNUSED_SYSCALLS when toolchain supports DCE
       [not found] <cover.1760463245.git.tanyuan@tinylab.org>
                   ` (3 preceding siblings ...)
  2025-10-15  6:18 ` [PATCH v2 3/8] scripts/Makefile.asm-headers: pass USED_SYSCALLS to syscalltbl.sh Yuan Tan
@ 2025-10-15  6:18 ` Yuan Tan
  2025-10-15  6:18 ` [PATCH v2 5/8] kconfig: add CONFIG_PUSHSECTION_WITH_RELOC for relocation support Yuan Tan
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Yuan Tan @ 2025-10-15  6:18 UTC (permalink / raw)
  To: arnd, masahiroy, nathan, palmer, linux-kbuild, linux-riscv
  Cc: linux-arch, linux-kernel, i, tanyuan, falcon, ronbogo,
	z1652074432, lx24

From: Yuhang Zheng <z1652074432@gmail.com>

Enable HAVE_TRIM_UNUSED_SYSCALLS on RISC-V when the toolchain supports dead
code and data elimination.

This allows the unused syscalls and related code to be trimmed at link time
based on the list of actually used syscalls.

While this enables per-syscall elimination, some syscall entries still
cannot be fully discarded due to sections that are force-kept by the
linker, such as ex_table, bug_table, jump_table, and .alternative.

Signed-off-by: Yuhang Zheng <z1652074432@gmail.com>
Signed-off-by: Yuan Tan <tanyuan@tinylab.org>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
---
 arch/riscv/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 0c6038dc5dfd..697050ac8c62 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -198,6 +198,7 @@ config RISCV
 	select HAVE_SAMPLE_FTRACE_DIRECT_MULTI
 	select HAVE_STACKPROTECTOR
 	select HAVE_SYSCALL_TRACEPOINTS
+	select HAVE_TRIM_UNUSED_SYSCALLS
 	select HOTPLUG_CORE_SYNC_DEAD if HOTPLUG_CPU
 	select IRQ_DOMAIN
 	select IRQ_FORCED_THREADING
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 5/8] kconfig: add CONFIG_PUSHSECTION_WITH_RELOC for relocation support
       [not found] <cover.1760463245.git.tanyuan@tinylab.org>
                   ` (4 preceding siblings ...)
  2025-10-15  6:18 ` [PATCH v2 4/8] riscv: enable HAVE_TRIM_UNUSED_SYSCALLS when toolchain supports DCE Yuan Tan
@ 2025-10-15  6:18 ` Yuan Tan
  2025-10-15  6:19 ` [PATCH v2 6/8] compiler.h: introduce PUSHSECTION macro to establish proper references Yuan Tan
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Yuan Tan @ 2025-10-15  6:18 UTC (permalink / raw)
  To: arnd, masahiroy, nathan, palmer, linux-kbuild, linux-riscv
  Cc: linux-arch, linux-kernel, i, tanyuan, falcon, ronbogo,
	z1652074432, lx24

If the assembler supports the '.reloc' directive with 'BFD_RELOC_NONE', we
can establish a reference between a section created by '.pushsection' and
its caller function by emitting a relocation in the caller.

Known toolchain minimums:
- GNU binutils (gas) >= 2.26
- LLVM integrated assembler (IAS) >= 13.0.0

All assemblers meeting the kernel's minimum toolchain requirements already
support it.

Signed-off-by: Yuan Tan <tanyuan@tinylab.org>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Peihan Liu <ronbogo@outlook.com>
---
 init/Kconfig | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/init/Kconfig b/init/Kconfig
index 2c6f86c44d96..3d1cf32d5407 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1631,6 +1631,10 @@ config HAVE_PCSPKR_PLATFORM
 config HAVE_TRIM_UNUSED_SYSCALLS
 	bool
 
+config AS_HAS_BFD_RELOC_NONE
+	bool
+	def_bool $(as-instr,.reloc .$(comma) BFD_RELOC_NONE$(comma))
+
 menuconfig EXPERT
 	bool "Configure standard kernel features (expert users)"
 	# Unhide debug options, to make the on-by-default options visible
@@ -1965,6 +1969,18 @@ config USED_SYSCALLS
 
 	  If unsure, please disable TRIM_UNUSED_SYSCALLS.
 
+config PUSHSECTION_WITH_RELOC
+	bool "Trim more syscalls"
+	depends on TRIM_UNUSED_SYSCALLS && AS_HAS_BFD_RELOC_NONE
+	default y
+	help
+	  Enable building relocation-based references between sections created
+	  by '.pushsection' and their caller functions when the assembler
+	  supports the '.reloc' directive.
+
+	  This allows the linker to establish proper dependencies, remove the
+	  need for KEEP().
+
 config KALLSYMS
 	bool "Load all symbols for debugging/ksymoops" if EXPERT
 	default y
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 6/8] compiler.h: introduce PUSHSECTION macro to establish proper references
       [not found] <cover.1760463245.git.tanyuan@tinylab.org>
                   ` (5 preceding siblings ...)
  2025-10-15  6:18 ` [PATCH v2 5/8] kconfig: add CONFIG_PUSHSECTION_WITH_RELOC for relocation support Yuan Tan
@ 2025-10-15  6:19 ` Yuan Tan
  2025-10-16  3:14   ` kernel test robot
  2025-10-16  6:03   ` kernel test robot
  2025-10-15  6:22 ` [PATCH v2 7/8] vmlinux.lds.h: support conditional KEEP() in linker script Yuan Tan
  2025-10-15  6:22 ` [PATCH v2 8/8] riscv: use PUSHSECTION in ex_table, jump_table, bug_table and alternatives Yuan Tan
  8 siblings, 2 replies; 15+ messages in thread
From: Yuan Tan @ 2025-10-15  6:19 UTC (permalink / raw)
  To: arnd, masahiroy, nathan, palmer, linux-kbuild, linux-riscv
  Cc: linux-arch, linux-kernel, i, tanyuan, falcon, ronbogo,
	z1652074432, lx24

When a section is created by .pushsection in assembly, there is no
reference between the caller function and the newly created section. As a
result, --gc-sections may incorrectly discard the newly created section.

To prevent such incorrect garbage collection, kernel code often wraps these
sections with KEEP() in linker scripts. While this guarantees that the
sections are retained, it introduces a dependency inversion: unused
sections are kept unnecessarily, and any sections they reference are also
forcibly retained. This prevents the linker from eliminating truly unused
code or data.

Introduce a new PUSHSECTION macro in include/linux/compiler.h to create a
proper reference between the .pushsection caller and the generated section.
The macro is fully compatible with all existing .pushsection parameters and
has no side effects, making it safe to replace all current .pushsection
usages with this version.

PUSHSECTION works by emitting a unique label inside the new section, and
adding a relocation from the caller function to that label. This ensures
the linker recognizes the dependency and keeps both sections alive
together. So we don't need to wrap the section with KEEP() in linker
anymore.

To guarantee uniqueness of the section and label names, both __COUNTER__
and %= are used:
Either alone is insufficient:
  - __COUNTER__ alone fails when the function containing PUSHSECTION is
    inlined multiple times, causing duplicate labels.
  - %= alone fails when multiple PUSHSECTION directives appear within a
    single inline assembly block.

In assembly code, a separate definition is provided because the C macro
cannot ensure unique section/label names when expanded inside an assembler
macro (.macro).

Signed-off-by: Yuan Tan <tanyuan@tinylab.org>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Xiao Liu <lx24@stu.ynu.edu.cn>
Signed-off-by: Peihan Liu <ronbogo@outlook.com>
---
 include/linux/compiler.h | 43 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 42 insertions(+), 1 deletion(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 5b45ea7dff3e..bba79cedbe24 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -3,6 +3,7 @@
 #define __LINUX_COMPILER_H
 
 #include <linux/compiler_types.h>
+#include <linux/stringify.h>
 
 #ifndef __ASSEMBLY__
 
@@ -267,7 +268,47 @@ static inline void *offset_to_ptr(const int *off)
 	return (void *)((unsigned long)off + *off);
 }
 
-#endif /* __ASSEMBLY__ */
+#ifdef CONFIG_PUSHSECTION_WITH_RELOC
+#define __PUSHSECTION_RELOC(lbl) ".reloc ., BFD_RELOC_NONE, " lbl "\n\t"
+#define __PUSHSECTION_HELPER(prefix) __stringify(prefix.%=) "_" __stringify(__COUNTER__)
+#define __PUSHSECTION_LABEL(lbl) lbl ":\n\t"
+#else
+#define __PUSHSECTION_RELOC(lbl)
+#define __PUSHSECTION_HELPER(prefix) __stringify(prefix)
+#define __PUSHSECTION_LABEL(lbl)
+#endif
+
+#define _PUSHSECTION(lbl, sec, ...)					\
+	__PUSHSECTION_RELOC(lbl)					\
+	".pushsection " sec ", " #__VA_ARGS__ "\n\t" __PUSHSECTION_LABEL(lbl)
+
+#define PUSHSECTION(sec, ...)						\
+	_PUSHSECTION(__PUSHSECTION_HELPER(.Lsec), __PUSHSECTION_HELPER(sec), __VA_ARGS__)
+
+#else /* __ASSEMBLY__ */
+
+#ifdef CONFIG_PUSHSECTION_WITH_RELOC
+#define __PUSHSECTION_RELOC .reloc ., BFD_RELOC_NONE, \label
+#define __PUSHSECTION_HELPER(prefix) prefix\().\@
+#define __PUSHSECTION_LABEL \label:
+#else
+#define __PUSHSECTION_RELOC
+#define __PUSHSECTION_HELPER(prefix) prefix
+#define __PUSHSECTION_LABEL
+#endif
+
+.macro  _PUSHSECTION label:req, section:req, args:vararg
+	__PUSHSECTION_RELOC
+	.pushsection __PUSHSECTION_HELPER(\section), \args
+	__PUSHSECTION_LABEL
+.endm
+
+.macro  PUSHSECTION section:req, args:vararg
+	_PUSHSECTION .Lsec\@, \section, \args
+.endm
+
+#endif /* !__ASSEMBLY__ */
+
 
 #ifdef CONFIG_64BIT
 #define ARCH_SEL(a,b) a
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 7/8] vmlinux.lds.h: support conditional KEEP() in linker script
       [not found] <cover.1760463245.git.tanyuan@tinylab.org>
                   ` (6 preceding siblings ...)
  2025-10-15  6:19 ` [PATCH v2 6/8] compiler.h: introduce PUSHSECTION macro to establish proper references Yuan Tan
@ 2025-10-15  6:22 ` Yuan Tan
  2025-10-15  6:22 ` [PATCH v2 8/8] riscv: use PUSHSECTION in ex_table, jump_table, bug_table and alternatives Yuan Tan
  8 siblings, 0 replies; 15+ messages in thread
From: Yuan Tan @ 2025-10-15  6:22 UTC (permalink / raw)
  To: arnd, masahiroy, nathan, palmer, linux-kbuild, linux-riscv
  Cc: linux-arch, linux-kernel, i, tanyuan, falcon, ronbogo,
	z1652074432, lx24

Introduce a conditional KEEP() helper, COND_KEEP(), that allows a
section to be kept only if a corresponding NOKEEP_<sec> macro is not
defined. This provides a finer-grained mechanism to control which
sections are protected from garbage collection.

Traditionally, many sections — for example, the exception table and jump
table — are created by .pushsection and wrapped with KEEP() in
vmlinux.lds.h to prevent them from being discarded by the linker, even when
they are not actually referenced. This can block dead code and data
elimination (DCE) when the section is known to be safe to drop.

With COND_KEEP(), architectures or subsystems can safely remove the KEEP()
in cases where the section can be safely garbage-collected.

The implementation adds:
  - __KEEP_ACT_0() / __KEEP_ACT_1() helpers for macro expansion
  - BSEC_MAIN() to handle possible sub-section patterns, such as
    __ex_table.18
  - COND_KEEP() macro, which wraps KEEP() conditionally based on
    __is_defined(NOKEEP_<sec>)

Example usage:

COND_KEEP(alternative, *(.alternative*))

Additionally, move the ___PASTE()/__PASTE() definitions in
include/linux/compiler_types.h out from under the '#ifndef __ASSEMBLY__'
guard so that they are visible to assembly.

No functional change unless NOKEEP_<sec> is defined.

Signed-off-by: Yuan Tan <tanyuan@tinylab.org>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Peihan Liu <ronbogo@outlook.com>
---
 include/asm-generic/vmlinux.lds.h | 12 ++++++++++--
 include/linux/compiler_types.h    |  8 ++++----
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 8a9a2e732a65..8bb411ace863 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -51,6 +51,7 @@
  */
 
 #include <asm-generic/codetag.lds.h>
+#include <linux/compiler_types.h>
 
 #ifndef LOAD_OFFSET
 #define LOAD_OFFSET 0
@@ -113,14 +114,21 @@ defined(CONFIG_AUTOFDO_CLANG) || defined(CONFIG_PROPELLER_CLANG)
 #define RODATA_MAIN .rodata .rodata.[0-9a-zA-Z_]* .rodata..L*
 #define BSS_MAIN .bss .bss.[0-9a-zA-Z_]* .bss..L* .bss..compoundliteral*
 #define SBSS_MAIN .sbss .sbss.[0-9a-zA-Z_]*
+#define BSEC_MAIN(sec) sec sec##.[0-9a-zA-Z_]*
 #else
 #define DATA_MAIN .data .data.rel .data.rel.local
 #define SDATA_MAIN .sdata
 #define RODATA_MAIN .rodata
 #define BSS_MAIN .bss
 #define SBSS_MAIN .sbss
+#define BSEC_MAIN(sec) sec
 #endif
 
+#define __KEEP_ACT_0(sec) KEEP(sec)
+#define __KEEP_ACT_1(sec) sec
+
+#define COND_KEEP(sec, list) __PASTE(__KEEP_ACT_, __is_defined(NOKEEP_##sec))(list)
+
 /*
  * GCC 4.5 and later have a 32 bytes section alignment for structures.
  * Except GCC 4.9, that feels the need to align on 64 bytes.
@@ -196,12 +204,12 @@ defined(CONFIG_AUTOFDO_CLANG) || defined(CONFIG_PROPELLER_CLANG)
 
 #define BOUNDED_SECTION_PRE_LABEL(_sec_, _label_, _BEGIN_, _END_)	\
 	_BEGIN_##_label_ = .;						\
-	KEEP(*(_sec_))							\
+	COND_KEEP(_sec_, *(BSEC_MAIN(_sec_)))				\
 	_END_##_label_ = .;
 
 #define BOUNDED_SECTION_POST_LABEL(_sec_, _label_, _BEGIN_, _END_)	\
 	_label_##_BEGIN_ = .;						\
-	KEEP(*(_sec_))							\
+	COND_KEEP(_sec_, *(BSEC_MAIN(_sec_)))				\
 	_label_##_END_ = .;
 
 #define BOUNDED_SECTION_BY(_sec_, _label_)				\
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index 59288a2c1ad2..680ba4afbe7d 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -79,10 +79,6 @@ static inline void __chk_io_ptr(const volatile void __iomem *ptr) { }
 # define __builtin_warning(x, y...) (1)
 #endif /* __CHECKER__ */
 
-/* Indirect macros required for expanded argument pasting, eg. __LINE__. */
-#define ___PASTE(a,b) a##b
-#define __PASTE(a,b) ___PASTE(a,b)
-
 #ifdef __KERNEL__
 
 /* Attributes */
@@ -425,6 +421,10 @@ struct ftrace_likely_data {
 
 #endif /* __ASSEMBLY__ */
 
+/* Indirect macros required for expanded argument pasting, eg. __LINE__. */
+#define ___PASTE(a, b) a##b
+#define __PASTE(a, b) ___PASTE(a, b)
+
 /*
  * The below symbols may be defined for one or more, but not ALL, of the above
  * compilers. We don't consider that to be an error, so set them to nothing.
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 8/8] riscv: use PUSHSECTION in ex_table, jump_table, bug_table and alternatives
       [not found] <cover.1760463245.git.tanyuan@tinylab.org>
                   ` (7 preceding siblings ...)
  2025-10-15  6:22 ` [PATCH v2 7/8] vmlinux.lds.h: support conditional KEEP() in linker script Yuan Tan
@ 2025-10-15  6:22 ` Yuan Tan
  8 siblings, 0 replies; 15+ messages in thread
From: Yuan Tan @ 2025-10-15  6:22 UTC (permalink / raw)
  To: arnd, masahiroy, nathan, palmer, linux-kbuild, linux-riscv
  Cc: linux-arch, linux-kernel, i, tanyuan, falcon, ronbogo,
	z1652074432, lx24

Replace plain .pushsection with the new PUSHSECTION macro for __ex_table,
__bug_table, __jump_table, and .alternative on RISC-V.

PUSHSECTION establishes proper references between the caller and the
generated sections, allowing --gc-sections to recognize their dependencies
correctly. This avoids the need for KEEP() and prevents dependency
inversion where unused sections keep others alive.

With this change, CONFIG_TRIM_UNUSED_SYSCALLS can correctly discard unused
syscalls together with their exception tables.

This update takes effect only when built with an assembler that supports
BFD_RELOC_NONE, and falls back to the existing behavior otherwise.

Signed-off-by: Yuan Tan <tanyuan@tinylab.org>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Peihan Liu <ronbogo@outlook.com>
---
 arch/riscv/include/asm/alternative-macros.h |  8 +++++---
 arch/riscv/include/asm/asm-extable.h        | 10 ++++++----
 arch/riscv/include/asm/bug.h                |  2 +-
 arch/riscv/include/asm/jump_label.h         |  3 ++-
 arch/riscv/kernel/vmlinux.lds.S             |  9 ++++++++-
 5 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/arch/riscv/include/asm/alternative-macros.h b/arch/riscv/include/asm/alternative-macros.h
index 9619bd5c8eba..dd24c3e1117b 100644
--- a/arch/riscv/include/asm/alternative-macros.h
+++ b/arch/riscv/include/asm/alternative-macros.h
@@ -2,9 +2,11 @@
 #ifndef __ASM_ALTERNATIVE_MACROS_H
 #define __ASM_ALTERNATIVE_MACROS_H
 
+#include <linux/compiler.h>
+
 #ifdef CONFIG_RISCV_ALTERNATIVE
 
-#ifdef __ASSEMBLER__
+#ifdef __ASSEMBLY__
 
 .macro ALT_ENTRY oldptr newptr vendor_id patch_id new_len
 	.4byte \oldptr - .
@@ -16,7 +18,7 @@
 
 .macro ALT_NEW_CONTENT vendor_id, patch_id, enable = 1, new_c
 	.if \enable
-	.pushsection .alternative, "a"
+	PUSHSECTION .alternative, "a"
 	ALT_ENTRY 886b, 888f, \vendor_id, \patch_id, 889f - 888f
 	.popsection
 	.subsection 1
@@ -67,7 +69,7 @@
 
 #define ALT_NEW_CONTENT(vendor_id, patch_id, enable, new_c)		\
 	".if " __stringify(enable) " == 1\n"				\
-	".pushsection .alternative, \"a\"\n"				\
+	PUSHSECTION(.alternative, "a")					\
 	ALT_ENTRY("886b", "888f", __stringify(vendor_id), __stringify(patch_id), "889f - 888f") \
 	".popsection\n"							\
 	".subsection 1\n"						\
diff --git a/arch/riscv/include/asm/asm-extable.h b/arch/riscv/include/asm/asm-extable.h
index 37d425d7a762..24eb29f2ef82 100644
--- a/arch/riscv/include/asm/asm-extable.h
+++ b/arch/riscv/include/asm/asm-extable.h
@@ -2,6 +2,8 @@
 #ifndef __ASM_ASM_EXTABLE_H
 #define __ASM_ASM_EXTABLE_H
 
+#include <linux/compiler.h>
+
 #define EX_TYPE_NONE			0
 #define EX_TYPE_FIXUP			1
 #define EX_TYPE_BPF			2
@@ -10,10 +12,10 @@
 
 #ifdef CONFIG_MMU
 
-#ifdef __ASSEMBLER__
+#ifdef __ASSEMBLY__
 
 #define __ASM_EXTABLE_RAW(insn, fixup, type, data)	\
-	.pushsection	__ex_table, "a";		\
+	PUSHSECTION __ex_table, "a";			\
 	.balign		4;				\
 	.long		((insn) - .);			\
 	.long		((fixup) - .);			\
@@ -31,8 +33,8 @@
 #include <linux/stringify.h>
 #include <asm/gpr-num.h>
 
-#define __ASM_EXTABLE_RAW(insn, fixup, type, data)	\
-	".pushsection	__ex_table, \"a\"\n"		\
+#define __ASM_EXTABLE_RAW(insn, fixup, type, data)      \
+	PUSHSECTION(__ex_table, "a")			\
 	".balign	4\n"				\
 	".long		((" insn ") - .)\n"		\
 	".long		((" fixup ") - .)\n"		\
diff --git a/arch/riscv/include/asm/bug.h b/arch/riscv/include/asm/bug.h
index 4c03e20ad11f..855860c34209 100644
--- a/arch/riscv/include/asm/bug.h
+++ b/arch/riscv/include/asm/bug.h
@@ -54,7 +54,7 @@ typedef u32 bug_insn_t;
 #define ARCH_WARN_ASM(file, line, flags, size)			\
 		"1:\n\t"					\
 			"ebreak\n"				\
-			".pushsection __bug_table,\"aw\"\n\t"	\
+			PUSHSECTION(__bug_table, "aw")          \
 		"2:\n\t"					\
 		__BUG_ENTRY(file, line, flags) "\n\t"		\
 			".org 2b + " size "\n\t"                \
diff --git a/arch/riscv/include/asm/jump_label.h b/arch/riscv/include/asm/jump_label.h
index 3ab5f2e3212b..1134a9bc95a7 100644
--- a/arch/riscv/include/asm/jump_label.h
+++ b/arch/riscv/include/asm/jump_label.h
@@ -11,13 +11,14 @@
 
 #include <linux/types.h>
 #include <asm/asm.h>
+#include <linux/compiler.h>
 
 #define HAVE_JUMP_LABEL_BATCH
 
 #define JUMP_LABEL_NOP_SIZE 4
 
 #define JUMP_TABLE_ENTRY(key, label)			\
-	".pushsection	__jump_table, \"aw\"	\n\t"	\
+	PUSHSECTION(__jump_table, "aw")	                \
 	".align		" RISCV_LGPTR "		\n\t"	\
 	".long		1b - ., " label " - .	\n\t"	\
 	"" RISCV_PTR "	" key " - .		\n\t"	\
diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S
index 61bd5ba6680a..e6d117047226 100644
--- a/arch/riscv/kernel/vmlinux.lds.S
+++ b/arch/riscv/kernel/vmlinux.lds.S
@@ -7,6 +7,13 @@
 #define RO_EXCEPTION_TABLE_ALIGN	4
 #define RUNTIME_DISCARD_EXIT
 
+#ifdef CONFIG_PUSHSECTION_WITH_RELOC
+#define NOKEEP___jump_table 1
+#define NOKEEP___ex_table 1
+#define NOKEEP___bug_table 1
+#define NOKEEP_alternative 1
+#endif
+
 #ifdef CONFIG_XIP_KERNEL
 #include "vmlinux-xip.lds.S"
 #else
@@ -117,7 +124,7 @@ SECTIONS
 	. = ALIGN(8);
 	.alternative : {
 		__alt_start = .;
-		KEEP(*(.alternative))
+		COND_KEEP(alternative, *(.alternative*))
 		__alt_end = .;
 	}
 	__init_end = .;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2/resend 0/8] dce, riscv: Unused syscall trimming with PUSHSECTION and conditional KEEP()
  2025-10-15  6:16 ` [PATCH v2 0/8] dce, riscv: Unused syscall trimming with PUSHSECTION and conditional KEEP() Yuan Tan
@ 2025-10-15  6:38   ` Yuan Tan
  2025-10-15  7:47   ` [PATCH v2 " Arnd Bergmann
  1 sibling, 0 replies; 15+ messages in thread
From: Yuan Tan @ 2025-10-15  6:38 UTC (permalink / raw)
  To: arnd, masahiroy, nathan, palmer, linux-kbuild, linux-riscv
  Cc: linux-arch, linux-kernel, i, tanyuan, falcon, ronbogo,
	z1652074432, lx24

Hi all,

Sorry for the noise — it looks like my mail provider rewrote the Message-ID
of the cover letter, which broke the thread. I'm resending the cover letter
to make the series appear correctly threaded on lore.

This series aims to introduce syscall trimming support based on dead code
and data elimination (DCE). This can reduce the final image size, which is
particularly useful for embedded devices, while also reducing the attack
surface. It might further benefit specialized scenarios such as unikernels
or LTO builds, and could potentially help shrink the instruction cache
footprint.

Besides that, this series also introduces a new PUSHSECTION macro. This
wrapper allows sections created by .pushsection to have a proper reference
relationship with their callers, so that --gc-sections can safely work
without requiring unconditional KEEP() entries in linker scripts.

Since the new syscalltbl.sh infrastructure has been merged, I think it’s a
good time to push this patchsetTODO? forward.

Patch 1–3 introduce the infrastructure for TRIM_UNUSED_SYSCALLS, mainly
allowing syscalltbl.sh to decide which syscalls to keep according to
USED_SYSCALLS.
Patch 4 enables TRIM_UNUSED_SYSCALLS for the RISC-V architecture. With
syscalltbl.sh now available, this feature should be applicable to all
architectures that support LD_DEAD_CODE_DATA_ELIMINATION and use
syscalltbl.sh, but let’s focus on RISC-V first.
Patch 5–8 address the dependency inversion problem caused by sections
created with .pushsection that are forcibly retained by KEEP() in linker
scripts.

Here is an example to illustrate the problem:

void fun2(void);

void fun1(void) {
	asm volatile (
		".pushsection .text.pushed,\"ax\"\n\t" "call fun2\n\t"
		".popsection\n\t"
	);
}

If fun1() is used, .text.fun1 is kept alive, but .text.pushed has no
reference to .text.fun1, so --gc-sections may incorrectly discard
.text.pushed. To avoid this, the kernel traditionally wraps such sections
with KEEP() in the linker script. However, KEEP() introduces a dependency
inversion: if fun1() and fun2() are unused, .text.fun1, .text.fun2 and
.text.pushed should be removed, but KEEP() forces .text.pushed to stay,
which even keeps .text.fun2. As a result, sections that should be
eliminated are retained unnecessarily.

In Linux, sections such as ex_table, jump_table, bug_table, and alternative
are created by .pushsection and suffer from this issue. They prevent some
syscalls from being trimmed.

Ideally, .text.fun1 and .text.pushed should share the same fate: if fun1()
is not referenced, .text.pushed should be discarded as well. To achieve
this, we can establish a relocation with a directive between the caller and
the section created by .pushsection:

.section .text.fun1,"ax"
.reloc ., BFD_RELOC_NONE, pushedlabel
.pushsection .text.pushed,"ax" pushedlabel:
	call fun2
.popsection

Based on this idea, we introduce the PUSHSECTION macro. This macro emits a
relocation directive and a new label automatically, while remaining fully
compatible with all existing .pushsection parameters. With this macro, all
current uses of .pushsection (and even .section) in the kernel can be
replaced, significantly reducing the number of KEEP() in linker scripts and
enabling --gc-sections to work more effectively.

Without PUSHSECTION, there are 56 syscalls that cannot be trimmed in
defconfig and TRIM_UNUSED_SYSCALLS enabled. With PUSHSECTION, all syscalls
can now be properly trimmed.

We have tested enabling TRIM_UNUSED_SYSCALLS while keeping all syscalls
listed in USED_SYSCALLS and successfully booted Ubuntu on a configuration
based on v6.18-rc1 defconfig. The detailed configuration is provided in
[1]. This confirms that the trimming mechanism functions correctly under a
standard kernel setup.

The vmlinux size with tinyconfig is as follows:

|                                 | syscall remain | vmlinux size   | vmlinux after strip |
| ------------------------------- | -------------- | -------------- | ------------------- |
| enable DCE                      | 188            | 1437008        | 915160              |
| enable DCE and syscall trimming | 3              | 1263528 (-12%) | 800472 (-13%)       |


Changes in v2:
- Rebased on the unified syscalltbl.sh infrastructure for syscall trimming.
USED_SYSCALLS now accepts only syscall names to avoid confusion, whereas v1
also allowed entry point symbols.
- Uses the .reloc directive to establish dependencies.
Compared with previous proposals using SHF_LINK_ORDER or SHF_GROUP, this
approach provides a generic, parameter-compatible macro for all
.pushsection usages without side effects.


Previous versions:
- RFC: https://lore.kernel.org/lkml/cover.1676594211.git.falcon@tinylab.org/
- v1 part 1: https://lore.kernel.org/lkml/cover.1695679700.git.falcon@tinylab.org/
- v1 part 2: https://lore.kernel.org/lkml/cover.1699025537.git.tanyuan@tinylab.org/

Links:
[1] https://pastebin.com/St51bk2K


Yuan Tan (4):
  kconfig: add CONFIG_PUSHSECTION_WITH_RELOC for relocation support
  compiler.h: introduce PUSHSECTION macro to establish proper references
  vmlinux.lds.h: support conditional KEEP() in linker script
  riscv: use PUSHSECTION in ex_table, jump_table, bug_table and
    alternatives

Yuhang Zheng (4):
  init/Kconfig: add CONFIG_TRIM_UNUSED_SYSCALLS and related options
  scripts/syscalltbl.sh: add optional --used-syscalls argument for
    syscall trimming
  scripts/Makefile.asm-headers: pass USED_SYSCALLS to syscalltbl.sh
  riscv: enable HAVE_TRIM_UNUSED_SYSCALLS when toolchain supports DCE

 arch/riscv/Kconfig                          |  1 +
 arch/riscv/include/asm/alternative-macros.h |  8 ++--
 arch/riscv/include/asm/asm-extable.h        | 10 +++--
 arch/riscv/include/asm/bug.h                |  2 +-
 arch/riscv/include/asm/jump_label.h         |  3 +-
 arch/riscv/kernel/vmlinux.lds.S             |  9 +++-
 include/asm-generic/vmlinux.lds.h           | 12 ++++-
 include/linux/compiler.h                    | 43 +++++++++++++++++-
 include/linux/compiler_types.h              |  8 ++--
 init/Kconfig                                | 49 +++++++++++++++++++++
 scripts/Makefile.asm-headers                |  4 ++
 scripts/syscalltbl.sh                       | 19 +++++++-
 12 files changed, 150 insertions(+), 18 deletions(-)


base-commit: 3a8660878839faadb4f1a6dd72c3179c1df56787
prerequisite-patch-id: 7af3175326df94637f04a050dee7356416eb1edd
-- 
2.43.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 0/8] dce, riscv: Unused syscall trimming with PUSHSECTION and conditional KEEP()
  2025-10-15  6:16 ` [PATCH v2 0/8] dce, riscv: Unused syscall trimming with PUSHSECTION and conditional KEEP() Yuan Tan
  2025-10-15  6:38   ` [PATCH v2/resend " Yuan Tan
@ 2025-10-15  7:47   ` Arnd Bergmann
  2025-11-04  2:21     ` Yuan Tan
  1 sibling, 1 reply; 15+ messages in thread
From: Arnd Bergmann @ 2025-10-15  7:47 UTC (permalink / raw)
  To: Yuan Tan, Masahiro Yamada, Nathan Chancellor, Palmer Dabbelt,
	linux-kbuild, linux-riscv
  Cc: Linux-Arch, linux-kernel, i, Zhangjin Wu, ronbogo, z1652074432,
	lx24

On Wed, Oct 15, 2025, at 08:16, Yuan Tan wrote:
> Hi all,
>
> This series aims to introduce syscall trimming support based on dead code
> and data elimination (DCE). This can reduce the final image size, which is
> particularly useful for embedded devices, while also reducing the attack
> surface. It might further benefit specialized scenarios such as unikernels
> or LTO builds, and could potentially help shrink the instruction cache
> footprint.
>
> Besides that, this series also introduces a new PUSHSECTION macro. This
> wrapper allows sections created by .pushsection to have a proper reference
> relationship with their callers, so that --gc-sections can safely work
> without requiring unconditional KEEP() entries in linker scripts.
>
> Since the new syscalltbl.sh infrastructure has been merged, I think it’s a
> good time to push this patchsetTODO? forward.
>
> Patch 1–3 introduce the infrastructure for TRIM_UNUSED_SYSCALLS, mainly
> allowing syscalltbl.sh to decide which syscalls to keep according to
> USED_SYSCALLS.
> Patch 4 enables TRIM_UNUSED_SYSCALLS for the RISC-V architecture. With
> syscalltbl.sh now available, this feature should be applicable to all
> architectures that support LD_DEAD_CODE_DATA_ELIMINATION and use
> syscalltbl.sh, but let’s focus on RISC-V first.
> Patch 5–8 address the dependency inversion problem caused by sections
> created with .pushsection that are forcibly retained by KEEP() in linker
> scripts.

Thanks a lot for your work on this. I think it is indeed valuable to
be able to optimize kernels with a smaller subset of system calls for
known workloads, and have as much dead code elimination as possible.

However, I continue to think that the added scripting with a known
set of syscall names is fundamentally the wrong approach to get to
this list: This adds complexity to the build process in one of
the areas that is already too complicated, and it duplicates what
we can already do with Kconfig for a subset of the system calls.

I think the way we should configure the set of syscalls instead is
to add more Kconfig symbols guarded by CONFIG_EXPERT that turn
classes of syscalls on or off. You have obviously done the research
to come up with a list of used/unused entry points for one or more
workloads. Can you share those lists?

      Arnd

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 6/8] compiler.h: introduce PUSHSECTION macro to establish proper references
  2025-10-15  6:19 ` [PATCH v2 6/8] compiler.h: introduce PUSHSECTION macro to establish proper references Yuan Tan
@ 2025-10-16  3:14   ` kernel test robot
  2025-10-16  6:03   ` kernel test robot
  1 sibling, 0 replies; 15+ messages in thread
From: kernel test robot @ 2025-10-16  3:14 UTC (permalink / raw)
  To: Yuan Tan, arnd, masahiroy, nathan, palmer, linux-kbuild,
	linux-riscv
  Cc: llvm, oe-kbuild-all, linux-arch, linux-kernel, i, tanyuan, falcon,
	ronbogo, z1652074432, lx24

Hi Yuan,

kernel test robot noticed the following build errors:

[auto build test ERROR on soc/for-next]
[also build test ERROR on arnd-asm-generic/master linus/master v6.18-rc1 next-20251015]
[cannot apply to masahiroy-kbuild/for-next masahiroy-kbuild/fixes]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Yuan-Tan/scripts-syscalltbl-sh-add-optional-used-syscalls-argument-for-syscall-trimming/20251015-181934
base:   https://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git for-next
patch link:    https://lore.kernel.org/r/40854460DE090346%2Bc30007da67d26ae0e8651732f32a8ede4926db14.1760463245.git.tanyuan%40tinylab.org
patch subject: [PATCH v2 6/8] compiler.h: introduce PUSHSECTION macro to establish proper references
config: arm-randconfig-001-20251016 (https://download.01.org/0day-ci/archive/20251016/202510161119.Qau82x7Z-lkp@intel.com/config)
compiler: clang version 22.0.0git (https://github.com/llvm/llvm-project 39f292ffa13d7ca0d1edff27ac8fd55024bb4d19)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251016/202510161119.Qau82x7Z-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510161119.Qau82x7Z-lkp@intel.com/

All errors (new ones prefixed by >>):

>> ld.lld: error: ./arch/arm/kernel/vmlinux.lds:1: unknown directive: .macro
   >>> .macro _PUSHSECTION label:req, section:req, args:vararg
   >>> ^

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 6/8] compiler.h: introduce PUSHSECTION macro to establish proper references
  2025-10-15  6:19 ` [PATCH v2 6/8] compiler.h: introduce PUSHSECTION macro to establish proper references Yuan Tan
  2025-10-16  3:14   ` kernel test robot
@ 2025-10-16  6:03   ` kernel test robot
  1 sibling, 0 replies; 15+ messages in thread
From: kernel test robot @ 2025-10-16  6:03 UTC (permalink / raw)
  To: Yuan Tan, arnd, masahiroy, nathan, palmer, linux-kbuild,
	linux-riscv
  Cc: oe-kbuild-all, linux-arch, linux-kernel, i, tanyuan, falcon,
	ronbogo, z1652074432, lx24

Hi Yuan,

kernel test robot noticed the following build errors:

[auto build test ERROR on soc/for-next]
[also build test ERROR on arnd-asm-generic/master linus/master v6.18-rc1 next-20251015]
[cannot apply to masahiroy-kbuild/for-next masahiroy-kbuild/fixes]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Yuan-Tan/scripts-syscalltbl-sh-add-optional-used-syscalls-argument-for-syscall-trimming/20251015-181934
base:   https://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git for-next
patch link:    https://lore.kernel.org/r/40854460DE090346%2Bc30007da67d26ae0e8651732f32a8ede4926db14.1760463245.git.tanyuan%40tinylab.org
patch subject: [PATCH v2 6/8] compiler.h: introduce PUSHSECTION macro to establish proper references
config: sh-randconfig-002-20251016 (https://download.01.org/0day-ci/archive/20251016/202510161337.2Vym0Pmn-lkp@intel.com/config)
compiler: sh4-linux-gcc (GCC) 14.3.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251016/202510161337.2Vym0Pmn-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510161337.2Vym0Pmn-lkp@intel.com/

All errors (new ones prefixed by >>):

>> sh4-linux-ld:./arch/sh/kernel/vmlinux.lds:2: syntax error

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 0/8] dce, riscv: Unused syscall trimming with PUSHSECTION and conditional KEEP()
  2025-10-15  7:47   ` [PATCH v2 " Arnd Bergmann
@ 2025-11-04  2:21     ` Yuan Tan
  2025-11-07 13:33       ` Arnd Bergmann
  0 siblings, 1 reply; 15+ messages in thread
From: Yuan Tan @ 2025-11-04  2:21 UTC (permalink / raw)
  To: Arnd Bergmann, Masahiro Yamada, Nathan Chancellor, Palmer Dabbelt,
	linux-kbuild, linux-riscv
  Cc: Linux-Arch, linux-kernel, i, Zhangjin Wu, ronbogo, z1652074432,
	lx24


On 10/15/2025 12:47 AM, Arnd Bergmann wrote:
> On Wed, Oct 15, 2025, at 08:16, Yuan Tan wrote:
>> Hi all,
>>
>> This series aims to introduce syscall trimming support based on dead code
>> and data elimination (DCE). This can reduce the final image size, which is
>> particularly useful for embedded devices, while also reducing the attack
>> surface. It might further benefit specialized scenarios such as unikernels
>> or LTO builds, and could potentially help shrink the instruction cache
>> footprint.
>>
>> Besides that, this series also introduces a new PUSHSECTION macro. This
>> wrapper allows sections created by .pushsection to have a proper reference
>> relationship with their callers, so that --gc-sections can safely work
>> without requiring unconditional KEEP() entries in linker scripts.
>>
>> Since the new syscalltbl.sh infrastructure has been merged, I think it’s a
>> good time to push this patchsetTODO? forward.
>>
>> Patch 1–3 introduce the infrastructure for TRIM_UNUSED_SYSCALLS, mainly
>> allowing syscalltbl.sh to decide which syscalls to keep according to
>> USED_SYSCALLS.
>> Patch 4 enables TRIM_UNUSED_SYSCALLS for the RISC-V architecture. With
>> syscalltbl.sh now available, this feature should be applicable to all
>> architectures that support LD_DEAD_CODE_DATA_ELIMINATION and use
>> syscalltbl.sh, but let’s focus on RISC-V first.
>> Patch 5–8 address the dependency inversion problem caused by sections
>> created with .pushsection that are forcibly retained by KEEP() in linker
>> scripts.
> Thanks a lot for your work on this. I think it is indeed valuable to
> be able to optimize kernels with a smaller subset of system calls for
> known workloads, and have as much dead code elimination as possible.
>
> However, I continue to think that the added scripting with a known
> set of syscall names is fundamentally the wrong approach to get to
> this list: This adds complexity to the build process in one of
> the areas that is already too complicated, and it duplicates what
> we can already do with Kconfig for a subset of the system calls.
>
> I think the way we should configure the set of syscalls instead is
> to add more Kconfig symbols guarded by CONFIG_EXPERT that turn
> classes of syscalls on or off. You have obviously done the research
> to come up with a list of used/unused entry points for one or more
> workloads. Can you share those lists?
>
>       Arnd


Hi Arnd,

Sorry for the late reply — this patchset really wore me out, and I only just
recovered.  Thank you very much for your feedback!

Regarding your suggestion to use Kconfig to control which system calls are
included or excluded, perhaps we could take inspiration from systemd's
classification approach. For example, systemd groups syscalls into categories
like[1]:

@aio @basic-io @chown @clock @cpu-emulation @debug @file-system

and so on.

However, if we go down this route, we would need to continuously maintain and
update these categories whenever Linux introduces new system calls. I' m not
sure whether that would be an ideal long-term approach.

For reference, here is the list of syscalls required to run Lighttpd.

execve set_tid_address mount write brk mmap munmap getuid getgid getpid
clock_gettime getcwd fcntl fstat read dup3 socket setsockopt bind listen
rt_sigaction rt_sigprocmask newfstatat prlimit64 epoll_create1 epoll_ctl pipe2
epoll_pwait accept4 getsockopt recvfrom shutdown writev getdents64 openat close

We've tested it successfully on QEMU + initramfs, and I can share the
deployment script if anyone would like to reproduce the setup.

Also, I noticed that there haven't been any comments so far on the later
patches introducing the PUSHSECTION macro.  I' m a bit concerned about how
people perceive this part.

[1] https://github.com/systemd/systemd/blob/main/src/shared/seccomp-util.c




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 0/8] dce, riscv: Unused syscall trimming with PUSHSECTION and conditional KEEP()
  2025-11-04  2:21     ` Yuan Tan
@ 2025-11-07 13:33       ` Arnd Bergmann
  0 siblings, 0 replies; 15+ messages in thread
From: Arnd Bergmann @ 2025-11-07 13:33 UTC (permalink / raw)
  To: Yuan Tan, Masahiro Yamada, Nathan Chancellor, Palmer Dabbelt,
	linux-kbuild, linux-riscv
  Cc: Linux-Arch, linux-kernel, i, Zhangjin Wu, ronbogo, z1652074432,
	lx24

On Tue, Nov 4, 2025, at 03:21, Yuan Tan wrote:

>> Sorry for the late reply — this patchset really wore me out, and I only just
>> recovered.  Thank you very much for your feedback!

Sorry to hear this has been stressful for you. It's an unfortunate
aspect of the way we work that sometimes 

> On 10/15/2025 12:47 AM, Arnd Bergmann wrote:
>> On Wed, Oct 15, 2025, at 08:16, Yuan Tan wrote:
>> Thanks a lot for your work on this. I think it is indeed valuable to
>> be able to optimize kernels with a smaller subset of system calls for
>> known workloads, and have as much dead code elimination as possible.
>>
>> However, I continue to think that the added scripting with a known
>> set of syscall names is fundamentally the wrong approach to get to
>> this list: This adds complexity to the build process in one of
>> the areas that is already too complicated, and it duplicates what
>> we can already do with Kconfig for a subset of the system calls.
>>
>> I think the way we should configure the set of syscalls instead is
>> to add more Kconfig symbols guarded by CONFIG_EXPERT that turn
>> classes of syscalls on or off. You have obviously done the research
>> to come up with a list of used/unused entry points for one or more
>> workloads. Can you share those lists?
>
> Regarding your suggestion to use Kconfig to control which system calls are
> included or excluded, perhaps we could take inspiration from systemd's
> classification approach. For example, systemd groups syscalls into categories
> like[1]:
>
> @aio @basic-io @chown @clock @cpu-emulation @debug @file-system
>
> and so on.

I think many of the categories already naturally align with the
structure of the kernel source code, so maintaining them naturally comes
out of the build system.

More importantly, turning off parts of the kernel on a per-file
basis tends to work better for eliminating the entire block
of code because only removing the syscall entry still leaves
references to functions and global data structures from initcalls
and exported functions.

> However, if we go down this route, we would need to continuously maintain and
> update these categories whenever Linux introduces new system calls. I' m not
> sure whether that would be an ideal long-term approach.

If we can (at least roughly) align the categories between the kernel and the
systemd classification, that would at least make it easier to maintain
the systemd ones.

> For reference, here is the list of syscalls required to run Lighttpd.
>
> execve set_tid_address mount write brk mmap munmap getuid getgid getpid
> clock_gettime getcwd fcntl fstat read dup3 socket setsockopt bind listen
> rt_sigaction rt_sigprocmask newfstatat prlimit64 epoll_create1 epoll_ctl pipe2
> epoll_pwait accept4 getsockopt recvfrom shutdown writev getdents64 openat close
>
> We've tested it successfully on QEMU + initramfs, and I can share the
> deployment script if anyone would like to reproduce the setup.

Thanks for the list! Is this a workload you are interested in actually
optimizing for deployment, or just something you used as a simple test
environment?

I see three types of syscalls in your list above:

1. essential ones that are basically always needed
2. socket interfaces (already optional)
3. epoll (already optional)

The first two sets are clearly going to have more syscalls in
them that are usually used in combination with the others:
If we provide read, write and writev, we should also provide readv,
and if we provide socket/bind/listen/recvfrom, we also likely want
accept/connect/sendto and probably recvmsg/sendmsg.

Starting with your set of syscalls and those closely related
ones, as well as the set of syscalls that already have a
Kconfig option, we should be able to find the set of syscalls
that are unconditionally enabled but could be optional.
If you have the chance, could you compile that list?
I might also have a list, but probably not in the next week.

The next step after that I think is to measure the impact
of turning off those remaining ones in a configuration that
has the existing symbols (e.g. sysvipc, futex, compat_32bit_time,
...) disabled already.

Side note: I'm a  bit surprised to see fstat() in the list, since riscv
should only really support newfstat().

> Also, I noticed that there haven't been any comments so far on the later
> patches introducing the PUSHSECTION macro.  I' m a bit concerned about how
> people perceive this part.

I don't have a strong opinion on this part.

     Arnd

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2025-11-07 13:33 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <cover.1760463245.git.tanyuan@tinylab.org>
2025-10-15  6:16 ` [PATCH v2 0/8] dce, riscv: Unused syscall trimming with PUSHSECTION and conditional KEEP() Yuan Tan
2025-10-15  6:38   ` [PATCH v2/resend " Yuan Tan
2025-10-15  7:47   ` [PATCH v2 " Arnd Bergmann
2025-11-04  2:21     ` Yuan Tan
2025-11-07 13:33       ` Arnd Bergmann
2025-10-15  6:17 ` [PATCH v2 1/8] init/Kconfig: add CONFIG_TRIM_UNUSED_SYSCALLS and related options Yuan Tan
2025-10-15  6:17 ` [PATCH v2 2/8] scripts/syscalltbl.sh: add optional --used-syscalls argument for syscall trimming Yuan Tan
2025-10-15  6:18 ` [PATCH v2 3/8] scripts/Makefile.asm-headers: pass USED_SYSCALLS to syscalltbl.sh Yuan Tan
2025-10-15  6:18 ` [PATCH v2 4/8] riscv: enable HAVE_TRIM_UNUSED_SYSCALLS when toolchain supports DCE Yuan Tan
2025-10-15  6:18 ` [PATCH v2 5/8] kconfig: add CONFIG_PUSHSECTION_WITH_RELOC for relocation support Yuan Tan
2025-10-15  6:19 ` [PATCH v2 6/8] compiler.h: introduce PUSHSECTION macro to establish proper references Yuan Tan
2025-10-16  3:14   ` kernel test robot
2025-10-16  6:03   ` kernel test robot
2025-10-15  6:22 ` [PATCH v2 7/8] vmlinux.lds.h: support conditional KEEP() in linker script Yuan Tan
2025-10-15  6:22 ` [PATCH v2 8/8] riscv: use PUSHSECTION in ex_table, jump_table, bug_table and alternatives Yuan Tan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).