From: haoran.jiang@linux.dev
To: loongarch@lists.linux.dev
Cc: linux-kernel@vger.kernel.org, chenhuacai@kernel.org,
kernel@xen0n.name, akpm@linux-foundation.org, jbohac@suse.cz,
kees@kernel.org, yangtiezhu@loongson.cn,
Haoran Jiang <jianghaoran@kylinos.cn>
Subject: [PATCH v4] LoongArch: Enable STRICT_MODULE_RWX for stricter modules memory permissions
Date: Sat, 13 Jun 2026 16:41:47 +0800 [thread overview]
Message-ID: <20260613084147.449502-1-haoran.jiang@linux.dev> (raw)
From: Haoran Jiang <jianghaoran@kylinos.cn>
Enable STRICT_MODULE_RWX to enforce strict memory permissions
on modules,making the code region non-writable, the data region
non-executable, and the read-only data region both non-writable
and non-executable.Temporarily modify code section read/write
permissions via set_memory() API.
Signed-off-by: Haoran Jiang <jianghaoran@kylinos.cn>
---
v2:
Change the method of modifying page table permissions from patch_map to set_memory() API.
v3:
Modify commit description.
v4:
Add text_mutex lock in the larch_insn_write call path and
CONFIG_STRICT_MODULE_RWX is enabled by default.
UB test on the 3C6000 server shows no significant performance impact.
Before patch:
========================================================================
BYTE UNIX Benchmarks (Version 5.1.6)
System: localhost.localdomain: GNU/Linux
OS: GNU/Linux -- 7.1.0-rc6 -- #1 SMP PREEMPT Wed Jun 10 21:07:41 CST 2026
Machine: loongarch64 (loongarch64)
Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
21:19:51 up 1 min, 2 users, load average: 0.71, 0.38, 0.14; runlevel 2026-06-10
------------------------------------------------------------------------
128 CPUs in system; running 1 parallel copy of tests
Dhrystone 2 using register variables 35205725.4 lps (10.0 s, 7 samples)
Double-Precision Whetstone 4244.9 MWIPS (10.0 s, 7 samples)
Execl Throughput 6717.7 lps (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 1213873.8 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 350740.5 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 3275103.0 KBps (30.0 s, 2 samples)
Pipe Throughput 1981993.9 lps (10.0 s, 7 samples)
Pipe-based Context Switching 55287.7 lps (10.0 s, 7 samples)
Process Creation 9056.8 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 6736.5 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 2109.8 lpm (60.0 s, 2 samples)
System Call Overhead 1549110.9 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 35205725.4 3016.8
Double-Precision Whetstone 55.0 4244.9 771.8
Execl Throughput 43.0 6717.7 1562.3
File Copy 1024 bufsize 2000 maxblocks 3960.0 1213873.8 3065.3
File Copy 256 bufsize 500 maxblocks 1655.0 350740.5 2119.3
File Copy 4096 bufsize 8000 maxblocks 5800.0 3275103.0 5646.7
Pipe Throughput 12440.0 1981993.9 1593.2
Pipe-based Context Switching 4000.0 55287.7 138.2
Process Creation 126.0 9056.8 718.8
Shell Scripts (1 concurrent) 42.4 6736.5 1588.8
Shell Scripts (8 concurrent) 6.0 2109.8 3516.4
System Call Overhead 15000.0 1549110.9 1032.7
========
System Benchmarks Index Score 1492.2
------------------------------------------------------------------------
128 CPUs in system; running 128 parallel copies of tests
Dhrystone 2 using register variables 2901925470.7 lps (10.0 s, 7 samples)
Double-Precision Whetstone 503614.9 MWIPS (10.1 s, 7 samples)
Execl Throughput 34080.1 lps (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 309291.4 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 75115.3 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 1018101.3 KBps (30.0 s, 2 samples)
Pipe Throughput 180702003.1 lps (10.0 s, 7 samples)
Pipe-based Context Switching 2426596.4 lps (10.0 s, 7 samples)
Process Creation 37282.5 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 50813.7 lpm (60.1 s, 2 samples)
Shell Scripts (8 concurrent) 5835.8 lpm (60.4 s, 2 samples)
System Call Overhead 9039181.1 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 2901925470.7 248665.4
Double-Precision Whetstone 55.0 503614.9 91566.3
Execl Throughput 43.0 34080.1 7925.6
File Copy 1024 bufsize 2000 maxblocks 3960.0 309291.4 781.0
File Copy 256 bufsize 500 maxblocks 1655.0 75115.3 453.9
File Copy 4096 bufsize 8000 maxblocks 5800.0 1018101.3 1755.3
Pipe Throughput 12440.0 180702003.1 145258.8
Pipe-based Context Switching 4000.0 2426596.4 6066.5
Process Creation 126.0 37282.5 2958.9
Shell Scripts (1 concurrent) 42.4 50813.7 11984.4
Shell Scripts (8 concurrent) 6.0 5835.8 9726.4
System Call Overhead 15000.0 9039181.1 6026.1
========
System Benchmarks Index Score 8765.2
After patch:
128 CPUs in system; running 1 parallel copy of tests
Dhrystone 2 using register variables 35438193.7 lps (10.0 s, 7 samples)
Double-Precision Whetstone 4245.7 MWIPS (10.0 s, 7 samples)
Execl Throughput 5293.7 lps (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 1233323.4 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 355264.5 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 3333631.6 KBps (30.0 s, 2 samples)
Pipe Throughput 1979613.2 lps (10.0 s, 7 samples)
Pipe-based Context Switching 55675.2 lps (10.0 s, 7 samples)
Process Creation 8528.1 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 6870.0 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 2115.5 lpm (60.0 s, 2 samples)
System Call Overhead 1546959.4 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 35438193.7 3036.7
Double-Precision Whetstone 55.0 4245.7 772.0
Execl Throughput 43.0 5293.7 1231.1
File Copy 1024 bufsize 2000 maxblocks 3960.0 1233323.4 3114.5
File Copy 256 bufsize 500 maxblocks 1655.0 355264.5 2146.6
File Copy 4096 bufsize 8000 maxblocks 5800.0 3333631.6 5747.6
Pipe Throughput 12440.0 1979613.2 1591.3
Pipe-based Context Switching 4000.0 55675.2 139.2
Process Creation 126.0 8528.1 676.8
Shell Scripts (1 concurrent) 42.4 6870.0 1620.3
Shell Scripts (8 concurrent) 6.0 2115.5 3525.8
System Call Overhead 15000.0 1546959.4 1031.3
========
System Benchmarks Index Score 1465.3
------------------------------------------------------------------------
128 CPUs in system; running 128 parallel copies of tests
Dhrystone 2 using register variables 2903340286.5 lps (10.0 s, 7 samples)
Double-Precision Whetstone 504137.7 MWIPS (10.1 s, 7 samples)
Execl Throughput 34332.8 lps (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 311391.2 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 72503.3 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 1000861.7 KBps (30.0 s, 2 samples)
Pipe Throughput 179382076.6 lps (10.0 s, 7 samples)
Pipe-based Context Switching 2415716.6 lps (10.0 s, 7 samples)
Process Creation 36873.1 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 51464.1 lpm (60.1 s, 2 samples)
Shell Scripts (8 concurrent) 5976.3 lpm (60.4 s, 2 samples)
System Call Overhead 9182389.5 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 2903340286.5 248786.7
Double-Precision Whetstone 55.0 504137.7 91661.4
Execl Throughput 43.0 34332.8 7984.4
File Copy 1024 bufsize 2000 maxblocks 3960.0 311391.2 786.3
File Copy 256 bufsize 500 maxblocks 1655.0 72503.3 438.1
File Copy 4096 bufsize 8000 maxblocks 5800.0 1000861.7 1725.6
Pipe Throughput 12440.0 179382076.6 144197.8
Pipe-based Context Switching 4000.0 2415716.6 6039.3
Process Creation 126.0 36873.1 2926.4
Shell Scripts (1 concurrent) 42.4 51464.1 12137.8
Shell Scripts (8 concurrent) 6.0 5976.3 9960.6
System Call Overhead 15000.0 9182389.5 6121.6
========
System Benchmarks Index Score 8759.8
---
arch/loongarch/Kconfig | 1 +
arch/loongarch/kernel/ftrace_dyn.c | 7 ++++++-
arch/loongarch/kernel/inst.c | 25 +++++++++++++++++++++----
arch/loongarch/kernel/jump_label.c | 3 +++
4 files changed, 31 insertions(+), 5 deletions(-)
diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index 606597da46b8..c751d714c287 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -27,6 +27,7 @@ config LOONGARCH
select ARCH_HAS_PTE_SPECIAL if 64BIT
select ARCH_HAS_SET_MEMORY
select ARCH_HAS_SET_DIRECT_MAP
+ select ARCH_HAS_STRICT_MODULE_RWX
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
select ARCH_HAS_UBSAN
select ARCH_HAS_VDSO_ARCH_DATA
diff --git a/arch/loongarch/kernel/ftrace_dyn.c b/arch/loongarch/kernel/ftrace_dyn.c
index d5d81d74034c..598dc6434cc4 100644
--- a/arch/loongarch/kernel/ftrace_dyn.c
+++ b/arch/loongarch/kernel/ftrace_dyn.c
@@ -8,6 +8,7 @@
#include <linux/ftrace.h>
#include <linux/kprobes.h>
#include <linux/uaccess.h>
+#include <linux/memory.h>
#include <asm/inst.h>
#include <asm/module.h>
@@ -24,8 +25,12 @@ static int ftrace_modify_code(unsigned long pc, u32 old, u32 new, bool validate)
return -EINVAL;
}
- if (larch_insn_patch_text((void *)pc, new))
+ mutex_lock(&text_mutex);
+ if (larch_insn_patch_text((void *)pc, new)) {
+ mutex_unlock(&text_mutex);
return -EPERM;
+ }
+ mutex_unlock(&text_mutex);
return 0;
}
diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
index 0b9228b7c13a..3de94d465c3c 100644
--- a/arch/loongarch/kernel/inst.c
+++ b/arch/loongarch/kernel/inst.c
@@ -6,12 +6,11 @@
#include <linux/uaccess.h>
#include <linux/set_memory.h>
#include <linux/stop_machine.h>
+#include <linux/memory.h>
#include <asm/cacheflush.h>
#include <asm/inst.h>
-static DEFINE_RAW_SPINLOCK(patch_lock);
-
void simu_pc(struct pt_regs *regs, union loongarch_instruction insn)
{
unsigned long pc = regs->csr_era;
@@ -207,14 +206,32 @@ int larch_insn_read(void *addr, u32 *insnp)
int larch_insn_write(void *addr, u32 insn)
{
int ret;
+ int err = 0;
+ size_t start;
unsigned long flags = 0;
if ((unsigned long)addr & 3)
return -EINVAL;
- raw_spin_lock_irqsave(&patch_lock, flags);
+ start = round_down((size_t)addr, PAGE_SIZE);
+
+ lockdep_assert_held(&text_mutex);
+
+ err = set_memory_rw(start, 1);
+ if (err) {
+ pr_info("%s: set_memory_rw() failed\n", __func__);
+ return err;
+ }
+
+ local_irq_save(flags);
ret = copy_to_kernel_nofault(addr, &insn, LOONGARCH_INSN_SIZE);
- raw_spin_unlock_irqrestore(&patch_lock, flags);
+ local_irq_restore(flags);
+
+ err = set_memory_rox(start, 1);
+ if (err) {
+ pr_info("%s: set_memory_rox() failed\n", __func__);
+ return err;
+ }
return ret;
}
diff --git a/arch/loongarch/kernel/jump_label.c b/arch/loongarch/kernel/jump_label.c
index 24a3f4d8540c..e6bb040fe4c5 100644
--- a/arch/loongarch/kernel/jump_label.c
+++ b/arch/loongarch/kernel/jump_label.c
@@ -6,6 +6,7 @@
*/
#include <linux/kernel.h>
#include <linux/jump_label.h>
+#include <linux/memory.h>
#include <asm/cacheflush.h>
#include <asm/inst.h>
@@ -19,7 +20,9 @@ bool arch_jump_label_transform_queue(struct jump_entry *entry, enum jump_label_t
else
insn = larch_insn_gen_nop();
+ mutex_lock(&text_mutex);
larch_insn_write(addr, insn);
+ mutex_unlock(&text_mutex);
return true;
}
--
2.43.0
next reply other threads:[~2026-06-13 8:42 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-13 8:41 haoran.jiang [this message]
2026-06-13 10:43 ` [PATCH v4] LoongArch: Enable STRICT_MODULE_RWX for stricter modules memory permissions Huacai Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260613084147.449502-1-haoran.jiang@linux.dev \
--to=haoran.jiang@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=chenhuacai@kernel.org \
--cc=jbohac@suse.cz \
--cc=jianghaoran@kylinos.cn \
--cc=kees@kernel.org \
--cc=kernel@xen0n.name \
--cc=linux-kernel@vger.kernel.org \
--cc=loongarch@lists.linux.dev \
--cc=yangtiezhu@loongson.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.