* [PATCH v3 0/9] s390/string: Convert various functions to C
@ 2026-06-09 10:33 Heiko Carstens
2026-06-09 10:33 ` [PATCH v3 1/9] s390/purgatory: Enforce z10 minimum architecture level Heiko Carstens
` (8 more replies)
0 siblings, 9 replies; 12+ messages in thread
From: Heiko Carstens @ 2026-06-09 10:33 UTC (permalink / raw)
To: Alexander Gordeev, Sven Schnelle, Vasily Gorbik,
Christian Borntraeger, Juergen Christ
Cc: linux-s390
v3:
- Add additional patch which adds -ffreestanding compile option to
string.o [Sashiko [3]]
v2:
Address various Shashiko findings [2]:
- Off by one comparison bug, which leads to slow path for memmove()
- Incorrect ifdef around __memset()
- Incorrect commit description for memmove() conversion
- Missing noinstr for tishift functions
- Missing header guard for tishift header file
v1:
While working on something else I stumbled again across the various mem*()
helper functions, which were implemented in assembler to avoid recursive
calls [1], when using the compiler's builtin functions.
Convert the functions back to C using inline assemblies, which makes them
hopefully a bit more readable and maintainable. Also improve the memmove()
implementation by using the mvcrl instruction for the backward copy case.
Thanks,
Heiko
[1] commit 535c611ddd3e ("s390/string: provide asm lib functions for memcpy and memcmp")
[2] https://sashiko.dev/#/patchset/20260607162937.2927356-1-hca%40linux.ibm.com
[3] https://sashiko.dev/#/patchset/20260608053754.571282-1-hca%40linux.ibm.com
Heiko Carstens (9):
s390/purgatory: Enforce z10 minimum architecture level
s390: Add .noinstr.text to boot and purgatory linker scripts
s390/string: Add -ffreestanding compile option to string.o
s390/string: Convert memmove() to C
s390/string: Convert memset() to C
s390/string: Convert memcpy() to C
s390/string: Convert memset(16|32|64)() to C
s390/memmove: Optimize backward copy case
s390/tishift: Convert __ashlti3(), __ashrti3(), __lshrti3() to C
arch/s390/boot/Makefile | 7 +-
arch/s390/boot/mem.S | 2 -
arch/s390/boot/string.c | 6 +-
arch/s390/boot/vmlinux.lds.S | 1 +
arch/s390/include/asm/asm-prototypes.h | 4 -
arch/s390/lib/Makefile | 6 +-
arch/s390/lib/mem.S | 192 ----------------------
arch/s390/lib/string.c | 210 +++++++++++++++++++++++++
arch/s390/lib/tishift.S | 63 --------
arch/s390/lib/tishift.c | 64 ++++++++
arch/s390/lib/tishift.h | 9 ++
arch/s390/purgatory/Makefile | 9 +-
arch/s390/purgatory/purgatory.lds.S | 1 +
13 files changed, 301 insertions(+), 273 deletions(-)
delete mode 100644 arch/s390/boot/mem.S
delete mode 100644 arch/s390/lib/mem.S
delete mode 100644 arch/s390/lib/tishift.S
create mode 100644 arch/s390/lib/tishift.c
create mode 100644 arch/s390/lib/tishift.h
--
2.53.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v3 1/9] s390/purgatory: Enforce z10 minimum architecture level
2026-06-09 10:33 [PATCH v3 0/9] s390/string: Convert various functions to C Heiko Carstens
@ 2026-06-09 10:33 ` Heiko Carstens
2026-06-09 10:33 ` [PATCH v3 2/9] s390: Add .noinstr.text to boot and purgatory linker scripts Heiko Carstens
` (7 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Heiko Carstens @ 2026-06-09 10:33 UTC (permalink / raw)
To: Alexander Gordeev, Sven Schnelle, Vasily Gorbik,
Christian Borntraeger, Juergen Christ
Cc: linux-s390
The purgatory code is compiled without the -march option. This means the
default architecture level of the compiler is used. This can cause
problems, e.g. if instructions used in inline assemblies are for a higher
architecture level than the default architecture level of the compiler.
Use z10 as minimum architecture level, similar to the boot code, to enforce
a defined architecture level set.
Reviewed-by: Juergen Christ <jchrist@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
---
arch/s390/purgatory/Makefile | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/s390/purgatory/Makefile b/arch/s390/purgatory/Makefile
index 95a8ac45b67e..f55764d0c49e 100644
--- a/arch/s390/purgatory/Makefile
+++ b/arch/s390/purgatory/Makefile
@@ -13,18 +13,20 @@ CFLAGS_sha256.o := -D__NO_FORTIFY
$(obj)/mem.o: $(srctree)/arch/s390/lib/mem.S FORCE
$(call if_changed_rule,as_o_S)
+CC_FLAGS_MARCH_MINIMUM := -march=z10
+
KBUILD_CFLAGS := $(CC_FLAGS_DIALECT) -fno-strict-aliasing -Wall -Wstrict-prototypes
KBUILD_CFLAGS += -Wno-pointer-sign -Wno-sign-compare
KBUILD_CFLAGS += -fno-zero-initialized-in-bss -fno-builtin -ffreestanding
KBUILD_CFLAGS += -Os -m64 -msoft-float -fno-common
KBUILD_CFLAGS += -fno-stack-protector
KBUILD_CFLAGS += -DDISABLE_BRANCH_PROFILING
-KBUILD_CFLAGS += -D__DISABLE_EXPORTS
+KBUILD_CFLAGS += $(CC_FLAGS_MARCH_MINIMUM) -D__DISABLE_EXPORTS
KBUILD_CFLAGS += $(CLANG_FLAGS)
KBUILD_CFLAGS += $(call cc-option,-fno-PIE)
KBUILD_CFLAGS += $(call cc-option, -Wno-default-const-init-unsafe)
KBUILD_AFLAGS := $(filter-out -DCC_USING_EXPOLINE,$(KBUILD_AFLAGS))
-KBUILD_AFLAGS += -D__DISABLE_EXPORTS
+KBUILD_AFLAGS += $(CC_FLAGS_MARCH_MINIMUM) -D__DISABLE_EXPORTS
# Since we link purgatory with -r unresolved symbols are not checked, so we
# also link a purgatory.chk binary without -r to check for unresolved symbols.
--
2.53.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v3 2/9] s390: Add .noinstr.text to boot and purgatory linker scripts
2026-06-09 10:33 [PATCH v3 0/9] s390/string: Convert various functions to C Heiko Carstens
2026-06-09 10:33 ` [PATCH v3 1/9] s390/purgatory: Enforce z10 minimum architecture level Heiko Carstens
@ 2026-06-09 10:33 ` Heiko Carstens
2026-06-09 10:33 ` [PATCH v3 3/9] s390/string: Add -ffreestanding compile option to string.o Heiko Carstens
` (6 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Heiko Carstens @ 2026-06-09 10:33 UTC (permalink / raw)
To: Alexander Gordeev, Sven Schnelle, Vasily Gorbik,
Christian Borntraeger, Juergen Christ
Cc: linux-s390
Upcoming changes will result in a .noinstr.text section within the
boot and purgatory string.o binary. Explicitly add the new section to
avoid orphaned warnings from the linker.
Reviewed-by: Juergen Christ <jchrist@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
---
arch/s390/boot/vmlinux.lds.S | 1 +
arch/s390/purgatory/purgatory.lds.S | 1 +
2 files changed, 2 insertions(+)
diff --git a/arch/s390/boot/vmlinux.lds.S b/arch/s390/boot/vmlinux.lds.S
index 070bc18babd0..d44964592541 100644
--- a/arch/s390/boot/vmlinux.lds.S
+++ b/arch/s390/boot/vmlinux.lds.S
@@ -31,6 +31,7 @@ SECTIONS
_text = .; /* Text */
*(.text)
*(.text.*)
+ *(.noinstr.text)
INIT_TEXT
_etext = . ;
}
diff --git a/arch/s390/purgatory/purgatory.lds.S b/arch/s390/purgatory/purgatory.lds.S
index 482eb4fbcef1..387d0db4085f 100644
--- a/arch/s390/purgatory/purgatory.lds.S
+++ b/arch/s390/purgatory/purgatory.lds.S
@@ -19,6 +19,7 @@ SECTIONS
_text = .; /* Text */
*(.text)
*(.text.*)
+ *(.noinstr.text)
_etext = . ;
}
.rodata : {
--
2.53.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v3 3/9] s390/string: Add -ffreestanding compile option to string.o
2026-06-09 10:33 [PATCH v3 0/9] s390/string: Convert various functions to C Heiko Carstens
2026-06-09 10:33 ` [PATCH v3 1/9] s390/purgatory: Enforce z10 minimum architecture level Heiko Carstens
2026-06-09 10:33 ` [PATCH v3 2/9] s390: Add .noinstr.text to boot and purgatory linker scripts Heiko Carstens
@ 2026-06-09 10:33 ` Heiko Carstens
2026-06-09 12:23 ` Juergen Christ
2026-06-09 10:33 ` [PATCH v3 4/9] s390/string: Convert memmove() to C Heiko Carstens
` (5 subsequent siblings)
8 siblings, 1 reply; 12+ messages in thread
From: Heiko Carstens @ 2026-06-09 10:33 UTC (permalink / raw)
To: Alexander Gordeev, Sven Schnelle, Vasily Gorbik,
Christian Borntraeger, Juergen Christ
Cc: linux-s390
Use -ffreestanding for string.o to avoid that the compiler generates
calls into themselves for standard library functions like memset().
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
---
arch/s390/boot/Makefile | 5 +++++
arch/s390/lib/Makefile | 5 +++++
2 files changed, 10 insertions(+)
diff --git a/arch/s390/boot/Makefile b/arch/s390/boot/Makefile
index a1e719a79d38..e1f82d118bc9 100644
--- a/arch/s390/boot/Makefile
+++ b/arch/s390/boot/Makefile
@@ -25,6 +25,11 @@ KBUILD_CFLAGS += $(call cc-option, -Wno-default-const-init-unsafe)
CFLAGS_sclp_early_core.o += -I$(srctree)/drivers/s390/char
+# string.o implements standard library functions like memset/memcpy etc.
+# Use -ffreestanding to ensure that the compiler does not try to "optimize"
+# them into calls to themselves.
+CFLAGS_string.o = -ffreestanding
+
obj-y := head.o als.o startup.o physmem_info.o ipl_parm.o ipl_report.o vmem.o
obj-y += string.o ebcdic.o sclp_early_core.o mem.o ipl_vmparm.o cmdline.o
obj-y += version.o pgm_check.o ctype.o ipl_data.o relocs.o alternative.o
diff --git a/arch/s390/lib/Makefile b/arch/s390/lib/Makefile
index 2bf47204f6ab..c82aedef0272 100644
--- a/arch/s390/lib/Makefile
+++ b/arch/s390/lib/Makefile
@@ -3,6 +3,11 @@
# Makefile for s390-specific library files..
#
+# string.o implements standard library functions like memset/memcpy etc.
+# Use -ffreestanding to ensure that the compiler does not try to "optimize"
+# them into calls to themselves.
+CFLAGS_string.o = -ffreestanding
+
lib-y += delay.o string.o uaccess.o find.o spinlock.o tishift.o
lib-y += csum-partial.o
obj-y += mem.o
--
2.53.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v3 4/9] s390/string: Convert memmove() to C
2026-06-09 10:33 [PATCH v3 0/9] s390/string: Convert various functions to C Heiko Carstens
` (2 preceding siblings ...)
2026-06-09 10:33 ` [PATCH v3 3/9] s390/string: Add -ffreestanding compile option to string.o Heiko Carstens
@ 2026-06-09 10:33 ` Heiko Carstens
2026-06-09 10:33 ` [PATCH v3 5/9] s390/string: Convert memset() " Heiko Carstens
` (4 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Heiko Carstens @ 2026-06-09 10:33 UTC (permalink / raw)
To: Alexander Gordeev, Sven Schnelle, Vasily Gorbik,
Christian Borntraeger, Juergen Christ
Cc: linux-s390
Convert memmove() from assembler to C, which should make it easier to
read and change, if required. And it allows the compiler to optimize
the code, and use different instructions, except for the used inline
assemblies.
Reviewed-by: Juergen Christ <jchrist@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
---
arch/s390/lib/mem.S | 41 -------------------------------------
arch/s390/lib/string.c | 46 ++++++++++++++++++++++++++++++++++++++++++
2 files changed, 46 insertions(+), 41 deletions(-)
diff --git a/arch/s390/lib/mem.S b/arch/s390/lib/mem.S
index d026debf250c..712b955ea9b4 100644
--- a/arch/s390/lib/mem.S
+++ b/arch/s390/lib/mem.S
@@ -11,47 +11,6 @@
GEN_BR_THUNK %r14
-/*
- * void *memmove(void *dest, const void *src, size_t n)
- */
-SYM_FUNC_START(__memmove)
- ltgr %r4,%r4
- lgr %r1,%r2
- jz .Lmemmove_exit
- aghi %r4,-1
- clgr %r2,%r3
- jnh .Lmemmove_forward
- la %r5,1(%r4,%r3)
- clgr %r2,%r5
- jl .Lmemmove_reverse
-.Lmemmove_forward:
- srlg %r0,%r4,8
- ltgr %r0,%r0
- jz .Lmemmove_forward_remainder
-.Lmemmove_forward_loop:
- mvc 0(256,%r1),0(%r3)
- la %r1,256(%r1)
- la %r3,256(%r3)
- brctg %r0,.Lmemmove_forward_loop
-.Lmemmove_forward_remainder:
- exrl %r4,.Lmemmove_mvc
-.Lmemmove_exit:
- BR_EX %r14
-.Lmemmove_reverse:
- ic %r0,0(%r4,%r3)
- stc %r0,0(%r4,%r1)
- brctg %r4,.Lmemmove_reverse
- ic %r0,0(%r4,%r3)
- stc %r0,0(%r4,%r1)
- BR_EX %r14
-.Lmemmove_mvc:
- mvc 0(1,%r1),0(%r3)
-SYM_FUNC_END(__memmove)
-EXPORT_SYMBOL(__memmove)
-
-SYM_FUNC_ALIAS(memmove, __memmove)
-EXPORT_SYMBOL(memmove)
-
/*
* memset implementation
*
diff --git a/arch/s390/lib/string.c b/arch/s390/lib/string.c
index 757f58960198..66286d486ef8 100644
--- a/arch/s390/lib/string.c
+++ b/arch/s390/lib/string.c
@@ -17,6 +17,52 @@
#include <linux/export.h>
#include <asm/asm.h>
+#define SYMBOL_FUNCTION_ALIAS(alias, name) \
+asm(".globl " __stringify(alias) "\n\t" \
+ ".set " __stringify(alias) "," __stringify(name))
+
+#ifdef __HAVE_ARCH_MEMMOVE
+noinstr void *__memmove(void *dest, const void *src, size_t n)
+{
+ const char *s = src;
+ char *d = dest;
+
+ if (!n)
+ return dest;
+ if ((d <= s || d >= s + n)) {
+ /* Forward copy */
+ while (n >= 256) {
+ asm volatile(
+ " mvc 0(256,%[d]),0(%[s])\n"
+ :
+ : [d] "a" (d), [s] "a" (s)
+ : "memory");
+ d += 256;
+ s += 256;
+ n -= 256;
+ }
+ if (n) {
+ asm volatile(
+ " exrl %[n],0f\n"
+ " j 1f\n"
+ "0: mvc 0(1,%[d]),0(%[s])\n"
+ "1:"
+ :
+ : [d] "a" (d), [s] "a" (s), [n] "a" (n - 1)
+ : "memory");
+ }
+ } else {
+ /* Backward copy */
+ while (n--)
+ d[n] = s[n];
+ }
+ return dest;
+}
+SYMBOL_FUNCTION_ALIAS(memmove, __memmove);
+EXPORT_SYMBOL(__memmove);
+EXPORT_SYMBOL(memmove);
+#endif
+
/*
* Helper functions to find the end of a string
*/
--
2.53.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v3 5/9] s390/string: Convert memset() to C
2026-06-09 10:33 [PATCH v3 0/9] s390/string: Convert various functions to C Heiko Carstens
` (3 preceding siblings ...)
2026-06-09 10:33 ` [PATCH v3 4/9] s390/string: Convert memmove() to C Heiko Carstens
@ 2026-06-09 10:33 ` Heiko Carstens
2026-06-09 10:33 ` [PATCH v3 6/9] s390/string: Convert memcpy() " Heiko Carstens
` (3 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Heiko Carstens @ 2026-06-09 10:33 UTC (permalink / raw)
To: Alexander Gordeev, Sven Schnelle, Vasily Gorbik,
Christian Borntraeger, Juergen Christ
Cc: linux-s390
Convert memset() from assembler to C, which should make it easier to
read and change, if required. And it allows the compiler to optimize
the code, and use different instructions, except for the used inline
assemblies.
Reviewed-by: Juergen Christ <jchrist@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
---
arch/s390/lib/mem.S | 63 ------------------------------------------
arch/s390/lib/string.c | 61 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 61 insertions(+), 63 deletions(-)
diff --git a/arch/s390/lib/mem.S b/arch/s390/lib/mem.S
index 712b955ea9b4..a27b103d7450 100644
--- a/arch/s390/lib/mem.S
+++ b/arch/s390/lib/mem.S
@@ -11,69 +11,6 @@
GEN_BR_THUNK %r14
-/*
- * memset implementation
- *
- * This code corresponds to the C construct below. We do distinguish
- * between clearing (c == 0) and setting a memory array (c != 0) simply
- * because nearly all memset invocations in the kernel clear memory and
- * the xc instruction is preferred in such cases.
- *
- * void *memset(void *s, int c, size_t n)
- * {
- * if (likely(c == 0))
- * return __builtin_memset(s, 0, n);
- * return __builtin_memset(s, c, n);
- * }
- */
-SYM_FUNC_START(__memset)
- ltgr %r4,%r4
- jz .Lmemset_exit
- ltgr %r3,%r3
- jnz .Lmemset_fill
- aghi %r4,-1
- srlg %r3,%r4,8
- ltgr %r3,%r3
- lgr %r1,%r2
- jz .Lmemset_clear_remainder
-.Lmemset_clear_loop:
- xc 0(256,%r1),0(%r1)
- la %r1,256(%r1)
- brctg %r3,.Lmemset_clear_loop
-.Lmemset_clear_remainder:
- exrl %r4,.Lmemset_xc
-.Lmemset_exit:
- BR_EX %r14
-.Lmemset_fill:
- cghi %r4,1
- lgr %r1,%r2
- je .Lmemset_fill_exit
- aghi %r4,-2
- srlg %r5,%r4,8
- ltgr %r5,%r5
- jz .Lmemset_fill_remainder
-.Lmemset_fill_loop:
- stc %r3,0(%r1)
- mvc 1(255,%r1),0(%r1)
- la %r1,256(%r1)
- brctg %r5,.Lmemset_fill_loop
-.Lmemset_fill_remainder:
- stc %r3,0(%r1)
- exrl %r4,.Lmemset_mvc
- BR_EX %r14
-.Lmemset_fill_exit:
- stc %r3,0(%r1)
- BR_EX %r14
-.Lmemset_xc:
- xc 0(1,%r1),0(%r1)
-.Lmemset_mvc:
- mvc 1(1,%r1),0(%r1)
-SYM_FUNC_END(__memset)
-EXPORT_SYMBOL(__memset)
-
-SYM_FUNC_ALIAS(memset, __memset)
-EXPORT_SYMBOL(memset)
-
/*
* memcpy implementation
*
diff --git a/arch/s390/lib/string.c b/arch/s390/lib/string.c
index 66286d486ef8..ff9c4b57b6f1 100644
--- a/arch/s390/lib/string.c
+++ b/arch/s390/lib/string.c
@@ -63,6 +63,67 @@ EXPORT_SYMBOL(__memmove);
EXPORT_SYMBOL(memmove);
#endif
+#ifdef __HAVE_ARCH_MEMSET
+noinstr void *__memset(void *s, int c, size_t n)
+{
+ char *xs = s;
+
+ if (!n)
+ return s;
+ if (!c) {
+ /* Clear memory */
+ while (n >= 256) {
+ asm volatile(
+ " xc 0(256,%[xs]),0(%[xs])"
+ :
+ : [xs] "a" (xs)
+ : "cc", "memory");
+ xs += 256;
+ n -= 256;
+ }
+ if (!n)
+ return s;
+ asm volatile(
+ " exrl %[n],0f\n"
+ " j 1f\n"
+ "0: xc 0(1,%[xs]),0(%[xs])\n"
+ "1:"
+ :
+ : [xs] "a" (xs), [n] "a" (n - 1)
+ : "cc", "memory");
+ } else {
+ /* Fill memory */
+ while (n >= 256) {
+ *xs = c;
+ asm volatile(
+ " mvc 1(255,%[xs]),0(%[xs])"
+ :
+ : [xs] "a" (xs)
+ : "memory");
+ xs += 256;
+ n -= 256;
+ }
+ if (!n)
+ return s;
+ *xs = c;
+ if (n == 1)
+ return s;
+ asm volatile(
+ " exrl %[n],0f\n"
+ " j 1f\n"
+ "0: mvc 1(1,%[xs]),0(%[xs])\n"
+ "1:"
+ :
+ : [xs] "a" (xs), [n] "a" (n - 2)
+ : "memory");
+ }
+ return s;
+}
+SYMBOL_FUNCTION_ALIAS(memset, __memset);
+EXPORT_SYMBOL(__memset);
+EXPORT_SYMBOL(memset);
+#endif
+
/*
* Helper functions to find the end of a string
*/
--
2.53.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v3 6/9] s390/string: Convert memcpy() to C
2026-06-09 10:33 [PATCH v3 0/9] s390/string: Convert various functions to C Heiko Carstens
` (4 preceding siblings ...)
2026-06-09 10:33 ` [PATCH v3 5/9] s390/string: Convert memset() " Heiko Carstens
@ 2026-06-09 10:33 ` Heiko Carstens
2026-06-09 10:33 ` [PATCH v3 7/9] s390/string: Convert memset(16|32|64)() " Heiko Carstens
` (2 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Heiko Carstens @ 2026-06-09 10:33 UTC (permalink / raw)
To: Alexander Gordeev, Sven Schnelle, Vasily Gorbik,
Christian Borntraeger, Juergen Christ
Cc: linux-s390
Convert memcpy() from assembler to C, which should make it easier to
read and change, if required. And it allows the compiler to optimize
the code, and use different instructions, except for the used inline
assemblies.
Reviewed-by: Juergen Christ <jchrist@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
---
arch/s390/lib/mem.S | 31 -------------------------------
arch/s390/lib/string.c | 34 ++++++++++++++++++++++++++++++++++
2 files changed, 34 insertions(+), 31 deletions(-)
diff --git a/arch/s390/lib/mem.S b/arch/s390/lib/mem.S
index a27b103d7450..d2e1ca87a568 100644
--- a/arch/s390/lib/mem.S
+++ b/arch/s390/lib/mem.S
@@ -11,37 +11,6 @@
GEN_BR_THUNK %r14
-/*
- * memcpy implementation
- *
- * void *memcpy(void *dest, const void *src, size_t n)
- */
-SYM_FUNC_START(__memcpy)
- ltgr %r4,%r4
- jz .Lmemcpy_exit
- aghi %r4,-1
- srlg %r5,%r4,8
- ltgr %r5,%r5
- lgr %r1,%r2
- jnz .Lmemcpy_loop
-.Lmemcpy_remainder:
- exrl %r4,.Lmemcpy_mvc
-.Lmemcpy_exit:
- BR_EX %r14
-.Lmemcpy_loop:
- mvc 0(256,%r1),0(%r3)
- la %r1,256(%r1)
- la %r3,256(%r3)
- brctg %r5,.Lmemcpy_loop
- j .Lmemcpy_remainder
-.Lmemcpy_mvc:
- mvc 0(1,%r1),0(%r3)
-SYM_FUNC_END(__memcpy)
-EXPORT_SYMBOL(__memcpy)
-
-SYM_FUNC_ALIAS(memcpy, __memcpy)
-EXPORT_SYMBOL(memcpy)
-
/*
* __memset16/32/64
*
diff --git a/arch/s390/lib/string.c b/arch/s390/lib/string.c
index ff9c4b57b6f1..4dd524cdef5f 100644
--- a/arch/s390/lib/string.c
+++ b/arch/s390/lib/string.c
@@ -124,6 +124,40 @@ EXPORT_SYMBOL(__memset);
EXPORT_SYMBOL(memset);
#endif
+#ifdef __HAVE_ARCH_MEMCPY
+noinstr void *__memcpy(void *dest, const void *src, size_t n)
+{
+ void *d = dest;
+
+ if (!n)
+ return d;
+ while (n >= 256) {
+ asm volatile(
+ " mvc 0(256,%[dest]),0(%[src])"
+ :
+ : [dest] "a" (dest), [src] "a" (src)
+ : "memory");
+ dest += 256;
+ src += 256;
+ n -= 256;
+ }
+ if (!n)
+ return d;
+ asm volatile(
+ " exrl %[n],1f\n"
+ " j 2f\n"
+ "1: mvc 0(1,%[dest]),0(%[src])\n"
+ "2:"
+ :
+ : [dest] "a" (dest), [src] "a" (src), [n] "a" (n - 1)
+ : "memory");
+ return d;
+}
+SYMBOL_FUNCTION_ALIAS(memcpy, __memcpy);
+EXPORT_SYMBOL(__memcpy);
+EXPORT_SYMBOL(memcpy);
+#endif
+
/*
* Helper functions to find the end of a string
*/
--
2.53.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v3 7/9] s390/string: Convert memset(16|32|64)() to C
2026-06-09 10:33 [PATCH v3 0/9] s390/string: Convert various functions to C Heiko Carstens
` (5 preceding siblings ...)
2026-06-09 10:33 ` [PATCH v3 6/9] s390/string: Convert memcpy() " Heiko Carstens
@ 2026-06-09 10:33 ` Heiko Carstens
2026-06-09 10:33 ` [PATCH v3 8/9] s390/memmove: Optimize backward copy case Heiko Carstens
2026-06-09 10:33 ` [PATCH v3 9/9] s390/tishift: Convert __ashlti3(), __ashrti3(), __lshrti3() to C Heiko Carstens
8 siblings, 0 replies; 12+ messages in thread
From: Heiko Carstens @ 2026-06-09 10:33 UTC (permalink / raw)
To: Alexander Gordeev, Sven Schnelle, Vasily Gorbik,
Christian Borntraeger, Juergen Christ
Cc: linux-s390
Convert memset(16|32|64)() from assembler to C, which should make it
easier to read and change, if required. And it allows the compiler to
optimize the code, and use different instructions, except for the used
inline assemblies.
Reviewed-by: Juergen Christ <jchrist@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
---
arch/s390/boot/Makefile | 2 +-
arch/s390/boot/mem.S | 2 --
arch/s390/boot/string.c | 6 +---
arch/s390/lib/Makefile | 1 -
arch/s390/lib/mem.S | 57 ------------------------------------
arch/s390/lib/string.c | 47 +++++++++++++++++++++++++++++
arch/s390/purgatory/Makefile | 5 +---
7 files changed, 50 insertions(+), 70 deletions(-)
delete mode 100644 arch/s390/boot/mem.S
delete mode 100644 arch/s390/lib/mem.S
diff --git a/arch/s390/boot/Makefile b/arch/s390/boot/Makefile
index e1f82d118bc9..10b75e053a6f 100644
--- a/arch/s390/boot/Makefile
+++ b/arch/s390/boot/Makefile
@@ -31,7 +31,7 @@ CFLAGS_sclp_early_core.o += -I$(srctree)/drivers/s390/char
CFLAGS_string.o = -ffreestanding
obj-y := head.o als.o startup.o physmem_info.o ipl_parm.o ipl_report.o vmem.o
-obj-y += string.o ebcdic.o sclp_early_core.o mem.o ipl_vmparm.o cmdline.o
+obj-y += string.o ebcdic.o sclp_early_core.o ipl_vmparm.o cmdline.o
obj-y += version.o pgm_check.o ctype.o ipl_data.o relocs.o alternative.o
obj-y += uv.o printk.o trampoline.o
obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
diff --git a/arch/s390/boot/mem.S b/arch/s390/boot/mem.S
deleted file mode 100644
index b33463633f03..000000000000
--- a/arch/s390/boot/mem.S
+++ /dev/null
@@ -1,2 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include "../lib/mem.S"
diff --git a/arch/s390/boot/string.c b/arch/s390/boot/string.c
index bd68161434a6..e4ad196cb720 100644
--- a/arch/s390/boot/string.c
+++ b/arch/s390/boot/string.c
@@ -43,11 +43,7 @@ ssize_t sized_strscpy(char *dst, const char *src, size_t count)
void *memset64(uint64_t *s, uint64_t v, size_t count)
{
- uint64_t *xs = s;
-
- while (count--)
- *xs++ = v;
- return s;
+ return __memset64(s, v, count * sizeof(v));
}
char *skip_spaces(const char *str)
diff --git a/arch/s390/lib/Makefile b/arch/s390/lib/Makefile
index c82aedef0272..aa6cc6a1fe88 100644
--- a/arch/s390/lib/Makefile
+++ b/arch/s390/lib/Makefile
@@ -10,7 +10,6 @@ CFLAGS_string.o = -ffreestanding
lib-y += delay.o string.o uaccess.o find.o spinlock.o tishift.o
lib-y += csum-partial.o
-obj-y += mem.o
lib-$(CONFIG_KPROBES) += probes.o
lib-$(CONFIG_UPROBES) += probes.o
obj-$(CONFIG_S390_KPROBES_SANITY_TEST) += test_kprobes_s390.o
diff --git a/arch/s390/lib/mem.S b/arch/s390/lib/mem.S
deleted file mode 100644
index d2e1ca87a568..000000000000
--- a/arch/s390/lib/mem.S
+++ /dev/null
@@ -1,57 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-/*
- * String handling functions.
- *
- * Copyright IBM Corp. 2012
- */
-
-#include <linux/export.h>
-#include <linux/linkage.h>
-#include <asm/nospec-insn.h>
-
- GEN_BR_THUNK %r14
-
-/*
- * __memset16/32/64
- *
- * void *__memset16(uint16_t *s, uint16_t v, size_t count)
- * void *__memset32(uint32_t *s, uint32_t v, size_t count)
- * void *__memset64(uint64_t *s, uint64_t v, size_t count)
- */
-.macro __MEMSET bits,bytes,insn
-SYM_FUNC_START(__memset\bits)
- ltgr %r4,%r4
- jz .L__memset_exit\bits
- cghi %r4,\bytes
- je .L__memset_store\bits
- aghi %r4,-(\bytes+1)
- srlg %r5,%r4,8
- ltgr %r5,%r5
- lgr %r1,%r2
- jz .L__memset_remainder\bits
-.L__memset_loop\bits:
- \insn %r3,0(%r1)
- mvc \bytes(256-\bytes,%r1),0(%r1)
- la %r1,256(%r1)
- brctg %r5,.L__memset_loop\bits
-.L__memset_remainder\bits:
- \insn %r3,0(%r1)
- exrl %r4,.L__memset_mvc\bits
- BR_EX %r14
-.L__memset_store\bits:
- \insn %r3,0(%r2)
-.L__memset_exit\bits:
- BR_EX %r14
-.L__memset_mvc\bits:
- mvc \bytes(1,%r1),0(%r1)
-SYM_FUNC_END(__memset\bits)
-.endm
-
-__MEMSET 16,2,sth
-EXPORT_SYMBOL(__memset16)
-
-__MEMSET 32,4,st
-EXPORT_SYMBOL(__memset32)
-
-__MEMSET 64,8,stg
-EXPORT_SYMBOL(__memset64)
diff --git a/arch/s390/lib/string.c b/arch/s390/lib/string.c
index 4dd524cdef5f..2f9e9e886016 100644
--- a/arch/s390/lib/string.c
+++ b/arch/s390/lib/string.c
@@ -158,6 +158,53 @@ EXPORT_SYMBOL(__memcpy);
EXPORT_SYMBOL(memcpy);
#endif
+#define DEFINE_MEMSET(_bits, _bytes, _type) \
+void *__memset##_bits(_type *s, _type v, size_t n) \
+{ \
+ _type *xs = s; \
+ \
+ if (!n) \
+ return s; \
+ while (n >= 256) { \
+ *xs = v; \
+ asm volatile( \
+ " mvc %[_b](256-%[_b],%[xs]),0(%[xs])\n" \
+ : \
+ : [xs] "a" (xs), [_b] "i" (_bytes) \
+ : "memory"); \
+ xs = (_type *)((char *)xs + 256); \
+ n -= 256; \
+ } \
+ if (!n) \
+ return s; \
+ *xs = v; \
+ if (n == _bytes) \
+ return s; \
+ n -= _bytes + 1; \
+ asm volatile( \
+ " exrl %[n],1f\n" \
+ " j 2f\n" \
+ "1: mvc %[_b](1,%[xs]),0(%[xs])\n" \
+ "2:" \
+ : \
+ : [n] "a" (n), [xs] "a" (xs), [_b] "i" (_bytes) \
+ : "memory"); \
+ return s; \
+} \
+EXPORT_SYMBOL(__memset##_bits)
+
+#ifdef __HAVE_ARCH_MEMSET16
+DEFINE_MEMSET(16, 2, uint16_t);
+#endif
+
+#ifdef __HAVE_ARCH_MEMSET32
+DEFINE_MEMSET(32, 4, uint32_t);
+#endif
+
+#ifdef __HAVE_ARCH_MEMSET64
+DEFINE_MEMSET(64, 8, uint64_t);
+#endif
+
/*
* Helper functions to find the end of a string
*/
diff --git a/arch/s390/purgatory/Makefile b/arch/s390/purgatory/Makefile
index f55764d0c49e..e74410bb1b88 100644
--- a/arch/s390/purgatory/Makefile
+++ b/arch/s390/purgatory/Makefile
@@ -1,6 +1,6 @@
# SPDX-License-Identifier: GPL-2.0
-purgatory-y := head.o purgatory.o string.o sha256.o mem.o
+purgatory-y := head.o purgatory.o string.o sha256.o
targets += $(purgatory-y) purgatory.lds purgatory purgatory.chk purgatory.ro
PURGATORY_OBJS = $(addprefix $(obj)/,$(purgatory-y))
@@ -10,9 +10,6 @@ $(obj)/sha256.o: $(srctree)/lib/crypto/sha256.c FORCE
CFLAGS_sha256.o := -D__NO_FORTIFY
-$(obj)/mem.o: $(srctree)/arch/s390/lib/mem.S FORCE
- $(call if_changed_rule,as_o_S)
-
CC_FLAGS_MARCH_MINIMUM := -march=z10
KBUILD_CFLAGS := $(CC_FLAGS_DIALECT) -fno-strict-aliasing -Wall -Wstrict-prototypes
--
2.53.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v3 8/9] s390/memmove: Optimize backward copy case
2026-06-09 10:33 [PATCH v3 0/9] s390/string: Convert various functions to C Heiko Carstens
` (6 preceding siblings ...)
2026-06-09 10:33 ` [PATCH v3 7/9] s390/string: Convert memset(16|32|64)() " Heiko Carstens
@ 2026-06-09 10:33 ` Heiko Carstens
2026-06-09 10:33 ` [PATCH v3 9/9] s390/tishift: Convert __ashlti3(), __ashrti3(), __lshrti3() to C Heiko Carstens
8 siblings, 0 replies; 12+ messages in thread
From: Heiko Carstens @ 2026-06-09 10:33 UTC (permalink / raw)
To: Alexander Gordeev, Sven Schnelle, Vasily Gorbik,
Christian Borntraeger, Juergen Christ
Cc: linux-s390
memmove() copies byte wise for the backward copy case, when the mvc
instruction cannot be used. This is quite slow, but can be optimized
with the mvcrl instruction, which is available since z15.
Some numbers (measured on a shared z16 LPAR) show that the new
implementation is nearly always faster, except for the non realistic
one and two byte cases:
size old new
1 2ns 3ns
2 4ns 5ns
4 5ns 5ns
8 8ns 5ns
16 12ns 6ns
32 8ns 7ns
64 15ns 7ns
128 31ns 9ns
256 64ns 10ns
512 129ns 18ns
1024 250ns 19ns
2048 498ns 38ns
Reviewed-by: Juergen Christ <jchrist@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
---
arch/s390/lib/string.c | 24 +++++++++++++++++++++++-
1 file changed, 23 insertions(+), 1 deletion(-)
diff --git a/arch/s390/lib/string.c b/arch/s390/lib/string.c
index 2f9e9e886016..e93e1acd2ade 100644
--- a/arch/s390/lib/string.c
+++ b/arch/s390/lib/string.c
@@ -15,6 +15,7 @@
#include <linux/types.h>
#include <linux/string.h>
#include <linux/export.h>
+#include <asm/facility.h>
#include <asm/asm.h>
#define SYMBOL_FUNCTION_ALIAS(alias, name) \
@@ -51,8 +52,29 @@ noinstr void *__memmove(void *dest, const void *src, size_t n)
: [d] "a" (d), [s] "a" (s), [n] "a" (n - 1)
: "memory");
}
+ return dest;
+ }
+ /* Backward copy */
+ if (test_facility(61)) {
+ /* Use mvcrl instruction if available */
+ while (n >= 256) {
+ asm volatile(
+ " lghi %%r0,255\n"
+ " .insn sse,0xe50a00000000,%[d],%[s]\n"
+ : [d] "=Q" (*(d + n - 256))
+ : [s] "Q" (*(s + n - 256))
+ : "0", "memory");
+ n -= 256;
+ }
+ if (n) {
+ asm volatile(
+ " lgr %%r0,%[n]\n"
+ " .insn sse,0xe50a00000000,%[d],%[s]\n"
+ : [d] "=Q" (*d)
+ : [s] "Q" (*s), [n] "d" (n - 1)
+ : "0", "memory");
+ }
} else {
- /* Backward copy */
while (n--)
d[n] = s[n];
}
--
2.53.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v3 9/9] s390/tishift: Convert __ashlti3(), __ashrti3(), __lshrti3() to C
2026-06-09 10:33 [PATCH v3 0/9] s390/string: Convert various functions to C Heiko Carstens
` (7 preceding siblings ...)
2026-06-09 10:33 ` [PATCH v3 8/9] s390/memmove: Optimize backward copy case Heiko Carstens
@ 2026-06-09 10:33 ` Heiko Carstens
8 siblings, 0 replies; 12+ messages in thread
From: Heiko Carstens @ 2026-06-09 10:33 UTC (permalink / raw)
To: Alexander Gordeev, Sven Schnelle, Vasily Gorbik,
Christian Borntraeger, Juergen Christ
Cc: linux-s390
There is no reason to have __ashlti3(), __ashrti3(), and __lshrti3()
implemented in C. Convert them all to C, which allows the compiler to
optimize the code if newer instructions allow that.
Reviewed-by: Juergen Christ <jchrist@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
---
arch/s390/include/asm/asm-prototypes.h | 4 --
arch/s390/lib/tishift.S | 63 -------------------------
arch/s390/lib/tishift.c | 64 ++++++++++++++++++++++++++
arch/s390/lib/tishift.h | 9 ++++
4 files changed, 73 insertions(+), 67 deletions(-)
delete mode 100644 arch/s390/lib/tishift.S
create mode 100644 arch/s390/lib/tishift.c
create mode 100644 arch/s390/lib/tishift.h
diff --git a/arch/s390/include/asm/asm-prototypes.h b/arch/s390/include/asm/asm-prototypes.h
index 7bd1801cf241..d4da4436d02b 100644
--- a/arch/s390/include/asm/asm-prototypes.h
+++ b/arch/s390/include/asm/asm-prototypes.h
@@ -8,8 +8,4 @@
#include <asm/nospec-branch.h>
#include <asm-generic/asm-prototypes.h>
-__int128_t __ashlti3(__int128_t a, int b);
-__int128_t __ashrti3(__int128_t a, int b);
-__int128_t __lshrti3(__int128_t a, int b);
-
#endif /* _ASM_S390_PROTOTYPES_H */
diff --git a/arch/s390/lib/tishift.S b/arch/s390/lib/tishift.S
deleted file mode 100644
index 96214f51f49b..000000000000
--- a/arch/s390/lib/tishift.S
+++ /dev/null
@@ -1,63 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-
-#include <linux/export.h>
-#include <linux/linkage.h>
-#include <asm/nospec-insn.h>
-
- .section .noinstr.text, "ax"
-
- GEN_BR_THUNK %r14
-
-SYM_FUNC_START(__ashlti3)
- lmg %r0,%r1,0(%r3)
- cije %r4,0,1f
- lhi %r3,64
- sr %r3,%r4
- jnh 0f
- srlg %r3,%r1,0(%r3)
- sllg %r0,%r0,0(%r4)
- sllg %r1,%r1,0(%r4)
- ogr %r0,%r3
- j 1f
-0: sllg %r0,%r1,-64(%r4)
- lghi %r1,0
-1: stmg %r0,%r1,0(%r2)
- BR_EX %r14
-SYM_FUNC_END(__ashlti3)
-EXPORT_SYMBOL(__ashlti3)
-
-SYM_FUNC_START(__ashrti3)
- lmg %r0,%r1,0(%r3)
- cije %r4,0,1f
- lhi %r3,64
- sr %r3,%r4
- jnh 0f
- sllg %r3,%r0,0(%r3)
- srlg %r1,%r1,0(%r4)
- srag %r0,%r0,0(%r4)
- ogr %r1,%r3
- j 1f
-0: srag %r1,%r0,-64(%r4)
- srag %r0,%r0,63
-1: stmg %r0,%r1,0(%r2)
- BR_EX %r14
-SYM_FUNC_END(__ashrti3)
-EXPORT_SYMBOL(__ashrti3)
-
-SYM_FUNC_START(__lshrti3)
- lmg %r0,%r1,0(%r3)
- cije %r4,0,1f
- lhi %r3,64
- sr %r3,%r4
- jnh 0f
- sllg %r3,%r0,0(%r3)
- srlg %r1,%r1,0(%r4)
- srlg %r0,%r0,0(%r4)
- ogr %r1,%r3
- j 1f
-0: srlg %r1,%r0,-64(%r4)
- lghi %r0,0
-1: stmg %r0,%r1,0(%r2)
- BR_EX %r14
-SYM_FUNC_END(__lshrti3)
-EXPORT_SYMBOL(__lshrti3)
diff --git a/arch/s390/lib/tishift.c b/arch/s390/lib/tishift.c
new file mode 100644
index 000000000000..bb16cf639af3
--- /dev/null
+++ b/arch/s390/lib/tishift.c
@@ -0,0 +1,64 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/export.h>
+#include <linux/types.h>
+#include "tishift.h"
+
+union ti {
+ __int128_t val;
+ struct {
+ u64 high;
+ u64 low;
+ };
+};
+
+noinstr __int128_t __ashlti3(__int128_t a, int shift)
+{
+ union ti ti = { .val = a };
+
+ if (!shift)
+ return ti.val;
+ if (shift < 64) {
+ ti.high = (ti.high << shift) | (ti.low >> (64 - shift));
+ ti.low = ti.low << shift;
+ } else {
+ ti.high = ti.low << (shift - 64);
+ ti.low = 0;
+ }
+ return ti.val;
+}
+EXPORT_SYMBOL(__ashlti3);
+
+noinstr __int128_t __ashrti3(__int128_t a, int shift)
+{
+ union ti ti = { .val = a };
+
+ if (!shift)
+ return ti.val;
+ if (shift < 64) {
+ ti.low = (ti.low >> shift) | (ti.high << (64 - shift));
+ ti.high = (int64_t)ti.high >> shift;
+ } else {
+ ti.low = (int64_t)ti.high >> (shift - 64);
+ ti.high = (int64_t)ti.high >> 63;
+ }
+ return ti.val;
+}
+EXPORT_SYMBOL(__ashrti3);
+
+noinstr __int128_t __lshrti3(__int128_t a, int shift)
+{
+ union ti ti = { .val = a };
+
+ if (!shift)
+ return ti.val;
+ if (shift < 64) {
+ ti.low = (ti.low >> shift) | (ti.high << (64 - shift));
+ ti.high = ti.high >> shift;
+ } else {
+ ti.low = ti.high >> (shift - 64);
+ ti.high = 0;
+ }
+ return ti.val;
+}
+EXPORT_SYMBOL(__lshrti3);
diff --git a/arch/s390/lib/tishift.h b/arch/s390/lib/tishift.h
new file mode 100644
index 000000000000..43a9b8c8e545
--- /dev/null
+++ b/arch/s390/lib/tishift.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _S390_LIB_TISHIFT_H
+#define _S390_LIB_TISHIFT_H
+
+__int128_t __ashlti3(__int128_t a, int b);
+__int128_t __ashrti3(__int128_t a, int b);
+__int128_t __lshrti3(__int128_t a, int b);
+
+#endif /* _S390_LIB_TISHIFT_H */
--
2.53.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v3 3/9] s390/string: Add -ffreestanding compile option to string.o
2026-06-09 10:33 ` [PATCH v3 3/9] s390/string: Add -ffreestanding compile option to string.o Heiko Carstens
@ 2026-06-09 12:23 ` Juergen Christ
2026-06-09 13:39 ` Heiko Carstens
0 siblings, 1 reply; 12+ messages in thread
From: Juergen Christ @ 2026-06-09 12:23 UTC (permalink / raw)
To: Heiko Carstens
Cc: Alexander Gordeev, Sven Schnelle, Vasily Gorbik,
Christian Borntraeger, linux-s390
> Use -ffreestanding for string.o to avoid that the compiler generates
> calls into themselves for standard library functions like memset().
>
> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
> ---
> arch/s390/boot/Makefile | 5 +++++
> arch/s390/lib/Makefile | 5 +++++
> 2 files changed, 10 insertions(+)
>
> diff --git a/arch/s390/boot/Makefile b/arch/s390/boot/Makefile
> index a1e719a79d38..e1f82d118bc9 100644
> --- a/arch/s390/boot/Makefile
> +++ b/arch/s390/boot/Makefile
> @@ -25,6 +25,11 @@ KBUILD_CFLAGS += $(call cc-option, -Wno-default-const-init-unsafe)
>
> CFLAGS_sclp_early_core.o += -I$(srctree)/drivers/s390/char
>
> +# string.o implements standard library functions like memset/memcpy etc.
> +# Use -ffreestanding to ensure that the compiler does not try to "optimize"
> +# them into calls to themselves.
> +CFLAGS_string.o = -ffreestanding
> +
Other places use simply expanded variables instead of recursively
expanded variables for CFLAGS or LDFLAGS. Is this an issue here?
Otherwise, -ffreestanding turns on -fno-builtin which then turns off
-ftree-loop-distribute-patterns which would detect the
memset/memcpy/memmove loops in GCC. So that is one way to make sure
this does not happen.
Reviewed-by: Juergen Christ <jchrist@linux.ibm.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 3/9] s390/string: Add -ffreestanding compile option to string.o
2026-06-09 12:23 ` Juergen Christ
@ 2026-06-09 13:39 ` Heiko Carstens
0 siblings, 0 replies; 12+ messages in thread
From: Heiko Carstens @ 2026-06-09 13:39 UTC (permalink / raw)
To: Juergen Christ
Cc: Alexander Gordeev, Sven Schnelle, Vasily Gorbik,
Christian Borntraeger, linux-s390
On Tue, Jun 09, 2026 at 02:23:23PM +0200, Juergen Christ wrote:
> > Use -ffreestanding for string.o to avoid that the compiler generates
> > calls into themselves for standard library functions like memset().
> >
> > Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
> > ---
> > arch/s390/boot/Makefile | 5 +++++
> > arch/s390/lib/Makefile | 5 +++++
> > 2 files changed, 10 insertions(+)
> >
> > diff --git a/arch/s390/boot/Makefile b/arch/s390/boot/Makefile
> > index a1e719a79d38..e1f82d118bc9 100644
> > --- a/arch/s390/boot/Makefile
> > +++ b/arch/s390/boot/Makefile
> > @@ -25,6 +25,11 @@ KBUILD_CFLAGS += $(call cc-option, -Wno-default-const-init-unsafe)
> >
> > CFLAGS_sclp_early_core.o += -I$(srctree)/drivers/s390/char
> >
> > +# string.o implements standard library functions like memset/memcpy etc.
> > +# Use -ffreestanding to ensure that the compiler does not try to "optimize"
> > +# them into calls to themselves.
> > +CFLAGS_string.o = -ffreestanding
> > +
>
> Other places use simply expanded variables instead of recursively
> expanded variables for CFLAGS or LDFLAGS. Is this an issue here?
I just copied the above from the generic lib/Makefile, since I didn't
want to make our logic more special.
> Otherwise, -ffreestanding turns on -fno-builtin which then turns off
> -ftree-loop-distribute-patterns which would detect the
> memset/memcpy/memmove loops in GCC. So that is one way to make sure
> this does not happen.
>
> Reviewed-by: Juergen Christ <jchrist@linux.ibm.com>
Thanks!
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-06-09 13:40 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-09 10:33 [PATCH v3 0/9] s390/string: Convert various functions to C Heiko Carstens
2026-06-09 10:33 ` [PATCH v3 1/9] s390/purgatory: Enforce z10 minimum architecture level Heiko Carstens
2026-06-09 10:33 ` [PATCH v3 2/9] s390: Add .noinstr.text to boot and purgatory linker scripts Heiko Carstens
2026-06-09 10:33 ` [PATCH v3 3/9] s390/string: Add -ffreestanding compile option to string.o Heiko Carstens
2026-06-09 12:23 ` Juergen Christ
2026-06-09 13:39 ` Heiko Carstens
2026-06-09 10:33 ` [PATCH v3 4/9] s390/string: Convert memmove() to C Heiko Carstens
2026-06-09 10:33 ` [PATCH v3 5/9] s390/string: Convert memset() " Heiko Carstens
2026-06-09 10:33 ` [PATCH v3 6/9] s390/string: Convert memcpy() " Heiko Carstens
2026-06-09 10:33 ` [PATCH v3 7/9] s390/string: Convert memset(16|32|64)() " Heiko Carstens
2026-06-09 10:33 ` [PATCH v3 8/9] s390/memmove: Optimize backward copy case Heiko Carstens
2026-06-09 10:33 ` [PATCH v3 9/9] s390/tishift: Convert __ashlti3(), __ashrti3(), __lshrti3() to C Heiko Carstens
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.