public inbox for linux-s390@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] s390: compile relocatable kernel with/without fPIE
@ 2024-02-19 13:27 Sumanth Korikkar
  2024-02-19 13:27 ` [PATCH v2 1/4] s390/vdso64: filter out munaligned-symbols flag for vdso Sumanth Korikkar
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: Sumanth Korikkar @ 2024-02-19 13:27 UTC (permalink / raw)
  To: linux-s390, hca, jpoimboe, joe.lawrence, gor; +Cc: iii, agordeev, sumanthk

Hi All,

This is a rebased version of Josh's patch series with a few fixups.
https://git.kernel.org/pub/scm/linux/kernel/git/jpoimboe/linux.git/log/?h=s390

This introduces the capability to compile the s390 relocatable kernel
with and without the -fPIE option.

When utilizing the kpatch functionality, it is advisable to compile the
kernel without the -fPIE option. This is particularly important if the
kernel is built with the -ffunction-sections and -fdata-sections flags.
The linker imposes a restriction on the number of sections (limited to
64k), necessitating the omission of -fPIE.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622872.html
[2] https://gcc.gnu.org/pipermail/gcc-patches/2023-August/625986.html

Gcc recently implemented an optimization [1] for loading symbols without
explicit alignment, aligning with the IBM Z ELF ABI. This ABI mandates
symbols to reside on a 2-byte boundary, enabling the use of the larl
instruction. However, kernel linker scripts may still generate unaligned
symbols. To address this, a new -munaligned-symbols option has been
introduced [2] in recent gcc versions. This option has to be used with
future gcc versions.

Older Clang lacks support for handling unaligned symbols generated
by kernel linker scripts when the kernel is built without -fPIE. However,
future versions of Clang will include support for the -munaligned-symbols
option. When the support is unavailable, compile the kernel with -fPIE
to maintain the existing behavior.

Patch 1 filters out -munaligned-symbol flag for vdso code. This is beneficial
when compiling kernel with -fno-PIE and -munaligned-symbols combination.

Patch 2 introduces the 'relocs' tool, which reads the vmlinux file and
generates a vmlinux.relocs_64 section, containing offsets for all
R_390_64 relocations.

Patch 3 enables the compilation of a relocatable kernel with or without
the -fPIE option. It  allows for building the relocatable kernel without
-fPIE.  However, if compiler cannot handle unaligned symbols, the kernel
is built with -fPIE.

Patch 4 handles orphan .rela sections when kernel is built with
-fno-PIE.

kpatch tools changes:
* -mno-pic-data-is-text-relative prevents relative addressing between
  code and data. This is needed to avoid relocation error when klp text
  and data are too far apart. kpatch already includes this flag.
  However, with these changes, ARCH_KFLAGS+="-fPIC" should be added to
  s390 kpatch tools, As -mno-pic-data-is-text-relative can be used only
  with -fPIC. The corresponding pull request will be sent to kpatch
  tools.

v2:
* Add Acked-by 
* Add my signed-off by:
* Rebase it to features branch

Thank you,
Sumanth

Josh Poimboeuf (2):
  s390: Add relocs tool
  s390: Compile relocatable kernel without -fPIE

Sumanth Korikkar (2):
  s390/vdso64: filter out munaligned-symbols flag for vdso
  s390/kernel: vmlinux.lds.S: handle orphan .rela sections

 arch/s390/Kconfig                    |  15 +-
 arch/s390/Makefile                   |   8 +-
 arch/s390/boot/.gitignore            |   1 +
 arch/s390/boot/Makefile              |  14 +-
 arch/s390/boot/boot.h                |   6 +
 arch/s390/boot/startup.c             |  80 +++++-
 arch/s390/boot/vmlinux.lds.S         |  18 ++
 arch/s390/include/asm/physmem_info.h |   1 +
 arch/s390/kernel/vdso64/Makefile     |   1 +
 arch/s390/kernel/vmlinux.lds.S       |  15 ++
 arch/s390/tools/.gitignore           |   1 +
 arch/s390/tools/Makefile             |   5 +
 arch/s390/tools/relocs.c             | 390 +++++++++++++++++++++++++++
 13 files changed, 542 insertions(+), 13 deletions(-)
 create mode 100644 arch/s390/tools/relocs.c

-- 
2.40.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 1/4] s390/vdso64: filter out munaligned-symbols flag for vdso
  2024-02-19 13:27 [PATCH v2 0/4] s390: compile relocatable kernel with/without fPIE Sumanth Korikkar
@ 2024-02-19 13:27 ` Sumanth Korikkar
  2024-02-19 13:27 ` [PATCH v2 2/4] s390: Add relocs tool Sumanth Korikkar
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Sumanth Korikkar @ 2024-02-19 13:27 UTC (permalink / raw)
  To: linux-s390, hca, jpoimboe, joe.lawrence, gor; +Cc: iii, agordeev, sumanthk

Gcc recently implemented an optimization [1] for loading symbols without
explicit alignment, aligning with the IBM Z ELF ABI. This ABI mandates
symbols to reside on a 2-byte boundary, enabling the use of the larl
instruction. However, kernel linker scripts may still generate unaligned
symbols. To address this, a new -munaligned-symbols option has been
introduced [2] in recent gcc versions.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622872.html
[2] https://gcc.gnu.org/pipermail/gcc-patches/2023-August/625986.html

However, when -munaligned-symbols  is used in vdso code, it leads to the
following compilation error:
`.data.rel.ro.local' referenced in section `.text' of
arch/s390/kernel/vdso64/vdso64_generic.o: defined in discarded section
`.data.rel.ro.local' of arch/s390/kernel/vdso64/vdso64_generic.o

vdso linker script discards .data section to make it lightweight.
However, -munaligned-symbols in vdso object files references literal
pool and accesses _vdso_data. Hence, compile vdso code without
-munaligned-symbols.  This means in the future, vdso code should deal
with alignment of newly introduced unaligned linker symbols.

Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
---
 arch/s390/kernel/vdso64/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/s390/kernel/vdso64/Makefile b/arch/s390/kernel/vdso64/Makefile
index caa4ebff8a19..ef9832726097 100644
--- a/arch/s390/kernel/vdso64/Makefile
+++ b/arch/s390/kernel/vdso64/Makefile
@@ -25,6 +25,7 @@ KBUILD_AFLAGS_64 += -m64
 
 KBUILD_CFLAGS_64 := $(filter-out -m64,$(KBUILD_CFLAGS))
 KBUILD_CFLAGS_64 := $(filter-out -mno-pic-data-is-text-relative,$(KBUILD_CFLAGS_64))
+KBUILD_CFLAGS_64 := $(filter-out -munaligned-symbols,$(KBUILD_CFLAGS_64))
 KBUILD_CFLAGS_64 += -m64 -fPIC -fno-common -fno-builtin
 ldflags-y := -shared -soname=linux-vdso64.so.1 \
 	     --hash-style=both --build-id=sha1 -T
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 2/4] s390: Add relocs tool
  2024-02-19 13:27 [PATCH v2 0/4] s390: compile relocatable kernel with/without fPIE Sumanth Korikkar
  2024-02-19 13:27 ` [PATCH v2 1/4] s390/vdso64: filter out munaligned-symbols flag for vdso Sumanth Korikkar
@ 2024-02-19 13:27 ` Sumanth Korikkar
  2024-02-19 16:11   ` Heiko Carstens
  2024-02-19 13:27 ` [PATCH v2 3/4] s390: Compile relocatable kernel without -fPIE Sumanth Korikkar
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: Sumanth Korikkar @ 2024-02-19 13:27 UTC (permalink / raw)
  To: linux-s390, hca, jpoimboe, joe.lawrence, gor; +Cc: iii, agordeev, sumanthk

From: Josh Poimboeuf <jpoimboe@kernel.org>

This 'relocs' tool is copied from the x86 version, ported for s390, and
greatly simplified to remove unnecessary features.

It reads vmlinux and outputs assembly to create a .vmlinux.relocs_64
section which contains the offsets of all R_390_64 relocations which
apply to allocatable sections.

Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
 arch/s390/tools/.gitignore |   1 +
 arch/s390/tools/Makefile   |   5 +
 arch/s390/tools/relocs.c   | 390 +++++++++++++++++++++++++++++++++++++
 3 files changed, 396 insertions(+)
 create mode 100644 arch/s390/tools/relocs.c

diff --git a/arch/s390/tools/.gitignore b/arch/s390/tools/.gitignore
index ea62f37b79ef..e6af51d9d183 100644
--- a/arch/s390/tools/.gitignore
+++ b/arch/s390/tools/.gitignore
@@ -1,3 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0-only
 gen_facilities
 gen_opcode_table
+relocs
diff --git a/arch/s390/tools/Makefile b/arch/s390/tools/Makefile
index f9dd47ff9ac4..f2862364fb42 100644
--- a/arch/s390/tools/Makefile
+++ b/arch/s390/tools/Makefile
@@ -25,3 +25,8 @@ $(kapi)/facility-defs.h: $(obj)/gen_facilities FORCE
 
 $(kapi)/dis-defs.h: $(obj)/gen_opcode_table FORCE
 	$(call filechk,dis-defs.h)
+
+hostprogs	+= relocs
+PHONY		+= relocs
+relocs: $(obj)/relocs
+	@:
diff --git a/arch/s390/tools/relocs.c b/arch/s390/tools/relocs.c
new file mode 100644
index 000000000000..d3ae25e3c3a4
--- /dev/null
+++ b/arch/s390/tools/relocs.c
@@ -0,0 +1,390 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <stdio.h>
+#include <stdarg.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <inttypes.h>
+#include <string.h>
+#include <errno.h>
+#include <unistd.h>
+#include <elf.h>
+#include <byteswap.h>
+#define USE_BSD
+#include <endian.h>
+
+#define ELF_BITS 64
+
+#define ELF_MACHINE             EM_S390
+#define ELF_MACHINE_NAME        "IBM S/390"
+#define SHT_REL_TYPE            SHT_RELA
+#define Elf_Rel                 Elf64_Rela
+
+#define ELF_CLASS               ELFCLASS64
+#define ELF_ENDIAN		ELFDATA2MSB
+#define ELF_R_SYM(val)          ELF64_R_SYM(val)
+#define ELF_R_TYPE(val)         ELF64_R_TYPE(val)
+#define ELF_ST_TYPE(o)          ELF64_ST_TYPE(o)
+#define ELF_ST_BIND(o)          ELF64_ST_BIND(o)
+#define ELF_ST_VISIBILITY(o)    ELF64_ST_VISIBILITY(o)
+
+#define ElfW(type)		_ElfW(ELF_BITS, type)
+#define _ElfW(bits, type)	__ElfW(bits, type)
+#define __ElfW(bits, type)	Elf##bits##_##type
+
+#define Elf_Addr		ElfW(Addr)
+#define Elf_Ehdr		ElfW(Ehdr)
+#define Elf_Phdr		ElfW(Phdr)
+#define Elf_Shdr		ElfW(Shdr)
+#define Elf_Sym			ElfW(Sym)
+
+static Elf_Ehdr		ehdr;
+static unsigned long	shnum;
+static unsigned int	shstrndx;
+
+struct relocs {
+	uint32_t	*offset;
+	unsigned long	count;
+	unsigned long	size;
+};
+
+static struct relocs relocs64;
+#define FMT PRIu64
+
+struct section {
+	Elf_Shdr       shdr;
+	struct section *link;
+	Elf_Rel        *reltab;
+};
+static struct section *secs;
+
+#if BYTE_ORDER == LITTLE_ENDIAN
+#define le16_to_cpu(val) (val)
+#define le32_to_cpu(val) (val)
+#define le64_to_cpu(val) (val)
+#define be16_to_cpu(val) bswap_16(val)
+#define be32_to_cpu(val) bswap_32(val)
+#define be64_to_cpu(val) bswap_64(val)
+#endif
+
+#if BYTE_ORDER == BIG_ENDIAN
+#define le16_to_cpu(val) bswap_16(val)
+#define le32_to_cpu(val) bswap_32(val)
+#define le64_to_cpu(val) bswap_64(val)
+#define be16_to_cpu(val) (val)
+#define be32_to_cpu(val) (val)
+#define be64_to_cpu(val) (val)
+#endif
+
+static uint16_t elf16_to_cpu(uint16_t val)
+{
+	if (ehdr.e_ident[EI_DATA] == ELFDATA2LSB)
+		return le16_to_cpu(val);
+	else
+		return be16_to_cpu(val);
+}
+
+static uint32_t elf32_to_cpu(uint32_t val)
+{
+	if (ehdr.e_ident[EI_DATA] == ELFDATA2LSB)
+		return le32_to_cpu(val);
+	else
+		return be32_to_cpu(val);
+}
+
+#define elf_half_to_cpu(x)	elf16_to_cpu(x)
+#define elf_word_to_cpu(x)	elf32_to_cpu(x)
+
+#if ELF_BITS == 64
+static uint64_t elf64_to_cpu(uint64_t val)
+{
+	return be64_to_cpu(val);
+}
+#define elf_addr_to_cpu(x)	elf64_to_cpu(x)
+#define elf_off_to_cpu(x)	elf64_to_cpu(x)
+#define elf_xword_to_cpu(x)	elf64_to_cpu(x)
+#else
+#define elf_addr_to_cpu(x)	elf32_to_cpu(x)
+#define elf_off_to_cpu(x)	elf32_to_cpu(x)
+#define elf_xword_to_cpu(x)	elf32_to_cpu(x)
+#endif
+
+static void die(char *fmt, ...)
+{
+	va_list ap;
+
+	va_start(ap, fmt);
+	vfprintf(stderr, fmt, ap);
+	va_end(ap);
+	exit(1);
+}
+
+static void read_ehdr(FILE *fp)
+{
+	if (fread(&ehdr, sizeof(ehdr), 1, fp) != 1) {
+		die("Cannot read ELF header: %s\n",
+			strerror(errno));
+	}
+	if (memcmp(ehdr.e_ident, ELFMAG, SELFMAG) != 0)
+		die("No ELF magic\n");
+	if (ehdr.e_ident[EI_CLASS] != ELF_CLASS)
+		die("Not a %d bit executable\n", ELF_BITS);
+	if (ehdr.e_ident[EI_DATA] != ELF_ENDIAN)
+		die("ELF endian mismatch\n");
+	if (ehdr.e_ident[EI_VERSION] != EV_CURRENT)
+		die("Unknown ELF version\n");
+
+	/* Convert the fields to native endian */
+	ehdr.e_type      = elf_half_to_cpu(ehdr.e_type);
+	ehdr.e_machine   = elf_half_to_cpu(ehdr.e_machine);
+	ehdr.e_version   = elf_word_to_cpu(ehdr.e_version);
+	ehdr.e_entry     = elf_addr_to_cpu(ehdr.e_entry);
+	ehdr.e_phoff     = elf_off_to_cpu(ehdr.e_phoff);
+	ehdr.e_shoff     = elf_off_to_cpu(ehdr.e_shoff);
+	ehdr.e_flags     = elf_word_to_cpu(ehdr.e_flags);
+	ehdr.e_ehsize    = elf_half_to_cpu(ehdr.e_ehsize);
+	ehdr.e_phentsize = elf_half_to_cpu(ehdr.e_phentsize);
+	ehdr.e_phnum     = elf_half_to_cpu(ehdr.e_phnum);
+	ehdr.e_shentsize = elf_half_to_cpu(ehdr.e_shentsize);
+	ehdr.e_shnum     = elf_half_to_cpu(ehdr.e_shnum);
+	ehdr.e_shstrndx  = elf_half_to_cpu(ehdr.e_shstrndx);
+
+	shnum = ehdr.e_shnum;
+	shstrndx = ehdr.e_shstrndx;
+
+	if ((ehdr.e_type != ET_EXEC) && (ehdr.e_type != ET_DYN))
+		die("Unsupported ELF header type\n");
+	if (ehdr.e_machine != ELF_MACHINE)
+		die("Not for %s\n", ELF_MACHINE_NAME);
+	if (ehdr.e_version != EV_CURRENT)
+		die("Unknown ELF version\n");
+	if (ehdr.e_ehsize != sizeof(Elf_Ehdr))
+		die("Bad Elf header size\n");
+	if (ehdr.e_phentsize != sizeof(Elf_Phdr))
+		die("Bad program header entry\n");
+	if (ehdr.e_shentsize != sizeof(Elf_Shdr))
+		die("Bad section header entry\n");
+
+	if (shnum == SHN_UNDEF || shstrndx == SHN_XINDEX) {
+		Elf_Shdr shdr;
+
+		if (fseek(fp, ehdr.e_shoff, SEEK_SET) < 0)
+			die("Seek to %" FMT " failed: %s\n", ehdr.e_shoff, strerror(errno));
+
+		if (fread(&shdr, sizeof(shdr), 1, fp) != 1)
+			die("Cannot read initial ELF section header: %s\n", strerror(errno));
+
+		if (shnum == SHN_UNDEF)
+			shnum = elf_xword_to_cpu(shdr.sh_size);
+
+		if (shstrndx == SHN_XINDEX)
+			shstrndx = elf_word_to_cpu(shdr.sh_link);
+	}
+
+	if (shstrndx >= shnum)
+		die("String table index out of bounds\n");
+}
+
+static void read_shdrs(FILE *fp)
+{
+	int i;
+	Elf_Shdr shdr;
+
+	secs = calloc(shnum, sizeof(struct section));
+	if (!secs)
+		die("Unable to allocate %ld section headers\n", shnum);
+
+	if (fseek(fp, ehdr.e_shoff, SEEK_SET) < 0)
+		die("Seek to %" FMT " failed: %s\n", ehdr.e_shoff, strerror(errno));
+
+	for (i = 0; i < shnum; i++) {
+		struct section *sec = &secs[i];
+
+		if (fread(&shdr, sizeof(shdr), 1, fp) != 1)
+			die("Cannot read ELF section headers %d/%ld: %s\n",
+					i, shnum, strerror(errno));
+
+		sec->shdr.sh_name      = elf_word_to_cpu(shdr.sh_name);
+		sec->shdr.sh_type      = elf_word_to_cpu(shdr.sh_type);
+		sec->shdr.sh_flags     = elf_xword_to_cpu(shdr.sh_flags);
+		sec->shdr.sh_addr      = elf_addr_to_cpu(shdr.sh_addr);
+		sec->shdr.sh_offset    = elf_off_to_cpu(shdr.sh_offset);
+		sec->shdr.sh_size      = elf_xword_to_cpu(shdr.sh_size);
+		sec->shdr.sh_link      = elf_word_to_cpu(shdr.sh_link);
+		sec->shdr.sh_info      = elf_word_to_cpu(shdr.sh_info);
+		sec->shdr.sh_addralign = elf_xword_to_cpu(shdr.sh_addralign);
+		sec->shdr.sh_entsize   = elf_xword_to_cpu(shdr.sh_entsize);
+
+		if (sec->shdr.sh_link < shnum)
+			sec->link = &secs[sec->shdr.sh_link];
+	}
+
+}
+
+static void read_relocs(FILE *fp)
+{
+	int i, j;
+
+	for (i = 0; i < shnum; i++) {
+		struct section *sec = &secs[i];
+
+		if (sec->shdr.sh_type != SHT_REL_TYPE)
+			continue;
+
+		sec->reltab = malloc(sec->shdr.sh_size);
+		if (!sec->reltab)
+			die("malloc of %" FMT " bytes for relocs failed\n", sec->shdr.sh_size);
+
+		if (fseek(fp, sec->shdr.sh_offset, SEEK_SET) < 0)
+			die("Seek to %" FMT " failed: %s\n", sec->shdr.sh_offset, strerror(errno));
+
+		if (fread(sec->reltab, 1, sec->shdr.sh_size, fp) != sec->shdr.sh_size)
+			die("Cannot read symbol table: %s\n", strerror(errno));
+
+		for (j = 0; j < sec->shdr.sh_size/sizeof(Elf_Rel); j++) {
+			Elf_Rel *rel = &sec->reltab[j];
+
+			rel->r_offset = elf_addr_to_cpu(rel->r_offset);
+			rel->r_info   = elf_xword_to_cpu(rel->r_info);
+#if (SHT_REL_TYPE == SHT_RELA)
+			rel->r_addend = elf_xword_to_cpu(rel->r_addend);
+#endif
+		}
+	}
+}
+
+static void add_reloc(struct relocs *r, uint32_t offset)
+{
+	if (r->count == r->size) {
+		unsigned long newsize = r->size + 50000;
+		void *mem = realloc(r->offset, newsize * sizeof(r->offset[0]));
+
+		if (!mem)
+			die("realloc of %ld entries for relocs failed\n", newsize);
+
+		r->offset = mem;
+		r->size = newsize;
+	}
+	r->offset[r->count++] = offset;
+}
+
+static int do_reloc(struct section *sec, Elf_Rel *rel)
+{
+	unsigned int r_type = ELF64_R_TYPE(rel->r_info);
+
+	ElfW(Addr) offset = rel->r_offset;
+
+	switch (r_type) {
+	case R_390_NONE:
+	case R_390_PC32:
+	case R_390_PC64:
+	case R_390_PC16DBL:
+	case R_390_PC32DBL:
+	case R_390_PLT32DBL:
+	case R_390_GOTENT:
+		break;
+	case R_390_64:
+		add_reloc(&relocs64, offset);
+		break;
+	default:
+		die("Unsupported relocation type: %d\n", r_type);
+		break;
+	}
+
+	return 0;
+}
+
+static void walk_relocs(void)
+{
+	int i;
+	/* Walk through the relocations */
+	for (i = 0; i < shnum; i++) {
+		struct section *sec_applies;
+		int j;
+		struct section *sec = &secs[i];
+
+		if (sec->shdr.sh_type != SHT_REL_TYPE)
+			continue;
+
+		sec_applies = &secs[sec->shdr.sh_info];
+		if (!(sec_applies->shdr.sh_flags & SHF_ALLOC))
+			continue;
+
+		for (j = 0; j < sec->shdr.sh_size/sizeof(Elf_Rel); j++) {
+			Elf_Rel *rel = &sec->reltab[j];
+
+			do_reloc(sec, rel);
+		}
+	}
+}
+
+static int cmp_relocs(const void *va, const void *vb)
+{
+	const uint32_t *a, *b;
+
+	a = va; b = vb;
+	return (*a == *b) ? 0 : (*a > *b) ? 1 : -1;
+}
+
+static void sort_relocs(struct relocs *r)
+{
+	qsort(r->offset, r->count, sizeof(r->offset[0]), cmp_relocs);
+}
+
+static int print_reloc(uint32_t v)
+{
+	return fprintf(stdout, "\t.long 0x%08"PRIx32"\n", v) > 0 ? 0 : -1;
+}
+
+static void emit_relocs(void)
+{
+	int i;
+
+	walk_relocs();
+	sort_relocs(&relocs64);
+
+	printf(".section \".vmlinux.relocs_64\",\"a\"\n");
+	for (i = 0; i < relocs64.count; i++)
+		print_reloc(relocs64.offset[i]);
+}
+
+static void process(FILE *fp)
+{
+	read_ehdr(fp);
+	read_shdrs(fp);
+	read_relocs(fp);
+	emit_relocs();
+}
+
+static void usage(void)
+{
+	die("relocs vmlinux\n");
+}
+
+int main(int argc, char **argv)
+{
+	const char *fname;
+	FILE *fp;
+	unsigned char e_ident[EI_NIDENT];
+
+	fname = NULL;
+
+	if (argc != 2)
+		usage();
+
+	fname = argv[1];
+
+	fp = fopen(fname, "r");
+	if (!fp)
+		die("Cannot open %s: %s\n", fname, strerror(errno));
+
+	if (fread(&e_ident, 1, EI_NIDENT, fp) != EI_NIDENT)
+		die("Cannot read %s: %s", fname, strerror(errno));
+
+	rewind(fp);
+
+	process(fp);
+
+	fclose(fp);
+	return 0;
+}
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 3/4] s390: Compile relocatable kernel without -fPIE
  2024-02-19 13:27 [PATCH v2 0/4] s390: compile relocatable kernel with/without fPIE Sumanth Korikkar
  2024-02-19 13:27 ` [PATCH v2 1/4] s390/vdso64: filter out munaligned-symbols flag for vdso Sumanth Korikkar
  2024-02-19 13:27 ` [PATCH v2 2/4] s390: Add relocs tool Sumanth Korikkar
@ 2024-02-19 13:27 ` Sumanth Korikkar
  2024-02-19 13:27 ` [PATCH v2 4/4] s390/kernel: vmlinux.lds.S: handle orphan .rela sections Sumanth Korikkar
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Sumanth Korikkar @ 2024-02-19 13:27 UTC (permalink / raw)
  To: linux-s390, hca, jpoimboe, joe.lawrence, gor; +Cc: iii, agordeev, sumanthk

From: Josh Poimboeuf <jpoimboe@kernel.org>

On s390, currently kernel uses the '-fPIE' compiler flag for compiling
vmlinux.  This has a few problems:

  - It uses dynamic symbols (.dynsym), for which the linker refuses to
    allow more than 64k sections.  This can break features which use
    '-ffunction-sections' and '-fdata-sections', including kpatch-build
    [1] and Function Granular KASLR.

  - It unnecessarily uses GOT relocations, adding an extra layer of
    indirection for many memory accesses.

Instead of using '-fPIE', resolve all the relocations at link time and
then manually adjust any absolute relocations (R_390_64) during boot.

This is done by first telling the linker to preserve all relocations
during the vmlinux link.  (Note this is harmless: they are later
stripped in the vmlinux.bin link.)

Then use the 'relocs' tool to find all absolute relocations (R_390_64)
which apply to allocatable sections.  The offsets of those relocations
are saved in a special section which is then used to adjust the
relocations during boot.

(Note: For some reason, Clang occasionally creates a GOT reference, even
without '-fPIE'.  So Clang-compiled kernels have a GOT, which needs to
be adjusted.)

On my mostly-defconfig kernel, this reduces kernel text size by ~1.3%.

[1] https://github.com/dynup/kpatch/issues/1284
[2] https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622872.html
[3] https://gcc.gnu.org/pipermail/gcc-patches/2023-August/625986.html

Compiler consideration:

Gcc recently implemented an optimization [2] for loading symbols without
explicit alignment, aligning with the IBM Z ELF ABI. This ABI mandates
symbols to reside on a 2-byte boundary, enabling the use of the larl
instruction. However, kernel linker scripts may still generate unaligned
symbols. To address this, a new -munaligned-symbols option has been
introduced [3] in recent gcc versions. This option has to be used with
future gcc versions.

Older Clang lacks support for handling unaligned symbols generated
by kernel linker scripts when the kernel is built without -fPIE. However,
future versions of Clang will include support for the -munaligned-symbols
option. When the support is unavailable, compile the kernel with -fPIE
to maintain the existing behavior.

In addition to it:
move vmlinux.relocs to safe relocation

When the kernel is built with CONFIG_KERNEL_UNCOMPRESSED, the entire
uncompressed vmlinux.bin is positioned in the bzImage decompressor
image at the default kernel LMA of 0x100000, enabling it to be executed
in-place. However, the size of .vmlinux.relocs could be large enough to
cause an overlap with the uncompressed kernel at the address 0x100000.
To address this issue, .vmlinux.relocs is positioned after the
.rodata.compressed in the bzImage. Nevertheless, in this configuration,
vmlinux.relocs will overlap with the .bss section of vmlinux.bin. To
overcome that, move vmlinux.relocs to a safe location before clearing
.bss and handling relocs.

[Rebased Josh Poimboeuf patches and move vmlinux.relocs
 to safe location: Sumanth Korikkar <sumanthk@linux.ibm.com>]
Tested-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
 arch/s390/Kconfig                    | 15 ++++--
 arch/s390/Makefile                   |  8 ++-
 arch/s390/boot/.gitignore            |  1 +
 arch/s390/boot/Makefile              | 14 ++++-
 arch/s390/boot/boot.h                |  6 +++
 arch/s390/boot/startup.c             | 80 +++++++++++++++++++++++++---
 arch/s390/boot/vmlinux.lds.S         | 18 +++++++
 arch/s390/include/asm/physmem_info.h |  1 +
 arch/s390/kernel/vmlinux.lds.S       |  9 ++++
 9 files changed, 139 insertions(+), 13 deletions(-)

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 39e937309fc2..2dc722fb31b4 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -583,14 +583,23 @@ config RELOCATABLE
 	help
 	  This builds a kernel image that retains relocation information
 	  so it can be loaded at an arbitrary address.
-	  The kernel is linked as a position-independent executable (PIE)
-	  and contains dynamic relocations which are processed early in the
-	  bootup process.
 	  The relocations make the kernel image about 15% larger (compressed
 	  10%), but are discarded at runtime.
 	  Note: this option exists only for documentation purposes, please do
 	  not remove it.
 
+config PIE_BUILD
+	def_bool CC_IS_CLANG && !$(cc-option,-munaligned-symbols)
+	help
+	  If the compiler is unable to generate code that can manage unaligned
+	  symbols, the kernel is linked as a position-independent executable
+	  (PIE) and includes dynamic relocations that are processed early
+	  during bootup.
+
+	  For kpatch functionality, it is recommended to build the kernel
+	  without the PIE_BUILD option. PIE_BUILD is only enabled when the
+	  compiler lacks proper support for handling unaligned symbols.
+
 config RANDOMIZE_BASE
 	bool "Randomize the address of the kernel image (KASLR)"
 	default y
diff --git a/arch/s390/Makefile b/arch/s390/Makefile
index 994f9b3d575f..2a58e1864931 100644
--- a/arch/s390/Makefile
+++ b/arch/s390/Makefile
@@ -14,8 +14,14 @@ KBUILD_AFLAGS_MODULE += -fPIC
 KBUILD_CFLAGS_MODULE += -fPIC
 KBUILD_AFLAGS	+= -m64
 KBUILD_CFLAGS	+= -m64
+ifdef CONFIG_PIE_BUILD
 KBUILD_CFLAGS	+= -fPIE
 LDFLAGS_vmlinux	:= -pie -z notext
+else
+KBUILD_CFLAGS	+= $(call cc-option,-munaligned-symbols,)
+LDFLAGS_vmlinux	:= --emit-relocs --discard-none
+extra_tools	:= relocs
+endif
 aflags_dwarf	:= -Wa,-gdwarf-2
 KBUILD_AFLAGS_DECOMPRESSOR := $(CLANG_FLAGS) -m64 -D__ASSEMBLY__
 ifndef CONFIG_AS_IS_LLVM
@@ -143,7 +149,7 @@ archheaders:
 
 archprepare:
 	$(Q)$(MAKE) $(build)=$(syscalls) kapi
-	$(Q)$(MAKE) $(build)=$(tools) kapi
+	$(Q)$(MAKE) $(build)=$(tools) kapi $(extra_tools)
 ifeq ($(KBUILD_EXTMOD),)
 # We need to generate vdso-offsets.h before compiling certain files in kernel/.
 # In order to do that, we should use the archprepare target, but we can't since
diff --git a/arch/s390/boot/.gitignore b/arch/s390/boot/.gitignore
index f56591bc0897..f5ef099e2fd3 100644
--- a/arch/s390/boot/.gitignore
+++ b/arch/s390/boot/.gitignore
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 image
 bzImage
+relocs.S
 section_cmp.*
 vmlinux
 vmlinux.lds
diff --git a/arch/s390/boot/Makefile b/arch/s390/boot/Makefile
index ace0bda1ad24..8cea8c7f82cf 100644
--- a/arch/s390/boot/Makefile
+++ b/arch/s390/boot/Makefile
@@ -37,7 +37,8 @@ CFLAGS_sclp_early_core.o += -I$(srctree)/drivers/s390/char
 
 obj-y	:= head.o als.o startup.o physmem_info.o ipl_parm.o ipl_report.o vmem.o
 obj-y	+= string.o ebcdic.o sclp_early_core.o mem.o ipl_vmparm.o cmdline.o
-obj-y	+= version.o pgm_check_info.o ctype.o ipl_data.o machine_kexec_reloc.o
+obj-y	+= version.o pgm_check_info.o ctype.o ipl_data.o
+obj-y	+= $(if $(CONFIG_PIE_BUILD),machine_kexec_reloc.o,relocs.o)
 obj-$(findstring y, $(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) $(CONFIG_PGSTE))	+= uv.o
 obj-$(CONFIG_RANDOMIZE_BASE)	+= kaslr.o
 obj-y	+= $(if $(CONFIG_KERNEL_UNCOMPRESSED),,decompressor.o) info.o
@@ -48,6 +49,9 @@ targets	:= bzImage section_cmp.boot.data section_cmp.boot.preserved.data $(obj-y
 targets	+= vmlinux.lds vmlinux vmlinux.bin vmlinux.bin.gz vmlinux.bin.bz2
 targets += vmlinux.bin.xz vmlinux.bin.lzma vmlinux.bin.lzo vmlinux.bin.lz4
 targets += vmlinux.bin.zst info.bin syms.bin vmlinux.syms $(obj-all)
+ifndef CONFIG_PIE_BUILD
+targets += relocs.S
+endif
 
 OBJECTS := $(addprefix $(obj)/,$(obj-y))
 OBJECTS_ALL := $(addprefix $(obj)/,$(obj-all))
@@ -106,6 +110,14 @@ OBJCOPYFLAGS_vmlinux.bin := -O binary --remove-section=.comment --remove-section
 $(obj)/vmlinux.bin: vmlinux FORCE
 	$(call if_changed,objcopy)
 
+ifndef CONFIG_PIE_BUILD
+CMD_RELOCS=arch/s390/tools/relocs
+quiet_cmd_relocs = RELOCS $@
+	cmd_relocs = $(CMD_RELOCS) $< > $@
+$(obj)/relocs.S: vmlinux FORCE
+	$(call if_changed,relocs)
+endif
+
 suffix-$(CONFIG_KERNEL_GZIP)  := .gz
 suffix-$(CONFIG_KERNEL_BZIP2) := .bz2
 suffix-$(CONFIG_KERNEL_LZ4)  := .lz4
diff --git a/arch/s390/boot/boot.h b/arch/s390/boot/boot.h
index 222c6886acf6..8943d5be7ca2 100644
--- a/arch/s390/boot/boot.h
+++ b/arch/s390/boot/boot.h
@@ -25,9 +25,14 @@ struct vmlinux_info {
 	unsigned long bootdata_size;
 	unsigned long bootdata_preserved_off;
 	unsigned long bootdata_preserved_size;
+#ifdef CONFIG_PIE_BUILD
 	unsigned long dynsym_start;
 	unsigned long rela_dyn_start;
 	unsigned long rela_dyn_end;
+#else
+	unsigned long got_off;
+	unsigned long got_size;
+#endif
 	unsigned long amode31_size;
 	unsigned long init_mm_off;
 	unsigned long swapper_pg_dir_off;
@@ -83,6 +88,7 @@ extern unsigned long vmalloc_size;
 extern int vmalloc_size_set;
 extern char __boot_data_start[], __boot_data_end[];
 extern char __boot_data_preserved_start[], __boot_data_preserved_end[];
+extern int __vmlinux_relocs_64_start[], __vmlinux_relocs_64_end[];
 extern char _decompressor_syms_start[], _decompressor_syms_end[];
 extern char _stack_start[], _stack_end[];
 extern char _end[], _decompressor_end[];
diff --git a/arch/s390/boot/startup.c b/arch/s390/boot/startup.c
index 9cc76e631759..cb0d89801c43 100644
--- a/arch/s390/boot/startup.c
+++ b/arch/s390/boot/startup.c
@@ -141,7 +141,8 @@ static void copy_bootdata(void)
 	memcpy((void *)vmlinux.bootdata_preserved_off, __boot_data_preserved_start, vmlinux.bootdata_preserved_size);
 }
 
-static void handle_relocs(unsigned long offset)
+#ifdef CONFIG_PIE_BUILD
+static void kaslr_adjust_relocs(unsigned long min_addr, unsigned long offset)
 {
 	Elf64_Rela *rela_start, *rela_end, *rela;
 	int r_type, r_sym, rc;
@@ -172,6 +173,62 @@ static void handle_relocs(unsigned long offset)
 	}
 }
 
+static void kaslr_adjust_got(unsigned long offset) {}
+static void rescue_relocs(void) {}
+static void free_relocs(void) {}
+#else
+int *vmlinux_relocs_64_start;
+int *vmlinux_relocs_64_end;
+
+static void rescue_relocs(void)
+{
+	unsigned long size, nrelocs;
+
+	nrelocs = __vmlinux_relocs_64_end - __vmlinux_relocs_64_start;
+	size = nrelocs * sizeof(uint32_t);
+	vmlinux_relocs_64_start = (void *)physmem_alloc_top_down(RR_RELOC, size, 0);
+	memmove(vmlinux_relocs_64_start, (void *)__vmlinux_relocs_64_start, size);
+	vmlinux_relocs_64_end = vmlinux_relocs_64_start + nrelocs;
+}
+
+static void free_relocs(void)
+{
+	physmem_free(RR_RELOC);
+}
+
+static void kaslr_adjust_relocs(unsigned long min_addr, unsigned long offset)
+{
+	int *reloc;
+	unsigned long max_addr = min_addr + vmlinux.image_size;
+	long loc;
+
+	/* Adjust R_390_64 relocations */
+	for (reloc = vmlinux_relocs_64_start;
+		reloc < vmlinux_relocs_64_end && *reloc;
+		reloc++) {
+		loc = (long)*reloc + offset;
+		if (loc < min_addr || loc > max_addr)
+			error("64-bit relocation outside of kernel!\n");
+		*(u64 *)loc += offset;
+	}
+}
+
+static void kaslr_adjust_got(unsigned long offset)
+{
+	u64 *entry;
+
+	/*
+	 * Even without -fPIE, Clang still uses a global offset table for some
+	 * reason.  Adjust the GOT entries.
+	 */
+	for (entry = (u64 *)vmlinux.got_off;
+	     entry < (u64 *)(vmlinux.got_off + vmlinux.got_size);
+	     entry++) {
+		*entry += offset;
+	}
+}
+#endif
+
 /*
  * Merge information from several sources into a single ident_map_size value.
  * "ident_map_size" represents the upper limit of physical memory we may ever
@@ -299,14 +356,18 @@ static void setup_vmalloc_size(void)
 	vmalloc_size = max(size, vmalloc_size);
 }
 
-static void offset_vmlinux_info(unsigned long offset)
+static void kaslr_adjust_vmlinux_info(unsigned long offset)
 {
 	*(unsigned long *)(&vmlinux.entry) += offset;
 	vmlinux.bootdata_off += offset;
 	vmlinux.bootdata_preserved_off += offset;
+#ifdef CONFIG_PIE_BUILD
 	vmlinux.rela_dyn_start += offset;
 	vmlinux.rela_dyn_end += offset;
 	vmlinux.dynsym_start += offset;
+#else
+	vmlinux.got_off += offset;
+#endif
 	vmlinux.init_mm_off += offset;
 	vmlinux.swapper_pg_dir_off += offset;
 	vmlinux.invalid_pg_dir_off += offset;
@@ -361,6 +422,7 @@ void startup_kernel(void)
 	detect_physmem_online_ranges(max_physmem_end);
 	save_ipl_cert_comp_list();
 	rescue_initrd(safe_addr, ident_map_size);
+	rescue_relocs();
 
 	if (kaslr_enabled()) {
 		vmlinux_lma = randomize_within_range(vmlinux.image_size + vmlinux.bss_size,
@@ -368,7 +430,7 @@ void startup_kernel(void)
 						     ident_map_size);
 		if (vmlinux_lma) {
 			__kaslr_offset = vmlinux_lma - vmlinux.default_lma;
-			offset_vmlinux_info(__kaslr_offset);
+			kaslr_adjust_vmlinux_info(__kaslr_offset);
 		}
 	}
 	vmlinux_lma = vmlinux_lma ?: vmlinux.default_lma;
@@ -393,18 +455,20 @@ void startup_kernel(void)
 	/*
 	 * The order of the following operations is important:
 	 *
-	 * - handle_relocs() must follow clear_bss_section() to establish static
+	 * - kaslr_adjust_relocs() must follow clear_bss_section() to establish static
 	 *   memory references to data in .bss to be used by setup_vmem()
 	 *   (i.e init_mm.pgd)
 	 *
-	 * - setup_vmem() must follow handle_relocs() to be able using
+	 * - setup_vmem() must follow kaslr_adjust_relocs() to be able using
 	 *   static memory references to data in .bss (i.e init_mm.pgd)
 	 *
-	 * - copy_bootdata() must follow setup_vmem() to propagate changes to
-	 *   bootdata made by setup_vmem()
+	 * - copy_bootdata() must follow setup_vmem() to propagate changes
+	 *   to bootdata made by setup_vmem()
 	 */
 	clear_bss_section(vmlinux_lma);
-	handle_relocs(__kaslr_offset);
+	kaslr_adjust_relocs(vmlinux_lma, __kaslr_offset);
+	kaslr_adjust_got(__kaslr_offset);
+	free_relocs();
 	setup_vmem(asce_limit);
 	copy_bootdata();
 
diff --git a/arch/s390/boot/vmlinux.lds.S b/arch/s390/boot/vmlinux.lds.S
index 4e3fb2b7abb6..e3208893ba6b 100644
--- a/arch/s390/boot/vmlinux.lds.S
+++ b/arch/s390/boot/vmlinux.lds.S
@@ -110,6 +110,24 @@ SECTIONS
 		_compressed_end = .;
 	}
 
+#ifndef CONFIG_PIE_BUILD
+	/*
+	 * When the kernel is built with CONFIG_KERNEL_UNCOMPRESSED, the entire
+	 * uncompressed vmlinux.bin is positioned in the bzImage decompressor
+	 * image at the default kernel LMA of 0x100000, enabling it to be
+	 * executed in-place. However, the size of .vmlinux.relocs could be
+	 * large enough to cause an overlap with the uncompressed kernel at the
+	 * address 0x100000. To address this issue, .vmlinux.relocs is
+	 * positioned after the .rodata.compressed.
+	 */
+	. = ALIGN(4);
+	.vmlinux.relocs : {
+		__vmlinux_relocs_64_start = .;
+		*(.vmlinux.relocs_64)
+		__vmlinux_relocs_64_end = .;
+	}
+#endif
+
 #define SB_TRAILER_SIZE 32
 	/* Trailer needed for Secure Boot */
 	. += SB_TRAILER_SIZE; /* make sure .sb.trailer does not overwrite the previous section */
diff --git a/arch/s390/include/asm/physmem_info.h b/arch/s390/include/asm/physmem_info.h
index 9e41a74fce9a..e747b067f8db 100644
--- a/arch/s390/include/asm/physmem_info.h
+++ b/arch/s390/include/asm/physmem_info.h
@@ -22,6 +22,7 @@ enum reserved_range_type {
 	RR_DECOMPRESSOR,
 	RR_INITRD,
 	RR_VMLINUX,
+	RR_RELOC,
 	RR_AMODE31,
 	RR_IPLREPORT,
 	RR_CERT_COMP_LIST,
diff --git a/arch/s390/kernel/vmlinux.lds.S b/arch/s390/kernel/vmlinux.lds.S
index 63cb403b3344..222b06662e19 100644
--- a/arch/s390/kernel/vmlinux.lds.S
+++ b/arch/s390/kernel/vmlinux.lds.S
@@ -63,7 +63,9 @@ SECTIONS
 		*(.data.rel.ro .data.rel.ro.*)
 	}
 	.got : {
+		__got_start = .;
 		*(.got)
+		__got_end = .;
 	}
 
 	. = ALIGN(PAGE_SIZE);
@@ -190,6 +192,7 @@ SECTIONS
 
 	PERCPU_SECTION(0x100)
 
+#ifdef CONFIG_PIE_BUILD
 	.dynsym ALIGN(8) : {
 		__dynsym_start = .;
 		*(.dynsym)
@@ -206,6 +209,7 @@ SECTIONS
 	.dynstr ALIGN(8) : {
 		*(.dynstr)
 	}
+#endif
 	.hash ALIGN(8) : {
 		*(.hash)
 	}
@@ -235,9 +239,14 @@ SECTIONS
 		QUAD(__boot_data_preserved_start)		/* bootdata_preserved_off */
 		QUAD(__boot_data_preserved_end -
 		     __boot_data_preserved_start)		/* bootdata_preserved_size */
+#ifdef CONFIG_PIE_BUILD
 		QUAD(__dynsym_start)				/* dynsym_start */
 		QUAD(__rela_dyn_start)				/* rela_dyn_start */
 		QUAD(__rela_dyn_end)				/* rela_dyn_end */
+#else
+		QUAD(__got_start)				/* got_off */
+		QUAD(__got_end - __got_start)			/* got_size */
+#endif
 		QUAD(_eamode31 - _samode31)			/* amode31_size */
 		QUAD(init_mm)
 		QUAD(swapper_pg_dir)
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 4/4] s390/kernel: vmlinux.lds.S: handle orphan .rela sections
  2024-02-19 13:27 [PATCH v2 0/4] s390: compile relocatable kernel with/without fPIE Sumanth Korikkar
                   ` (2 preceding siblings ...)
  2024-02-19 13:27 ` [PATCH v2 3/4] s390: Compile relocatable kernel without -fPIE Sumanth Korikkar
@ 2024-02-19 13:27 ` Sumanth Korikkar
  2024-02-19 16:12 ` [PATCH v2 0/4] s390: compile relocatable kernel with/without fPIE Heiko Carstens
  2024-06-18 20:37 ` Joe Lawrence
  5 siblings, 0 replies; 14+ messages in thread
From: Sumanth Korikkar @ 2024-02-19 13:27 UTC (permalink / raw)
  To: linux-s390, hca, jpoimboe, joe.lawrence, gor; +Cc: iii, agordeev, sumanthk

When kernel is built with CONFIG_LD_ORPHAN_WARN and -fno-PIE, there are
several warnings:

ld: warning: orphan section `.rela.iplt' from
`arch/s390/kernel/head64.o' being placed in section `.rela.dyn'
ld: warning: orphan section `.rela.head.text' from
`arch/s390/kernel/head64.o' being placed in section `.rela.dyn'
ld: warning: orphan section `.rela.init.text' from
`arch/s390/kernel/head64.o' being placed in section `.rela.dyn'
ld: warning: orphan section `.rela.rodata.cst8' from
`arch/s390/kernel/head64.o' being placed in section `.rela.dyn'

Orphan sections are sections that exist in an object file but don't have
a corresponding output section in the final executable. ld raises a
warning when it identifies such sections.

Eliminate the warning by placing all .rela orphan sections in .rela.dyn
and raise an error when size of .rela.dyn is greater than zero. i.e.
Dont just neglect orphan sections.

This is similar to adjustment performed in x86, where kernel is built
with -fno-PIE.
commit 5354e84598f2 ("x86/build: Add asserts for unwanted sections")

Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
---
 arch/s390/kernel/vmlinux.lds.S | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/s390/kernel/vmlinux.lds.S b/arch/s390/kernel/vmlinux.lds.S
index 222b06662e19..404883b1b023 100644
--- a/arch/s390/kernel/vmlinux.lds.S
+++ b/arch/s390/kernel/vmlinux.lds.S
@@ -277,6 +277,12 @@ SECTIONS
 		*(.plt) *(.plt.*) *(.iplt) *(.igot .igot.plt)
 	}
 	ASSERT(SIZEOF(.plt) == 0, "Unexpected run-time procedure linkages detected!")
+#ifndef CONFIG_PIE_BUILD
+	.rela.dyn : {
+		*(.rela.*) *(.rela_*)
+	}
+	ASSERT(SIZEOF(.rela.dyn) == 0, "Unexpected run-time relocations (.rela) detected!")
+#endif
 
 	/* Sections to be discarded */
 	DISCARDS
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/4] s390: Add relocs tool
  2024-02-19 13:27 ` [PATCH v2 2/4] s390: Add relocs tool Sumanth Korikkar
@ 2024-02-19 16:11   ` Heiko Carstens
  0 siblings, 0 replies; 14+ messages in thread
From: Heiko Carstens @ 2024-02-19 16:11 UTC (permalink / raw)
  To: Sumanth Korikkar; +Cc: linux-s390, jpoimboe, joe.lawrence, gor, iii, agordeev

On Mon, Feb 19, 2024 at 02:27:32PM +0100, Sumanth Korikkar wrote:
> From: Josh Poimboeuf <jpoimboe@kernel.org>
> 
> This 'relocs' tool is copied from the x86 version, ported for s390, and
> greatly simplified to remove unnecessary features.
> 
> It reads vmlinux and outputs assembly to create a .vmlinux.relocs_64
> section which contains the offsets of all R_390_64 relocations which
> apply to allocatable sections.
> 
> Acked-by: Vasily Gorbik <gor@linux.ibm.com>
> Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
> ---

Sumanth, your Signed-off-by line should come last, since it should
reflect the route of a patch. I'll fix it up :)

> diff --git a/arch/s390/tools/relocs.c b/arch/s390/tools/relocs.c
> new file mode 100644
> index 000000000000..d3ae25e3c3a4
> --- /dev/null
> +++ b/arch/s390/tools/relocs.c
> @@ -0,0 +1,390 @@
> +#define ELF_BITS 64

With this...

> +#if ELF_BITS == 64
> +static uint64_t elf64_to_cpu(uint64_t val)
> +{
> +	return be64_to_cpu(val);
> +}
> +#define elf_addr_to_cpu(x)	elf64_to_cpu(x)
> +#define elf_off_to_cpu(x)	elf64_to_cpu(x)
> +#define elf_xword_to_cpu(x)	elf64_to_cpu(x)
> +#else
> +#define elf_addr_to_cpu(x)	elf32_to_cpu(x)
> +#define elf_off_to_cpu(x)	elf32_to_cpu(x)
> +#define elf_xword_to_cpu(x)	elf32_to_cpu(x)
> +#endif

...this is dead code. I'll remove it.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/4] s390: compile relocatable kernel with/without fPIE
  2024-02-19 13:27 [PATCH v2 0/4] s390: compile relocatable kernel with/without fPIE Sumanth Korikkar
                   ` (3 preceding siblings ...)
  2024-02-19 13:27 ` [PATCH v2 4/4] s390/kernel: vmlinux.lds.S: handle orphan .rela sections Sumanth Korikkar
@ 2024-02-19 16:12 ` Heiko Carstens
  2024-06-18 20:37 ` Joe Lawrence
  5 siblings, 0 replies; 14+ messages in thread
From: Heiko Carstens @ 2024-02-19 16:12 UTC (permalink / raw)
  To: Sumanth Korikkar; +Cc: linux-s390, jpoimboe, joe.lawrence, gor, iii, agordeev

On Mon, Feb 19, 2024 at 02:27:30PM +0100, Sumanth Korikkar wrote:
> Hi All,
> 
> This is a rebased version of Josh's patch series with a few fixups.
> https://git.kernel.org/pub/scm/linux/kernel/git/jpoimboe/linux.git/log/?h=s390
> 
> This introduces the capability to compile the s390 relocatable kernel
> with and without the -fPIE option.
...
> Josh Poimboeuf (2):
>   s390: Add relocs tool
>   s390: Compile relocatable kernel without -fPIE
> 
> Sumanth Korikkar (2):
>   s390/vdso64: filter out munaligned-symbols flag for vdso
>   s390/kernel: vmlinux.lds.S: handle orphan .rela sections
> 
>  arch/s390/Kconfig                    |  15 +-
>  arch/s390/Makefile                   |   8 +-
>  arch/s390/boot/.gitignore            |   1 +
>  arch/s390/boot/Makefile              |  14 +-
>  arch/s390/boot/boot.h                |   6 +
>  arch/s390/boot/startup.c             |  80 +++++-
>  arch/s390/boot/vmlinux.lds.S         |  18 ++
>  arch/s390/include/asm/physmem_info.h |   1 +
>  arch/s390/kernel/vdso64/Makefile     |   1 +
>  arch/s390/kernel/vmlinux.lds.S       |  15 ++
>  arch/s390/tools/.gitignore           |   1 +
>  arch/s390/tools/Makefile             |   5 +
>  arch/s390/tools/relocs.c             | 390 +++++++++++++++++++++++++++
>  13 files changed, 542 insertions(+), 13 deletions(-)
>  create mode 100644 arch/s390/tools/relocs.c

Thanks a lot for providing this!

Series applied.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/4] s390: compile relocatable kernel with/without fPIE
  2024-02-19 13:27 [PATCH v2 0/4] s390: compile relocatable kernel with/without fPIE Sumanth Korikkar
                   ` (4 preceding siblings ...)
  2024-02-19 16:12 ` [PATCH v2 0/4] s390: compile relocatable kernel with/without fPIE Heiko Carstens
@ 2024-06-18 20:37 ` Joe Lawrence
  2024-06-19 17:01   ` Sumanth Korikkar
  5 siblings, 1 reply; 14+ messages in thread
From: Joe Lawrence @ 2024-06-18 20:37 UTC (permalink / raw)
  To: Sumanth Korikkar; +Cc: linux-s390, hca, jpoimboe, gor, iii, agordeev

On Mon, Feb 19, 2024 at 02:27:30PM +0100, Sumanth Korikkar wrote:
> Hi All,
> 
> This is a rebased version of Josh's patch series with a few fixups.
> https://git.kernel.org/pub/scm/linux/kernel/git/jpoimboe/linux.git/log/?h=s390
> 
> This introduces the capability to compile the s390 relocatable kernel
> with and without the -fPIE option.
> 
> When utilizing the kpatch functionality, it is advisable to compile the
> kernel without the -fPIE option. This is particularly important if the
> kernel is built with the -ffunction-sections and -fdata-sections flags.
> The linker imposes a restriction on the number of sections (limited to
> 64k), necessitating the omission of -fPIE.
> 
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622872.html
> [2] https://gcc.gnu.org/pipermail/gcc-patches/2023-August/625986.html
> 
> Gcc recently implemented an optimization [1] for loading symbols without
> explicit alignment, aligning with the IBM Z ELF ABI. This ABI mandates
> symbols to reside on a 2-byte boundary, enabling the use of the larl
> instruction. However, kernel linker scripts may still generate unaligned
> symbols. To address this, a new -munaligned-symbols option has been
> introduced [2] in recent gcc versions. This option has to be used with
> future gcc versions.
> 
> Older Clang lacks support for handling unaligned symbols generated
> by kernel linker scripts when the kernel is built without -fPIE. However,
> future versions of Clang will include support for the -munaligned-symbols
> option. When the support is unavailable, compile the kernel with -fPIE
> to maintain the existing behavior.
> 
> Patch 1 filters out -munaligned-symbol flag for vdso code. This is beneficial
> when compiling kernel with -fno-PIE and -munaligned-symbols combination.
> 
> Patch 2 introduces the 'relocs' tool, which reads the vmlinux file and
> generates a vmlinux.relocs_64 section, containing offsets for all
> R_390_64 relocations.
> 
> Patch 3 enables the compilation of a relocatable kernel with or without
> the -fPIE option. It  allows for building the relocatable kernel without
> -fPIE.  However, if compiler cannot handle unaligned symbols, the kernel
> is built with -fPIE.
> 
> Patch 4 handles orphan .rela sections when kernel is built with
> -fno-PIE.
> 
> kpatch tools changes:
> * -mno-pic-data-is-text-relative prevents relative addressing between
>   code and data. This is needed to avoid relocation error when klp text
>   and data are too far apart. kpatch already includes this flag.
>   However, with these changes, ARCH_KFLAGS+="-fPIC" should be added to
>   s390 kpatch tools, As -mno-pic-data-is-text-relative can be used only
>   with -fPIC. The corresponding pull request will be sent to kpatch
>   tools.

Hi Sumanth,

I noticed interesting compiler differences when adding -fPIC build
option and not.  The difference in resulting output can confuse
kpatch-build when it tries to verify that its reference build (with the
mentioned options, plus --ffunction-sections and -fdata-sections),
doesn't line up closely enough with the original vmlinux source (sans
all these options).

I don't think a kpatch-build PR was ever opened to add "-fPIC", but the
compiler now warns that its required.  Have you ever seen optimization /
build output changes when adding this option and if so, were there
additional kpatch-build implications?

Here is a quick example that I stumbled on while investigating the
kpatch-build shadow-newpid.patch integration test that modifies
kernel/fork.c.  I couldn't reproduce with the s390 defconfig, but it
shows up when building with RHEL-10 config.  Reproducer steps and
disassembly examples follows.

Regards,

-- Joe

-->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8--

# Setup - v6.9 with a RHEL-10 config

  $ git clone --depth=1 --branch v6.9 git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
  $ cd linux

  $ curl -o .config https://file.rdu.redhat.com/~jolawren/kernel-s390x-rhel.config
  $ make olddefconfig


# Build and disassemble kernel/fork.c :: mmput()

  $ make -s -j$(nproc) kernel/fork.o
  $ objdump -Dr -j.text --disassemble=mmput kernel/fork.o

  kernel/fork.o:     file format elf64-s390


  Disassembly of section .text:

  0000000000001090 <mmput>:
      1090:       c0 04 00 00 00 00       jgnop   1090 <mmput>
      1096:       a7 18 ff ff             lhi     %r1,-1
      109a:       eb 01 21 3c 00 f8       laa     %r0,%r1,316(%r2)
      10a0:       07 e0                   bnor    %r0
      10a2:       ec 08 00 06 01 7e       cije    %r0,1,10ae <mmput+0x1e>
      10a8:       c0 f4 00 00 00 00       jg      10a8 <mmput+0x18>
                          10aa: R_390_PC32DBL     __s390_indirect_jump_r14+0x2
      10ae:       c0 f4 00 00 00 00       jg      10ae <mmput+0x1e>
                          10b0: R_390_PLT32DBL    __mmput+0x2


# Build with -fPIC and disassemble kernel/fork.c :: mmput()

  $ make -s clean
  $ KCFLAGS="-fPIC -mno-pic-data-is-text-relative -fno-section-anchors" make -s -j$(nproc) kernel/fork.o
  $ objdump -Dr -j.text --disassemble=mmput kernel/fork.o

  kernel/fork.o:     file format elf64-s390


  Disassembly of section .text:

  0000000000002430 <mmput>:
      2430:       c0 04 00 00 00 00       jgnop   2430 <mmput>
      2436:       a7 18 ff ff             lhi     %r1,-1
      243a:       eb 01 21 3c 00 f8       laa     %r0,%r1,316(%r2)
      2440:       07 e0                   bnor    %r0
      2442:       ec 08 00 06 01 7e       cije    %r0,1,244e <mmput+0x1e>
      2448:       c0 f4 00 00 00 00       jg      2448 <mmput+0x18>
                          244a: R_390_PC32DBL     __s390_indirect_jump_r14+0x2
      244e:       eb 9f f0 60 00 24       stmg    %r9,%r15,96(%r15)
      2454:       b9 04 00 ef             lgr     %r14,%r15
      2458:       e3 f0 ff c0 ff 71       lay     %r15,-64(%r15)
      245e:       e3 e0 f0 98 00 24       stg     %r14,152(%r15)
      2464:       b9 04 00 b2             lgr     %r11,%r2
      2468:       c0 e5 00 00 00 00       brasl   %r14,2468 <mmput+0x38>
                          246a: R_390_PLT32DBL    uprobe_clear_state+0x2
      246e:       b9 04 00 2b             lgr     %r2,%r11
      2472:       c0 e5 00 00 00 00       brasl   %r14,2472 <mmput+0x42>
                          2474: R_390_PLT32DBL    exit_aio+0x2
      2478:       e3 20 b4 f8 00 04       lg      %r2,1272(%r11)
      247e:       a7 20 00 01             tmlh    %r2,1
      2482:       a7 74 00 6f             jne     2560 <mmput+0x130>
      2486:       e3 30 b4 f8 00 04       lg      %r3,1272(%r11)
      248c:       a7 30 00 02             tmlh    %r3,2
      2490:       a7 74 00 61             jne     2552 <mmput+0x122>
      2494:       b9 04 00 2b             lgr     %r2,%r11
      2498:       41 90 b1 88             la      %r9,392(%r11)
      249c:       c0 e5 00 00 00 00       brasl   %r14,249c <mmput+0x6c>
                          249e: R_390_PLT32DBL    exit_mmap+0x2
      24a2:       b9 04 00 2b             lgr     %r2,%r11
      24a6:       c0 e5 00 00 00 00       brasl   %r14,24a6 <mmput+0x76>
                          24a8: R_390_PLT32DBL    mm_put_huge_zero_page+0x2
      24ac:       a7 39 00 00             lghi    %r3,0
      24b0:       b9 04 00 2b             lgr     %r2,%r11
      24b4:       c0 e5 00 00 00 00       brasl   %r14,24b4 <mmput+0x84>
                          24b6: R_390_PLT32DBL    set_mm_exe_file+0x2
      24ba:       e3 40 b1 88 00 04       lg      %r4,392(%r11)
      24c0:       ec 94 00 2f 80 64       cgrje   %r9,%r4,251e <mmput+0xee>
      24c6:       c4 a8 00 00 00 00       lgrl    %r10,24c6 <mmput+0x96>
                          24c8: R_390_GOTENT      mmlist_lock+0x2
      24cc:       58 10 03 ac             l       %r1,940
      24d0:       a7 58 00 00             lhi     %r5,0
      24d4:       ba 51 a0 00             cs      %r5,%r1,0(%r10)
      24d8:       ec 56 00 53 00 7e       cijne   %r5,0,257e <mmput+0x14e>
      24de:       b9 04 00 29             lgr     %r2,%r9
      24e2:       c0 e5 00 00 00 00       brasl   %r14,24e2 <mmput+0xb2>
                          24e4: R_390_PLT32DBL    __list_del_entry_valid_or_report+0x2
      24e8:       ec 28 00 0f 00 7c       cgije   %r2,0,2506 <mmput+0xd6>
      24ee:       e3 e0 b1 90 00 04       lg      %r14,400(%r11)
      24f4:       e3 20 b1 88 00 04       lg      %r2,392(%r11)
      24fa:       e3 e0 20 08 00 24       stg     %r14,8(%r2)
      2500:       e3 20 e0 00 00 24       stg     %r2,0(%r14)
      2506:       e5 48 b1 88 01 00       mvghi   392(%r11),256
      250c:       e5 48 b1 90 01 22       mvghi   400(%r11),290
      2512:       a7 08 00 00             lhi     %r0,0
      2516:       47 00 00 00             nop     0
      251a:       40 00 a0 02             sth     %r0,2(%r10)
      251e:       e3 30 b4 68 00 02       ltg     %r3,1128(%r11)
      2524:       a7 84 00 08             je      2534 <mmput+0x104>
      2528:       e3 20 30 10 00 04       lg      %r2,16(%r3)
      252e:       c0 e5 00 00 00 00       brasl   %r14,252e <mmput+0xfe>
                          2530: R_390_PLT32DBL    module_put+0x2
      2534:       a7 48 ff ff             lhi     %r4,-1
      2538:       eb 94 b0 00 00 f8       laa     %r9,%r4,0(%r11)
      253e:       07 e0                   bnor    %r0
      2540:       ec 98 00 17 01 7e       cije    %r9,1,256e <mmput+0x13e>
      2546:       eb 9f f0 a0 00 04       lmg     %r9,%r15,160(%r15)
      254c:       c0 f4 00 00 00 00       jg      254c <mmput+0x11c>
                          254e: R_390_PC32DBL     __s390_indirect_jump_r14+0x2
      2552:       b9 04 00 2b             lgr     %r2,%r11
      2556:       c0 e5 00 00 00 00       brasl   %r14,2556 <mmput+0x126>
                          2558: R_390_PLT32DBL    __khugepaged_exit+0x2
      255c:       a7 f4 ff 9c             j       2494 <mmput+0x64>
      2560:       b9 04 00 2b             lgr     %r2,%r11
      2564:       c0 e5 00 00 00 00       brasl   %r14,2564 <mmput+0x134>
                          2566: R_390_PLT32DBL    __ksm_exit+0x2
      256a:       a7 f4 ff 8e             j       2486 <mmput+0x56>
      256e:       b9 04 00 2b             lgr     %r2,%r11
      2572:       eb 9f f0 a0 00 04       lmg     %r9,%r15,160(%r15)
      2578:       c0 f4 00 00 00 00       jg      2578 <mmput+0x148>
                          257a: R_390_PLT32DBL    __mmdrop+0x2
      257e:       b9 04 00 2a             lgr     %r2,%r10
      2582:       c0 e5 00 00 00 00       brasl   %r14,2582 <mmput+0x152>
                          2584: R_390_PLT32DBL    arch_spin_lock_wait+0x2
      2588:       a7 f4 ff ab             j       24de <mmput+0xae>


# GCC information

  $ gcc -v
  Using built-in specs.
  COLLECT_GCC=gcc
  COLLECT_LTO_WRAPPER=/usr/libexec/gcc/s390x-redhat-linux/14/lto-wrapper
  Target: s390x-redhat-linux
  Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --disable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --enable-libstdcxx-backtrace --with-libstdcxx-zoneinfo=/usr/share/zoneinfo --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --without-isl --enable-gnu-indirect-function --with-long-double-128 --with-arch=z14 --with-tune=z15 --enable-decimal-float --build=s390x-redhat-linux --with-build-config=bootstrap-lto --enable-link-serialization=1 --enable-host-pie --enable-host-bind-now
  Thread model: posix
  Supported LTO compression algorithms: zlib zstd
  gcc version 14.1.1 20240507 (Red Hat 14.1.1-1) (GCC)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/4] s390: compile relocatable kernel with/without fPIE
  2024-06-18 20:37 ` Joe Lawrence
@ 2024-06-19 17:01   ` Sumanth Korikkar
  2024-06-19 18:23     ` Joe Lawrence
  0 siblings, 1 reply; 14+ messages in thread
From: Sumanth Korikkar @ 2024-06-19 17:01 UTC (permalink / raw)
  To: Joe Lawrence; +Cc: linux-s390, hca, jpoimboe, gor, iii, agordeev

On Tue, Jun 18, 2024 at 04:37:16PM -0400, Joe Lawrence wrote:
> On Mon, Feb 19, 2024 at 02:27:30PM +0100, Sumanth Korikkar wrote:
> > Hi All,
> > 
> > This is a rebased version of Josh's patch series with a few fixups.
> > https://git.kernel.org/pub/scm/linux/kernel/git/jpoimboe/linux.git/log/?h=s390
> > 
> > This introduces the capability to compile the s390 relocatable kernel
> > with and without the -fPIE option.
> > 
> > When utilizing the kpatch functionality, it is advisable to compile the
> > kernel without the -fPIE option. This is particularly important if the
> > kernel is built with the -ffunction-sections and -fdata-sections flags.
> > The linker imposes a restriction on the number of sections (limited to
> > 64k), necessitating the omission of -fPIE.
> > 
> > [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622872.html
> > [2] https://gcc.gnu.org/pipermail/gcc-patches/2023-August/625986.html
> > 
> > Gcc recently implemented an optimization [1] for loading symbols without
> > explicit alignment, aligning with the IBM Z ELF ABI. This ABI mandates
> > symbols to reside on a 2-byte boundary, enabling the use of the larl
> > instruction. However, kernel linker scripts may still generate unaligned
> > symbols. To address this, a new -munaligned-symbols option has been
> > introduced [2] in recent gcc versions. This option has to be used with
> > future gcc versions.
> > 
> > Older Clang lacks support for handling unaligned symbols generated
> > by kernel linker scripts when the kernel is built without -fPIE. However,
> > future versions of Clang will include support for the -munaligned-symbols
> > option. When the support is unavailable, compile the kernel with -fPIE
> > to maintain the existing behavior.
> > 
> > Patch 1 filters out -munaligned-symbol flag for vdso code. This is beneficial
> > when compiling kernel with -fno-PIE and -munaligned-symbols combination.
> > 
> > Patch 2 introduces the 'relocs' tool, which reads the vmlinux file and
> > generates a vmlinux.relocs_64 section, containing offsets for all
> > R_390_64 relocations.
> > 
> > Patch 3 enables the compilation of a relocatable kernel with or without
> > the -fPIE option. It  allows for building the relocatable kernel without
> > -fPIE.  However, if compiler cannot handle unaligned symbols, the kernel
> > is built with -fPIE.
> > 
> > Patch 4 handles orphan .rela sections when kernel is built with
> > -fno-PIE.
> > 
> > kpatch tools changes:
> > * -mno-pic-data-is-text-relative prevents relative addressing between
> >   code and data. This is needed to avoid relocation error when klp text
> >   and data are too far apart. kpatch already includes this flag.
> >   However, with these changes, ARCH_KFLAGS+="-fPIC" should be added to
> >   s390 kpatch tools, As -mno-pic-data-is-text-relative can be used only
> >   with -fPIC. The corresponding pull request will be sent to kpatch
> >   tools.
> 
> Hi Sumanth,
> 
> I noticed interesting compiler differences when adding -fPIC build
> option and not.  The difference in resulting output can confuse
> kpatch-build when it tries to verify that its reference build (with the
> mentioned options, plus --ffunction-sections and -fdata-sections),
> doesn't line up closely enough with the original vmlinux source (sans
> all these options).

Hi Joe,

kpatch for s390 already uses extra compiler flag -mno-pic-data-is-text-relative
inorder to prevent relative addressing between code and data. Also,
includes -ffunction-sections and -fdata-sections along with it to identify
modified functions and its relocations.

Both the source code and modified code are built with the same
options during kpatch-build (-fPIC added to
-mno-pic-data-is-text-relative). kpatch-build was able to identify
modified functions and its associated relocations and include these
changes in the final kpatch module.

May be I am missing some info: Does this deviation cause confusion to kpatch?

Adding -fPIC flag would end up with extra indirections via GOT for
eg: when accessing global vars. But, this is similar in case of previous
kpatch build as well (-fPIE + -mno-pic-data-is-text-relative).

Rela comparision which I performed on .rela.text.mmput:
1. with -fno-PIE kernel:
.rela.text.mmput:
0000000000000022  000001a800000014 R_390_PLT32DBL         0000000000000000 __cond_resched + 2
0000000000000040  0000017800000013 R_390_PC32DBL          0000000000000000 __s390_indirect_jump_r14 + 2
000000000000004a  000001a000000014 R_390_PLT32DBL         0000000000000000 uprobe_clear_state + 2
0000000000000054  000001a100000014 R_390_PLT32DBL         0000000000000000 exit_aio + 2
000000000000006c  000001a200000014 R_390_PLT32DBL         0000000000000000 exit_mmap + 2
000000000000009a  000001a300000014 R_390_PLT32DBL         0000000000000000 fput + 2
00000000000000b8  000001a400000013 R_390_PC32DBL          0000000000000000 mmlist_lock + 2  <<<
00000000000000cc  000001a500000014 R_390_PLT32DBL         0000000000000000 __list_del_entry_valid_or_report + 2
0000000000000100  000001a400000013 R_390_PC32DBL          0000000000000000 mmlist_lock + 4
000000000000011e  000001a600000014 R_390_PLT32DBL         0000000000000000 module_put + 2
0000000000000140  0000016f00000014 R_390_PLT32DBL         0000000000000000 __mmdrop + 2
000000000000014a  000001a700000014 R_390_PLT32DBL         0000000000000000 __ksm_exit + 2
0000000000000154  0000019700000014 R_390_PLT32DBL         0000000000000000 arch_spin_lock_wait + 2

2. With -fPIC flag added:
With Relocation section '.rela.text.mmput' at offset 0x84c80 contains 12 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000022  000001d500000014 R_390_PLT32DBL         0000000000000000 __cond_resched + 2
0000000000000040  0000017800000013 R_390_PC32DBL          0000000000000000 __s390_indirect_jump_r14 + 2
000000000000004a  000001d600000014 R_390_PLT32DBL         0000000000000000 uprobe_clear_state + 2
0000000000000054  000001d700000014 R_390_PLT32DBL         0000000000000000 exit_aio + 2
0000000000000070  000001d800000014 R_390_PLT32DBL         0000000000000000 exit_mmap + 2
000000000000007e  000001d300000014 R_390_PLT32DBL         0000000000000000 set_mm_exe_file + 2
0000000000000090  000001d90000001a R_390_GOTENT           0000000000000000 mmlist_lock + 2     <<< Difference GOTENT
00000000000000ac  000001da00000014 R_390_PLT32DBL         0000000000000000 __list_del_entry_valid_or_report + 2
00000000000000f8  000001db00000014 R_390_PLT32DBL         0000000000000000 module_put + 2
000000000000011a  0000016f00000014 R_390_PLT32DBL         0000000000000000 __mmdrop + 2
0000000000000124  000001dc00000014 R_390_PLT32DBL         0000000000000000 __ksm_exit + 2
0000000000000132  0000019500000014 R_390_PLT32DBL         0000000000000000 arch_spin_lock_wait + 2

3. built with -fPIE commit + -mno-pic-data-is-text-relative (Previous kpatch version):
Relocation section '.rela.text.__mmput' at offset 0x95bc8 contains 13 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
000000000000001c  000001c000000014 R_390_PLT32DBL         0000000000000000 uprobe_clear_state + 2 
0000000000000026  000001c100000014 R_390_PLT32DBL         0000000000000000 exit_aio + 2 
000000000000004c  000001c200000014 R_390_PLT32DBL         0000000000000000 exit_mmap + 2
0000000000000056  000001c300000014 R_390_PLT32DBL         0000000000000000 mm_put_huge_zero_page + 2
0000000000000084  000001c400000014 R_390_PLT32DBL         0000000000000000 fput + 2
000000000000009a  000001c50000001a R_390_GOTENT           0000000000000000 mmlist_lock + 2   <<<<
0000000000000106  000001c600000014 R_390_PLT32DBL         0000000000000000 module_put + 2
0000000000000124  0000019000000013 R_390_PC32DBL          0000000000000000 __s390_indirect_jump_r14 + 2
000000000000012e  000001c700000014 R_390_PLT32DBL         0000000000000000 __khugepaged_exit + 2
000000000000013c  000001c800000014 R_390_PLT32DBL         0000000000000000 __ksm_exit + 2 
000000000000014a  000001c900000014 R_390_PLT32DBL         0000000000000000 __list_del_entry_valid_or_report + 2
0000000000000158  000001a200000014 R_390_PLT32DBL         0000000000000000 arch_spin_lock_wait + 2
000000000000016c  0000018700000014 R_390_PLT32DBL         0000000000000000 __mmdrop + 2 


> 
> I don't think a kpatch-build PR was ever opened to add "-fPIC", but the
> compiler now warns that its required.  Have you ever seen optimization /
> build output changes when adding this option and if so, were there
> additional kpatch-build implications?

I missed adding "-fPIC" KCFLAGS in kpatch tool. I will send a PR request.

I expect and assume similar implications with (-fPIC added)
vs (-fPIE + -mno-pic-data-is-text-relative) build, as shown above in the rela
comparisions.

Other Note:
The latest kernel is built with -fPIC and linked with -no-pie (reference
commit: ca888b17da9b ("s390: Compile kernel with -fPIC and link with
-no-pie")) which also avoids generation of dynamic symbols and helps
kpatch usecases (when num of sections >=64k sections).  Also the build
options would be similar (-fPIC in kernel and -fPIC in kpatch-build)

For latest kernel, there is no need to add explicit -fPIC again
in kpatch tool.

But for the intermediate commits, yes, makes sense to add
it in kpatch-build tools and will create one PR.

> Here is a quick example that I stumbled on while investigating the
> kpatch-build shadow-newpid.patch integration test that modifies
> kernel/fork.c.  I couldn't reproduce with the s390 defconfig, but it
> shows up when building with RHEL-10 config.  Reproducer steps and
> disassembly examples follows.

Sorry Joe, I didnt get this question. "I couldnt reproduce with the s390
defconfig". Could you please clarify it ?

Thank you,
Sumanth

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/4] s390: compile relocatable kernel with/without fPIE
  2024-06-19 17:01   ` Sumanth Korikkar
@ 2024-06-19 18:23     ` Joe Lawrence
  2024-06-21  7:14       ` Sumanth Korikkar
  2024-06-21 16:59       ` Joe Lawrence
  0 siblings, 2 replies; 14+ messages in thread
From: Joe Lawrence @ 2024-06-19 18:23 UTC (permalink / raw)
  To: Sumanth Korikkar; +Cc: linux-s390, hca, jpoimboe, gor, iii, agordeev

On 6/19/24 13:01, Sumanth Korikkar wrote:
> On Tue, Jun 18, 2024 at 04:37:16PM -0400, Joe Lawrence wrote:
>> On Mon, Feb 19, 2024 at 02:27:30PM +0100, Sumanth Korikkar wrote:
>>> Hi All,
>>>
>>> This is a rebased version of Josh's patch series with a few fixups.
>>> https://git.kernel.org/pub/scm/linux/kernel/git/jpoimboe/linux.git/log/?h=s390
>>>
>>> This introduces the capability to compile the s390 relocatable kernel
>>> with and without the -fPIE option.
>>>
>>> When utilizing the kpatch functionality, it is advisable to compile the
>>> kernel without the -fPIE option. This is particularly important if the
>>> kernel is built with the -ffunction-sections and -fdata-sections flags.
>>> The linker imposes a restriction on the number of sections (limited to
>>> 64k), necessitating the omission of -fPIE.
>>>
>>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622872.html
>>> [2] https://gcc.gnu.org/pipermail/gcc-patches/2023-August/625986.html
>>>
>>> Gcc recently implemented an optimization [1] for loading symbols without
>>> explicit alignment, aligning with the IBM Z ELF ABI. This ABI mandates
>>> symbols to reside on a 2-byte boundary, enabling the use of the larl
>>> instruction. However, kernel linker scripts may still generate unaligned
>>> symbols. To address this, a new -munaligned-symbols option has been
>>> introduced [2] in recent gcc versions. This option has to be used with
>>> future gcc versions.
>>>
>>> Older Clang lacks support for handling unaligned symbols generated
>>> by kernel linker scripts when the kernel is built without -fPIE. However,
>>> future versions of Clang will include support for the -munaligned-symbols
>>> option. When the support is unavailable, compile the kernel with -fPIE
>>> to maintain the existing behavior.
>>>
>>> Patch 1 filters out -munaligned-symbol flag for vdso code. This is beneficial
>>> when compiling kernel with -fno-PIE and -munaligned-symbols combination.
>>>
>>> Patch 2 introduces the 'relocs' tool, which reads the vmlinux file and
>>> generates a vmlinux.relocs_64 section, containing offsets for all
>>> R_390_64 relocations.
>>>
>>> Patch 3 enables the compilation of a relocatable kernel with or without
>>> the -fPIE option. It  allows for building the relocatable kernel without
>>> -fPIE.  However, if compiler cannot handle unaligned symbols, the kernel
>>> is built with -fPIE.
>>>
>>> Patch 4 handles orphan .rela sections when kernel is built with
>>> -fno-PIE.
>>>
>>> kpatch tools changes:
>>> * -mno-pic-data-is-text-relative prevents relative addressing between
>>>   code and data. This is needed to avoid relocation error when klp text
>>>   and data are too far apart. kpatch already includes this flag.
>>>   However, with these changes, ARCH_KFLAGS+="-fPIC" should be added to
>>>   s390 kpatch tools, As -mno-pic-data-is-text-relative can be used only
>>>   with -fPIC. The corresponding pull request will be sent to kpatch
>>>   tools.
>>
>> Hi Sumanth,
>>
>> I noticed interesting compiler differences when adding -fPIC build
>> option and not.  The difference in resulting output can confuse
>> kpatch-build when it tries to verify that its reference build (with the
>> mentioned options, plus --ffunction-sections and -fdata-sections),
>> doesn't line up closely enough with the original vmlinux source (sans
>> all these options).
> 
> Hi Joe,
> 
> kpatch for s390 already uses extra compiler flag -mno-pic-data-is-text-relative
> inorder to prevent relative addressing between code and data. Also,
> includes -ffunction-sections and -fdata-sections along with it to identify
> modified functions and its relocations.
> 
> Both the source code and modified code are built with the same
> options during kpatch-build (-fPIC added to
> -mno-pic-data-is-text-relative). kpatch-build was able to identify
> modified functions and its associated relocations and include these
> changes in the final kpatch module.
> 
> May be I am missing some info: Does this deviation cause confusion to kpatch?
> 

Hi Sumanth,

Yes, in the example I provided, the __mmput() function is only inlined
in kpatch builds, but not the builds that create the target vmlinux.
Here is a reproducer tarball that you can try against a local
create-diff-object binary:

https://file.rdu.redhat.com/~jolawren/repro-s390x-shadow-newpid.tar.gz

create-diff-object: ERROR: fork.ORIG.o: find_local_syms: 222: couldn't
find matching fork.c local symbols in vmlinux symbol table.

> Adding -fPIC flag would end up with extra indirections via GOT for
> eg: when accessing global vars. But, this is similar in case of previous
> kpatch build as well (-fPIE + -mno-pic-data-is-text-relative).
> 
> Rela comparision which I performed on .rela.text.mmput:
> 1. with -fno-PIE kernel:
> .rela.text.mmput:
> 0000000000000022  000001a800000014 R_390_PLT32DBL         0000000000000000 __cond_resched + 2
> 0000000000000040  0000017800000013 R_390_PC32DBL          0000000000000000 __s390_indirect_jump_r14 + 2
> 000000000000004a  000001a000000014 R_390_PLT32DBL         0000000000000000 uprobe_clear_state + 2
> 0000000000000054  000001a100000014 R_390_PLT32DBL         0000000000000000 exit_aio + 2
> 000000000000006c  000001a200000014 R_390_PLT32DBL         0000000000000000 exit_mmap + 2
> 000000000000009a  000001a300000014 R_390_PLT32DBL         0000000000000000 fput + 2
> 00000000000000b8  000001a400000013 R_390_PC32DBL          0000000000000000 mmlist_lock + 2  <<<
> 00000000000000cc  000001a500000014 R_390_PLT32DBL         0000000000000000 __list_del_entry_valid_or_report + 2
> 0000000000000100  000001a400000013 R_390_PC32DBL          0000000000000000 mmlist_lock + 4
> 000000000000011e  000001a600000014 R_390_PLT32DBL         0000000000000000 module_put + 2
> 0000000000000140  0000016f00000014 R_390_PLT32DBL         0000000000000000 __mmdrop + 2
> 000000000000014a  000001a700000014 R_390_PLT32DBL         0000000000000000 __ksm_exit + 2
> 0000000000000154  0000019700000014 R_390_PLT32DBL         0000000000000000 arch_spin_lock_wait + 2
> 
> 2. With -fPIC flag added:
> With Relocation section '.rela.text.mmput' at offset 0x84c80 contains 12 entries:
>     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
> 0000000000000022  000001d500000014 R_390_PLT32DBL         0000000000000000 __cond_resched + 2
> 0000000000000040  0000017800000013 R_390_PC32DBL          0000000000000000 __s390_indirect_jump_r14 + 2
> 000000000000004a  000001d600000014 R_390_PLT32DBL         0000000000000000 uprobe_clear_state + 2
> 0000000000000054  000001d700000014 R_390_PLT32DBL         0000000000000000 exit_aio + 2
> 0000000000000070  000001d800000014 R_390_PLT32DBL         0000000000000000 exit_mmap + 2
> 000000000000007e  000001d300000014 R_390_PLT32DBL         0000000000000000 set_mm_exe_file + 2
> 0000000000000090  000001d90000001a R_390_GOTENT           0000000000000000 mmlist_lock + 2     <<< Difference GOTENT
> 00000000000000ac  000001da00000014 R_390_PLT32DBL         0000000000000000 __list_del_entry_valid_or_report + 2
> 00000000000000f8  000001db00000014 R_390_PLT32DBL         0000000000000000 module_put + 2
> 000000000000011a  0000016f00000014 R_390_PLT32DBL         0000000000000000 __mmdrop + 2
> 0000000000000124  000001dc00000014 R_390_PLT32DBL         0000000000000000 __ksm_exit + 2
> 0000000000000132  0000019500000014 R_390_PLT32DBL         0000000000000000 arch_spin_lock_wait + 2
> 
> 3. built with -fPIE commit + -mno-pic-data-is-text-relative (Previous kpatch version):
> Relocation section '.rela.text.__mmput' at offset 0x95bc8 contains 13 entries:
>     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
> 000000000000001c  000001c000000014 R_390_PLT32DBL         0000000000000000 uprobe_clear_state + 2 
> 0000000000000026  000001c100000014 R_390_PLT32DBL         0000000000000000 exit_aio + 2 
> 000000000000004c  000001c200000014 R_390_PLT32DBL         0000000000000000 exit_mmap + 2
> 0000000000000056  000001c300000014 R_390_PLT32DBL         0000000000000000 mm_put_huge_zero_page + 2
> 0000000000000084  000001c400000014 R_390_PLT32DBL         0000000000000000 fput + 2
> 000000000000009a  000001c50000001a R_390_GOTENT           0000000000000000 mmlist_lock + 2   <<<<
> 0000000000000106  000001c600000014 R_390_PLT32DBL         0000000000000000 module_put + 2
> 0000000000000124  0000019000000013 R_390_PC32DBL          0000000000000000 __s390_indirect_jump_r14 + 2
> 000000000000012e  000001c700000014 R_390_PLT32DBL         0000000000000000 __khugepaged_exit + 2
> 000000000000013c  000001c800000014 R_390_PLT32DBL         0000000000000000 __ksm_exit + 2 
> 000000000000014a  000001c900000014 R_390_PLT32DBL         0000000000000000 __list_del_entry_valid_or_report + 2
> 0000000000000158  000001a200000014 R_390_PLT32DBL         0000000000000000 arch_spin_lock_wait + 2
> 000000000000016c  0000018700000014 R_390_PLT32DBL         0000000000000000 __mmdrop + 2 
> 
> 
>>
>> I don't think a kpatch-build PR was ever opened to add "-fPIC", but the
>> compiler now warns that its required.  Have you ever seen optimization /
>> build output changes when adding this option and if so, were there
>> additional kpatch-build implications?
> 
> I missed adding "-fPIC" KCFLAGS in kpatch tool. I will send a PR request.
> 
> I expect and assume similar implications with (-fPIC added)
> vs (-fPIE + -mno-pic-data-is-text-relative) build, as shown above in the rela
> comparisions.
> 

I don't have much personal mileage with kpatch on s390x, so you may be
right that this is not unique to -fPIC.  I suppose a workaround could be
to force (non)inline in the input kpatch to match the target vmlinux.

> Other Note:
> The latest kernel is built with -fPIC and linked with -no-pie (reference
> commit: ca888b17da9b ("s390: Compile kernel with -fPIC and link with
> -no-pie")) which also avoids generation of dynamic symbols and helps
> kpatch usecases (when num of sections >=64k sections).  Also the build
> options would be similar (-fPIC in kernel and -fPIC in kpatch-build)
> 
> For latest kernel, there is no need to add explicit -fPIC again
> in kpatch tool.
> 
> But for the intermediate commits, yes, makes sense to add
> it in kpatch-build tools and will create one PR.
> 

Interesting!  With 00cda11d3b2e ("s390: Compile kernel with -fPIC and
link with -no-pie") it sounds like the original vmlinux would be built
with -fPIC as well, so the optimization decisions re: __mmput() would
likely be the same.  I can retry the tests with v6.10-rcX to verify.

>> Here is a quick example that I stumbled on while investigating the
>> kpatch-build shadow-newpid.patch integration test that modifies
>> kernel/fork.c.  I couldn't reproduce with the s390 defconfig, but it
>> shows up when building with RHEL-10 config.  Reproducer steps and
>> disassembly examples follows.
> 
> Sorry Joe, I didnt get this question. "I couldnt reproduce with the s390
> defconfig". Could you please clarify it ?
> 

In the email reproducer steps, if I ran `make defconfig` and tried the
two builds of fork.o, __mmput() inlining didn't change.  Only when I
added the RHEL-10 s390x config could I reproduce.  I didn't want to
spend time hunting down the Red Hat specific CONFIG setting(s) that
induced the build differences as I assumed there may be an obvious
reason why this was happening.

-- 
Joe


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/4] s390: compile relocatable kernel with/without fPIE
  2024-06-19 18:23     ` Joe Lawrence
@ 2024-06-21  7:14       ` Sumanth Korikkar
  2024-06-21 11:32         ` Joe Lawrence
  2024-06-21 16:59       ` Joe Lawrence
  1 sibling, 1 reply; 14+ messages in thread
From: Sumanth Korikkar @ 2024-06-21  7:14 UTC (permalink / raw)
  To: Joe Lawrence; +Cc: linux-s390, hca, jpoimboe, gor, iii, agordeev

On Wed, Jun 19, 2024 at 02:23:49PM -0400, Joe Lawrence wrote:
> On 6/19/24 13:01, Sumanth Korikkar wrote:
> > On Tue, Jun 18, 2024 at 04:37:16PM -0400, Joe Lawrence wrote:
> >> On Mon, Feb 19, 2024 at 02:27:30PM +0100, Sumanth Korikkar wrote:
> >>> Hi All,
> >>>
> >>> This is a rebased version of Josh's patch series with a few fixups.
> >>> https://git.kernel.org/pub/scm/linux/kernel/git/jpoimboe/linux.git/log/?h=s390
> >>>
> >>> This introduces the capability to compile the s390 relocatable kernel
> >>> with and without the -fPIE option.
> >>>
> >>> When utilizing the kpatch functionality, it is advisable to compile the
> >>> kernel without the -fPIE option. This is particularly important if the
> >>> kernel is built with the -ffunction-sections and -fdata-sections flags.
> >>> The linker imposes a restriction on the number of sections (limited to
> >>> 64k), necessitating the omission of -fPIE.
> >>>
> >>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622872.html
> >>> [2] https://gcc.gnu.org/pipermail/gcc-patches/2023-August/625986.html
> >>>
> >>> Gcc recently implemented an optimization [1] for loading symbols without
> >>> explicit alignment, aligning with the IBM Z ELF ABI. This ABI mandates
> >>> symbols to reside on a 2-byte boundary, enabling the use of the larl
> >>> instruction. However, kernel linker scripts may still generate unaligned
> >>> symbols. To address this, a new -munaligned-symbols option has been
> >>> introduced [2] in recent gcc versions. This option has to be used with
> >>> future gcc versions.
> >>>
> >>> Older Clang lacks support for handling unaligned symbols generated
> >>> by kernel linker scripts when the kernel is built without -fPIE. However,
> >>> future versions of Clang will include support for the -munaligned-symbols
> >>> option. When the support is unavailable, compile the kernel with -fPIE
> >>> to maintain the existing behavior.
> >>>
> >>> Patch 1 filters out -munaligned-symbol flag for vdso code. This is beneficial
> >>> when compiling kernel with -fno-PIE and -munaligned-symbols combination.
> >>>
> >>> Patch 2 introduces the 'relocs' tool, which reads the vmlinux file and
> >>> generates a vmlinux.relocs_64 section, containing offsets for all
> >>> R_390_64 relocations.
> >>>
> >>> Patch 3 enables the compilation of a relocatable kernel with or without
> >>> the -fPIE option. It  allows for building the relocatable kernel without
> >>> -fPIE.  However, if compiler cannot handle unaligned symbols, the kernel
> >>> is built with -fPIE.
> >>>
> >>> Patch 4 handles orphan .rela sections when kernel is built with
> >>> -fno-PIE.
> >>>
> >>> kpatch tools changes:
> >>> * -mno-pic-data-is-text-relative prevents relative addressing between
> >>>   code and data. This is needed to avoid relocation error when klp text
> >>>   and data are too far apart. kpatch already includes this flag.
> >>>   However, with these changes, ARCH_KFLAGS+="-fPIC" should be added to
> >>>   s390 kpatch tools, As -mno-pic-data-is-text-relative can be used only
> >>>   with -fPIC. The corresponding pull request will be sent to kpatch
> >>>   tools.
> >>
> >> Hi Sumanth,
> >>
> >> I noticed interesting compiler differences when adding -fPIC build
> >> option and not.  The difference in resulting output can confuse
> >> kpatch-build when it tries to verify that its reference build (with the
> >> mentioned options, plus --ffunction-sections and -fdata-sections),
> >> doesn't line up closely enough with the original vmlinux source (sans
> >> all these options).
> > 
> > Hi Joe,
> > 
> > kpatch for s390 already uses extra compiler flag -mno-pic-data-is-text-relative
> > inorder to prevent relative addressing between code and data. Also,
> > includes -ffunction-sections and -fdata-sections along with it to identify
> > modified functions and its relocations.
> > 
> > Both the source code and modified code are built with the same
> > options during kpatch-build (-fPIC added to
> > -mno-pic-data-is-text-relative). kpatch-build was able to identify
> > modified functions and its associated relocations and include these
> > changes in the final kpatch module.
> > 
> > May be I am missing some info: Does this deviation cause confusion to kpatch?
> > 
> 
> Hi Sumanth,
> 
> Yes, in the example I provided, the __mmput() function is only inlined
> in kpatch builds, but not the builds that create the target vmlinux.
> Here is a reproducer tarball that you can try against a local
> create-diff-object binary:
> 
> https://file.rdu.redhat.com/~jolawren/repro-s390x-shadow-newpid.tar.gz
> 
> create-diff-object: ERROR: fork.ORIG.o: find_local_syms: 222: couldn't
> find matching fork.c local symbols in vmlinux symbol table.

Hi Joe,

I tried to download the tarball and rhel config. Both are unreachable.
Failed to resolve 'file.rdu.redhat.com' (Name or service not known)

Could you please provide alternative link to it?

Thank you,
Sumanth

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/4] s390: compile relocatable kernel with/without fPIE
  2024-06-21  7:14       ` Sumanth Korikkar
@ 2024-06-21 11:32         ` Joe Lawrence
  0 siblings, 0 replies; 14+ messages in thread
From: Joe Lawrence @ 2024-06-21 11:32 UTC (permalink / raw)
  To: Sumanth Korikkar
  Cc: linux-s390, hca, jpoimboe, gor, iii, agordeev, Joe Lawrence

On 6/21/24 03:14, Sumanth Korikkar wrote:
> On Wed, Jun 19, 2024 at 02:23:49PM -0400, Joe Lawrence wrote:
>> On 6/19/24 13:01, Sumanth Korikkar wrote:
>>> On Tue, Jun 18, 2024 at 04:37:16PM -0400, Joe Lawrence wrote:
>>>> On Mon, Feb 19, 2024 at 02:27:30PM +0100, Sumanth Korikkar wrote:
>>>>> Hi All,
>>>>>
>>>>> This is a rebased version of Josh's patch series with a few fixups.
>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/jpoimboe/linux.git/log/?h=s390
>>>>>
>>>>> This introduces the capability to compile the s390 relocatable kernel
>>>>> with and without the -fPIE option.
>>>>>
>>>>> When utilizing the kpatch functionality, it is advisable to compile the
>>>>> kernel without the -fPIE option. This is particularly important if the
>>>>> kernel is built with the -ffunction-sections and -fdata-sections flags.
>>>>> The linker imposes a restriction on the number of sections (limited to
>>>>> 64k), necessitating the omission of -fPIE.
>>>>>
>>>>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622872.html
>>>>> [2] https://gcc.gnu.org/pipermail/gcc-patches/2023-August/625986.html
>>>>>
>>>>> Gcc recently implemented an optimization [1] for loading symbols without
>>>>> explicit alignment, aligning with the IBM Z ELF ABI. This ABI mandates
>>>>> symbols to reside on a 2-byte boundary, enabling the use of the larl
>>>>> instruction. However, kernel linker scripts may still generate unaligned
>>>>> symbols. To address this, a new -munaligned-symbols option has been
>>>>> introduced [2] in recent gcc versions. This option has to be used with
>>>>> future gcc versions.
>>>>>
>>>>> Older Clang lacks support for handling unaligned symbols generated
>>>>> by kernel linker scripts when the kernel is built without -fPIE. However,
>>>>> future versions of Clang will include support for the -munaligned-symbols
>>>>> option. When the support is unavailable, compile the kernel with -fPIE
>>>>> to maintain the existing behavior.
>>>>>
>>>>> Patch 1 filters out -munaligned-symbol flag for vdso code. This is beneficial
>>>>> when compiling kernel with -fno-PIE and -munaligned-symbols combination.
>>>>>
>>>>> Patch 2 introduces the 'relocs' tool, which reads the vmlinux file and
>>>>> generates a vmlinux.relocs_64 section, containing offsets for all
>>>>> R_390_64 relocations.
>>>>>
>>>>> Patch 3 enables the compilation of a relocatable kernel with or without
>>>>> the -fPIE option. It  allows for building the relocatable kernel without
>>>>> -fPIE.  However, if compiler cannot handle unaligned symbols, the kernel
>>>>> is built with -fPIE.
>>>>>
>>>>> Patch 4 handles orphan .rela sections when kernel is built with
>>>>> -fno-PIE.
>>>>>
>>>>> kpatch tools changes:
>>>>> * -mno-pic-data-is-text-relative prevents relative addressing between
>>>>>   code and data. This is needed to avoid relocation error when klp text
>>>>>   and data are too far apart. kpatch already includes this flag.
>>>>>   However, with these changes, ARCH_KFLAGS+="-fPIC" should be added to
>>>>>   s390 kpatch tools, As -mno-pic-data-is-text-relative can be used only
>>>>>   with -fPIC. The corresponding pull request will be sent to kpatch
>>>>>   tools.
>>>>
>>>> Hi Sumanth,
>>>>
>>>> I noticed interesting compiler differences when adding -fPIC build
>>>> option and not.  The difference in resulting output can confuse
>>>> kpatch-build when it tries to verify that its reference build (with the
>>>> mentioned options, plus --ffunction-sections and -fdata-sections),
>>>> doesn't line up closely enough with the original vmlinux source (sans
>>>> all these options).
>>>
>>> Hi Joe,
>>>
>>> kpatch for s390 already uses extra compiler flag -mno-pic-data-is-text-relative
>>> inorder to prevent relative addressing between code and data. Also,
>>> includes -ffunction-sections and -fdata-sections along with it to identify
>>> modified functions and its relocations.
>>>
>>> Both the source code and modified code are built with the same
>>> options during kpatch-build (-fPIC added to
>>> -mno-pic-data-is-text-relative). kpatch-build was able to identify
>>> modified functions and its associated relocations and include these
>>> changes in the final kpatch module.
>>>
>>> May be I am missing some info: Does this deviation cause confusion to kpatch?
>>>
>>
>> Hi Sumanth,
>>
>> Yes, in the example I provided, the __mmput() function is only inlined
>> in kpatch builds, but not the builds that create the target vmlinux.
>> Here is a reproducer tarball that you can try against a local
>> create-diff-object binary:
>>
>> https://file.rdu.redhat.com/~jolawren/repro-s390x-shadow-newpid.tar.gz
>>
>> create-diff-object: ERROR: fork.ORIG.o: find_local_syms: 222: couldn't
>> find matching fork.c local symbols in vmlinux symbol table.
> 
> Hi Joe,
> 
> I tried to download the tarball and rhel config. Both are unreachable.
> Failed to resolve 'file.rdu.redhat.com' (Name or service not known)
> 
> Could you please provide alternative link to it?
> 

D'oh, I uploaded them to our internal filespace.  Try these links, they
should be visible to everyone:

https://people.redhat.com/~jolawren/kernel-s390x-rhel.config
https://people.redhat.com/~jolawren/repro-s390x-special-static.tar.gz

-- 
Joe


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/4] s390: compile relocatable kernel with/without fPIE
  2024-06-19 18:23     ` Joe Lawrence
  2024-06-21  7:14       ` Sumanth Korikkar
@ 2024-06-21 16:59       ` Joe Lawrence
  2024-06-24  9:59         ` Sumanth Korikkar
  1 sibling, 1 reply; 14+ messages in thread
From: Joe Lawrence @ 2024-06-21 16:59 UTC (permalink / raw)
  To: Sumanth Korikkar; +Cc: linux-s390, hca, jpoimboe, gor, iii, agordeev

On 6/19/24 14:23, Joe Lawrence wrote:
> On 6/19/24 13:01, Sumanth Korikkar wrote:
>> Other Note:
>> The latest kernel is built with -fPIC and linked with -no-pie (reference
>> commit: ca888b17da9b ("s390: Compile kernel with -fPIC and link with
>> -no-pie")) which also avoids generation of dynamic symbols and helps
>> kpatch usecases (when num of sections >=64k sections).  Also the build
>> options would be similar (-fPIC in kernel and -fPIC in kpatch-build)
>>
>> For latest kernel, there is no need to add explicit -fPIC again
>> in kpatch tool.
>>
>> But for the intermediate commits, yes, makes sense to add
>> it in kpatch-build tools and will create one PR.
>>
> 
> Interesting!  With 00cda11d3b2e ("s390: Compile kernel with -fPIC and
> link with -no-pie") it sounds like the original vmlinux would be built
> with -fPIC as well, so the optimization decisions re: __mmput() would
> likely be the same.  I can retry the tests with v6.10-rcX to verify.
> 

To follow up, all of the kpatch-build integration tests work with
v6.10.0-rc4 :) as the kernel is built with -fPIC and so are the kpatch
reference and patched builds.  For pre-v6.10 kernels, I think there may
be some instances where a patch author may need to account for slight
build differences to appease kpatch-build expectations as I noticed here.

-- 
Joe


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/4] s390: compile relocatable kernel with/without fPIE
  2024-06-21 16:59       ` Joe Lawrence
@ 2024-06-24  9:59         ` Sumanth Korikkar
  0 siblings, 0 replies; 14+ messages in thread
From: Sumanth Korikkar @ 2024-06-24  9:59 UTC (permalink / raw)
  To: Joe Lawrence; +Cc: linux-s390, hca, jpoimboe, gor, iii, agordeev

On Fri, Jun 21, 2024 at 12:59:21PM -0400, Joe Lawrence wrote:
> On 6/19/24 14:23, Joe Lawrence wrote:
> > On 6/19/24 13:01, Sumanth Korikkar wrote:
> >> Other Note:
> >> The latest kernel is built with -fPIC and linked with -no-pie (reference
> >> commit: ca888b17da9b ("s390: Compile kernel with -fPIC and link with
> >> -no-pie")) which also avoids generation of dynamic symbols and helps
> >> kpatch usecases (when num of sections >=64k sections).  Also the build
> >> options would be similar (-fPIC in kernel and -fPIC in kpatch-build)
> >>
> >> For latest kernel, there is no need to add explicit -fPIC again
> >> in kpatch tool.
> >>
> >> But for the intermediate commits, yes, makes sense to add
> >> it in kpatch-build tools and will create one PR.
> >>
> > 
> > Interesting!  With 00cda11d3b2e ("s390: Compile kernel with -fPIC and
> > link with -no-pie") it sounds like the original vmlinux would be built
> > with -fPIC as well, so the optimization decisions re: __mmput() would
> > likely be the same.  I can retry the tests with v6.10-rcX to verify.
> > 
> 
> To follow up, all of the kpatch-build integration tests work with
> v6.10.0-rc4 :) as the kernel is built with -fPIC and so are the kpatch
> reference and patched builds.  For pre-v6.10 kernels, I think there may
> be some instances where a patch author may need to account for slight
> build differences to appease kpatch-build expectations as I noticed here.
> 
> -- 
> Joe
>
Hi Joe,

I tried bisecting provided rhel config with defconfig, but didnt
succeed yet.

However, I tried using KCFLAGS="-fPIE -mno-pic-data-is-text-relative"
instead of KCFLAGS="-fPIC -mno-pic-data-is-text-relative" and it
generated similar output.

* objdump -Dr -j.text --disassemble=mmput kernel/fork.o

Disassembly of section .text:

00000000000011d0 <mmput>:
    11d0:       c0 04 00 00 00 00       jgnop   11d0 <mmput>
    11d6:       a7 18 ff ff             lhi     %r1,-1
    11da:       eb 01 21 3c 00 f8       laa     %r0,%r1,316(%r2)
    11e0:       07 e0                   bnor    %r0
    11e2:       ec 08 00 06 01 7e       cije    %r0,1,11ee <mmput+0x1e>
    11e8:       c0 f4 00 00 00 00       jg      11e8 <mmput+0x18>
                        11ea: R_390_PC32DBL     __s390_indirect_jump_r14+0x2
    11ee:       c0 f4 00 00 00 00       jg      11ee <mmput+0x1e>
                        11f0: R_390_PLT32DBL    __mmput+0x2

* KCFLAGS="-fPIE -mno-pic-data-is-text-relative -fno-section-anchors" \ 
  make -s -j$(nproc) kernel/fork.o ; 
  objdump -Dr -j.text --disassemble=mmput kernel/fork.o

kernel/fork.o:     file format elf64-s390


Disassembly of section .text:

0000000000001230 <mmput>:
    1230:       c0 04 00 00 00 00       jgnop   1230 <mmput>
    1236:       a7 18 ff ff             lhi     %r1,-1
    123a:       eb 01 21 3c 00 f8       laa     %r0,%r1,316(%r2)
    1240:       07 e0                   bnor    %r0
    1242:       ec 08 00 06 01 7e       cije    %r0,1,124e <mmput+0x1e>
    1248:       c0 f4 00 00 00 00       jg      1248 <mmput+0x18>
                        124a: R_390_PC32DBL     __s390_indirect_jump_r14+0x2
    124e:       c0 f4 00 00 00 00       jg      124e <mmput+0x1e>
                        1250: R_390_PLT32DBL    __mmput+0x2

* For commit 778666df60f0 ("s390: compile relocatable kernel without -fPIE"),
  "-fPIE -mno-pic-data-is-text-relative" could be used to prevent
  confusion to the kpatch tool. I will make changes in the kpatch pull request.
* For latest kernel, both vmlinux and kpatch-build uses -fPIC and
  hence no changes are necessary in kpatch tool.

Thank you,
Sumanth

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-06-24  9:59 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-19 13:27 [PATCH v2 0/4] s390: compile relocatable kernel with/without fPIE Sumanth Korikkar
2024-02-19 13:27 ` [PATCH v2 1/4] s390/vdso64: filter out munaligned-symbols flag for vdso Sumanth Korikkar
2024-02-19 13:27 ` [PATCH v2 2/4] s390: Add relocs tool Sumanth Korikkar
2024-02-19 16:11   ` Heiko Carstens
2024-02-19 13:27 ` [PATCH v2 3/4] s390: Compile relocatable kernel without -fPIE Sumanth Korikkar
2024-02-19 13:27 ` [PATCH v2 4/4] s390/kernel: vmlinux.lds.S: handle orphan .rela sections Sumanth Korikkar
2024-02-19 16:12 ` [PATCH v2 0/4] s390: compile relocatable kernel with/without fPIE Heiko Carstens
2024-06-18 20:37 ` Joe Lawrence
2024-06-19 17:01   ` Sumanth Korikkar
2024-06-19 18:23     ` Joe Lawrence
2024-06-21  7:14       ` Sumanth Korikkar
2024-06-21 11:32         ` Joe Lawrence
2024-06-21 16:59       ` Joe Lawrence
2024-06-24  9:59         ` Sumanth Korikkar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox