* [PATCH v3 0/5] Reuse 32 bit C code more safely
@ 2024-10-11 8:52 Frediano Ziglio
2024-10-11 8:52 ` [PATCH v3 1/5] x86/boot: create a C bundle for 32 bit boot code and use it Frediano Ziglio
` (4 more replies)
0 siblings, 5 replies; 23+ messages in thread
From: Frediano Ziglio @ 2024-10-11 8:52 UTC (permalink / raw)
To: xen-devel
Cc: Frediano Ziglio, Jan Beulich, Andrew Cooper, Roger Pau Monné,
Julien Grall, Stefano Stabellini, Daniel P. Smith,
Marek Marczykowski-Górecki
This series attempt to:
- use more C code, that is replace some assembly code with C;
- avoid some code duplication between C and assembly;
- prevent some issues having relocations in C code.
The idea is extending the current C to binary code conversion
done for 32 bit C code called from head.S making sure relocations
are safe and allowing external symbols usage from C code.
Note that, as an addition, scripts generating code check for no
data to allow code and data separation.
More details of the implementation are in commit message 1/5,
which is the largest patch.
Patch 2/5 reuses code to relocate the trampoline between 32 and 64 bit.
Patches 3/5 and 4/5 move some code from assembly to C.
Other RFC commits were excluded, I didn't manage to entangle the issues
sharing headers between 32 and 64 bits.
Since RFC code was more tested, also with CI and some incompatibility
were fixed. On that it's weird that we need Python 3.8 for Qemu but we
still use Python 2 for Xen. Shouldn't we bump requirement to Python 3
even for Xen?
Code boots successfully using:
- BIOS boot;
- EFI boot with Grub2 and ELF file;
- direct EFI boot without Grub.
Code is currently based on "master" branch, currently commit
2b49ef4503e2c549c166f439cf0dd331d9a8874c.
Changes since v1:
- 2 preliminary commits for adjust .gitignore;
- last commit split (one variable at a time);
- lot of style and names changes;
- first commit, now 3/6 had some build changes, more details on
commit message.
Changes since v2:
- removed merged commits;
- reverted order of 2 commits;
- remove some useless casts;
- added change to comment.
Frediano Ziglio (5):
x86/boot: create a C bundle for 32 bit boot code and use it
x86/boot: Reuse code to relocate trampoline
x86/boot: Use boot_vid_info variable directly from C code
x86/boot: Use trampoline_phys variable directly from C code
x86/boot: Clarify comment
xen/arch/x86/boot/.gitignore | 5 +-
xen/arch/x86/boot/Makefile | 62 ++++--
.../x86/boot/{build32.lds => build32.lds.S} | 49 ++++-
xen/arch/x86/boot/cmdline.c | 12 --
xen/arch/x86/boot/head.S | 49 +----
xen/arch/x86/boot/reloc-trampoline.c | 36 ++++
xen/arch/x86/boot/reloc.c | 35 +---
xen/arch/x86/efi/efi-boot.h | 15 +-
xen/tools/combine_two_binaries | 198 ++++++++++++++++++
9 files changed, 344 insertions(+), 117 deletions(-)
rename xen/arch/x86/boot/{build32.lds => build32.lds.S} (57%)
create mode 100644 xen/arch/x86/boot/reloc-trampoline.c
create mode 100755 xen/tools/combine_two_binaries
--
2.34.1
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v3 1/5] x86/boot: create a C bundle for 32 bit boot code and use it
2024-10-11 8:52 [PATCH v3 0/5] Reuse 32 bit C code more safely Frediano Ziglio
@ 2024-10-11 8:52 ` Frediano Ziglio
2024-10-11 10:57 ` [Makefile only] " Andrew Cooper
` (2 more replies)
2024-10-11 8:52 ` [PATCH v3 2/5] x86/boot: Reuse code to relocate trampoline Frediano Ziglio
` (3 subsequent siblings)
4 siblings, 3 replies; 23+ messages in thread
From: Frediano Ziglio @ 2024-10-11 8:52 UTC (permalink / raw)
To: xen-devel
Cc: Frediano Ziglio, Jan Beulich, Andrew Cooper, Roger Pau Monné,
Julien Grall, Stefano Stabellini
The current method to include 32 bit C boot code is:
- compile each function we want to use into a separate object file;
- each function is compiled with -fpic option;
- convert these object files to binary files. This operation removes GOP
which we don't want in the executable;
- a small assembly part in each file add the entry point;
- code can't have external references, all possible variables are passed
by value or pointer;
- include these binary files in head.S.
There are currently some limitations:
- code is compiled separately, it's not possible to share a function
(like memcpy) between different functions to use;
- although code is compiled with -fpic there's no certainty there are
no relocations, specifically data ones. This can lead into hard to
find bugs;
- it's hard to add a simple function;
- having to pass external variables makes hard to do multiple things
otherwise functions would require a lot of parameters so code would
have to be split into multiple functions which is not easy;
- we generate a single text section containing data and code, not a
problem at the moment but if we want to add W^X protection it's
not helpful.
Current change extends the current process:
- all object files are linked together before getting converted making
possible to share code between the function we want to call;
- a single object file is generated with all functions to use and
exported symbols to easily call;
- variables to use are declared in linker script and easily used inside
C code. Declaring them manually could be annoying but makes also
easier to check them. Using external pointers can be still an issue if
they are not fixed. If an external symbol is not declared this gives a
link error;
- linker script put data (bss and data) into a separate section and
check that that section is empty making sure code is W^X compatible;
Some details of the implementation:
- C code is compiled with -fpic flags (as before);
- object files from C code are linked together;
- the single bundled object file is linked with 2 slightly different
script files to generate 2 bundled object files;
- the 2 bundled object files are converted to binary removing the need
for global offset tables;
- a Python script is used to generate assembly source from the 2
binaries;
- the single assembly file is compiled to generate final bundled object
file;
- to detect possible unwanted relocation in data/code code is generated
with different addresses. This is enforced starting .text section at
different positions and adding a fixed "gap" at the beginning.
This makes sure code and data is position independent;
- to detect used symbols in data/code symbols are placed in .text
section at different offsets (based on the line in the linker script).
This is needed as potentially a reference to a symbol is converted to
a reference to the containing section so multiple symbols could be
converted to reference to same symbol (section name) and we need to
distinguish them;
- to avoid relocations
- --orphan-handling=error option to linker is used to make sure we
account for all possible sections from C code;
Current limitations:
- the main one is the lack of support for 64 bit code. It would make
sure that even the code used for 64 bit (at the moment EFI code) is
code and data position independent. We cannot assume that code that
came from code compiled for 32 bit and compiled for 64 bit is code and
data position independent, different compiler options lead to
different code/data.
Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
---
Changes since v1:
- separate lines adding files in Makefile;
- remove unneeded "#undef ENTRY" in build32.lds.S;
- print some information from combine_two_binaries only passing --verbose;
- detect --orphan-handling=error option dynamically;
- define and use a LD32;
- more intermediate targets to build more in parallel;
- use obj32 in Makefile to keep list of 32 bit object files;
- 32 bit object files are now named XXX.32.o;
- rename "cbundle" to "built_in_32".
---
xen/arch/x86/boot/.gitignore | 5 +-
xen/arch/x86/boot/Makefile | 58 +++--
.../x86/boot/{build32.lds => build32.lds.S} | 43 +++-
xen/arch/x86/boot/cmdline.c | 12 --
xen/arch/x86/boot/head.S | 12 --
xen/arch/x86/boot/reloc.c | 14 --
xen/tools/combine_two_binaries | 198 ++++++++++++++++++
7 files changed, 283 insertions(+), 59 deletions(-)
rename xen/arch/x86/boot/{build32.lds => build32.lds.S} (63%)
create mode 100755 xen/tools/combine_two_binaries
diff --git a/xen/arch/x86/boot/.gitignore b/xen/arch/x86/boot/.gitignore
index a379db7988..ebad650e5c 100644
--- a/xen/arch/x86/boot/.gitignore
+++ b/xen/arch/x86/boot/.gitignore
@@ -1,3 +1,4 @@
/mkelf32
-/*.bin
-/*.lnk
+/build32.*.lds
+/built_in_32.*.bin
+/built_in_32.*.map
diff --git a/xen/arch/x86/boot/Makefile b/xen/arch/x86/boot/Makefile
index ff0f965876..4cf0d7e140 100644
--- a/xen/arch/x86/boot/Makefile
+++ b/xen/arch/x86/boot/Makefile
@@ -1,15 +1,18 @@
+obj32 := cmdline.o
+obj32 += reloc.o
+
obj-bin-y += head.o
+obj-bin-y += built_in_32.o
-head-bin-objs := cmdline.o reloc.o
+obj32 := $(patsubst %.o,%.32.o,$(obj32))
-nocov-y += $(head-bin-objs)
-noubsan-y += $(head-bin-objs)
-targets += $(head-bin-objs)
+nocov-y += $(obj32)
+noubsan-y += $(obj32)
+targets += $(obj32)
-head-bin-objs := $(addprefix $(obj)/,$(head-bin-objs))
+obj32 := $(addprefix $(obj)/,$(obj32))
$(obj)/head.o: AFLAGS-y += -Wa$(comma)-I$(obj)
-$(obj)/head.o: $(head-bin-objs:.o=.bin)
CFLAGS_x86_32 := $(subst -m64,-m32 -march=i686,$(XEN_TREEWIDE_CFLAGS))
$(call cc-options-add,CFLAGS_x86_32,CC,$(EMBEDDED_EXTRA_CFLAGS))
@@ -17,17 +20,46 @@ CFLAGS_x86_32 += -Werror -fno-builtin -g0 -msoft-float -mregparm=3
CFLAGS_x86_32 += -nostdinc -include $(filter %/include/xen/config.h,$(XEN_CFLAGS))
CFLAGS_x86_32 += $(filter -I% -O%,$(XEN_CFLAGS)) -D__XEN__
+LD32 := $(LD) $(subst x86_64,i386,$(LDFLAGS_DIRECT))
+
# override for 32bit binaries
-$(head-bin-objs): CFLAGS_stack_boundary :=
-$(head-bin-objs): XEN_CFLAGS := $(CFLAGS_x86_32) -fpic
+$(obj32): CFLAGS_stack_boundary :=
+$(obj32): XEN_CFLAGS := $(CFLAGS_x86_32) -fpic
LDFLAGS_DIRECT-$(call ld-option,--warn-rwx-segments) := --no-warn-rwx-segments
LDFLAGS_DIRECT += $(LDFLAGS_DIRECT-y)
-%.bin: %.lnk
- $(OBJCOPY) -j .text -O binary $< $@
+$(obj)/%.32.o: $(src)/%.c FORCE
+ $(call if_changed_rule,cc_o_c)
+
+$(obj)/build32.final.lds: AFLAGS-y += -DFINAL
+$(obj)/build32.other.lds $(obj)/build32.final.lds: $(src)/build32.lds.S
+ $(call if_changed_dep,cpp_lds_S)
+
+orphan-handling-$(call ld-option,--orphan-handling=error) := --orphan-handling=error
+
+# link all object files together
+$(obj)/built_in_32.tmp.o: $(obj32)
+ $(LD32) -r -o $@ $(obj32)
+
+$(obj)/built_in_32.%.bin: $(obj)/build32.%.lds $(obj)/built_in_32.tmp.o
+## link bundle with a given layout
+ $(LD32) $(orphan-handling-y) -N -T $< -Map $(obj)/built_in_32.$(*F).map -o $(obj)/built_in_32.$(*F).o $(obj)/built_in_32.tmp.o
+## extract binaries from object
+ $(OBJCOPY) -j .text -O binary $(obj)/built_in_32.$(*F).o $@
+ rm -f $(obj)/built_in_32.$(*F).o
-%.lnk: %.o $(src)/build32.lds
- $(LD) $(subst x86_64,i386,$(LDFLAGS_DIRECT)) -N -T $(filter %.lds,$^) -o $@ $<
+# generate final object file combining and checking above binaries
+$(obj)/built_in_32.o: $(obj)/built_in_32.other.bin $(obj)/built_in_32.final.bin
+ $(PYTHON) $(srctree)/tools/combine_two_binaries \
+ --script $(obj)/build32.final.lds \
+ --bin1 $(obj)/built_in_32.other.bin --bin2 $(obj)/built_in_32.final.bin \
+ --map $(obj)/built_in_32.final.map \
+ --exports cmdline_parse_early,reloc \
+ --section-header '.section .init.text, "ax", @progbits' \
+ --output $(obj)/built_in_32.s
+ $(CC) -c $(obj)/built_in_32.s -o $@.tmp
+ rm -f $(obj)/built_in_32.s $@
+ mv $@.tmp $@
-clean-files := *.lnk *.bin
+clean-files := built_in_32.*.bin built_in_32.*.map build32.*.lds
diff --git a/xen/arch/x86/boot/build32.lds b/xen/arch/x86/boot/build32.lds.S
similarity index 63%
rename from xen/arch/x86/boot/build32.lds
rename to xen/arch/x86/boot/build32.lds.S
index 56edaa727b..72a4c5c614 100644
--- a/xen/arch/x86/boot/build32.lds
+++ b/xen/arch/x86/boot/build32.lds.S
@@ -15,22 +15,52 @@
* with this program. If not, see <http://www.gnu.org/licenses/>.
*/
-ENTRY(_start)
+#ifdef FINAL
+# define GAP 0
+# define MULT 0
+# define TEXT_START
+#else
+# define GAP 0x010200
+# define MULT 1
+# define TEXT_START 0x408020
+#endif
+# define DECLARE_IMPORT(name) name = . + (__LINE__ * MULT)
+
+ENTRY(dummy_start)
SECTIONS
{
- /* Merge code and data into one section. */
- .text : {
+ /* Merge code and read-only data into one section. */
+ .text TEXT_START : {
+ /* Silence linker warning, we are not going to use it */
+ dummy_start = .;
+
+ /* Declare below any symbol name needed.
+ * Each symbol should be on its own line.
+ * It looks like a tedious work but we make sure the things we use.
+ * Potentially they should be all variables. */
+ DECLARE_IMPORT(__base_relocs_start);
+ DECLARE_IMPORT(__base_relocs_end);
+ . = . + GAP;
*(.text)
*(.text.*)
- *(.data)
- *(.data.*)
*(.rodata)
*(.rodata.*)
+ }
+
+ /* Writeable data sections. Check empty.
+ * We collapse all into code section and we don't want it to be writeable. */
+ .data : {
+ *(.data)
+ *(.data.*)
*(.bss)
*(.bss.*)
}
-
+ /DISCARD/ : {
+ *(.comment)
+ *(.comment.*)
+ *(.note.*)
+ }
/* Dynamic linkage sections. Collected simply so we can check they're empty. */
.got : {
*(.got)
@@ -64,3 +94,4 @@ ASSERT(SIZEOF(.igot.plt) == 0, ".igot.plt non-empty")
ASSERT(SIZEOF(.iplt) == 0, ".iplt non-empty")
ASSERT(SIZEOF(.plt) == 0, ".plt non-empty")
ASSERT(SIZEOF(.rel) == 0, "leftover relocations")
+ASSERT(SIZEOF(.data) == 0, "we don't want data")
diff --git a/xen/arch/x86/boot/cmdline.c b/xen/arch/x86/boot/cmdline.c
index fc9241ede9..196c580e91 100644
--- a/xen/arch/x86/boot/cmdline.c
+++ b/xen/arch/x86/boot/cmdline.c
@@ -18,18 +18,6 @@
* Linux kernel source (linux/lib/string.c).
*/
-/*
- * This entry point is entered from xen/arch/x86/boot/head.S with:
- * - %eax = &cmdline,
- * - %edx = &early_boot_opts.
- */
-asm (
- " .text \n"
- " .globl _start \n"
- "_start: \n"
- " jmp cmdline_parse_early \n"
- );
-
#include <xen/compiler.h>
#include <xen/kconfig.h>
#include <xen/macros.h>
diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S
index c4de1dfab5..e0776e3896 100644
--- a/xen/arch/x86/boot/head.S
+++ b/xen/arch/x86/boot/head.S
@@ -759,18 +759,6 @@ trampoline_setup:
/* Jump into the relocated trampoline. */
lret
- /*
- * cmdline and reloc are written in C, and linked to be 32bit PIC with
- * entrypoints at 0 and using the fastcall convention.
- */
-FUNC_LOCAL(cmdline_parse_early)
- .incbin "cmdline.bin"
-END(cmdline_parse_early)
-
-FUNC_LOCAL(reloc)
- .incbin "reloc.bin"
-END(reloc)
-
ENTRY(trampoline_start)
#include "trampoline.S"
ENTRY(trampoline_end)
diff --git a/xen/arch/x86/boot/reloc.c b/xen/arch/x86/boot/reloc.c
index 8c58affcd9..94b078d7b1 100644
--- a/xen/arch/x86/boot/reloc.c
+++ b/xen/arch/x86/boot/reloc.c
@@ -12,20 +12,6 @@
* Daniel Kiper <daniel.kiper@oracle.com>
*/
-/*
- * This entry point is entered from xen/arch/x86/boot/head.S with:
- * - %eax = MAGIC,
- * - %edx = INFORMATION_ADDRESS,
- * - %ecx = TOPMOST_LOW_MEMORY_STACK_ADDRESS.
- * - 0x04(%esp) = BOOT_VIDEO_INFO_ADDRESS.
- */
-asm (
- " .text \n"
- " .globl _start \n"
- "_start: \n"
- " jmp reloc \n"
- );
-
#include <xen/compiler.h>
#include <xen/macros.h>
#include <xen/types.h>
diff --git a/xen/tools/combine_two_binaries b/xen/tools/combine_two_binaries
new file mode 100755
index 0000000000..ea2d6ddc4e
--- /dev/null
+++ b/xen/tools/combine_two_binaries
@@ -0,0 +1,198 @@
+#!/usr/bin/env python3
+
+from __future__ import print_function
+import argparse
+import re
+import struct
+import sys
+
+parser = argparse.ArgumentParser(description='Generate assembly file to merge into other code.')
+parser.add_argument('--script', dest='script',
+ required=True,
+ help='Linker script for extracting symbols')
+parser.add_argument('--bin1', dest='bin1',
+ required=True,
+ help='First binary')
+parser.add_argument('--bin2', dest='bin2',
+ required=True,
+ help='Second binary')
+parser.add_argument('--output', dest='output',
+ help='Output file')
+parser.add_argument('--map', dest='mapfile',
+ help='Map file to read for symbols to export')
+parser.add_argument('--exports', dest='exports',
+ help='Symbols to export')
+parser.add_argument('--section-header', dest='section_header',
+ default='.text',
+ help='Section header declaration')
+parser.add_argument('-v', '--verbose',
+ action='store_true')
+args = parser.parse_args()
+
+gap = 0x010200
+text_diff = 0x408020
+
+# Parse linker script for external symbols to use.
+symbol_re = re.compile(r'\s+(\S+)\s*=\s*\.\s*\+\s*\((\d+)\s*\*\s*0\s*\)\s*;')
+symbols = {}
+lines = {}
+for line in open(args.script):
+ m = symbol_re.match(line)
+ if not m:
+ continue
+ (name, line_num) = (m.group(1), int(m.group(2)))
+ if line_num == 0:
+ raise Exception("Invalid line number found:\n\t" + line)
+ if line_num in symbols:
+ raise Exception("Symbol with this line already present:\n\t" + line)
+ if name in lines:
+ raise Exception("Symbol with this name already present:\n\t" + name)
+ symbols[line_num] = name
+ lines[name] = line_num
+
+exports = []
+if args.exports is not None:
+ exports = dict([(name, None) for name in args.exports.split(',')])
+
+# Parse mapfile, look for ther symbols we want to export.
+if args.mapfile is not None:
+ symbol_re = re.compile(r'\s{15,}0x([0-9a-f]+)\s+(\S+)\n')
+ for line in open(args.mapfile):
+ m = symbol_re.match(line)
+ if not m or m.group(2) not in exports:
+ continue
+ addr = int(m.group(1), 16)
+ exports[m.group(2)] = addr
+for (name, addr) in exports.items():
+ if addr is None:
+ raise Exception("Required export symbols %s not found" % name)
+
+file1 = open(args.bin1, 'rb')
+file2 = open(args.bin2, 'rb')
+file1.seek(0, 2)
+size1 = file1.tell()
+file2.seek(0, 2)
+size2 = file2.tell()
+if size1 > size2:
+ file1, file2 = file2, file1
+ size1, size2 = size2, size1
+if size2 != size1 + gap:
+ raise Exception('File sizes do not match')
+
+file1.seek(0, 0)
+data1 = file1.read(size1)
+file2.seek(gap, 0)
+data2 = file2.read(size1)
+
+max_line = max(symbols.keys())
+
+def to_int32(n):
+ '''Convert a number to signed 32 bit integer truncating if needed'''
+ mask = (1 << 32) - 1
+ h = 1 << 31
+ return (n & mask) ^ h - h
+
+i = 0
+references = {}
+internals = 0
+while i <= size1 - 4:
+ n1 = struct.unpack('<I', data1[i:i+4])[0]
+ n2 = struct.unpack('<I', data2[i:i+4])[0]
+ i += 1
+ # The two numbers are the same, no problems
+ if n1 == n2:
+ continue
+ # Try to understand why they are different
+ diff = to_int32(n1 - n2)
+ if diff == -gap: # this is an internal relocation
+ pos = i - 1
+ print(("Internal relocation found at position %#x "
+ "n1=%#x n2=%#x diff=%#x") % (pos, n1, n2, diff),
+ file=sys.stderr)
+ i += 3
+ internals += 1
+ if internals >= 10:
+ break
+ continue
+ # This is a relative relocation to a symbol, accepted, code/data is
+ # relocatable.
+ if diff < gap and diff >= gap - max_line:
+ n = gap - diff
+ symbol = symbols.get(n)
+ # check we have a symbol
+ if symbol is None:
+ raise Exception("Cannot find symbol for line %d" % n)
+ pos = i - 1
+ if args.verbose:
+ print('Position %#x %d %s' % (pos, n, symbol), file=sys.stderr)
+ i += 3
+ references[pos] = symbol
+ continue
+ # First byte is the same, move to next byte
+ if diff & 0xff == 0 and i <= size1 - 4:
+ continue
+ # Probably a type of relocation we don't want or support
+ pos = i - 1
+ suggestion = ''
+ symbol = symbols.get(-diff - text_diff)
+ if symbol is not None:
+ suggestion = " Maybe %s is not defined as hidden?" % symbol
+ raise Exception(("Unexpected difference found at %#x "
+ "n1=%#x n2=%#x diff=%#x gap=%#x.%s") % \
+ (pos, n1, n2, diff, gap, suggestion))
+if internals != 0:
+ raise Exception("Previous relocations found")
+
+def line_bytes(buf, out):
+ '''Output an assembly line with all bytes in "buf"'''
+ if type(buf) == str:
+ print("\t.byte " + ','.join([str(ord(c)) for c in buf]), file=out)
+ else:
+ print("\t.byte " + ','.join([str(n) for n in buf]), file=out)
+
+def part(start, end, out):
+ '''Output bytes of "data" from "start" to "end"'''
+ while start < end:
+ e = min(start + 16, end)
+ line_bytes(data1[start:e], out)
+ start = e
+
+def reference(pos, out):
+ name = references[pos]
+ n = struct.unpack('<I', data1[pos:pos+4])[0]
+ sign = '+'
+ if n >= (1 << 31):
+ n -= (1 << 32)
+ n += pos
+ if n < 0:
+ n = -n
+ sign = '-'
+ print("\t.hidden %s\n\t.long %s %s %#x - ." % (name, name, sign, n),
+ file=out)
+
+def output(out):
+ prev = 0
+ exports_by_addr = {}
+ for (sym, addr) in exports.items():
+ exports_by_addr.setdefault(addr, []).append(sym)
+ positions = list(references.keys())
+ positions += list(exports_by_addr.keys())
+ for pos in sorted(positions):
+ part(prev, pos, out)
+ prev = pos
+ if pos in references:
+ reference(pos, out)
+ prev = pos + 4
+ if pos in exports_by_addr:
+ for sym in exports_by_addr[pos]:
+ print("\t.global %s\n\t.hidden %s\n%s:" % (sym, sym, sym),
+ file=out)
+ part(prev, size1, out)
+
+out = sys.stdout
+if args.output is not None:
+ out = open(args.output, 'w')
+print('\t' + args.section_header, file=out)
+output(out)
+print('\n\t.section\t.note.GNU-stack,"",@progbits', file=out)
+out.flush()
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v3 2/5] x86/boot: Reuse code to relocate trampoline
2024-10-11 8:52 [PATCH v3 0/5] Reuse 32 bit C code more safely Frediano Ziglio
2024-10-11 8:52 ` [PATCH v3 1/5] x86/boot: create a C bundle for 32 bit boot code and use it Frediano Ziglio
@ 2024-10-11 8:52 ` Frediano Ziglio
2024-10-11 12:02 ` Andrew Cooper
2024-10-11 12:50 ` Jan Beulich
2024-10-11 8:52 ` [PATCH v3 3/5] x86/boot: Use boot_vid_info variable directly from C code Frediano Ziglio
` (2 subsequent siblings)
4 siblings, 2 replies; 23+ messages in thread
From: Frediano Ziglio @ 2024-10-11 8:52 UTC (permalink / raw)
To: xen-devel
Cc: Frediano Ziglio, Jan Beulich, Andrew Cooper, Roger Pau Monné,
Daniel P. Smith, Marek Marczykowski-Górecki
Move code from efi-boot.h to a separate, new, reloc-trampoline.c file.
Reuse this new code to replace assembly code in head.S doing the
same thing.
Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
---
xen/arch/x86/boot/Makefile | 12 ++++++----
xen/arch/x86/boot/build32.lds.S | 5 ++++
xen/arch/x86/boot/head.S | 23 +-----------------
xen/arch/x86/boot/reloc-trampoline.c | 36 ++++++++++++++++++++++++++++
xen/arch/x86/efi/efi-boot.h | 15 ++----------
5 files changed, 52 insertions(+), 39 deletions(-)
create mode 100644 xen/arch/x86/boot/reloc-trampoline.c
diff --git a/xen/arch/x86/boot/Makefile b/xen/arch/x86/boot/Makefile
index 4cf0d7e140..24b4b0cb17 100644
--- a/xen/arch/x86/boot/Makefile
+++ b/xen/arch/x86/boot/Makefile
@@ -1,14 +1,18 @@
obj32 := cmdline.o
obj32 += reloc.o
+obj32 += reloc-trampoline.o
+
+obj64 := reloc-trampoline.o
obj-bin-y += head.o
obj-bin-y += built_in_32.o
+obj-bin-y += $(obj64)
obj32 := $(patsubst %.o,%.32.o,$(obj32))
-nocov-y += $(obj32)
-noubsan-y += $(obj32)
-targets += $(obj32)
+nocov-y += $(obj32) $(obj64)
+noubsan-y += $(obj32) $(obj64)
+targets += $(obj32) $(obj64)
obj32 := $(addprefix $(obj)/,$(obj32))
@@ -55,7 +59,7 @@ $(obj)/built_in_32.o: $(obj)/built_in_32.other.bin $(obj)/built_in_32.final.bin
--script $(obj)/build32.final.lds \
--bin1 $(obj)/built_in_32.other.bin --bin2 $(obj)/built_in_32.final.bin \
--map $(obj)/built_in_32.final.map \
- --exports cmdline_parse_early,reloc \
+ --exports cmdline_parse_early,reloc,reloc_trampoline32 \
--section-header '.section .init.text, "ax", @progbits' \
--output $(obj)/built_in_32.s
$(CC) -c $(obj)/built_in_32.s -o $@.tmp
diff --git a/xen/arch/x86/boot/build32.lds.S b/xen/arch/x86/boot/build32.lds.S
index 72a4c5c614..9d09e3adbd 100644
--- a/xen/arch/x86/boot/build32.lds.S
+++ b/xen/arch/x86/boot/build32.lds.S
@@ -41,6 +41,11 @@ SECTIONS
* Potentially they should be all variables. */
DECLARE_IMPORT(__base_relocs_start);
DECLARE_IMPORT(__base_relocs_end);
+ DECLARE_IMPORT(__trampoline_rel_start);
+ DECLARE_IMPORT(__trampoline_rel_stop);
+ DECLARE_IMPORT(__trampoline_seg_start);
+ DECLARE_IMPORT(__trampoline_seg_stop);
+ DECLARE_IMPORT(trampoline_phys);
. = . + GAP;
*(.text)
*(.text.*)
diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S
index e0776e3896..ade2c5c43d 100644
--- a/xen/arch/x86/boot/head.S
+++ b/xen/arch/x86/boot/head.S
@@ -706,28 +706,7 @@ trampoline_setup:
mov %edx, sym_offs(l1_bootmap)(%esi, %ecx, 8)
/* Apply relocations to bootstrap trampoline. */
- mov sym_esi(trampoline_phys), %edx
- lea sym_esi(__trampoline_rel_start), %edi
- lea sym_esi(__trampoline_rel_stop), %ecx
-1:
- mov (%edi), %eax
- add %edx, (%edi, %eax)
- add $4,%edi
-
- cmp %ecx, %edi
- jb 1b
-
- /* Patch in the trampoline segment. */
- shr $4,%edx
- lea sym_esi(__trampoline_seg_start), %edi
- lea sym_esi(__trampoline_seg_stop), %ecx
-1:
- mov (%edi), %eax
- mov %dx, (%edi, %eax)
- add $4,%edi
-
- cmp %ecx, %edi
- jb 1b
+ call reloc_trampoline32
/* Do not parse command line on EFI platform here. */
cmpb $0, sym_esi(efi_platform)
diff --git a/xen/arch/x86/boot/reloc-trampoline.c b/xen/arch/x86/boot/reloc-trampoline.c
new file mode 100644
index 0000000000..c899e99b08
--- /dev/null
+++ b/xen/arch/x86/boot/reloc-trampoline.c
@@ -0,0 +1,36 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include <xen/compiler.h>
+#include <xen/stdint.h>
+#include <asm/trampoline.h>
+
+extern const int32_t __trampoline_rel_start[], __trampoline_rel_stop[];
+extern const int32_t __trampoline_seg_start[], __trampoline_seg_stop[];
+
+#if defined(__i386__)
+void reloc_trampoline32(void)
+#elif defined (__x86_64__)
+void reloc_trampoline64(void)
+#else
+#error Unknow architecture
+#endif
+{
+ unsigned long phys = trampoline_phys;
+ const int32_t *trampoline_ptr;
+
+ /*
+ * Apply relocations to trampoline.
+ *
+ * This modifies the trampoline in place within Xen, so that it will
+ * operate correctly when copied into place.
+ */
+ for ( trampoline_ptr = __trampoline_rel_start;
+ trampoline_ptr < __trampoline_rel_stop;
+ ++trampoline_ptr )
+ *(uint32_t *)(*trampoline_ptr + (long)trampoline_ptr) += phys;
+
+ for ( trampoline_ptr = __trampoline_seg_start;
+ trampoline_ptr < __trampoline_seg_stop;
+ ++trampoline_ptr )
+ *(uint16_t *)(*trampoline_ptr + (long)trampoline_ptr) = phys >> 4;
+}
diff --git a/xen/arch/x86/efi/efi-boot.h b/xen/arch/x86/efi/efi-boot.h
index 94f3443364..1acceec471 100644
--- a/xen/arch/x86/efi/efi-boot.h
+++ b/xen/arch/x86/efi/efi-boot.h
@@ -103,27 +103,16 @@ static void __init efi_arch_relocate_image(unsigned long delta)
}
}
-extern const int32_t __trampoline_rel_start[], __trampoline_rel_stop[];
-extern const int32_t __trampoline_seg_start[], __trampoline_seg_stop[];
+void reloc_trampoline64(void);
static void __init relocate_trampoline(unsigned long phys)
{
- const int32_t *trampoline_ptr;
-
trampoline_phys = phys;
if ( !efi_enabled(EFI_LOADER) )
return;
- /* Apply relocations to trampoline. */
- for ( trampoline_ptr = __trampoline_rel_start;
- trampoline_ptr < __trampoline_rel_stop;
- ++trampoline_ptr )
- *(u32 *)(*trampoline_ptr + (long)trampoline_ptr) += phys;
- for ( trampoline_ptr = __trampoline_seg_start;
- trampoline_ptr < __trampoline_seg_stop;
- ++trampoline_ptr )
- *(u16 *)(*trampoline_ptr + (long)trampoline_ptr) = phys >> 4;
+ reloc_trampoline64();
}
static void __init place_string(u32 *addr, const char *s)
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v3 3/5] x86/boot: Use boot_vid_info variable directly from C code
2024-10-11 8:52 [PATCH v3 0/5] Reuse 32 bit C code more safely Frediano Ziglio
2024-10-11 8:52 ` [PATCH v3 1/5] x86/boot: create a C bundle for 32 bit boot code and use it Frediano Ziglio
2024-10-11 8:52 ` [PATCH v3 2/5] x86/boot: Reuse code to relocate trampoline Frediano Ziglio
@ 2024-10-11 8:52 ` Frediano Ziglio
2024-10-11 12:07 ` Andrew Cooper
2024-10-11 8:52 ` [PATCH v3 4/5] x86/boot: Use trampoline_phys " Frediano Ziglio
2024-10-11 8:52 ` [PATCH v3 5/5] x86/boot: Clarify comment Frediano Ziglio
4 siblings, 1 reply; 23+ messages in thread
From: Frediano Ziglio @ 2024-10-11 8:52 UTC (permalink / raw)
To: xen-devel
Cc: Frediano Ziglio, Jan Beulich, Andrew Cooper, Roger Pau Monné
No more need to pass from assembly code.
Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
---
Changes since v1:
- split the 2 variable changes into 2 commits.
Changes since v2:
- revert commit order.
---
xen/arch/x86/boot/build32.lds.S | 1 +
xen/arch/x86/boot/head.S | 10 +---------
xen/arch/x86/boot/reloc.c | 13 +++++--------
3 files changed, 7 insertions(+), 17 deletions(-)
diff --git a/xen/arch/x86/boot/build32.lds.S b/xen/arch/x86/boot/build32.lds.S
index 9d09e3adbd..26ef609523 100644
--- a/xen/arch/x86/boot/build32.lds.S
+++ b/xen/arch/x86/boot/build32.lds.S
@@ -46,6 +46,7 @@ SECTIONS
DECLARE_IMPORT(__trampoline_seg_start);
DECLARE_IMPORT(__trampoline_seg_stop);
DECLARE_IMPORT(trampoline_phys);
+ DECLARE_IMPORT(boot_vid_info);
. = . + GAP;
*(.text)
*(.text.*)
diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S
index ade2c5c43d..5da7ac138f 100644
--- a/xen/arch/x86/boot/head.S
+++ b/xen/arch/x86/boot/head.S
@@ -514,18 +514,10 @@ trampoline_setup:
mov sym_esi(trampoline_phys), %ecx
add $TRAMPOLINE_SPACE,%ecx
-#ifdef CONFIG_VIDEO
- lea sym_esi(boot_vid_info), %edx
-#else
- xor %edx, %edx
-#endif
-
/* Save Multiboot / PVH info struct (after relocation) for later use. */
- push %edx /* Boot video info to be filled from MB2. */
mov %ebx, %edx /* Multiboot / PVH information address. */
- /* reloc(magic/eax, info/edx, trampoline/ecx, video/stk) using fastcall. */
+ /* reloc(magic/eax, info/edx, trampoline/ecx) using fastcall. */
call reloc
- add $4, %esp
#ifdef CONFIG_PVH_GUEST
cmpb $0, sym_esi(pvh_boot)
diff --git a/xen/arch/x86/boot/reloc.c b/xen/arch/x86/boot/reloc.c
index 94b078d7b1..707d9c5f15 100644
--- a/xen/arch/x86/boot/reloc.c
+++ b/xen/arch/x86/boot/reloc.c
@@ -176,7 +176,7 @@ static multiboot_info_t *mbi_reloc(uint32_t mbi_in, memctx *ctx)
return mbi_out;
}
-static multiboot_info_t *mbi2_reloc(uint32_t mbi_in, uint32_t video_out, memctx *ctx)
+static multiboot_info_t *mbi2_reloc(uint32_t mbi_in, memctx *ctx)
{
const multiboot2_fixed_t *mbi_fix = _p(mbi_in);
const multiboot2_memory_map_t *mmap_src;
@@ -185,7 +185,7 @@ static multiboot_info_t *mbi2_reloc(uint32_t mbi_in, uint32_t video_out, memctx
memory_map_t *mmap_dst;
multiboot_info_t *mbi_out;
#ifdef CONFIG_VIDEO
- struct boot_video_info *video = NULL;
+ struct boot_video_info *video = &boot_vid_info;
#endif
uint32_t ptr;
unsigned int i, mod_idx = 0;
@@ -290,12 +290,11 @@ static multiboot_info_t *mbi2_reloc(uint32_t mbi_in, uint32_t video_out, memctx
#ifdef CONFIG_VIDEO
case MULTIBOOT2_TAG_TYPE_VBE:
- if ( video_out )
+ if ( video )
{
const struct vesa_ctrl_info *ci;
const struct vesa_mode_info *mi;
- video = _p(video_out);
ci = (const void *)get_mb2_data(tag, vbe, vbe_control_info);
mi = (const void *)get_mb2_data(tag, vbe, vbe_mode_info);
@@ -321,7 +320,6 @@ static multiboot_info_t *mbi2_reloc(uint32_t mbi_in, uint32_t video_out, memctx
if ( (get_mb2_data(tag, framebuffer, framebuffer_type) !=
MULTIBOOT2_FRAMEBUFFER_TYPE_RGB) )
{
- video_out = 0;
video = NULL;
}
break;
@@ -346,8 +344,7 @@ static multiboot_info_t *mbi2_reloc(uint32_t mbi_in, uint32_t video_out, memctx
}
/* SAF-1-safe */
-void *reloc(uint32_t magic, uint32_t in, uint32_t trampoline,
- uint32_t video_info)
+void *reloc(uint32_t magic, uint32_t in, uint32_t trampoline)
{
memctx ctx = { trampoline };
@@ -357,7 +354,7 @@ void *reloc(uint32_t magic, uint32_t in, uint32_t trampoline,
return mbi_reloc(in, &ctx);
case MULTIBOOT2_BOOTLOADER_MAGIC:
- return mbi2_reloc(in, video_info, &ctx);
+ return mbi2_reloc(in, &ctx);
case XEN_HVM_START_MAGIC_VALUE:
if ( IS_ENABLED(CONFIG_PVH_GUEST) )
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v3 4/5] x86/boot: Use trampoline_phys variable directly from C code
2024-10-11 8:52 [PATCH v3 0/5] Reuse 32 bit C code more safely Frediano Ziglio
` (2 preceding siblings ...)
2024-10-11 8:52 ` [PATCH v3 3/5] x86/boot: Use boot_vid_info variable directly from C code Frediano Ziglio
@ 2024-10-11 8:52 ` Frediano Ziglio
2024-10-11 12:05 ` Andrew Cooper
2024-10-11 8:52 ` [PATCH v3 5/5] x86/boot: Clarify comment Frediano Ziglio
4 siblings, 1 reply; 23+ messages in thread
From: Frediano Ziglio @ 2024-10-11 8:52 UTC (permalink / raw)
To: xen-devel
Cc: Frediano Ziglio, Jan Beulich, Andrew Cooper, Roger Pau Monné
No more need to pass from assembly code.
Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
---
Changes since v1:
- split the 2 variable changes into 2 commits.
Changes since v2:
- revert commit order;
- avoid useless casts.
---
xen/arch/x86/boot/head.S | 6 +-----
xen/arch/x86/boot/reloc.c | 8 ++++++--
2 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S
index 5da7ac138f..dcda91cfda 100644
--- a/xen/arch/x86/boot/head.S
+++ b/xen/arch/x86/boot/head.S
@@ -510,13 +510,9 @@ trampoline_setup:
mov %esi, sym_esi(xen_phys_start)
mov %esi, sym_esi(trampoline_xen_phys_start)
- /* Get bottom-most low-memory stack address. */
- mov sym_esi(trampoline_phys), %ecx
- add $TRAMPOLINE_SPACE,%ecx
-
/* Save Multiboot / PVH info struct (after relocation) for later use. */
mov %ebx, %edx /* Multiboot / PVH information address. */
- /* reloc(magic/eax, info/edx, trampoline/ecx) using fastcall. */
+ /* reloc(magic/eax, info/edx) using fastcall. */
call reloc
#ifdef CONFIG_PVH_GUEST
diff --git a/xen/arch/x86/boot/reloc.c b/xen/arch/x86/boot/reloc.c
index 707d9c5f15..e50e161b27 100644
--- a/xen/arch/x86/boot/reloc.c
+++ b/xen/arch/x86/boot/reloc.c
@@ -19,6 +19,9 @@
#include <xen/kconfig.h>
#include <xen/multiboot.h>
#include <xen/multiboot2.h>
+#include <xen/page-size.h>
+
+#include <asm/trampoline.h>
#include <public/arch-x86/hvm/start_info.h>
@@ -344,9 +347,10 @@ static multiboot_info_t *mbi2_reloc(uint32_t mbi_in, memctx *ctx)
}
/* SAF-1-safe */
-void *reloc(uint32_t magic, uint32_t in, uint32_t trampoline)
+void *reloc(uint32_t magic, uint32_t in)
{
- memctx ctx = { trampoline };
+ /* Get bottom-most low-memory stack address. */
+ memctx ctx = { trampoline_phys + TRAMPOLINE_SPACE };
switch ( magic )
{
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v3 5/5] x86/boot: Clarify comment
2024-10-11 8:52 [PATCH v3 0/5] Reuse 32 bit C code more safely Frediano Ziglio
` (3 preceding siblings ...)
2024-10-11 8:52 ` [PATCH v3 4/5] x86/boot: Use trampoline_phys " Frediano Ziglio
@ 2024-10-11 8:52 ` Frediano Ziglio
2024-10-11 12:56 ` Alejandro Vallejo
4 siblings, 1 reply; 23+ messages in thread
From: Frediano Ziglio @ 2024-10-11 8:52 UTC (permalink / raw)
To: xen-devel
Cc: Frediano Ziglio, Jan Beulich, Andrew Cooper, Roger Pau Monné
Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
---
xen/arch/x86/boot/reloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/xen/arch/x86/boot/reloc.c b/xen/arch/x86/boot/reloc.c
index e50e161b27..e725cfb6eb 100644
--- a/xen/arch/x86/boot/reloc.c
+++ b/xen/arch/x86/boot/reloc.c
@@ -65,7 +65,7 @@ typedef struct memctx {
/*
* Simple bump allocator.
*
- * It starts from the base of the trampoline and allocates downwards.
+ * It starts on top of space reserved for the trampoline and allocates downwards.
*/
uint32_t ptr;
} memctx;
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [Makefile only] [PATCH v3 1/5] x86/boot: create a C bundle for 32 bit boot code and use it
2024-10-11 8:52 ` [PATCH v3 1/5] x86/boot: create a C bundle for 32 bit boot code and use it Frediano Ziglio
@ 2024-10-11 10:57 ` Andrew Cooper
2024-10-11 13:17 ` [python] " Andrew Cooper
2024-10-11 13:20 ` Jan Beulich
2 siblings, 0 replies; 23+ messages in thread
From: Andrew Cooper @ 2024-10-11 10:57 UTC (permalink / raw)
To: Frediano Ziglio, xen-devel
Cc: Jan Beulich, Roger Pau Monné, Julien Grall,
Stefano Stabellini
On 11/10/2024 9:52 am, Frediano Ziglio wrote:
> <snip>
>
> Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
The makefile changes here are not the easiest to follow, because there
are two related things being done.
I experimented, and came up with the following:
https://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=shortlog;h=refs/heads/fz-32b-v3.1
which splits the obj32 transformation out of the "totally change how we
link things" which is the purpose of this patch. The end result is much
clearer to follow IMO.
I've got some other comments, which I've included in the branch, but are
presented here for posterity.
> diff --git a/xen/arch/x86/boot/Makefile b/xen/arch/x86/boot/Makefile
> index ff0f965876..4cf0d7e140 100644
> --- a/xen/arch/x86/boot/Makefile
> +++ b/xen/arch/x86/boot/Makefile
> @@ -1,15 +1,18 @@
> +obj32 := cmdline.o
> +obj32 += reloc.o
> +
> obj-bin-y += head.o
> +obj-bin-y += built_in_32.o
>
> -head-bin-objs := cmdline.o reloc.o
> +obj32 := $(patsubst %.o,%.32.o,$(obj32))
We're already used to writing out foo.init.o, so just have the .32.o in
list.
This makes patch 3 slightly more obvious, where you're not listing the
same .o in two different obj lists.
>
> -nocov-y += $(head-bin-objs)
> -noubsan-y += $(head-bin-objs)
> -targets += $(head-bin-objs)
> +nocov-y += $(obj32)
> +noubsan-y += $(obj32)
> +targets += $(obj32)
>
> -head-bin-objs := $(addprefix $(obj)/,$(head-bin-objs))
> +obj32 := $(addprefix $(obj)/,$(obj32))
>
> $(obj)/head.o: AFLAGS-y += -Wa$(comma)-I$(obj)
> -$(obj)/head.o: $(head-bin-objs:.o=.bin)
The AFLAGS-y was only for the .incbin's, so can go away here too.
>
> CFLAGS_x86_32 := $(subst -m64,-m32 -march=i686,$(XEN_TREEWIDE_CFLAGS))
> $(call cc-options-add,CFLAGS_x86_32,CC,$(EMBEDDED_EXTRA_CFLAGS))
> @@ -17,17 +20,46 @@ CFLAGS_x86_32 += -Werror -fno-builtin -g0 -msoft-float -mregparm=3
> CFLAGS_x86_32 += -nostdinc -include $(filter %/include/xen/config.h,$(XEN_CFLAGS))
> CFLAGS_x86_32 += $(filter -I% -O%,$(XEN_CFLAGS)) -D__XEN__
>
> +LD32 := $(LD) $(subst x86_64,i386,$(LDFLAGS_DIRECT))
Because this is :=, it needs to be after ...
> +
> # override for 32bit binaries
> -$(head-bin-objs): CFLAGS_stack_boundary :=
> -$(head-bin-objs): XEN_CFLAGS := $(CFLAGS_x86_32) -fpic
> +$(obj32): CFLAGS_stack_boundary :=
> +$(obj32): XEN_CFLAGS := $(CFLAGS_x86_32) -fpic
>
> LDFLAGS_DIRECT-$(call ld-option,--warn-rwx-segments) := --no-warn-rwx-segments
> LDFLAGS_DIRECT += $(LDFLAGS_DIRECT-y)
... this.
>
> -%.bin: %.lnk
> - $(OBJCOPY) -j .text -O binary $< $@
> +$(obj)/%.32.o: $(src)/%.c FORCE
> + $(call if_changed_rule,cc_o_c)
> +
> +$(obj)/build32.final.lds: AFLAGS-y += -DFINAL
> +$(obj)/build32.other.lds $(obj)/build32.final.lds: $(src)/build32.lds.S
> + $(call if_changed_dep,cpp_lds_S)
FORCE needs to be a prerequisite to use $(call if_changed_dep) (Sorry -
I missed this one in the branch I pushed.)
> +
> +orphan-handling-$(call ld-option,--orphan-handling=error) := --orphan-handling=error
> +
> +# link all object files together
All 32bit objects. It isn't even all objects today (head.o isn't included).
> +$(obj)/built_in_32.tmp.o: $(obj32)
> + $(LD32) -r -o $@ $(obj32)
$^
> +
> +$(obj)/built_in_32.%.bin: $(obj)/build32.%.lds $(obj)/built_in_32.tmp.o
> +## link bundle with a given layout
> + $(LD32) $(orphan-handling-y) -N -T $< -Map $(obj)/built_in_32.$(*F).map -o $(obj)/built_in_32.$(*F).o $(obj)/built_in_32.tmp.o
> +## extract binaries from object
> + $(OBJCOPY) -j .text -O binary $(obj)/built_in_32.$(*F).o $@
> + rm -f $(obj)/built_in_32.$(*F).o
>
> -%.lnk: %.o $(src)/build32.lds
> - $(LD) $(subst x86_64,i386,$(LDFLAGS_DIRECT)) -N -T $(filter %.lds,$^) -o $@ $<
> +# generate final object file combining and checking above binaries
> +$(obj)/built_in_32.o: $(obj)/built_in_32.other.bin $(obj)/built_in_32.final.bin
> + $(PYTHON) $(srctree)/tools/combine_two_binaries \
> + --script $(obj)/build32.final.lds \
> + --bin1 $(obj)/built_in_32.other.bin --bin2 $(obj)/built_in_32.final.bin \
> + --map $(obj)/built_in_32.final.map \
> + --exports cmdline_parse_early,reloc \
> + --section-header '.section .init.text, "ax", @progbits' \
> + --output $(obj)/built_in_32.s
> + $(CC) -c $(obj)/built_in_32.s -o $@.tmp
> + rm -f $(obj)/built_in_32.s $@
> + mv $@.tmp $@
$(obj)/built_in_32.S: and --output $@
The build system already knows how to turn a .S into a .o. This form
removed an opencoded $(CC), and leaves built_in_32.S around as an
intermediate to be inspected.
~Andrew
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 2/5] x86/boot: Reuse code to relocate trampoline
2024-10-11 8:52 ` [PATCH v3 2/5] x86/boot: Reuse code to relocate trampoline Frediano Ziglio
@ 2024-10-11 12:02 ` Andrew Cooper
2024-10-11 12:50 ` Jan Beulich
1 sibling, 0 replies; 23+ messages in thread
From: Andrew Cooper @ 2024-10-11 12:02 UTC (permalink / raw)
To: Frediano Ziglio, xen-devel
Cc: Jan Beulich, Roger Pau Monné, Daniel P. Smith,
Marek Marczykowski-Górecki
On 11/10/2024 9:52 am, Frediano Ziglio wrote:
> Move code from efi-boot.h to a separate, new, reloc-trampoline.c file.
> Reuse this new code to replace assembly code in head.S doing the
> same thing.
>
> Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
I'd be tempted to say "Reuse this new code, compiling it for 32bit as
well, to replace ..."
Either way, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 4/5] x86/boot: Use trampoline_phys variable directly from C code
2024-10-11 8:52 ` [PATCH v3 4/5] x86/boot: Use trampoline_phys " Frediano Ziglio
@ 2024-10-11 12:05 ` Andrew Cooper
0 siblings, 0 replies; 23+ messages in thread
From: Andrew Cooper @ 2024-10-11 12:05 UTC (permalink / raw)
To: Frediano Ziglio, xen-devel; +Cc: Jan Beulich, Roger Pau Monné
On 11/10/2024 9:52 am, Frediano Ziglio wrote:
> No more need to pass from assembly code.
>
> Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 3/5] x86/boot: Use boot_vid_info variable directly from C code
2024-10-11 8:52 ` [PATCH v3 3/5] x86/boot: Use boot_vid_info variable directly from C code Frediano Ziglio
@ 2024-10-11 12:07 ` Andrew Cooper
0 siblings, 0 replies; 23+ messages in thread
From: Andrew Cooper @ 2024-10-11 12:07 UTC (permalink / raw)
To: Frediano Ziglio, xen-devel; +Cc: Jan Beulich, Roger Pau Monné
On 11/10/2024 9:52 am, Frediano Ziglio wrote:
> No more need to pass from assembly code.
>
> Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Yes, much less churn.
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 2/5] x86/boot: Reuse code to relocate trampoline
2024-10-11 8:52 ` [PATCH v3 2/5] x86/boot: Reuse code to relocate trampoline Frediano Ziglio
2024-10-11 12:02 ` Andrew Cooper
@ 2024-10-11 12:50 ` Jan Beulich
1 sibling, 0 replies; 23+ messages in thread
From: Jan Beulich @ 2024-10-11 12:50 UTC (permalink / raw)
To: Frediano Ziglio
Cc: Andrew Cooper, Roger Pau Monné, Daniel P. Smith,
Marek Marczykowski-Górecki, xen-devel
On 11.10.2024 10:52, Frediano Ziglio wrote:
> --- /dev/null
> +++ b/xen/arch/x86/boot/reloc-trampoline.c
> @@ -0,0 +1,36 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include <xen/compiler.h>
> +#include <xen/stdint.h>
> +#include <asm/trampoline.h>
> +
> +extern const int32_t __trampoline_rel_start[], __trampoline_rel_stop[];
> +extern const int32_t __trampoline_seg_start[], __trampoline_seg_stop[];
> +
> +#if defined(__i386__)
> +void reloc_trampoline32(void)
> +#elif defined (__x86_64__)
> +void reloc_trampoline64(void)
> +#else
> +#error Unknow architecture
Nit: Unknown
Can likely be adjusted while committing.
Jan
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 5/5] x86/boot: Clarify comment
2024-10-11 8:52 ` [PATCH v3 5/5] x86/boot: Clarify comment Frediano Ziglio
@ 2024-10-11 12:56 ` Alejandro Vallejo
2024-10-11 13:08 ` Frediano Ziglio
0 siblings, 1 reply; 23+ messages in thread
From: Alejandro Vallejo @ 2024-10-11 12:56 UTC (permalink / raw)
To: Frediano Ziglio
Cc: xen-devel, Jan Beulich, Andrew Cooper, Roger Pau Monné
On Fri, Oct 11, 2024 at 09:52:44AM +0100, Frediano Ziglio wrote:
> Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
> ---
> xen/arch/x86/boot/reloc.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/xen/arch/x86/boot/reloc.c b/xen/arch/x86/boot/reloc.c
> index e50e161b27..e725cfb6eb 100644
> --- a/xen/arch/x86/boot/reloc.c
> +++ b/xen/arch/x86/boot/reloc.c
> @@ -65,7 +65,7 @@ typedef struct memctx {
> /*
> * Simple bump allocator.
> *
> - * It starts from the base of the trampoline and allocates downwards.
> + * It starts on top of space reserved for the trampoline and allocates downwards.
nit: Not sure this is much clearer. The trampoline is not a stack (and even if
it was, I personally find "top" and "bottom" quite ambiguous when it grows
backwards), so calling top to its lowest address seems more confusing than not.
If anything clarification ought to go in the which direction it takes. Leaving
"base" instead of "top" and replacing "downwards" by "backwards" to make it
crystal clear that it's a pointer that starts where the trampoline starts, but
moves in the opposite direction.
My .02 anyway.
> */
> uint32_t ptr;
> } memctx;
> --
> 2.34.1
>
>
Cheers,
Alejandro
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 5/5] x86/boot: Clarify comment
2024-10-11 12:56 ` Alejandro Vallejo
@ 2024-10-11 13:08 ` Frediano Ziglio
2024-10-11 13:28 ` Alejandro Vallejo
0 siblings, 1 reply; 23+ messages in thread
From: Frediano Ziglio @ 2024-10-11 13:08 UTC (permalink / raw)
To: Alejandro Vallejo
Cc: xen-devel, Jan Beulich, Andrew Cooper, Roger Pau Monné
On Fri, Oct 11, 2024 at 1:56 PM Alejandro Vallejo
<alejandro.vallejo@cloud.com> wrote:
>
> On Fri, Oct 11, 2024 at 09:52:44AM +0100, Frediano Ziglio wrote:
> > Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
> > ---
> > xen/arch/x86/boot/reloc.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/xen/arch/x86/boot/reloc.c b/xen/arch/x86/boot/reloc.c
> > index e50e161b27..e725cfb6eb 100644
> > --- a/xen/arch/x86/boot/reloc.c
> > +++ b/xen/arch/x86/boot/reloc.c
> > @@ -65,7 +65,7 @@ typedef struct memctx {
> > /*
> > * Simple bump allocator.
> > *
> > - * It starts from the base of the trampoline and allocates downwards.
> > + * It starts on top of space reserved for the trampoline and allocates downwards.
>
> nit: Not sure this is much clearer. The trampoline is not a stack (and even if
> it was, I personally find "top" and "bottom" quite ambiguous when it grows
> backwards), so calling top to its lowest address seems more confusing than not.
>
> If anything clarification ought to go in the which direction it takes. Leaving
> "base" instead of "top" and replacing "downwards" by "backwards" to make it
> crystal clear that it's a pointer that starts where the trampoline starts, but
> moves in the opposite direction.
>
Base looks confusing to me, but surely that comment could be confusing.
For the trampoline 64 KB are reserved. Last 4 KB are used as a normal
stack (push/pop/call/whatever), first part gets a copy of the
trampoline code/data (about 6 Kb) the rest (so 64 - 4 - ~6 = ~54 kb)
is used for the copy of MBI information. That "rest" is what we are
talking about here.
> My .02 anyway.
>
> > */
> > uint32_t ptr;
> > } memctx;
> > --
> > 2.34.1
> >
> >
>
> Cheers,
> Alejandro
Frediano
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [python] [PATCH v3 1/5] x86/boot: create a C bundle for 32 bit boot code and use it
2024-10-11 8:52 ` [PATCH v3 1/5] x86/boot: create a C bundle for 32 bit boot code and use it Frediano Ziglio
2024-10-11 10:57 ` [Makefile only] " Andrew Cooper
@ 2024-10-11 13:17 ` Andrew Cooper
2024-10-11 13:53 ` Frediano Ziglio
2024-10-11 13:20 ` Jan Beulich
2 siblings, 1 reply; 23+ messages in thread
From: Andrew Cooper @ 2024-10-11 13:17 UTC (permalink / raw)
To: Frediano Ziglio, xen-devel
Cc: Jan Beulich, Roger Pau Monné, Julien Grall,
Stefano Stabellini
On 11/10/2024 9:52 am, Frediano Ziglio wrote:
> diff --git a/xen/arch/x86/boot/Makefile b/xen/arch/x86/boot/Makefile
> index ff0f965876..4cf0d7e140 100644
> --- a/xen/arch/x86/boot/Makefile
> +++ b/xen/arch/x86/boot/Makefile
> ...
> +$(obj)/built_in_32.o: $(obj)/built_in_32.other.bin $(obj)/built_in_32.final.bin
> + $(PYTHON) $(srctree)/tools/combine_two_binaries \
> + --script $(obj)/build32.final.lds \
> + --bin1 $(obj)/built_in_32.other.bin --bin2 $(obj)/built_in_32.final.bin \
> + --map $(obj)/built_in_32.final.map \
> + --exports cmdline_parse_early,reloc \
> + --section-header '.section .init.text, "ax", @progbits' \
> + --output $(obj)/built_in_32.s
I can't see a case where we'd want this anywhere other than .init.text,
so I'd drop the --section-header and just write it out unconditionally.
However, looking at the output:
xen$ head arch/x86/boot/built_in_32.S
.section .init.text, "ax", @progbits
.byte 137,194,128,56,0,116,6,64,128,56,0,117,250,41,208,195
.byte 133,201,116,42,86,83,141,52,8,64,15,182,72,255,66,15
.byte 182,90,255,56,217,117,15,132,201,116,25,57,198,117,234,184
This wants to start with a comment saying that it was automatically
generated by combine_two_binaries.
Next, we need some kind of symbol at the start. Right now, the
disassembly reads:
arch/x86/boot/built_in_32.o: file format elf64-x86-64
Disassembly of section .init.text:
0000000000000000 <cmdline_parse_early-0x391>:
0: 89 c2 mov %eax,%edx
2: 80 38 00 cmpb $0x0,(%eax)
because most metadata is lost by this transformation, and it doesn't
know that this is in fact strlen(). I'd suggest suggest obj32_start: or
equivalent.
> diff --git a/xen/arch/x86/boot/build32.lds b/xen/arch/x86/boot/build32.lds.S
> similarity index 63%
> rename from xen/arch/x86/boot/build32.lds
> rename to xen/arch/x86/boot/build32.lds.S
> index 56edaa727b..72a4c5c614 100644
> --- a/xen/arch/x86/boot/build32.lds
> +++ b/xen/arch/x86/boot/build32.lds.S
> @@ -15,22 +15,52 @@
> * with this program. If not, see <http://www.gnu.org/licenses/>.
> */
>
> -ENTRY(_start)
> +#ifdef FINAL
> +# define GAP 0
> +# define MULT 0
> +# define TEXT_START
> +#else
> +# define GAP 0x010200
> +# define MULT 1
> +# define TEXT_START 0x408020
> +#endif
> +# define DECLARE_IMPORT(name) name = . + (__LINE__ * MULT)
> +
> +ENTRY(dummy_start)
>
> SECTIONS
> {
> - /* Merge code and data into one section. */
> - .text : {
> + /* Merge code and read-only data into one section. */
> + .text TEXT_START : {
> + /* Silence linker warning, we are not going to use it */
> + dummy_start = .;
> +
> + /* Declare below any symbol name needed.
> + * Each symbol should be on its own line.
> + * It looks like a tedious work but we make sure the things we use.
> + * Potentially they should be all variables. */
> + DECLARE_IMPORT(__base_relocs_start);
> + DECLARE_IMPORT(__base_relocs_end);
> + . = . + GAP;
> *(.text)
> *(.text.*)
> - *(.data)
> - *(.data.*)
> *(.rodata)
> *(.rodata.*)
> + }
> +
> + /* Writeable data sections. Check empty.
> + * We collapse all into code section and we don't want it to be writeable. */
> + .data : {
> + *(.data)
> + *(.data.*)
> *(.bss)
> *(.bss.*)
> }
> -
> + /DISCARD/ : {
> + *(.comment)
> + *(.comment.*)
> + *(.note.*)
> + }
> /* Dynamic linkage sections. Collected simply so we can check they're empty. */
> .got : {
> *(.got)
> @@ -64,3 +94,4 @@ ASSERT(SIZEOF(.igot.plt) == 0, ".igot.plt non-empty")
> ASSERT(SIZEOF(.iplt) == 0, ".iplt non-empty")
> ASSERT(SIZEOF(.plt) == 0, ".plt non-empty")
> ASSERT(SIZEOF(.rel) == 0, "leftover relocations")
> +ASSERT(SIZEOF(.data) == 0, "we don't want data")
3mdeb are getting around to rebasing/resubmitting the Trenchboot work
(Intel TXT and AMD SKINIT) backing QubeOS Anti-Evil-Maid.
While most of the cleanup is proving very helpful (i.e. reducing their
work), the lack of .data was seen as likely to be a blocker. Thinking
about this more, I'm now fairly certain we do not need to exclude data.
This object executes during boot in 32bit flat unpaged mode (i.e. there
are no actual restrictions during execution), and because it's all
wrapped in .init.text, its "just code" to the outside world. This means
it does not impact R^X that we're trying to arrange for the EFI section
headers.
Therefore the data arrangements should stay as they were before, and I
think the result will be fine. We obviously don't want gratuitous use
of .data, but we don't need to prohibit it either.
I've got various other suggestions for improvements of the result, but
they can all be deferred until later. This is complicated enough.
> diff --git a/xen/tools/combine_two_binaries b/xen/tools/combine_two_binaries
> new file mode 100755
> index 0000000000..ea2d6ddc4e
> --- /dev/null
> +++ b/xen/tools/combine_two_binaries
> @@ -0,0 +1,198 @@
> +#!/usr/bin/env python3
> +
> +from __future__ import print_function
> +import argparse
> +import re
> +import struct
> +import sys
> +
> +parser = argparse.ArgumentParser(description='Generate assembly file to merge into other code.')
> +parser.add_argument('--script', dest='script',
> + required=True,
> + help='Linker script for extracting symbols')
> +parser.add_argument('--bin1', dest='bin1',
> + required=True,
> + help='First binary')
> +parser.add_argument('--bin2', dest='bin2',
> + required=True,
> + help='Second binary')
> +parser.add_argument('--output', dest='output',
> + help='Output file')
> +parser.add_argument('--map', dest='mapfile',
> + help='Map file to read for symbols to export')
> +parser.add_argument('--exports', dest='exports',
> + help='Symbols to export')
> +parser.add_argument('--section-header', dest='section_header',
> + default='.text',
> + help='Section header declaration')
> +parser.add_argument('-v', '--verbose',
> + action='store_true')
> +args = parser.parse_args()
> +
> +gap = 0x010200
> +text_diff = 0x408020
> +
> +# Parse linker script for external symbols to use.
> +symbol_re = re.compile(r'\s+(\S+)\s*=\s*\.\s*\+\s*\((\d+)\s*\*\s*0\s*\)\s*;')
What is this looking for? I'd expect the DECLARE_IMPORT() lines, but
this is as clear as regexes...
> +symbols = {}
> +lines = {}
> +for line in open(args.script):
> + m = symbol_re.match(line)
> + if not m:
> + continue
> + (name, line_num) = (m.group(1), int(m.group(2)))
> + if line_num == 0:
> + raise Exception("Invalid line number found:\n\t" + line)
> + if line_num in symbols:
> + raise Exception("Symbol with this line already present:\n\t" + line)
> + if name in lines:
> + raise Exception("Symbol with this name already present:\n\t" + name)
> + symbols[line_num] = name
> + lines[name] = line_num
> +
> +exports = []
> +if args.exports is not None:
> + exports = dict([(name, None) for name in args.exports.split(',')])
> +
> +# Parse mapfile, look for ther symbols we want to export.
> +if args.mapfile is not None:
> + symbol_re = re.compile(r'\s{15,}0x([0-9a-f]+)\s+(\S+)\n')
> + for line in open(args.mapfile):
> + m = symbol_re.match(line)
> + if not m or m.group(2) not in exports:
> + continue
> + addr = int(m.group(1), 16)
> + exports[m.group(2)] = addr
> +for (name, addr) in exports.items():
> + if addr is None:
> + raise Exception("Required export symbols %s not found" % name)
> +
> +file1 = open(args.bin1, 'rb')
> +file2 = open(args.bin2, 'rb')
> +file1.seek(0, 2)
> +size1 = file1.tell()
> +file2.seek(0, 2)
> +size2 = file2.tell()
> +if size1 > size2:
> + file1, file2 = file2, file1
> + size1, size2 = size2, size1
> +if size2 != size1 + gap:
> + raise Exception('File sizes do not match')
> +
> +file1.seek(0, 0)
> +data1 = file1.read(size1)
> +file2.seek(gap, 0)
> +data2 = file2.read(size1)
> +
> +max_line = max(symbols.keys())
> +
> +def to_int32(n):
> + '''Convert a number to signed 32 bit integer truncating if needed'''
> + mask = (1 << 32) - 1
> + h = 1 << 31
> + return (n & mask) ^ h - h
> +
> +i = 0
> +references = {}
> +internals = 0
> +while i <= size1 - 4:
> + n1 = struct.unpack('<I', data1[i:i+4])[0]
> + n2 = struct.unpack('<I', data2[i:i+4])[0]
> + i += 1
> + # The two numbers are the same, no problems
> + if n1 == n2:
> + continue
> + # Try to understand why they are different
> + diff = to_int32(n1 - n2)
> + if diff == -gap: # this is an internal relocation
> + pos = i - 1
> + print(("Internal relocation found at position %#x "
> + "n1=%#x n2=%#x diff=%#x") % (pos, n1, n2, diff),
Here and elsewhere, you don't need brackets around around the string
itself. Python strings are like C strings and will auto-concatenate.
> + file=sys.stderr)
> + i += 3
> + internals += 1
> + if internals >= 10:
> + break
> + continue
> + # This is a relative relocation to a symbol, accepted, code/data is
> + # relocatable.
> + if diff < gap and diff >= gap - max_line:
> + n = gap - diff
> + symbol = symbols.get(n)
> + # check we have a symbol
> + if symbol is None:
> + raise Exception("Cannot find symbol for line %d" % n)
> + pos = i - 1
> + if args.verbose:
> + print('Position %#x %d %s' % (pos, n, symbol), file=sys.stderr)
> + i += 3
> + references[pos] = symbol
> + continue
> + # First byte is the same, move to next byte
> + if diff & 0xff == 0 and i <= size1 - 4:
> + continue
> + # Probably a type of relocation we don't want or support
> + pos = i - 1
> + suggestion = ''
> + symbol = symbols.get(-diff - text_diff)
> + if symbol is not None:
> + suggestion = " Maybe %s is not defined as hidden?" % symbol
> + raise Exception(("Unexpected difference found at %#x "
> + "n1=%#x n2=%#x diff=%#x gap=%#x.%s") % \
> + (pos, n1, n2, diff, gap, suggestion))
> +if internals != 0:
> + raise Exception("Previous relocations found")
> +
> +def line_bytes(buf, out):
> + '''Output an assembly line with all bytes in "buf"'''
> + if type(buf) == str:
> + print("\t.byte " + ','.join([str(ord(c)) for c in buf]), file=out)
> + else:
> + print("\t.byte " + ','.join([str(n) for n in buf]), file=out)
I'm guessing this is Py2/3 compatibility?
> +
> +def part(start, end, out):
> + '''Output bytes of "data" from "start" to "end"'''
> + while start < end:
> + e = min(start + 16, end)
> + line_bytes(data1[start:e], out)
> + start = e
> +
> +def reference(pos, out):
> + name = references[pos]
> + n = struct.unpack('<I', data1[pos:pos+4])[0]
> + sign = '+'
> + if n >= (1 << 31):
> + n -= (1 << 32)
> + n += pos
> + if n < 0:
> + n = -n
> + sign = '-'
> + print("\t.hidden %s\n\t.long %s %s %#x - ." % (name, name, sign, n),
Personally, I think this is easier to read as:
print("\t.hidden %s\n"
"\t.long %s %s %#x - ." % (name, name, sign, n),
file=out)
so it visually matches the output too. Same for .globl/hidden lower.
~Andrew
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 1/5] x86/boot: create a C bundle for 32 bit boot code and use it
2024-10-11 8:52 ` [PATCH v3 1/5] x86/boot: create a C bundle for 32 bit boot code and use it Frediano Ziglio
2024-10-11 10:57 ` [Makefile only] " Andrew Cooper
2024-10-11 13:17 ` [python] " Andrew Cooper
@ 2024-10-11 13:20 ` Jan Beulich
2024-10-11 13:21 ` Andrew Cooper
2 siblings, 1 reply; 23+ messages in thread
From: Jan Beulich @ 2024-10-11 13:20 UTC (permalink / raw)
To: Frediano Ziglio
Cc: Andrew Cooper, Roger Pau Monné, Julien Grall,
Stefano Stabellini, xen-devel
On 11.10.2024 10:52, Frediano Ziglio wrote:
> diff --git a/xen/tools/combine_two_binaries b/xen/tools/combine_two_binaries
> new file mode 100755
> index 0000000000..ea2d6ddc4e
> --- /dev/null
> +++ b/xen/tools/combine_two_binaries
> @@ -0,0 +1,198 @@
> +#!/usr/bin/env python3
Nit: Such files, by convention, are named *.py, I think.
Jan
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 1/5] x86/boot: create a C bundle for 32 bit boot code and use it
2024-10-11 13:20 ` Jan Beulich
@ 2024-10-11 13:21 ` Andrew Cooper
0 siblings, 0 replies; 23+ messages in thread
From: Andrew Cooper @ 2024-10-11 13:21 UTC (permalink / raw)
To: Jan Beulich, Frediano Ziglio
Cc: Roger Pau Monné, Julien Grall, Stefano Stabellini, xen-devel
On 11/10/2024 2:20 pm, Jan Beulich wrote:
> On 11.10.2024 10:52, Frediano Ziglio wrote:
>> diff --git a/xen/tools/combine_two_binaries b/xen/tools/combine_two_binaries
>> new file mode 100755
>> index 0000000000..ea2d6ddc4e
>> --- /dev/null
>> +++ b/xen/tools/combine_two_binaries
>> @@ -0,0 +1,198 @@
>> +#!/usr/bin/env python3
> Nit: Such files, by convention, are named *.py, I think.
Yes please, and I forgot it on my python review.
~Andrew
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 5/5] x86/boot: Clarify comment
2024-10-11 13:08 ` Frediano Ziglio
@ 2024-10-11 13:28 ` Alejandro Vallejo
2024-10-11 13:38 ` Andrew Cooper
0 siblings, 1 reply; 23+ messages in thread
From: Alejandro Vallejo @ 2024-10-11 13:28 UTC (permalink / raw)
To: Frediano Ziglio
Cc: xen-devel, Jan Beulich, Andrew Cooper, Roger Pau Monné
On Fri, Oct 11, 2024 at 02:08:37PM +0100, Frediano Ziglio wrote:
> On Fri, Oct 11, 2024 at 1:56 PM Alejandro Vallejo
> <alejandro.vallejo@cloud.com> wrote:
> >
> > On Fri, Oct 11, 2024 at 09:52:44AM +0100, Frediano Ziglio wrote:
> > > Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
> > > ---
> > > xen/arch/x86/boot/reloc.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/xen/arch/x86/boot/reloc.c b/xen/arch/x86/boot/reloc.c
> > > index e50e161b27..e725cfb6eb 100644
> > > --- a/xen/arch/x86/boot/reloc.c
> > > +++ b/xen/arch/x86/boot/reloc.c
> > > @@ -65,7 +65,7 @@ typedef struct memctx {
> > > /*
> > > * Simple bump allocator.
> > > *
> > > - * It starts from the base of the trampoline and allocates downwards.
> > > + * It starts on top of space reserved for the trampoline and allocates downwards.
> >
> > nit: Not sure this is much clearer. The trampoline is not a stack (and even if
> > it was, I personally find "top" and "bottom" quite ambiguous when it grows
> > backwards), so calling top to its lowest address seems more confusing than not.
> >
> > If anything clarification ought to go in the which direction it takes. Leaving
> > "base" instead of "top" and replacing "downwards" by "backwards" to make it
> > crystal clear that it's a pointer that starts where the trampoline starts, but
> > moves in the opposite direction.
> >
>
> Base looks confusing to me, but surely that comment could be confusing.
> For the trampoline 64 KB are reserved. Last 4 KB are used as a normal
> stack (push/pop/call/whatever), first part gets a copy of the
> trampoline code/data (about 6 Kb) the rest (so 64 - 4 - ~6 = ~54 kb)
> is used for the copy of MBI information. That "rest" is what we are
> talking about here.
Last? From what I looked at it seems to be the first 12K.
#define TRAMPOLINE_STACK_SPACE PAGE_SIZE
#define TRAMPOLINE_SPACE (KB(64) - TRAMPOLINE_STACK_SPACE)
To put it another way, with left=lo-addr and right=hi-addr. The code seems to
do this...
|<--------------64K-------------->|
|<-----12K--->| |
+-------------+-----+-------------+
| stack-space | mbi | trampoline |
+-------------+-----+-------------+
^ ^
| |
| +-- copied Multiboot info + modules
+----- initial memctx.ptr
... with the stack growing backwards to avoid overflowing onto mbi.
Or am I missing something?
Cheers,
Alejandro
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 5/5] x86/boot: Clarify comment
2024-10-11 13:28 ` Alejandro Vallejo
@ 2024-10-11 13:38 ` Andrew Cooper
2024-10-11 13:58 ` Frediano Ziglio
2024-10-11 14:06 ` Andrew Cooper
0 siblings, 2 replies; 23+ messages in thread
From: Andrew Cooper @ 2024-10-11 13:38 UTC (permalink / raw)
To: Alejandro Vallejo, Frediano Ziglio
Cc: xen-devel, Jan Beulich, Roger Pau Monné
On 11/10/2024 2:28 pm, Alejandro Vallejo wrote:
> On Fri, Oct 11, 2024 at 02:08:37PM +0100, Frediano Ziglio wrote:
>> On Fri, Oct 11, 2024 at 1:56 PM Alejandro Vallejo
>> <alejandro.vallejo@cloud.com> wrote:
>>> On Fri, Oct 11, 2024 at 09:52:44AM +0100, Frediano Ziglio wrote:
>>>> Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
>>>> ---
>>>> xen/arch/x86/boot/reloc.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/xen/arch/x86/boot/reloc.c b/xen/arch/x86/boot/reloc.c
>>>> index e50e161b27..e725cfb6eb 100644
>>>> --- a/xen/arch/x86/boot/reloc.c
>>>> +++ b/xen/arch/x86/boot/reloc.c
>>>> @@ -65,7 +65,7 @@ typedef struct memctx {
>>>> /*
>>>> * Simple bump allocator.
>>>> *
>>>> - * It starts from the base of the trampoline and allocates downwards.
>>>> + * It starts on top of space reserved for the trampoline and allocates downwards.
>>> nit: Not sure this is much clearer. The trampoline is not a stack (and even if
>>> it was, I personally find "top" and "bottom" quite ambiguous when it grows
>>> backwards), so calling top to its lowest address seems more confusing than not.
>>>
>>> If anything clarification ought to go in the which direction it takes. Leaving
>>> "base" instead of "top" and replacing "downwards" by "backwards" to make it
>>> crystal clear that it's a pointer that starts where the trampoline starts, but
>>> moves in the opposite direction.
>>>
>> Base looks confusing to me, but surely that comment could be confusing.
>> For the trampoline 64 KB are reserved. Last 4 KB are used as a normal
>> stack (push/pop/call/whatever), first part gets a copy of the
>> trampoline code/data (about 6 Kb) the rest (so 64 - 4 - ~6 = ~54 kb)
>> is used for the copy of MBI information. That "rest" is what we are
>> talking about here.
> Last? From what I looked at it seems to be the first 12K.
>
> #define TRAMPOLINE_STACK_SPACE PAGE_SIZE
> #define TRAMPOLINE_SPACE (KB(64) - TRAMPOLINE_STACK_SPACE)
>
> To put it another way, with left=lo-addr and right=hi-addr. The code seems to
> do this...
>
> |<--------------64K-------------->|
> |<-----12K--->| |
> +-------------+-----+-------------+
> | stack-space | mbi | trampoline |
> +-------------+-----+-------------+
> ^ ^
> | |
> | +-- copied Multiboot info + modules
> +----- initial memctx.ptr
>
> ... with the stack growing backwards to avoid overflowing onto mbi.
>
> Or am I missing something?
So I was hoping for some kind of diagram like this, to live in
arch/x86/include/asm/trampoline.h with the other notes about the trampoline.
But, is that diagram accurate? Looking at
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [python] [PATCH v3 1/5] x86/boot: create a C bundle for 32 bit boot code and use it
2024-10-11 13:17 ` [python] " Andrew Cooper
@ 2024-10-11 13:53 ` Frediano Ziglio
0 siblings, 0 replies; 23+ messages in thread
From: Frediano Ziglio @ 2024-10-11 13:53 UTC (permalink / raw)
To: Andrew Cooper
Cc: xen-devel, Jan Beulich, Roger Pau Monné, Julien Grall,
Stefano Stabellini
On Fri, Oct 11, 2024 at 2:17 PM Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>
> On 11/10/2024 9:52 am, Frediano Ziglio wrote:
> > diff --git a/xen/arch/x86/boot/Makefile b/xen/arch/x86/boot/Makefile
> > index ff0f965876..4cf0d7e140 100644
> > --- a/xen/arch/x86/boot/Makefile
> > +++ b/xen/arch/x86/boot/Makefile
> > ...
> > +$(obj)/built_in_32.o: $(obj)/built_in_32.other.bin $(obj)/built_in_32.final.bin
> > + $(PYTHON) $(srctree)/tools/combine_two_binaries \
> > + --script $(obj)/build32.final.lds \
> > + --bin1 $(obj)/built_in_32.other.bin --bin2 $(obj)/built_in_32.final.bin \
> > + --map $(obj)/built_in_32.final.map \
> > + --exports cmdline_parse_early,reloc \
> > + --section-header '.section .init.text, "ax", @progbits' \
> > + --output $(obj)/built_in_32.s
>
> I can't see a case where we'd want this anywhere other than .init.text,
> so I'd drop the --section-header and just write it out unconditionally.
Could we just change the default in Python code and remove the option
calling the script?
> However, looking at the output:
>
> xen$ head arch/x86/boot/built_in_32.S
> .section .init.text, "ax", @progbits
> .byte 137,194,128,56,0,116,6,64,128,56,0,117,250,41,208,195
> .byte 133,201,116,42,86,83,141,52,8,64,15,182,72,255,66,15
> .byte 182,90,255,56,217,117,15,132,201,116,25,57,198,117,234,184
>
> This wants to start with a comment saying that it was automatically
> generated by combine_two_binaries.
>
Added a
print('''/*
* File autogenerated by combine_two_binaries.py DO NOT EDIT
*/''', file=out)
statement
And renamed the script file adding the ".py".
> Next, we need some kind of symbol at the start. Right now, the
> disassembly reads:
>
> arch/x86/boot/built_in_32.o: file format elf64-x86-64
> Disassembly of section .init.text:
> 0000000000000000 <cmdline_parse_early-0x391>:
> 0: 89 c2 mov %eax,%edx
> 2: 80 38 00 cmpb $0x0,(%eax)
>
>
> because most metadata is lost by this transformation, and it doesn't
> know that this is in fact strlen(). I'd suggest suggest obj32_start: or
> equivalent.
>
Would "obj_start" fine too? Maybe in the future it could be used for
64 bit too (to avoid mistakes like the relocation of error strings we
had).
> > diff --git a/xen/arch/x86/boot/build32.lds b/xen/arch/x86/boot/build32.lds.S
> > similarity index 63%
> > rename from xen/arch/x86/boot/build32.lds
> > rename to xen/arch/x86/boot/build32.lds.S
> > index 56edaa727b..72a4c5c614 100644
> > --- a/xen/arch/x86/boot/build32.lds
> > +++ b/xen/arch/x86/boot/build32.lds.S
> > @@ -15,22 +15,52 @@
> > * with this program. If not, see <http://www.gnu.org/licenses/>.
> > */
> >
> > -ENTRY(_start)
> > +#ifdef FINAL
> > +# define GAP 0
> > +# define MULT 0
> > +# define TEXT_START
> > +#else
> > +# define GAP 0x010200
> > +# define MULT 1
> > +# define TEXT_START 0x408020
> > +#endif
> > +# define DECLARE_IMPORT(name) name = . + (__LINE__ * MULT)
> > +
> > +ENTRY(dummy_start)
> >
> > SECTIONS
> > {
> > - /* Merge code and data into one section. */
> > - .text : {
> > + /* Merge code and read-only data into one section. */
> > + .text TEXT_START : {
> > + /* Silence linker warning, we are not going to use it */
> > + dummy_start = .;
> > +
> > + /* Declare below any symbol name needed.
> > + * Each symbol should be on its own line.
> > + * It looks like a tedious work but we make sure the things we use.
> > + * Potentially they should be all variables. */
> > + DECLARE_IMPORT(__base_relocs_start);
> > + DECLARE_IMPORT(__base_relocs_end);
> > + . = . + GAP;
> > *(.text)
> > *(.text.*)
> > - *(.data)
> > - *(.data.*)
> > *(.rodata)
> > *(.rodata.*)
> > + }
> > +
> > + /* Writeable data sections. Check empty.
> > + * We collapse all into code section and we don't want it to be writeable. */
> > + .data : {
> > + *(.data)
> > + *(.data.*)
> > *(.bss)
> > *(.bss.*)
> > }
> > -
> > + /DISCARD/ : {
> > + *(.comment)
> > + *(.comment.*)
> > + *(.note.*)
> > + }
> > /* Dynamic linkage sections. Collected simply so we can check they're empty. */
> > .got : {
> > *(.got)
> > @@ -64,3 +94,4 @@ ASSERT(SIZEOF(.igot.plt) == 0, ".igot.plt non-empty")
> > ASSERT(SIZEOF(.iplt) == 0, ".iplt non-empty")
> > ASSERT(SIZEOF(.plt) == 0, ".plt non-empty")
> > ASSERT(SIZEOF(.rel) == 0, "leftover relocations")
> > +ASSERT(SIZEOF(.data) == 0, "we don't want data")
>
> 3mdeb are getting around to rebasing/resubmitting the Trenchboot work
> (Intel TXT and AMD SKINIT) backing QubeOS Anti-Evil-Maid.
>
> While most of the cleanup is proving very helpful (i.e. reducing their
> work), the lack of .data was seen as likely to be a blocker. Thinking
> about this more, I'm now fairly certain we do not need to exclude data.
>
We could change if needed in the future.
Can I order:
- .text
- .rodata
- .data
- .bss
instead of old
- .text
- .data
- .rodata
- .bss
So all readonly code/data are more close?
> This object executes during boot in 32bit flat unpaged mode (i.e. there
> are no actual restrictions during execution), and because it's all
> wrapped in .init.text, its "just code" to the outside world. This means
> it does not impact R^X that we're trying to arrange for the EFI section
> headers.
>
> Therefore the data arrangements should stay as they were before, and I
> think the result will be fine. We obviously don't want gratuitous use
> of .data, but we don't need to prohibit it either.
>
> I've got various other suggestions for improvements of the result, but
> they can all be deferred until later. This is complicated enough.
>
> > diff --git a/xen/tools/combine_two_binaries b/xen/tools/combine_two_binaries
> > new file mode 100755
> > index 0000000000..ea2d6ddc4e
> > --- /dev/null
> > +++ b/xen/tools/combine_two_binaries
> > @@ -0,0 +1,198 @@
> > +#!/usr/bin/env python3
> > +
> > +from __future__ import print_function
> > +import argparse
> > +import re
> > +import struct
> > +import sys
> > +
> > +parser = argparse.ArgumentParser(description='Generate assembly file to merge into other code.')
> > +parser.add_argument('--script', dest='script',
> > + required=True,
> > + help='Linker script for extracting symbols')
> > +parser.add_argument('--bin1', dest='bin1',
> > + required=True,
> > + help='First binary')
> > +parser.add_argument('--bin2', dest='bin2',
> > + required=True,
> > + help='Second binary')
> > +parser.add_argument('--output', dest='output',
> > + help='Output file')
> > +parser.add_argument('--map', dest='mapfile',
> > + help='Map file to read for symbols to export')
> > +parser.add_argument('--exports', dest='exports',
> > + help='Symbols to export')
> > +parser.add_argument('--section-header', dest='section_header',
> > + default='.text',
> > + help='Section header declaration')
> > +parser.add_argument('-v', '--verbose',
> > + action='store_true')
> > +args = parser.parse_args()
> > +
> > +gap = 0x010200
> > +text_diff = 0x408020
> > +
> > +# Parse linker script for external symbols to use.
> > +symbol_re = re.compile(r'\s+(\S+)\s*=\s*\.\s*\+\s*\((\d+)\s*\*\s*0\s*\)\s*;')
>
> What is this looking for? I'd expect the DECLARE_IMPORT() lines, but
> this is as clear as regexes...
>
Added a
# Next regex matches expanded DECLARE_IMPORT lines in linker script.
comment
> > +symbols = {}
> > +lines = {}
> > +for line in open(args.script):
> > + m = symbol_re.match(line)
> > + if not m:
> > + continue
> > + (name, line_num) = (m.group(1), int(m.group(2)))
> > + if line_num == 0:
> > + raise Exception("Invalid line number found:\n\t" + line)
> > + if line_num in symbols:
> > + raise Exception("Symbol with this line already present:\n\t" + line)
> > + if name in lines:
> > + raise Exception("Symbol with this name already present:\n\t" + name)
> > + symbols[line_num] = name
> > + lines[name] = line_num
> > +
> > +exports = []
> > +if args.exports is not None:
> > + exports = dict([(name, None) for name in args.exports.split(',')])
> > +
> > +# Parse mapfile, look for ther symbols we want to export.
> > +if args.mapfile is not None:
> > + symbol_re = re.compile(r'\s{15,}0x([0-9a-f]+)\s+(\S+)\n')
> > + for line in open(args.mapfile):
> > + m = symbol_re.match(line)
> > + if not m or m.group(2) not in exports:
> > + continue
> > + addr = int(m.group(1), 16)
> > + exports[m.group(2)] = addr
> > +for (name, addr) in exports.items():
> > + if addr is None:
> > + raise Exception("Required export symbols %s not found" % name)
> > +
> > +file1 = open(args.bin1, 'rb')
> > +file2 = open(args.bin2, 'rb')
> > +file1.seek(0, 2)
> > +size1 = file1.tell()
> > +file2.seek(0, 2)
> > +size2 = file2.tell()
> > +if size1 > size2:
> > + file1, file2 = file2, file1
> > + size1, size2 = size2, size1
> > +if size2 != size1 + gap:
> > + raise Exception('File sizes do not match')
> > +
> > +file1.seek(0, 0)
> > +data1 = file1.read(size1)
> > +file2.seek(gap, 0)
> > +data2 = file2.read(size1)
> > +
> > +max_line = max(symbols.keys())
> > +
> > +def to_int32(n):
> > + '''Convert a number to signed 32 bit integer truncating if needed'''
> > + mask = (1 << 32) - 1
> > + h = 1 << 31
> > + return (n & mask) ^ h - h
> > +
> > +i = 0
> > +references = {}
> > +internals = 0
> > +while i <= size1 - 4:
> > + n1 = struct.unpack('<I', data1[i:i+4])[0]
> > + n2 = struct.unpack('<I', data2[i:i+4])[0]
> > + i += 1
> > + # The two numbers are the same, no problems
> > + if n1 == n2:
> > + continue
> > + # Try to understand why they are different
> > + diff = to_int32(n1 - n2)
> > + if diff == -gap: # this is an internal relocation
> > + pos = i - 1
> > + print(("Internal relocation found at position %#x "
> > + "n1=%#x n2=%#x diff=%#x") % (pos, n1, n2, diff),
>
> Here and elsewhere, you don't need brackets around around the string
> itself. Python strings are like C strings and will auto-concatenate.
>
Done (also another similar occurency below).
> > + file=sys.stderr)
> > + i += 3
> > + internals += 1
> > + if internals >= 10:
> > + break
> > + continue
> > + # This is a relative relocation to a symbol, accepted, code/data is
> > + # relocatable.
> > + if diff < gap and diff >= gap - max_line:
> > + n = gap - diff
> > + symbol = symbols.get(n)
> > + # check we have a symbol
> > + if symbol is None:
> > + raise Exception("Cannot find symbol for line %d" % n)
> > + pos = i - 1
> > + if args.verbose:
> > + print('Position %#x %d %s' % (pos, n, symbol), file=sys.stderr)
> > + i += 3
> > + references[pos] = symbol
> > + continue
> > + # First byte is the same, move to next byte
> > + if diff & 0xff == 0 and i <= size1 - 4:
> > + continue
> > + # Probably a type of relocation we don't want or support
> > + pos = i - 1
> > + suggestion = ''
> > + symbol = symbols.get(-diff - text_diff)
> > + if symbol is not None:
> > + suggestion = " Maybe %s is not defined as hidden?" % symbol
> > + raise Exception(("Unexpected difference found at %#x "
> > + "n1=%#x n2=%#x diff=%#x gap=%#x.%s") % \
> > + (pos, n1, n2, diff, gap, suggestion))
Here removed other parenthesis
> > +if internals != 0:
> > + raise Exception("Previous relocations found")
> > +
> > +def line_bytes(buf, out):
> > + '''Output an assembly line with all bytes in "buf"'''
> > + if type(buf) == str:
> > + print("\t.byte " + ','.join([str(ord(c)) for c in buf]), file=out)
> > + else:
> > + print("\t.byte " + ','.join([str(n) for n in buf]), file=out)
>
> I'm guessing this is Py2/3 compatibility?
>
Yes, as added
# Python 2 compatibility
will state
> > +
> > +def part(start, end, out):
> > + '''Output bytes of "data" from "start" to "end"'''
> > + while start < end:
> > + e = min(start + 16, end)
> > + line_bytes(data1[start:e], out)
> > + start = e
> > +
> > +def reference(pos, out):
> > + name = references[pos]
> > + n = struct.unpack('<I', data1[pos:pos+4])[0]
> > + sign = '+'
> > + if n >= (1 << 31):
> > + n -= (1 << 32)
> > + n += pos
> > + if n < 0:
> > + n = -n
> > + sign = '-'
> > + print("\t.hidden %s\n\t.long %s %s %#x - ." % (name, name, sign, n),
>
> Personally, I think this is easier to read as:
>
> print("\t.hidden %s\n"
> "\t.long %s %s %#x - ." % (name, name, sign, n),
> file=out)
>
> so it visually matches the output too. Same for .globl/hidden lower.
>
Done
> ~Andrew
Frediano
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 5/5] x86/boot: Clarify comment
2024-10-11 13:38 ` Andrew Cooper
@ 2024-10-11 13:58 ` Frediano Ziglio
2024-10-11 15:06 ` Alejandro Vallejo
2024-10-11 14:06 ` Andrew Cooper
1 sibling, 1 reply; 23+ messages in thread
From: Frediano Ziglio @ 2024-10-11 13:58 UTC (permalink / raw)
To: Andrew Cooper
Cc: Alejandro Vallejo, xen-devel, Jan Beulich, Roger Pau Monné
On Fri, Oct 11, 2024 at 2:38 PM Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>
> On 11/10/2024 2:28 pm, Alejandro Vallejo wrote:
> > On Fri, Oct 11, 2024 at 02:08:37PM +0100, Frediano Ziglio wrote:
> >> On Fri, Oct 11, 2024 at 1:56 PM Alejandro Vallejo
> >> <alejandro.vallejo@cloud.com> wrote:
> >>> On Fri, Oct 11, 2024 at 09:52:44AM +0100, Frediano Ziglio wrote:
> >>>> Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
> >>>> ---
> >>>> xen/arch/x86/boot/reloc.c | 2 +-
> >>>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/xen/arch/x86/boot/reloc.c b/xen/arch/x86/boot/reloc.c
> >>>> index e50e161b27..e725cfb6eb 100644
> >>>> --- a/xen/arch/x86/boot/reloc.c
> >>>> +++ b/xen/arch/x86/boot/reloc.c
> >>>> @@ -65,7 +65,7 @@ typedef struct memctx {
> >>>> /*
> >>>> * Simple bump allocator.
> >>>> *
> >>>> - * It starts from the base of the trampoline and allocates downwards.
> >>>> + * It starts on top of space reserved for the trampoline and allocates downwards.
> >>> nit: Not sure this is much clearer. The trampoline is not a stack (and even if
> >>> it was, I personally find "top" and "bottom" quite ambiguous when it grows
> >>> backwards), so calling top to its lowest address seems more confusing than not.
> >>>
> >>> If anything clarification ought to go in the which direction it takes. Leaving
> >>> "base" instead of "top" and replacing "downwards" by "backwards" to make it
> >>> crystal clear that it's a pointer that starts where the trampoline starts, but
> >>> moves in the opposite direction.
> >>>
> >> Base looks confusing to me, but surely that comment could be confusing.
> >> For the trampoline 64 KB are reserved. Last 4 KB are used as a normal
> >> stack (push/pop/call/whatever), first part gets a copy of the
> >> trampoline code/data (about 6 Kb) the rest (so 64 - 4 - ~6 = ~54 kb)
> >> is used for the copy of MBI information. That "rest" is what we are
> >> talking about here.
> > Last? From what I looked at it seems to be the first 12K.
> >
> > #define TRAMPOLINE_STACK_SPACE PAGE_SIZE
> > #define TRAMPOLINE_SPACE (KB(64) - TRAMPOLINE_STACK_SPACE)
> >
> > To put it another way, with left=lo-addr and right=hi-addr. The code seems to
> > do this...
> >
> > |<--------------64K-------------->|
> > |<-----12K--->| |
> > +-------------+-----+-------------+
> > | stack-space | mbi | trampoline |
> > +-------------+-----+-------------+
> > ^ ^
> > | |
> > | +-- copied Multiboot info + modules
> > +----- initial memctx.ptr
> >
> > ... with the stack growing backwards to avoid overflowing onto mbi.
> >
> > Or am I missing something?
>
> So I was hoping for some kind of diagram like this, to live in
> arch/x86/include/asm/trampoline.h with the other notes about the trampoline.
>
> But, is that diagram accurate? Looking at
/* Switch to low-memory stack which lives at the end of
trampoline region. */
mov sym_esi(trampoline_phys), %edi
lea TRAMPOLINE_SPACE+TRAMPOLINE_STACK_SPACE(%edi),%esp
lea trampoline_boot_cpu_entry-trampoline_start(%edi),%eax
pushl $BOOT_CS32
push %eax
/* Copy bootstrap trampoline to low memory, below 1MB. */
lea sym_esi(trampoline_start), %esi
mov $((trampoline_end - trampoline_start) / 4),%ecx
rep movsl
So, from low to high
- trampoline code/data (%edi at beginning of copy is trampoline_phys,
%esi is trampoline_start)
- space (used for MBI copy)
- stack (%esp is set to trampoline_phys + TRAMPOLINE_SPACE +
TRAMPOLINE_STACK_SPACE)
Frediano
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 5/5] x86/boot: Clarify comment
2024-10-11 13:38 ` Andrew Cooper
2024-10-11 13:58 ` Frediano Ziglio
@ 2024-10-11 14:06 ` Andrew Cooper
1 sibling, 0 replies; 23+ messages in thread
From: Andrew Cooper @ 2024-10-11 14:06 UTC (permalink / raw)
To: Alejandro Vallejo, Frediano Ziglio
Cc: xen-devel, Jan Beulich, Roger Pau Monné
On 11/10/2024 2:38 pm, Andrew Cooper wrote:
> On 11/10/2024 2:28 pm, Alejandro Vallejo wrote:
>> On Fri, Oct 11, 2024 at 02:08:37PM +0100, Frediano Ziglio wrote:
>>> On Fri, Oct 11, 2024 at 1:56 PM Alejandro Vallejo
>>> <alejandro.vallejo@cloud.com> wrote:
>>>> On Fri, Oct 11, 2024 at 09:52:44AM +0100, Frediano Ziglio wrote:
>>>>> Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
>>>>> ---
>>>>> xen/arch/x86/boot/reloc.c | 2 +-
>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/xen/arch/x86/boot/reloc.c b/xen/arch/x86/boot/reloc.c
>>>>> index e50e161b27..e725cfb6eb 100644
>>>>> --- a/xen/arch/x86/boot/reloc.c
>>>>> +++ b/xen/arch/x86/boot/reloc.c
>>>>> @@ -65,7 +65,7 @@ typedef struct memctx {
>>>>> /*
>>>>> * Simple bump allocator.
>>>>> *
>>>>> - * It starts from the base of the trampoline and allocates downwards.
>>>>> + * It starts on top of space reserved for the trampoline and allocates downwards.
>>>> nit: Not sure this is much clearer. The trampoline is not a stack (and even if
>>>> it was, I personally find "top" and "bottom" quite ambiguous when it grows
>>>> backwards), so calling top to its lowest address seems more confusing than not.
>>>>
>>>> If anything clarification ought to go in the which direction it takes. Leaving
>>>> "base" instead of "top" and replacing "downwards" by "backwards" to make it
>>>> crystal clear that it's a pointer that starts where the trampoline starts, but
>>>> moves in the opposite direction.
>>>>
>>> Base looks confusing to me, but surely that comment could be confusing.
>>> For the trampoline 64 KB are reserved. Last 4 KB are used as a normal
>>> stack (push/pop/call/whatever), first part gets a copy of the
>>> trampoline code/data (about 6 Kb) the rest (so 64 - 4 - ~6 = ~54 kb)
>>> is used for the copy of MBI information. That "rest" is what we are
>>> talking about here.
>> Last? From what I looked at it seems to be the first 12K.
>>
>> #define TRAMPOLINE_STACK_SPACE PAGE_SIZE
>> #define TRAMPOLINE_SPACE (KB(64) - TRAMPOLINE_STACK_SPACE)
>>
>> To put it another way, with left=lo-addr and right=hi-addr. The code seems to
>> do this...
>>
>> |<--------------64K-------------->|
>> |<-----12K--->| |
>> +-------------+-----+-------------+
>> | stack-space | mbi | trampoline |
>> +-------------+-----+-------------+
>> ^ ^
>> | |
>> | +-- copied Multiboot info + modules
>> +----- initial memctx.ptr
>>
>> ... with the stack growing backwards to avoid overflowing onto mbi.
>>
>> Or am I missing something?
> So I was hoping for some kind of diagram like this, to live in
> arch/x86/include/asm/trampoline.h with the other notes about the trampoline.
>
> But, is that diagram accurate? Looking at
Sorry, sent too early.
GDB says that trampoline_end-trampoline_start is 6512, so one and a half
pages.
TRAMPOLINE_STACK_SPACE is 4k, and subtracted from 64k to make
TRAMPOLINE_SPACE
So this is why code uses TRAMPOLINE_SPACE+TRAMPOLINE_STACK_SPACE (==
64k) for it's size calculation.
Within that, we've got MBI_SPACE_MIN (8k) used in the linker assertion,
for SPACE (60k) - (end - start)(~7k).
What we're really saying is that the total size is 64k, with the top 4k
being stack, the bottom 7k being .text(ish), and the middle 53k being
the MBI relocation area.
And memctx is a backwards bump allocator within the middle 53k.
Maybe we should defer this until after the Hyperlaunch BootInfo series
has come in. Later parts have a major change on how we handle the MBI
block, and I can see some substantial pruning opportunities.
~Andrew
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 5/5] x86/boot: Clarify comment
2024-10-11 13:58 ` Frediano Ziglio
@ 2024-10-11 15:06 ` Alejandro Vallejo
2024-10-11 16:07 ` Andrew Cooper
0 siblings, 1 reply; 23+ messages in thread
From: Alejandro Vallejo @ 2024-10-11 15:06 UTC (permalink / raw)
To: Frediano Ziglio, Andrew Cooper
Cc: xen-devel, Jan Beulich, Roger Pau Monné
On Fri Oct 11, 2024 at 2:58 PM BST, Frediano Ziglio wrote:
> On Fri, Oct 11, 2024 at 2:38 PM Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> >
> > On 11/10/2024 2:28 pm, Alejandro Vallejo wrote:
> > > On Fri, Oct 11, 2024 at 02:08:37PM +0100, Frediano Ziglio wrote:
> > >> On Fri, Oct 11, 2024 at 1:56 PM Alejandro Vallejo
> > >> <alejandro.vallejo@cloud.com> wrote:
> > >>> On Fri, Oct 11, 2024 at 09:52:44AM +0100, Frediano Ziglio wrote:
> > >>>> Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
> > >>>> ---
> > >>>> xen/arch/x86/boot/reloc.c | 2 +-
> > >>>> 1 file changed, 1 insertion(+), 1 deletion(-)
> > >>>>
> > >>>> diff --git a/xen/arch/x86/boot/reloc.c b/xen/arch/x86/boot/reloc.c
> > >>>> index e50e161b27..e725cfb6eb 100644
> > >>>> --- a/xen/arch/x86/boot/reloc.c
> > >>>> +++ b/xen/arch/x86/boot/reloc.c
> > >>>> @@ -65,7 +65,7 @@ typedef struct memctx {
> > >>>> /*
> > >>>> * Simple bump allocator.
> > >>>> *
> > >>>> - * It starts from the base of the trampoline and allocates downwards.
> > >>>> + * It starts on top of space reserved for the trampoline and allocates downwards.
> > >>> nit: Not sure this is much clearer. The trampoline is not a stack (and even if
> > >>> it was, I personally find "top" and "bottom" quite ambiguous when it grows
> > >>> backwards), so calling top to its lowest address seems more confusing than not.
> > >>>
> > >>> If anything clarification ought to go in the which direction it takes. Leaving
> > >>> "base" instead of "top" and replacing "downwards" by "backwards" to make it
> > >>> crystal clear that it's a pointer that starts where the trampoline starts, but
> > >>> moves in the opposite direction.
> > >>>
> > >> Base looks confusing to me, but surely that comment could be confusing.
> > >> For the trampoline 64 KB are reserved. Last 4 KB are used as a normal
> > >> stack (push/pop/call/whatever), first part gets a copy of the
> > >> trampoline code/data (about 6 Kb) the rest (so 64 - 4 - ~6 = ~54 kb)
> > >> is used for the copy of MBI information. That "rest" is what we are
> > >> talking about here.
> > > Last? From what I looked at it seems to be the first 12K.
> > >
> > > #define TRAMPOLINE_STACK_SPACE PAGE_SIZE
> > > #define TRAMPOLINE_SPACE (KB(64) - TRAMPOLINE_STACK_SPACE)
> > >
> > > To put it another way, with left=lo-addr and right=hi-addr. The code seems to
> > > do this...
> > >
> > > |<--------------64K-------------->|
> > > |<-----12K--->| |
s/12K/4K/
My brain merged the 12bits in the wrong place. Too much bit twiddling.
> > > +-------------+-----+-------------+
> > > | stack-space | mbi | trampoline |
> > > +-------------+-----+-------------+
> > > ^ ^
> > > | |
> > > | +-- copied Multiboot info + modules
> > > +----- initial memctx.ptr
> > >
> > > ... with the stack growing backwards to avoid overflowing onto mbi.
> > >
> > > Or am I missing something?
> >
> > So I was hoping for some kind of diagram like this, to live in
> > arch/x86/include/asm/trampoline.h with the other notes about the trampoline.
> >
> > But, is that diagram accurate? Looking at
>
> /* Switch to low-memory stack which lives at the end of
> trampoline region. */
> mov sym_esi(trampoline_phys), %edi
> lea TRAMPOLINE_SPACE+TRAMPOLINE_STACK_SPACE(%edi),%esp
> lea trampoline_boot_cpu_entry-trampoline_start(%edi),%eax
> pushl $BOOT_CS32
> push %eax
>
> /* Copy bootstrap trampoline to low memory, below 1MB. */
> lea sym_esi(trampoline_start), %esi
> mov $((trampoline_end - trampoline_start) / 4),%ecx
> rep movsl
>
> So, from low to high
> - trampoline code/data (%edi at beginning of copy is trampoline_phys,
> %esi is trampoline_start)
> - space (used for MBI copy)
> - stack (%esp is set to trampoline_phys + TRAMPOLINE_SPACE +
> TRAMPOLINE_STACK_SPACE)
>
> Frediano
So it's reversed from what I thought
|<--------------64K-------------->|
| |<-----4K---->|
+-------------+-----+-------------+
| text-(ish) | mbi | stack-space |
+-------------+-----+-------------+
^ ^
| |
| +-- initial memctx.ptr
+------------------- copied Multiboot info + modules
Your version of the comment is a definite improvement over the nonsense that
was there before. Sorry for the noise :)
Cheers,
Alejandro
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 5/5] x86/boot: Clarify comment
2024-10-11 15:06 ` Alejandro Vallejo
@ 2024-10-11 16:07 ` Andrew Cooper
0 siblings, 0 replies; 23+ messages in thread
From: Andrew Cooper @ 2024-10-11 16:07 UTC (permalink / raw)
To: Alejandro Vallejo, Frediano Ziglio
Cc: xen-devel, Jan Beulich, Roger Pau Monné
On 11/10/2024 4:06 pm, Alejandro Vallejo wrote:
> On Fri Oct 11, 2024 at 2:58 PM BST, Frediano Ziglio wrote:
>> On Fri, Oct 11, 2024 at 2:38 PM Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>> On 11/10/2024 2:28 pm, Alejandro Vallejo wrote:
>>>> On Fri, Oct 11, 2024 at 02:08:37PM +0100, Frediano Ziglio wrote:
>>>>> On Fri, Oct 11, 2024 at 1:56 PM Alejandro Vallejo
>>>>> <alejandro.vallejo@cloud.com> wrote:
>>>>>> On Fri, Oct 11, 2024 at 09:52:44AM +0100, Frediano Ziglio wrote:
>>>>>>> Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
>>>>>>> ---
>>>>>>> xen/arch/x86/boot/reloc.c | 2 +-
>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/xen/arch/x86/boot/reloc.c b/xen/arch/x86/boot/reloc.c
>>>>>>> index e50e161b27..e725cfb6eb 100644
>>>>>>> --- a/xen/arch/x86/boot/reloc.c
>>>>>>> +++ b/xen/arch/x86/boot/reloc.c
>>>>>>> @@ -65,7 +65,7 @@ typedef struct memctx {
>>>>>>> /*
>>>>>>> * Simple bump allocator.
>>>>>>> *
>>>>>>> - * It starts from the base of the trampoline and allocates downwards.
>>>>>>> + * It starts on top of space reserved for the trampoline and allocates downwards.
>>>>>> nit: Not sure this is much clearer. The trampoline is not a stack (and even if
>>>>>> it was, I personally find "top" and "bottom" quite ambiguous when it grows
>>>>>> backwards), so calling top to its lowest address seems more confusing than not.
>>>>>>
>>>>>> If anything clarification ought to go in the which direction it takes. Leaving
>>>>>> "base" instead of "top" and replacing "downwards" by "backwards" to make it
>>>>>> crystal clear that it's a pointer that starts where the trampoline starts, but
>>>>>> moves in the opposite direction.
>>>>>>
>>>>> Base looks confusing to me, but surely that comment could be confusing.
>>>>> For the trampoline 64 KB are reserved. Last 4 KB are used as a normal
>>>>> stack (push/pop/call/whatever), first part gets a copy of the
>>>>> trampoline code/data (about 6 Kb) the rest (so 64 - 4 - ~6 = ~54 kb)
>>>>> is used for the copy of MBI information. That "rest" is what we are
>>>>> talking about here.
>>>> Last? From what I looked at it seems to be the first 12K.
>>>>
>>>> #define TRAMPOLINE_STACK_SPACE PAGE_SIZE
>>>> #define TRAMPOLINE_SPACE (KB(64) - TRAMPOLINE_STACK_SPACE)
>>>>
>>>> To put it another way, with left=lo-addr and right=hi-addr. The code seems to
>>>> do this...
>>>>
>>>> |<--------------64K-------------->|
>>>> |<-----12K--->| |
> s/12K/4K/
>
> My brain merged the 12bits in the wrong place. Too much bit twiddling.
>
>>>> +-------------+-----+-------------+
>>>> | stack-space | mbi | trampoline |
>>>> +-------------+-----+-------------+
>>>> ^ ^
>>>> | |
>>>> | +-- copied Multiboot info + modules
>>>> +----- initial memctx.ptr
>>>>
>>>> ... with the stack growing backwards to avoid overflowing onto mbi.
>>>>
>>>> Or am I missing something?
>>> So I was hoping for some kind of diagram like this, to live in
>>> arch/x86/include/asm/trampoline.h with the other notes about the trampoline.
>>>
>>> But, is that diagram accurate? Looking at
>> /* Switch to low-memory stack which lives at the end of
>> trampoline region. */
>> mov sym_esi(trampoline_phys), %edi
>> lea TRAMPOLINE_SPACE+TRAMPOLINE_STACK_SPACE(%edi),%esp
>> lea trampoline_boot_cpu_entry-trampoline_start(%edi),%eax
>> pushl $BOOT_CS32
>> push %eax
>>
>> /* Copy bootstrap trampoline to low memory, below 1MB. */
>> lea sym_esi(trampoline_start), %esi
>> mov $((trampoline_end - trampoline_start) / 4),%ecx
>> rep movsl
>>
>> So, from low to high
>> - trampoline code/data (%edi at beginning of copy is trampoline_phys,
>> %esi is trampoline_start)
>> - space (used for MBI copy)
>> - stack (%esp is set to trampoline_phys + TRAMPOLINE_SPACE +
>> TRAMPOLINE_STACK_SPACE)
>>
>> Frediano
> So it's reversed from what I thought
>
> |<--------------64K-------------->|
> | |<-----4K---->|
> +-------------+-----+-------------+
> | text-(ish) | mbi | stack-space |
> +-------------+-----+-------------+
> ^ ^
> | |
> | +-- initial memctx.ptr
> +------------------- copied Multiboot info + modules
>
>
> Your version of the comment is a definite improvement over the nonsense that
> was there before. Sorry for the noise :)
Today, the pointer that becomes memctx.ptr is phys+SPACE, which does not
include the stack.
So initial memctx.ptr starts immediately below the stack, and the bump
allocator goes backwards (leftwards).
~Andrew
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2024-10-11 16:07 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-11 8:52 [PATCH v3 0/5] Reuse 32 bit C code more safely Frediano Ziglio
2024-10-11 8:52 ` [PATCH v3 1/5] x86/boot: create a C bundle for 32 bit boot code and use it Frediano Ziglio
2024-10-11 10:57 ` [Makefile only] " Andrew Cooper
2024-10-11 13:17 ` [python] " Andrew Cooper
2024-10-11 13:53 ` Frediano Ziglio
2024-10-11 13:20 ` Jan Beulich
2024-10-11 13:21 ` Andrew Cooper
2024-10-11 8:52 ` [PATCH v3 2/5] x86/boot: Reuse code to relocate trampoline Frediano Ziglio
2024-10-11 12:02 ` Andrew Cooper
2024-10-11 12:50 ` Jan Beulich
2024-10-11 8:52 ` [PATCH v3 3/5] x86/boot: Use boot_vid_info variable directly from C code Frediano Ziglio
2024-10-11 12:07 ` Andrew Cooper
2024-10-11 8:52 ` [PATCH v3 4/5] x86/boot: Use trampoline_phys " Frediano Ziglio
2024-10-11 12:05 ` Andrew Cooper
2024-10-11 8:52 ` [PATCH v3 5/5] x86/boot: Clarify comment Frediano Ziglio
2024-10-11 12:56 ` Alejandro Vallejo
2024-10-11 13:08 ` Frediano Ziglio
2024-10-11 13:28 ` Alejandro Vallejo
2024-10-11 13:38 ` Andrew Cooper
2024-10-11 13:58 ` Frediano Ziglio
2024-10-11 15:06 ` Alejandro Vallejo
2024-10-11 16:07 ` Andrew Cooper
2024-10-11 14:06 ` Andrew Cooper
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.