* [PATCH v2 0/5] symbols: assorted adjustments
@ 2025-04-02 13:56 Jan Beulich
2025-04-02 13:57 ` [PATCH v2 1/5] symbols: add minimal self-test Jan Beulich
` (4 more replies)
0 siblings, 5 replies; 21+ messages in thread
From: Jan Beulich @ 2025-04-02 13:56 UTC (permalink / raw)
To: xen-devel@lists.xenproject.org
Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, Anthony PERARD,
Michal Orzel, Roger Pau Monné
The main goal was what patch 3 does, but on the way there various other
things became noticeable, and some preparation was necessary, too.
1: add minimal self-test
2: split symbols_num_syms
3: arrange to know where functions end
4: centralize and re-arrange $(all_symbols) calculation
5: prefer symbols which have a type
Jan
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 1/5] symbols: add minimal self-test
2025-04-02 13:56 [PATCH v2 0/5] symbols: assorted adjustments Jan Beulich
@ 2025-04-02 13:57 ` Jan Beulich
2025-08-27 22:20 ` Jason Andryuk
2025-08-29 14:24 ` Roger Pau Monné
2025-04-02 13:58 ` [PATCH v2 2/5] symbols: split symbols_num_syms Jan Beulich
` (3 subsequent siblings)
4 siblings, 2 replies; 21+ messages in thread
From: Jan Beulich @ 2025-04-02 13:57 UTC (permalink / raw)
To: xen-devel@lists.xenproject.org
Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, Anthony PERARD,
Michal Orzel, Roger Pau Monné
... before making changes to the involved logic.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
While Andrew validly suggests cf_check isn't a requirement for selecting
which function(s) to use (with the non-upstream gcc patch that we're
using in CI), that's only because of how the non-upstream patch works.
Going function-pointer -> unsigned long -> function-pointer without it
being diagnosed that the cf_check is missing is a shortcoming there, and
might conceivably be fixed at some point. (Imo any address-taking on a
function should require it to be cf_check.) Hence I'd like to stick to
using cf_check functions only for passing to test_lookup().
With this FAST_SYMBOL_LOOKUP may make sense to permit enabling even
when LIVEPATCH=n. Thoughts? (In this case "symbols: centralize and re-
arrange $(all_symbols) calculation" would want pulling ahead.)
--- a/xen/common/symbols.c
+++ b/xen/common/symbols.c
@@ -260,6 +260,41 @@ unsigned long symbols_lookup_by_name(con
return 0;
}
+#ifdef CONFIG_SELF_TESTS
+
+static void __init test_lookup(unsigned long addr, const char *expected)
+{
+ char buf[KSYM_NAME_LEN + 1];
+ const char *name, *symname;
+ unsigned long size, offs;
+
+ name = symbols_lookup(addr, &size, &offs, buf);
+ if ( !name )
+ panic("%s: address not found\n", expected);
+ if ( offs )
+ panic("%s: non-zero offset (%#lx) unexpected\n", expected, offs);
+
+ /* Cope with static symbols, where varying file names/paths may be used. */
+ symname = strchr(name, '#');
+ symname = symname ? symname + 1 : name;
+ if ( strcmp(symname, expected) )
+ panic("%s: unexpected symbol name: '%s'\n", expected, symname);
+
+ offs = symbols_lookup_by_name(name);
+ if ( offs != addr )
+ panic("%s: address %#lx unexpected; wanted %#lx\n",
+ expected, offs, addr);
+}
+
+static void __init __constructor test_symbols(void)
+{
+ /* Be sure to only try this for cf_check functions. */
+ test_lookup((unsigned long)dump_execstate, "dump_execstate");
+ test_lookup((unsigned long)test_symbols, __func__);
+}
+
+#endif /* CONFIG_SELF_TESTS */
+
/*
* Local variables:
* mode: C
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 2/5] symbols: split symbols_num_syms
2025-04-02 13:56 [PATCH v2 0/5] symbols: assorted adjustments Jan Beulich
2025-04-02 13:57 ` [PATCH v2 1/5] symbols: add minimal self-test Jan Beulich
@ 2025-04-02 13:58 ` Jan Beulich
2025-08-27 22:20 ` Jason Andryuk
2025-04-02 13:58 ` [PATCH v2 3/5] symbols: arrange to know where functions end Jan Beulich
` (2 subsequent siblings)
4 siblings, 1 reply; 21+ messages in thread
From: Jan Beulich @ 2025-04-02 13:58 UTC (permalink / raw)
To: xen-devel@lists.xenproject.org
Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, Anthony PERARD,
Michal Orzel, Roger Pau Monné
In preparation for inserting address entries into symbols_addresses[] /
symbols_offsets[] without enlarging symbols_sorted_offsets[], split
symbols_num_syms into symbols_num_addrs (counting entries in the former
plus symbols_names[] as well as, less directly, symbols_markers[]) and
symbols_num_names (counting entries in the latter).
While doing the adjustment move declarations to a new private symbols.h,
to be used by both symbols.c and symbols-dummy.c. Replace u8/u16 while
doing so.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
--- a/xen/common/symbols.c
+++ b/xen/common/symbols.c
@@ -10,7 +10,6 @@
* compression (see tools/symbols.c for a more complete description)
*/
-#include <xen/symbols.h>
#include <xen/kernel.h>
#include <xen/init.h>
#include <xen/lib.h>
@@ -21,22 +20,7 @@
#include <xen/guest_access.h>
#include <xen/errno.h>
-#ifdef SYMBOLS_ORIGIN
-extern const unsigned int symbols_offsets[];
-#define symbols_address(n) (SYMBOLS_ORIGIN + symbols_offsets[n])
-#else
-extern const unsigned long symbols_addresses[];
-#define symbols_address(n) symbols_addresses[n]
-#endif
-extern const unsigned int symbols_num_syms;
-extern const u8 symbols_names[];
-
-extern const struct symbol_offset symbols_sorted_offsets[];
-
-extern const u8 symbols_token_table[];
-extern const u16 symbols_token_index[];
-
-extern const unsigned int symbols_markers[];
+#include "symbols.h"
/* expand a compressed symbol data into the resulting uncompressed string,
given the offset to where the symbol is in the compressed stream */
@@ -124,7 +108,7 @@ const char *symbols_lookup(unsigned long
/* do a binary search on the sorted symbols_addresses array */
low = 0;
- high = symbols_num_syms;
+ high = symbols_num_addrs;
while (high-low > 1) {
mid = (low + high) / 2;
@@ -141,7 +125,7 @@ const char *symbols_lookup(unsigned long
symbols_expand_symbol(get_symbol_offset(low), namebuf);
/* Search for next non-aliased symbol */
- for (i = low + 1; i < symbols_num_syms; i++) {
+ for (i = low + 1; i < symbols_num_addrs; i++) {
if (symbols_address(i) > symbols_address(low)) {
symbol_end = symbols_address(i);
break;
@@ -182,9 +166,9 @@ int xensyms_read(uint32_t *symnum, char
static unsigned int next_symbol, next_offset;
static DEFINE_SPINLOCK(symbols_mutex);
- if ( *symnum > symbols_num_syms )
+ if ( *symnum > symbols_num_addrs )
return -ERANGE;
- if ( *symnum == symbols_num_syms )
+ if ( *symnum == symbols_num_addrs )
{
/* No more symbols */
name[0] = '\0';
@@ -227,7 +211,7 @@ unsigned long symbols_lookup_by_name(con
#ifdef CONFIG_FAST_SYMBOL_LOOKUP
low = 0;
- high = symbols_num_syms;
+ high = symbols_num_names;
while ( low < high )
{
unsigned long mid = low + ((high - low) / 2);
--- /dev/null
+++ b/xen/common/symbols.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#include <xen/stdint.h>
+#include <xen/symbols.h>
+
+#ifdef SYMBOLS_ORIGIN
+extern const unsigned int symbols_offsets[];
+#define symbols_address(n) (SYMBOLS_ORIGIN + symbols_offsets[n])
+#else
+extern const unsigned long symbols_addresses[];
+#define symbols_address(n) symbols_addresses[n]
+#endif
+extern const unsigned int symbols_num_addrs;
+extern const unsigned char symbols_names[];
+
+extern const unsigned int symbols_num_names;
+extern const struct symbol_offset symbols_sorted_offsets[];
+
+extern const uint8_t symbols_token_table[];
+extern const uint16_t symbols_token_index[];
+
+extern const unsigned int symbols_markers[];
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
--- a/xen/common/symbols-dummy.c
+++ b/xen/common/symbols-dummy.c
@@ -3,22 +3,22 @@
* link of the hypervisor image.
*/
-#include <xen/types.h>
-#include <xen/symbols.h>
+#include "symbols.h"
#ifdef SYMBOLS_ORIGIN
const unsigned int symbols_offsets[1];
#else
const unsigned long symbols_addresses[1];
#endif
-const unsigned int symbols_num_syms;
-const u8 symbols_names[1];
+const unsigned int symbols_num_addrs;
+const unsigned char symbols_names[1];
#ifdef CONFIG_FAST_SYMBOL_LOOKUP
+const unsigned int symbols_num_names;
const struct symbol_offset symbols_sorted_offsets[1];
#endif
-const u8 symbols_token_table[1];
-const u16 symbols_token_index[1];
+const uint8_t symbols_token_table[1];
+const uint16_t symbols_token_index[1];
const unsigned int symbols_markers[1];
--- a/xen/tools/symbols.c
+++ b/xen/tools/symbols.c
@@ -323,7 +323,7 @@ static void write_src(void)
}
printf("\n");
- output_label("symbols_num_syms");
+ output_label("symbols_num_addrs");
printf("\t.long\t%d\n", table_cnt);
printf("\n");
@@ -373,6 +373,10 @@ static void write_src(void)
return;
}
+ output_label("symbols_num_names");
+ printf("\t.long\t%d\n", table_cnt);
+ printf("\n");
+
/* Sorted by original symbol names and type. */
qsort(table, table_cnt, sizeof(*table), compare_name_orig);
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 3/5] symbols: arrange to know where functions end
2025-04-02 13:56 [PATCH v2 0/5] symbols: assorted adjustments Jan Beulich
2025-04-02 13:57 ` [PATCH v2 1/5] symbols: add minimal self-test Jan Beulich
2025-04-02 13:58 ` [PATCH v2 2/5] symbols: split symbols_num_syms Jan Beulich
@ 2025-04-02 13:58 ` Jan Beulich
2025-04-02 14:08 ` Jan Beulich
2025-08-28 1:03 ` Jason Andryuk
2025-04-02 13:59 ` [PATCH v2 4/5] symbols: centralize and re-arrange $(all_symbols) calculation Jan Beulich
2025-04-02 14:00 ` [PATCH v2 5/5] symbols: prefer symbols which have a type Jan Beulich
4 siblings, 2 replies; 21+ messages in thread
From: Jan Beulich @ 2025-04-02 13:58 UTC (permalink / raw)
To: xen-devel@lists.xenproject.org
Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, Anthony PERARD,
Michal Orzel, Roger Pau Monné
When determining the symbol for a given address (e.g. for the %pS
logging format specifier), so far the size of a symbol (function) was
assumed to be everything until the next symbol. There may be gaps
though, which would better be recognizable in output (often suggesting
something odd is going on).
Insert "fake" end symbols in the address table, accompanied by zero-
length type/name entries (to keep lookup reasonably close to how it
was).
Note however that this, with present GNU binutils, won't work for
xen.efi: The linker loses function sizes (they're not part of a normal
symbol table entry), and hence nm has no way of reporting them.
The address table growth is quite significant on x86 release builds (due
to functions being aligned to 16-byte boundaries), though: Its size
almost doubles.
Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
Note: Style-wise this is a horrible mix. I'm trying to match styles with
what's used in the respective functions.
Older GNU ld retains section symbols, which nm then also lists. Should
we perhaps strip those as we read in nm's output? They don't provide any
useful extra information, as our linker scripts add section start
symbols anyway. (For the purposes here, luckily such section symbols are
at least emitted without size.)
Even for section start symbols there is the question of whether they
really need retaining (except perhaps when producing a map file). The
main question here likely is whether livepatch may have a need to look
them up by name. (Section end symbols may actually be slightly more
useful to keep, but that may also want considering more closely.)
---
v2: Deal with multiple symbols at the same address, but only some having
a size specified.
--- a/xen/common/symbols.c
+++ b/xen/common/symbols.c
@@ -116,6 +116,13 @@ const char *symbols_lookup(unsigned long
else high = mid;
}
+ /* If we hit an END symbol, move to the previous (real) one. */
+ if (!symbols_names[get_symbol_offset(low)]) {
+ ASSERT(low);
+ symbol_end = symbols_address(low);
+ --low;
+ }
+
/* search for the first aliased symbol. Aliased symbols are
symbols with the same address */
while (low && symbols_address(low - 1) == symbols_address(low))
@@ -124,11 +131,13 @@ const char *symbols_lookup(unsigned long
/* Grab name */
symbols_expand_symbol(get_symbol_offset(low), namebuf);
- /* Search for next non-aliased symbol */
- for (i = low + 1; i < symbols_num_addrs; i++) {
- if (symbols_address(i) > symbols_address(low)) {
- symbol_end = symbols_address(i);
- break;
+ if (!symbol_end) {
+ /* Search for next non-aliased symbol */
+ for (i = low + 1; i < symbols_num_addrs; i++) {
+ if (symbols_address(i) > symbols_address(low)) {
+ symbol_end = symbols_address(i);
+ break;
+ }
}
}
@@ -170,6 +179,7 @@ int xensyms_read(uint32_t *symnum, char
return -ERANGE;
if ( *symnum == symbols_num_addrs )
{
+ no_symbol:
/* No more symbols */
name[0] = '\0';
return 0;
@@ -183,10 +193,31 @@ int xensyms_read(uint32_t *symnum, char
/* Non-sequential access */
next_offset = get_symbol_offset(*symnum);
+ /*
+ * If we're at an END symbol, skip to the next (real) one. This can
+ * happen if the caller ignores the *symnum output from an earlier
+ * iteration (Linux'es /proc/xen/xensyms handling does as of 6.14-rc).
+ */
+ if ( !symbols_names[next_offset] )
+ {
+ ++next_offset;
+ if ( ++*symnum == symbols_num_addrs )
+ goto no_symbol;
+ }
+
*type = symbols_get_symbol_type(next_offset);
next_offset = symbols_expand_symbol(next_offset, name);
*address = symbols_address(*symnum);
+ /* If next one is an END symbol, skip it. */
+ if ( !symbols_names[next_offset] )
+ {
+ ++next_offset;
+ /* Make sure not to increment past symbols_num_addrs below. */
+ if ( *symnum + 1 < symbols_num_addrs )
+ ++*symnum;
+ }
+
next_symbol = ++*symnum;
spin_unlock(&symbols_mutex);
--- a/xen/tools/symbols.c
+++ b/xen/tools/symbols.c
@@ -38,6 +38,7 @@
struct sym_entry {
unsigned long long addr;
+ unsigned long size;
unsigned int len;
unsigned char *sym;
char *orig_symbol;
@@ -87,6 +88,8 @@ static int read_symbol(FILE *in, struct
static char *filename;
int rc = -1;
+ s->size = 0;
+
switch (input_format) {
case fmt_bsd:
rc = fscanf(in, "%llx %c %499s\n", &s->addr, &stype, str);
@@ -96,8 +99,12 @@ static int read_symbol(FILE *in, struct
/* nothing */;
rc = fscanf(in, "%499[^ |] |%llx | %c |",
str, &s->addr, &stype);
- if (rc == 3 && fscanf(in, " %19[^ |] |", type) != 1)
- *type = '\0';
+ if (rc == 3) {
+ if(fscanf(in, " %19[^ |] |", type) != 1)
+ *type = '\0';
+ else if(fscanf(in, "%lx |", &s->size) != 1)
+ s->size = 0;
+ }
break;
}
if (rc != 3) {
@@ -287,9 +294,18 @@ static int compare_name_orig(const void
return rc;
}
+/* Determine whether the symbol at address table @idx wants a fake END
+ * symbol (address only) emitted as well. */
+static bool want_symbol_end(unsigned int idx)
+{
+ return table[idx].size &&
+ (idx + 1 == table_cnt ||
+ table[idx].addr + table[idx].size < table[idx + 1].addr);
+}
+
static void write_src(void)
{
- unsigned int i, k, off;
+ unsigned int i, k, off, ends;
unsigned int best_idx[256];
unsigned int *markers;
char buf[KSYM_NAME_LEN+1];
@@ -318,24 +334,42 @@ static void write_src(void)
printf("#else\n");
output_label("symbols_offsets");
printf("#endif\n");
- for (i = 0; i < table_cnt; i++) {
+ for (i = 0, ends = 0; i < table_cnt; i++) {
printf("\tPTR\t%#llx - SYMBOLS_ORIGIN\n", table[i].addr);
+
+ table[i].addr_idx = i + ends;
+
+ if (!want_symbol_end(i)) {
+ /* If there's another symbol at the same address,
+ * propagate this symbol's size if the next one has
+ * no size, or if the next one's size is larger. */
+ if (table[i].size &&
+ i + 1 < table_cnt &&
+ table[i + 1].addr == table[i].addr &&
+ (!table[i + 1].size ||
+ table[i + 1].size > table[i].size))
+ table[i + 1].size = table[i].size;
+ continue;
+ }
+
+ ++ends;
+ printf("\tPTR\t%#llx - SYMBOLS_ORIGIN\n",
+ table[i].addr + table[i].size);
}
printf("\n");
output_label("symbols_num_addrs");
- printf("\t.long\t%d\n", table_cnt);
+ printf("\t.long\t%d\n", table_cnt + ends);
printf("\n");
/* table of offset markers, that give the offset in the compressed stream
* every 256 symbols */
- markers = (unsigned int *) malloc(sizeof(unsigned int) * ((table_cnt + 255) / 256));
+ markers = malloc(sizeof(*markers) * ((table_cnt + ends + 255) >> 8));
output_label("symbols_names");
- off = 0;
- for (i = 0; i < table_cnt; i++) {
- if ((i & 0xFF) == 0)
- markers[i >> 8] = off;
+ for (i = 0, off = 0, ends = 0; i < table_cnt; i++) {
+ if (((i + ends) & 0xFF) == 0)
+ markers[(i + ends) >> 8] = off;
printf("\t.byte 0x%02x", table[i].len);
for (k = 0; k < table[i].len; k++)
@@ -344,11 +378,22 @@ static void write_src(void)
table[i].stream_offset = off;
off += table[i].len + 1;
+
+ if (!want_symbol_end(i))
+ continue;
+
+ /* END symbols have no name or type. */
+ ++ends;
+ if (((i + ends) & 0xFF) == 0)
+ markers[(i + ends) >> 8] = off;
+
+ printf("\t.byte 0\n");
+ ++off;
}
printf("\n");
output_label("symbols_markers");
- for (i = 0; i < ((table_cnt + 255) >> 8); i++)
+ for (i = 0; i < ((table_cnt + ends + 255) >> 8); i++)
printf("\t.long\t%d\n", markers[i]);
printf("\n");
@@ -450,7 +495,6 @@ static void compress_symbols(unsigned ch
len = table[i].len;
p1 = table[i].sym;
- table[i].addr_idx = i;
/* find the token on the symbol */
p2 = memmem_pvt(p1, len, str, 2);
if (!p2) continue;
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 4/5] symbols: centralize and re-arrange $(all_symbols) calculation
2025-04-02 13:56 [PATCH v2 0/5] symbols: assorted adjustments Jan Beulich
` (2 preceding siblings ...)
2025-04-02 13:58 ` [PATCH v2 3/5] symbols: arrange to know where functions end Jan Beulich
@ 2025-04-02 13:59 ` Jan Beulich
2025-08-28 1:05 ` Jason Andryuk
2025-04-02 14:00 ` [PATCH v2 5/5] symbols: prefer symbols which have a type Jan Beulich
4 siblings, 1 reply; 21+ messages in thread
From: Jan Beulich @ 2025-04-02 13:59 UTC (permalink / raw)
To: xen-devel@lists.xenproject.org
Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, Anthony PERARD,
Michal Orzel, Roger Pau Monné, Bertrand Marquis,
Volodymyr Babchuk
For one there's no need for each architecture to have the same logic.
Move to the root Makefile, also to calculate just once.
And then re-arrange to permit FAST_SYMBOL_LOOKUP to be independent of
LIVEPATCH, which may be useful in (at least) debugging.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
Likely syms-warn-dup-y wants to follow suit; it doesn't even have an Arm
counterpart right now.
--- a/xen/Makefile
+++ b/xen/Makefile
@@ -460,6 +460,10 @@ ALL_OBJS-$(CONFIG_CRYPTO) += crypto/buil
ALL_LIBS-y := lib/lib.a
+all-symbols-y :=
+all-symbols-$(CONFIG_LIVEPATCH) += --all-symbols
+all-symbols-$(CONFIG_FAST_SYMBOL_LOOKUP) += --sort-by-name
+
include $(srctree)/arch/$(SRCARCH)/arch.mk
# define new variables to avoid the ones defined in Config.mk
@@ -612,7 +616,8 @@ $(TARGET): outputmakefile asm-generic FO
$(Q)$(MAKE) $(build)=include all
$(Q)$(MAKE) $(build)=arch/$(SRCARCH) include
$(Q)$(MAKE) $(build)=. arch/$(SRCARCH)/include/asm/asm-offsets.h
- $(Q)$(MAKE) $(build)=. MKRELOC=$(MKRELOC) 'ALL_OBJS=$(ALL_OBJS-y)' 'ALL_LIBS=$(ALL_LIBS-y)' $@
+ $(Q)$(MAKE) $(build)=. MKRELOC=$(MKRELOC) 'ALL_OBJS=$(ALL_OBJS-y)' \
+ 'ALL_LIBS=$(ALL_LIBS-y)' 'all_symbols=$(all-symbols-y)' $@
SUBDIRS = xsm arch common crypto drivers lib test
define all_sources
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -81,15 +81,6 @@ ifneq ($(CONFIG_DTB_FILE),"")
obj-y += dtb.o
endif
-ifdef CONFIG_LIVEPATCH
-all_symbols = --all-symbols
-ifdef CONFIG_FAST_SYMBOL_LOOKUP
-all_symbols = --all-symbols --sort-by-name
-endif
-else
-all_symbols =
-endif
-
$(TARGET): $(TARGET)-syms
$(OBJCOPY) -O binary -S $< $@
ifeq ($(CONFIG_ARM_64),y)
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -111,15 +111,6 @@ notes_phdrs = --notes
endif
endif
-ifdef CONFIG_LIVEPATCH
-all_symbols = --all-symbols
-ifdef CONFIG_FAST_SYMBOL_LOOKUP
-all_symbols = --all-symbols --sort-by-name
-endif
-else
-all_symbols =
-endif
-
syms-warn-dup-y := --warn-dup
syms-warn-dup-$(CONFIG_SUPPRESS_DUPLICATE_SYMBOL_WARNINGS) :=
syms-warn-dup-$(CONFIG_ENFORCE_UNIQUE_SYMBOLS) := --error-dup
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 5/5] symbols: prefer symbols which have a type
2025-04-02 13:56 [PATCH v2 0/5] symbols: assorted adjustments Jan Beulich
` (3 preceding siblings ...)
2025-04-02 13:59 ` [PATCH v2 4/5] symbols: centralize and re-arrange $(all_symbols) calculation Jan Beulich
@ 2025-04-02 14:00 ` Jan Beulich
2025-08-28 1:07 ` Jason Andryuk
4 siblings, 1 reply; 21+ messages in thread
From: Jan Beulich @ 2025-04-02 14:00 UTC (permalink / raw)
To: xen-devel@lists.xenproject.org
Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, Anthony PERARD,
Michal Orzel, Roger Pau Monné
... and thus typically also a size. Using global vs local is undesirable
in certain situations, e.g. when a "real" symbol is local and at the
same address as a section start/end one (which are all global).
Note that for xen.efi the checking for "Function" is only forward-
looking at this point: The function-ness of symbols (much like their
size) is lost when linking PE/COFF binaries from ELF objects with GNU ld
up to at least 2.44.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
I didn't see much reason to also check for "Pointer" and "Array" or any
of the basic types. While nm reports pointers and arrays (but not the
basic types) for PE/COFF, making those up when linker input is ELF would
be impossible without further auxiliary (and non-standard) data in the
ELF symbol table. Transforming STT_FUNC, otoh, is in principle possible.
Implicit from the above: Until GNU ld properly transforms STT_FUNC,
symbol conflicts will be resolved differently for functions. Symbol
conflicts will always be resolved differently for data. xen.efi stack
traces may therefore be less informative than xen-syms ones.
---
v2: New.
--- a/xen/tools/symbols.c
+++ b/xen/tools/symbols.c
@@ -45,6 +45,7 @@ struct sym_entry {
unsigned int addr_idx;
unsigned int stream_offset;
unsigned char type;
+ bool typed;
};
#define SYMBOL_NAME(s) ((char *)(s)->sym + 1)
@@ -180,6 +181,9 @@ static int read_symbol(FILE *in, struct
s->type = stype; /* As s->sym[0] ends mangled. */
}
s->sym[0] = stype;
+ s->typed = strcmp(type, "FUNC") == 0 ||
+ strcmp(type, "OBJECT") == 0 ||
+ strcmp(type, "Function") == 0;
rc = 0;
skip_tail:
@@ -613,6 +617,13 @@ static int compare_value(const void *p1,
return -1;
if (sym1->addr > sym2->addr)
return +1;
+
+ /* Prefer symbols which have a type. */
+ if (sym1->typed && !sym2->typed)
+ return -1;
+ if (sym2->typed && !sym1->typed)
+ return +1;
+
/* Prefer global symbols. */
if (isupper(*sym1->sym))
return -1;
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/5] symbols: arrange to know where functions end
2025-04-02 13:58 ` [PATCH v2 3/5] symbols: arrange to know where functions end Jan Beulich
@ 2025-04-02 14:08 ` Jan Beulich
2025-08-28 1:03 ` Jason Andryuk
1 sibling, 0 replies; 21+ messages in thread
From: Jan Beulich @ 2025-04-02 14:08 UTC (permalink / raw)
To: xen-devel@lists.xenproject.org
Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, Anthony PERARD,
Michal Orzel, Roger Pau Monné
On 02.04.2025 15:58, Jan Beulich wrote:
> When determining the symbol for a given address (e.g. for the %pS
> logging format specifier), so far the size of a symbol (function) was
> assumed to be everything until the next symbol. There may be gaps
> though, which would better be recognizable in output (often suggesting
> something odd is going on).
>
> Insert "fake" end symbols in the address table, accompanied by zero-
> length type/name entries (to keep lookup reasonably close to how it
> was).
>
> Note however that this, with present GNU binutils, won't work for
> xen.efi: The linker loses function sizes (they're not part of a normal
> symbol table entry), and hence nm has no way of reporting them.
And, just for reference:
https://sourceware.org/pipermail/binutils/2025-March/140252.html
Jan
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/5] symbols: add minimal self-test
2025-04-02 13:57 ` [PATCH v2 1/5] symbols: add minimal self-test Jan Beulich
@ 2025-08-27 22:20 ` Jason Andryuk
2025-08-29 14:24 ` Roger Pau Monné
1 sibling, 0 replies; 21+ messages in thread
From: Jason Andryuk @ 2025-08-27 22:20 UTC (permalink / raw)
To: Jan Beulich, xen-devel@lists.xenproject.org
Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, Anthony PERARD,
Michal Orzel, Roger Pau Monné
On 2025-04-02 09:57, Jan Beulich wrote:
> ... before making changes to the involved logic.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/5] symbols: split symbols_num_syms
2025-04-02 13:58 ` [PATCH v2 2/5] symbols: split symbols_num_syms Jan Beulich
@ 2025-08-27 22:20 ` Jason Andryuk
0 siblings, 0 replies; 21+ messages in thread
From: Jason Andryuk @ 2025-08-27 22:20 UTC (permalink / raw)
To: Jan Beulich, xen-devel@lists.xenproject.org
Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, Anthony PERARD,
Michal Orzel, Roger Pau Monné
On 2025-04-02 09:58, Jan Beulich wrote:
> In preparation for inserting address entries into symbols_addresses[] /
> symbols_offsets[] without enlarging symbols_sorted_offsets[], split
> symbols_num_syms into symbols_num_addrs (counting entries in the former
> plus symbols_names[] as well as, less directly, symbols_markers[]) and
> symbols_num_names (counting entries in the latter).
>
> While doing the adjustment move declarations to a new private symbols.h,
> to be used by both symbols.c and symbols-dummy.c. Replace u8/u16 while
> doing so.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/5] symbols: arrange to know where functions end
2025-04-02 13:58 ` [PATCH v2 3/5] symbols: arrange to know where functions end Jan Beulich
2025-04-02 14:08 ` Jan Beulich
@ 2025-08-28 1:03 ` Jason Andryuk
2025-08-28 7:28 ` Jan Beulich
1 sibling, 1 reply; 21+ messages in thread
From: Jason Andryuk @ 2025-08-28 1:03 UTC (permalink / raw)
To: Jan Beulich, xen-devel@lists.xenproject.org
Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, Anthony PERARD,
Michal Orzel, Roger Pau Monné
On 2025-04-02 09:58, Jan Beulich wrote:
> When determining the symbol for a given address (e.g. for the %pS
> logging format specifier), so far the size of a symbol (function) was
> assumed to be everything until the next symbol. There may be gaps
> though, which would better be recognizable in output (often suggesting
> something odd is going on).
>
> Insert "fake" end symbols in the address table, accompanied by zero-
> length type/name entries (to keep lookup reasonably close to how it
> was).
>
> Note however that this, with present GNU binutils, won't work for
> xen.efi: The linker loses function sizes (they're not part of a normal
> symbol table entry), and hence nm has no way of reporting them.
>
> The address table growth is quite significant on x86 release builds (due
> to functions being aligned to 16-byte boundaries), though: Its size
> almost doubles.
>
> Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> Note: Style-wise this is a horrible mix. I'm trying to match styles with
> what's used in the respective functions.
>
> Older GNU ld retains section symbols, which nm then also lists. Should
> we perhaps strip those as we read in nm's output? They don't provide any
> useful extra information, as our linker scripts add section start
> symbols anyway. (For the purposes here, luckily such section symbols are
> at least emitted without size.)
>
> Even for section start symbols there is the question of whether they
> really need retaining (except perhaps when producing a map file). The
> main question here likely is whether livepatch may have a need to look
> them up by name. (Section end symbols may actually be slightly more
> useful to keep, but that may also want considering more closely.)
> ---
> --- a/xen/tools/symbols.c
> +++ b/xen/tools/symbols.c
> @@ -318,24 +334,42 @@ static void write_src(void)
> printf("#else\n");
> output_label("symbols_offsets");
> printf("#endif\n");
> - for (i = 0; i < table_cnt; i++) {
> + for (i = 0, ends = 0; i < table_cnt; i++) {
> printf("\tPTR\t%#llx - SYMBOLS_ORIGIN\n", table[i].addr);
> +
> + table[i].addr_idx = i + ends;
> +
> + if (!want_symbol_end(i)) {
> + /* If there's another symbol at the same address,
> + * propagate this symbol's size if the next one has
> + * no size, or if the next one's size is larger. */
Why do we want to shrink the next symbol's size?
The code looks good - I just don't understand this condition.
Thanks,
Jason
> + if (table[i].size &&
> + i + 1 < table_cnt &&
> + table[i + 1].addr == table[i].addr &&
> + (!table[i + 1].size ||
> + table[i + 1].size > table[i].size))
> + table[i + 1].size = table[i].size;
> + continue;
> + }
> +
> + ++ends;
> + printf("\tPTR\t%#llx - SYMBOLS_ORIGIN\n",
> + table[i].addr + table[i].size);
> }
> printf("\n");
>
> output_label("symbols_num_addrs");
> - printf("\t.long\t%d\n", table_cnt);
> + printf("\t.long\t%d\n", table_cnt + ends);
> printf("\n");
>
> /* table of offset markers, that give the offset in the compressed stream
> * every 256 symbols */
> - markers = (unsigned int *) malloc(sizeof(unsigned int) * ((table_cnt + 255) / 256));
> + markers = malloc(sizeof(*markers) * ((table_cnt + ends + 255) >> 8));
>
> output_label("symbols_names");
> - off = 0;
> - for (i = 0; i < table_cnt; i++) {
> - if ((i & 0xFF) == 0)
> - markers[i >> 8] = off;
> + for (i = 0, off = 0, ends = 0; i < table_cnt; i++) {
> + if (((i + ends) & 0xFF) == 0)
> + markers[(i + ends) >> 8] = off;
>
> printf("\t.byte 0x%02x", table[i].len);
> for (k = 0; k < table[i].len; k++)
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 4/5] symbols: centralize and re-arrange $(all_symbols) calculation
2025-04-02 13:59 ` [PATCH v2 4/5] symbols: centralize and re-arrange $(all_symbols) calculation Jan Beulich
@ 2025-08-28 1:05 ` Jason Andryuk
2025-08-29 15:02 ` Roger Pau Monné
0 siblings, 1 reply; 21+ messages in thread
From: Jason Andryuk @ 2025-08-28 1:05 UTC (permalink / raw)
To: Jan Beulich, xen-devel@lists.xenproject.org
Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, Anthony PERARD,
Michal Orzel, Roger Pau Monné, Bertrand Marquis,
Volodymyr Babchuk
On 2025-04-02 09:59, Jan Beulich wrote:
> For one there's no need for each architecture to have the same logic.
> Move to the root Makefile, also to calculate just once.
>
> And then re-arrange to permit FAST_SYMBOL_LOOKUP to be independent of
> LIVEPATCH, which may be useful in (at least) debugging.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 5/5] symbols: prefer symbols which have a type
2025-04-02 14:00 ` [PATCH v2 5/5] symbols: prefer symbols which have a type Jan Beulich
@ 2025-08-28 1:07 ` Jason Andryuk
2025-08-29 15:13 ` Roger Pau Monné
0 siblings, 1 reply; 21+ messages in thread
From: Jason Andryuk @ 2025-08-28 1:07 UTC (permalink / raw)
To: Jan Beulich, xen-devel@lists.xenproject.org
Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, Anthony PERARD,
Michal Orzel, Roger Pau Monné
On 2025-04-02 10:00, Jan Beulich wrote:
> ... and thus typically also a size. Using global vs local is undesirable
> in certain situations, e.g. when a "real" symbol is local and at the
> same address as a section start/end one (which are all global).
>
> Note that for xen.efi the checking for "Function" is only forward-
> looking at this point: The function-ness of symbols (much like their
> size) is lost when linking PE/COFF binaries from ELF objects with GNU ld
> up to at least 2.44.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/5] symbols: arrange to know where functions end
2025-08-28 1:03 ` Jason Andryuk
@ 2025-08-28 7:28 ` Jan Beulich
2025-08-28 16:11 ` Jan Beulich
0 siblings, 1 reply; 21+ messages in thread
From: Jan Beulich @ 2025-08-28 7:28 UTC (permalink / raw)
To: Jason Andryuk
Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, Anthony PERARD,
Michal Orzel, Roger Pau Monné,
xen-devel@lists.xenproject.org
On 28.08.2025 03:03, Jason Andryuk wrote:
> On 2025-04-02 09:58, Jan Beulich wrote:
>> When determining the symbol for a given address (e.g. for the %pS
>> logging format specifier), so far the size of a symbol (function) was
>> assumed to be everything until the next symbol. There may be gaps
>> though, which would better be recognizable in output (often suggesting
>> something odd is going on).
>>
>> Insert "fake" end symbols in the address table, accompanied by zero-
>> length type/name entries (to keep lookup reasonably close to how it
>> was).
>>
>> Note however that this, with present GNU binutils, won't work for
Btw, I've updated this to say "with GNU binutils prior to 2.45".
>> --- a/xen/tools/symbols.c
>> +++ b/xen/tools/symbols.c
>
>> @@ -318,24 +334,42 @@ static void write_src(void)
>> printf("#else\n");
>> output_label("symbols_offsets");
>> printf("#endif\n");
>> - for (i = 0; i < table_cnt; i++) {
>> + for (i = 0, ends = 0; i < table_cnt; i++) {
>> printf("\tPTR\t%#llx - SYMBOLS_ORIGIN\n", table[i].addr);
>> +
>> + table[i].addr_idx = i + ends;
>> +
>> + if (!want_symbol_end(i)) {
>> + /* If there's another symbol at the same address,
>> + * propagate this symbol's size if the next one has
>> + * no size, or if the next one's size is larger. */
>
> Why do we want to shrink the next symbol's size?
First (see related post-commit-message remarks): In principle section symbols
could come with a size, too. That would break everything as long as we don't
strip those.
The main reason though is that imo smallest granularity is what we want here,
together with predictability. One symbol with a huge size could cover
multiple other symbols with smaller sizes. We could omit that part of the
change here, but then the processing in the hypervisor would need to change,
to fish out the "best suitable" symbol when dealing with multiple ones at the
same address. Other changes may then also be needed to the tool, to have such
symbols come in a well-defined order (to keep the then-new code in the
hypervisor as simple as possible). Look for "aliased symbol" in
common/symbols.c to see how simplistic respective code is right now.
Jan
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/5] symbols: arrange to know where functions end
2025-08-28 7:28 ` Jan Beulich
@ 2025-08-28 16:11 ` Jan Beulich
2025-08-28 17:16 ` Jason Andryuk
0 siblings, 1 reply; 21+ messages in thread
From: Jan Beulich @ 2025-08-28 16:11 UTC (permalink / raw)
To: Jason Andryuk
Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, Anthony PERARD,
Michal Orzel, Roger Pau Monné,
xen-devel@lists.xenproject.org
On 28.08.2025 09:28, Jan Beulich wrote:
> On 28.08.2025 03:03, Jason Andryuk wrote:
>> On 2025-04-02 09:58, Jan Beulich wrote:
>>> --- a/xen/tools/symbols.c
>>> +++ b/xen/tools/symbols.c
>>
>>> @@ -318,24 +334,42 @@ static void write_src(void)
>>> printf("#else\n");
>>> output_label("symbols_offsets");
>>> printf("#endif\n");
>>> - for (i = 0; i < table_cnt; i++) {
>>> + for (i = 0, ends = 0; i < table_cnt; i++) {
>>> printf("\tPTR\t%#llx - SYMBOLS_ORIGIN\n", table[i].addr);
>>> +
>>> + table[i].addr_idx = i + ends;
>>> +
>>> + if (!want_symbol_end(i)) {
>>> + /* If there's another symbol at the same address,
>>> + * propagate this symbol's size if the next one has
>>> + * no size, or if the next one's size is larger. */
>>
>> Why do we want to shrink the next symbol's size?
>
> First (see related post-commit-message remarks): In principle section symbols
> could come with a size, too. That would break everything as long as we don't
> strip those.
>
> The main reason though is that imo smallest granularity is what we want here,
> together with predictability. One symbol with a huge size could cover
> multiple other symbols with smaller sizes. We could omit that part of the
> change here, but then the processing in the hypervisor would need to change,
> to fish out the "best suitable" symbol when dealing with multiple ones at the
> same address. Other changes may then also be needed to the tool, to have such
> symbols come in a well-defined order (to keep the then-new code in the
> hypervisor as simple as possible). Look for "aliased symbol" in
> common/symbols.c to see how simplistic respective code is right now.
Furthermore remember that we can't record sizes, but instead we insert fake
symbols. Obviously there can be only one (at least in the present scheme).
If we used too large a size, chances would increase that the end symbol (in
the sorted table) would have to live past some other symbol, thus becoming
that one's "end".
Jan
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/5] symbols: arrange to know where functions end
2025-08-28 16:11 ` Jan Beulich
@ 2025-08-28 17:16 ` Jason Andryuk
2025-08-29 6:59 ` Jan Beulich
0 siblings, 1 reply; 21+ messages in thread
From: Jason Andryuk @ 2025-08-28 17:16 UTC (permalink / raw)
To: Jan Beulich
Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, Anthony PERARD,
Michal Orzel, Roger Pau Monné,
xen-devel@lists.xenproject.org
On 2025-08-28 12:11, Jan Beulich wrote:
> On 28.08.2025 09:28, Jan Beulich wrote:
>> On 28.08.2025 03:03, Jason Andryuk wrote:
>>> On 2025-04-02 09:58, Jan Beulich wrote:
>>>> --- a/xen/tools/symbols.c
>>>> +++ b/xen/tools/symbols.c
>>>
>>>> @@ -318,24 +334,42 @@ static void write_src(void)
>>>> printf("#else\n");
>>>> output_label("symbols_offsets");
>>>> printf("#endif\n");
>>>> - for (i = 0; i < table_cnt; i++) {
>>>> + for (i = 0, ends = 0; i < table_cnt; i++) {
>>>> printf("\tPTR\t%#llx - SYMBOLS_ORIGIN\n", table[i].addr);
>>>> +
>>>> + table[i].addr_idx = i + ends;
>>>> +
>>>> + if (!want_symbol_end(i)) {
>>>> + /* If there's another symbol at the same address,
>>>> + * propagate this symbol's size if the next one has
>>>> + * no size, or if the next one's size is larger. */
>>>
>>> Why do we want to shrink the next symbol's size?
>>
>> First (see related post-commit-message remarks): In principle section symbols
>> could come with a size, too. That would break everything as long as we don't
>> strip those.
>>
>> The main reason though is that imo smallest granularity is what we want here,
>> together with predictability. One symbol with a huge size could cover
>> multiple other symbols with smaller sizes. We could omit that part of the
>> change here, but then the processing in the hypervisor would need to change,
>> to fish out the "best suitable" symbol when dealing with multiple ones at the
>> same address. Other changes may then also be needed to the tool, to have such
>> symbols come in a well-defined order (to keep the then-new code in the
>> hypervisor as simple as possible). Look for "aliased symbol" in
>> common/symbols.c to see how simplistic respective code is right now.
>
> Furthermore remember that we can't record sizes, but instead we insert fake
> symbols. Obviously there can be only one (at least in the present scheme).
> If we used too large a size, chances would increase that the end symbol (in
> the sorted table) would have to live past some other symbol, thus becoming
> that one's "end".
The scenario I thought about is something like:
a 0x100-0x10f
b 0x100-0x1ff
c 0x200-0x2ff
If you shrink b, you are creating a hole that would otherwise be
assigned to b.
But I agree avoiding huge sizes covering multiple small variables would
better be avoided.
Do you have concrete examples to help illustrate the problem?
Regards,
Jason
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/5] symbols: arrange to know where functions end
2025-08-28 17:16 ` Jason Andryuk
@ 2025-08-29 6:59 ` Jan Beulich
2025-08-31 14:50 ` Jason Andryuk
0 siblings, 1 reply; 21+ messages in thread
From: Jan Beulich @ 2025-08-29 6:59 UTC (permalink / raw)
To: Jason Andryuk
Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, Anthony PERARD,
Michal Orzel, Roger Pau Monné,
xen-devel@lists.xenproject.org
On 28.08.2025 19:16, Jason Andryuk wrote:
> On 2025-08-28 12:11, Jan Beulich wrote:
>> On 28.08.2025 09:28, Jan Beulich wrote:
>>> On 28.08.2025 03:03, Jason Andryuk wrote:
>>>> On 2025-04-02 09:58, Jan Beulich wrote:
>>>>> --- a/xen/tools/symbols.c
>>>>> +++ b/xen/tools/symbols.c
>>>>
>>>>> @@ -318,24 +334,42 @@ static void write_src(void)
>>>>> printf("#else\n");
>>>>> output_label("symbols_offsets");
>>>>> printf("#endif\n");
>>>>> - for (i = 0; i < table_cnt; i++) {
>>>>> + for (i = 0, ends = 0; i < table_cnt; i++) {
>>>>> printf("\tPTR\t%#llx - SYMBOLS_ORIGIN\n", table[i].addr);
>>>>> +
>>>>> + table[i].addr_idx = i + ends;
>>>>> +
>>>>> + if (!want_symbol_end(i)) {
>>>>> + /* If there's another symbol at the same address,
>>>>> + * propagate this symbol's size if the next one has
>>>>> + * no size, or if the next one's size is larger. */
>>>>
>>>> Why do we want to shrink the next symbol's size?
>>>
>>> First (see related post-commit-message remarks): In principle section symbols
>>> could come with a size, too. That would break everything as long as we don't
>>> strip those.
>>>
>>> The main reason though is that imo smallest granularity is what we want here,
>>> together with predictability. One symbol with a huge size could cover
>>> multiple other symbols with smaller sizes. We could omit that part of the
>>> change here, but then the processing in the hypervisor would need to change,
>>> to fish out the "best suitable" symbol when dealing with multiple ones at the
>>> same address. Other changes may then also be needed to the tool, to have such
>>> symbols come in a well-defined order (to keep the then-new code in the
>>> hypervisor as simple as possible). Look for "aliased symbol" in
>>> common/symbols.c to see how simplistic respective code is right now.
>>
>> Furthermore remember that we can't record sizes, but instead we insert fake
>> symbols. Obviously there can be only one (at least in the present scheme).
>> If we used too large a size, chances would increase that the end symbol (in
>> the sorted table) would have to live past some other symbol, thus becoming
>> that one's "end".
>
> The scenario I thought about is something like:
>
> a 0x100-0x10f
> b 0x100-0x1ff
> c 0x200-0x2ff
>
> If you shrink b, you are creating a hole that would otherwise be
> assigned to b.
>
> But I agree avoiding huge sizes covering multiple small variables would
> better be avoided.
>
> Do you have concrete examples to help illustrate the problem?
a 0x100-0x1ff
b 0x100-0x10f
c 0x110-0x11f
If we inserted an "end" label based on a's size, that would effectively be
c's 2nd end symbol (and there may not be two "end" symbols in a row, unless
we want to further complicate the symbol lookup logic).
Jan
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/5] symbols: add minimal self-test
2025-04-02 13:57 ` [PATCH v2 1/5] symbols: add minimal self-test Jan Beulich
2025-08-27 22:20 ` Jason Andryuk
@ 2025-08-29 14:24 ` Roger Pau Monné
2025-09-01 6:38 ` Jan Beulich
1 sibling, 1 reply; 21+ messages in thread
From: Roger Pau Monné @ 2025-08-29 14:24 UTC (permalink / raw)
To: Jan Beulich
Cc: xen-devel@lists.xenproject.org, Andrew Cooper, Julien Grall,
Stefano Stabellini, Anthony PERARD, Michal Orzel
On Wed, Apr 02, 2025 at 03:57:57PM +0200, Jan Beulich wrote:
> ... before making changes to the involved logic.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> While Andrew validly suggests cf_check isn't a requirement for selecting
> which function(s) to use (with the non-upstream gcc patch that we're
> using in CI), that's only because of how the non-upstream patch works.
> Going function-pointer -> unsigned long -> function-pointer without it
> being diagnosed that the cf_check is missing is a shortcoming there, and
> might conceivably be fixed at some point. (Imo any address-taking on a
> function should require it to be cf_check.) Hence I'd like to stick to
> using cf_check functions only for passing to test_lookup().
>
> With this FAST_SYMBOL_LOOKUP may make sense to permit enabling even
> when LIVEPATCH=n. Thoughts? (In this case "symbols: centralize and re-
> arrange $(all_symbols) calculation" would want pulling ahead.)
>
> --- a/xen/common/symbols.c
> +++ b/xen/common/symbols.c
> @@ -260,6 +260,41 @@ unsigned long symbols_lookup_by_name(con
> return 0;
> }
>
> +#ifdef CONFIG_SELF_TESTS
> +
> +static void __init test_lookup(unsigned long addr, const char *expected)
> +{
> + char buf[KSYM_NAME_LEN + 1];
> + const char *name, *symname;
> + unsigned long size, offs;
> +
> + name = symbols_lookup(addr, &size, &offs, buf);
> + if ( !name )
> + panic("%s: address not found\n", expected);
> + if ( offs )
> + panic("%s: non-zero offset (%#lx) unexpected\n", expected, offs);
If there's a non-zero offset returned, could you also print the
returned name? (so use %s+%#lx) there's a change the returned name
doesn't match what we expect if there's a non-zero offset.
The rest LGTM:
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Thanks, Roger.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 4/5] symbols: centralize and re-arrange $(all_symbols) calculation
2025-08-28 1:05 ` Jason Andryuk
@ 2025-08-29 15:02 ` Roger Pau Monné
0 siblings, 0 replies; 21+ messages in thread
From: Roger Pau Monné @ 2025-08-29 15:02 UTC (permalink / raw)
To: Jason Andryuk
Cc: Jan Beulich, xen-devel@lists.xenproject.org, Andrew Cooper,
Julien Grall, Stefano Stabellini, Anthony PERARD, Michal Orzel,
Bertrand Marquis, Volodymyr Babchuk
On Wed, Aug 27, 2025 at 09:05:59PM -0400, Jason Andryuk wrote:
> On 2025-04-02 09:59, Jan Beulich wrote:
> > For one there's no need for each architecture to have the same logic.
> > Move to the root Makefile, also to calculate just once.
> >
> > And then re-arrange to permit FAST_SYMBOL_LOOKUP to be independent of
> > LIVEPATCH, which may be useful in (at least) debugging.
> >
> > Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Thanks.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 5/5] symbols: prefer symbols which have a type
2025-08-28 1:07 ` Jason Andryuk
@ 2025-08-29 15:13 ` Roger Pau Monné
0 siblings, 0 replies; 21+ messages in thread
From: Roger Pau Monné @ 2025-08-29 15:13 UTC (permalink / raw)
To: Jason Andryuk
Cc: Jan Beulich, xen-devel@lists.xenproject.org, Andrew Cooper,
Julien Grall, Stefano Stabellini, Anthony PERARD, Michal Orzel
On Wed, Aug 27, 2025 at 09:07:50PM -0400, Jason Andryuk wrote:
> On 2025-04-02 10:00, Jan Beulich wrote:
> > ... and thus typically also a size. Using global vs local is undesirable
> > in certain situations, e.g. when a "real" symbol is local and at the
> > same address as a section start/end one (which are all global).
> >
> > Note that for xen.efi the checking for "Function" is only forward-
> > looking at this point: The function-ness of symbols (much like their
> > size) is lost when linking PE/COFF binaries from ELF objects with GNU ld
> > up to at least 2.44.
> >
> > Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Thanks, Roger.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/5] symbols: arrange to know where functions end
2025-08-29 6:59 ` Jan Beulich
@ 2025-08-31 14:50 ` Jason Andryuk
0 siblings, 0 replies; 21+ messages in thread
From: Jason Andryuk @ 2025-08-31 14:50 UTC (permalink / raw)
To: Jan Beulich
Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, Anthony PERARD,
Michal Orzel, Roger Pau Monné,
xen-devel@lists.xenproject.org
On 2025-08-29 02:59, Jan Beulich wrote:
> On 28.08.2025 19:16, Jason Andryuk wrote:
>> On 2025-08-28 12:11, Jan Beulich wrote:
>>> On 28.08.2025 09:28, Jan Beulich wrote:
>>>> On 28.08.2025 03:03, Jason Andryuk wrote:
>>>>> On 2025-04-02 09:58, Jan Beulich wrote:
>>>>>> --- a/xen/tools/symbols.c
>>>>>> +++ b/xen/tools/symbols.c
>>>>>
>>>>>> @@ -318,24 +334,42 @@ static void write_src(void)
>>>>>> printf("#else\n");
>>>>>> output_label("symbols_offsets");
>>>>>> printf("#endif\n");
>>>>>> - for (i = 0; i < table_cnt; i++) {
>>>>>> + for (i = 0, ends = 0; i < table_cnt; i++) {
>>>>>> printf("\tPTR\t%#llx - SYMBOLS_ORIGIN\n", table[i].addr);
>>>>>> +
>>>>>> + table[i].addr_idx = i + ends;
>>>>>> +
>>>>>> + if (!want_symbol_end(i)) {
>>>>>> + /* If there's another symbol at the same address,
>>>>>> + * propagate this symbol's size if the next one has
>>>>>> + * no size, or if the next one's size is larger. */
>>>>>
>>>>> Why do we want to shrink the next symbol's size?
>>>>
>>>> First (see related post-commit-message remarks): In principle section symbols
>>>> could come with a size, too. That would break everything as long as we don't
>>>> strip those.
>>>>
>>>> The main reason though is that imo smallest granularity is what we want here,
>>>> together with predictability. One symbol with a huge size could cover
>>>> multiple other symbols with smaller sizes. We could omit that part of the
>>>> change here, but then the processing in the hypervisor would need to change,
>>>> to fish out the "best suitable" symbol when dealing with multiple ones at the
>>>> same address. Other changes may then also be needed to the tool, to have such
>>>> symbols come in a well-defined order (to keep the then-new code in the
>>>> hypervisor as simple as possible). Look for "aliased symbol" in
>>>> common/symbols.c to see how simplistic respective code is right now.
>>>
>>> Furthermore remember that we can't record sizes, but instead we insert fake
>>> symbols. Obviously there can be only one (at least in the present scheme).
>>> If we used too large a size, chances would increase that the end symbol (in
>>> the sorted table) would have to live past some other symbol, thus becoming
>>> that one's "end".
>>
>> The scenario I thought about is something like:
>>
>> a 0x100-0x10f
>> b 0x100-0x1ff
>> c 0x200-0x2ff
>>
>> If you shrink b, you are creating a hole that would otherwise be
>> assigned to b.
>>
>> But I agree avoiding huge sizes covering multiple small variables would
>> better be avoided.
>>
>> Do you have concrete examples to help illustrate the problem?
>
> a 0x100-0x1ff
> b 0x100-0x10f
> c 0x110-0x11f
>
> If we inserted an "end" label based on a's size, that would effectively be
> c's 2nd end symbol (and there may not be two "end" symbols in a row, unless
> we want to further complicate the symbol lookup logic).
Ok.
Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
Thanks,
Jason
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/5] symbols: add minimal self-test
2025-08-29 14:24 ` Roger Pau Monné
@ 2025-09-01 6:38 ` Jan Beulich
0 siblings, 0 replies; 21+ messages in thread
From: Jan Beulich @ 2025-09-01 6:38 UTC (permalink / raw)
To: Roger Pau Monné
Cc: xen-devel@lists.xenproject.org, Andrew Cooper, Julien Grall,
Stefano Stabellini, Anthony PERARD, Michal Orzel
On 29.08.2025 16:24, Roger Pau Monné wrote:
> On Wed, Apr 02, 2025 at 03:57:57PM +0200, Jan Beulich wrote:
>> ... before making changes to the involved logic.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> ---
>> While Andrew validly suggests cf_check isn't a requirement for selecting
>> which function(s) to use (with the non-upstream gcc patch that we're
>> using in CI), that's only because of how the non-upstream patch works.
>> Going function-pointer -> unsigned long -> function-pointer without it
>> being diagnosed that the cf_check is missing is a shortcoming there, and
>> might conceivably be fixed at some point. (Imo any address-taking on a
>> function should require it to be cf_check.) Hence I'd like to stick to
>> using cf_check functions only for passing to test_lookup().
>>
>> With this FAST_SYMBOL_LOOKUP may make sense to permit enabling even
>> when LIVEPATCH=n. Thoughts? (In this case "symbols: centralize and re-
>> arrange $(all_symbols) calculation" would want pulling ahead.)
>>
>> --- a/xen/common/symbols.c
>> +++ b/xen/common/symbols.c
>> @@ -260,6 +260,41 @@ unsigned long symbols_lookup_by_name(con
>> return 0;
>> }
>>
>> +#ifdef CONFIG_SELF_TESTS
>> +
>> +static void __init test_lookup(unsigned long addr, const char *expected)
>> +{
>> + char buf[KSYM_NAME_LEN + 1];
>> + const char *name, *symname;
>> + unsigned long size, offs;
>> +
>> + name = symbols_lookup(addr, &size, &offs, buf);
>> + if ( !name )
>> + panic("%s: address not found\n", expected);
>> + if ( offs )
>> + panic("%s: non-zero offset (%#lx) unexpected\n", expected, offs);
>
> If there's a non-zero offset returned, could you also print the
> returned name? (so use %s+%#lx) there's a change the returned name
> doesn't match what we expect if there's a non-zero offset.
Hmm, perhaps we could, even if that's not the main goal of the test. Note
though that the patch has gone in already, with Jason's R-b.
> The rest LGTM:
>
> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Nevertheless, thanks.
Jan
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2025-09-01 6:38 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-02 13:56 [PATCH v2 0/5] symbols: assorted adjustments Jan Beulich
2025-04-02 13:57 ` [PATCH v2 1/5] symbols: add minimal self-test Jan Beulich
2025-08-27 22:20 ` Jason Andryuk
2025-08-29 14:24 ` Roger Pau Monné
2025-09-01 6:38 ` Jan Beulich
2025-04-02 13:58 ` [PATCH v2 2/5] symbols: split symbols_num_syms Jan Beulich
2025-08-27 22:20 ` Jason Andryuk
2025-04-02 13:58 ` [PATCH v2 3/5] symbols: arrange to know where functions end Jan Beulich
2025-04-02 14:08 ` Jan Beulich
2025-08-28 1:03 ` Jason Andryuk
2025-08-28 7:28 ` Jan Beulich
2025-08-28 16:11 ` Jan Beulich
2025-08-28 17:16 ` Jason Andryuk
2025-08-29 6:59 ` Jan Beulich
2025-08-31 14:50 ` Jason Andryuk
2025-04-02 13:59 ` [PATCH v2 4/5] symbols: centralize and re-arrange $(all_symbols) calculation Jan Beulich
2025-08-28 1:05 ` Jason Andryuk
2025-08-29 15:02 ` Roger Pau Monné
2025-04-02 14:00 ` [PATCH v2 5/5] symbols: prefer symbols which have a type Jan Beulich
2025-08-28 1:07 ` Jason Andryuk
2025-08-29 15:13 ` Roger Pau Monné
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.