* [PATCH] symbols: discard stray file symbols
@ 2025-04-16 9:00 Jan Beulich
2025-09-04 21:53 ` Jason Andryuk
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Jan Beulich @ 2025-04-16 9:00 UTC (permalink / raw)
To: xen-devel@lists.xenproject.org
Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, Anthony PERARD,
Michal Orzel, Roger Pau Monné
By observation GNU ld 2.25 may emit file symbols for .data.read_mostly
when linking xen.efi. Due to the nature of file symbols in COFF symbol
tables (see the code comment) the symbols_offsets[] entries for such
symbols would cause assembler warnings regarding value truncation. Of
course the resulting entries would also be both meaningless and useless.
Add a heuristic to get rid of them, really taking effect only when
--all-symbols is specified (otherwise these symbols are discarded
anyway).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
Factor 2 may in principle still be too small: We zap what looks like
real file symbols already in read_symbol(), so table_cnt doesn't really
reflect the number of symbol table entries encountered. It has proven to
work for me in practice though, with still some leeway left.
--- a/xen/tools/symbols.c
+++ b/xen/tools/symbols.c
@@ -213,6 +213,16 @@ static int symbol_valid(struct sym_entry
if (strstr((char *)s->sym + offset, "_compiled."))
return 0;
+ /* At least GNU ld 2.25 may emit bogus file symbols referencing a
+ * section name while linking xen.efi. In COFF symbol tables the
+ * "value" of file symbols is a link (symbol table index) to the next
+ * file symbol. Since file (and other) symbols (can) come with one
+ * (or in principle more) auxiliary symbol table entries, the value in
+ * this heuristic is bounded to twice the number of symbols we have
+ * found. See also read_symbol() as to the '?' checked for here. */
+ if (s->sym[0] == '?' && s->sym[1] == '.' && s->addr < table_cnt * 2)
+ return 0;
+
return 1;
}
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] symbols: discard stray file symbols
2025-04-16 9:00 [PATCH] symbols: discard stray file symbols Jan Beulich
@ 2025-09-04 21:53 ` Jason Andryuk
2025-09-05 6:23 ` Jan Beulich
2025-09-25 7:36 ` Ping: " Jan Beulich
2025-10-21 9:56 ` Roger Pau Monné
2 siblings, 1 reply; 7+ messages in thread
From: Jason Andryuk @ 2025-09-04 21:53 UTC (permalink / raw)
To: Jan Beulich, xen-devel@lists.xenproject.org
Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, Anthony PERARD,
Michal Orzel, Roger Pau Monné
On 2025-04-16 05:00, Jan Beulich wrote:
> By observation GNU ld 2.25 may emit file symbols for .data.read_mostly
> when linking xen.efi. Due to the nature of file symbols in COFF symbol
> tables (see the code comment) the symbols_offsets[] entries for such
> symbols would cause assembler warnings regarding value truncation. Of
> course the resulting entries would also be both meaningless and useless.
> Add a heuristic to get rid of them, really taking effect only when
> --all-symbols is specified (otherwise these symbols are discarded
> anyway).
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> Factor 2 may in principle still be too small: We zap what looks like
> real file symbols already in read_symbol(), so table_cnt doesn't really
> reflect the number of symbol table entries encountered. It has proven to
> work for me in practice though, with still some leeway left.
>
> --- a/xen/tools/symbols.c
> +++ b/xen/tools/symbols.c
> @@ -213,6 +213,16 @@ static int symbol_valid(struct sym_entry
> if (strstr((char *)s->sym + offset, "_compiled."))
> return 0;
>
> + /* At least GNU ld 2.25 may emit bogus file symbols referencing a
> + * section name while linking xen.efi. In COFF symbol tables the
> + * "value" of file symbols is a link (symbol table index) to the next
> + * file symbol. Since file (and other) symbols (can) come with one
> + * (or in principle more) auxiliary symbol table entries, the value in
> + * this heuristic is bounded to twice the number of symbols we have
> + * found. See also read_symbol() as to the '?' checked for here. */
> + if (s->sym[0] == '?' && s->sym[1] == '.' && s->addr < table_cnt * 2)
> + return 0;
> +
> return 1;
> }
I looked at this. It'll drop symbols, but I don't know enough to give
an R-b. I can't give an actionable A-b either. Maybe someone else can
chime in.
Maybe this is just showing my lack of knowledge, but could any symbol
starting "?." be considered invalid? I don't think I've ever seen any
like that.
Regards,
Jason
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] symbols: discard stray file symbols
2025-09-04 21:53 ` Jason Andryuk
@ 2025-09-05 6:23 ` Jan Beulich
0 siblings, 0 replies; 7+ messages in thread
From: Jan Beulich @ 2025-09-05 6:23 UTC (permalink / raw)
To: Jason Andryuk
Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, Anthony PERARD,
Michal Orzel, Roger Pau Monné,
xen-devel@lists.xenproject.org
On 04.09.2025 23:53, Jason Andryuk wrote:
> On 2025-04-16 05:00, Jan Beulich wrote:
>> By observation GNU ld 2.25 may emit file symbols for .data.read_mostly
>> when linking xen.efi. Due to the nature of file symbols in COFF symbol
>> tables (see the code comment) the symbols_offsets[] entries for such
>> symbols would cause assembler warnings regarding value truncation. Of
>> course the resulting entries would also be both meaningless and useless.
>> Add a heuristic to get rid of them, really taking effect only when
>> --all-symbols is specified (otherwise these symbols are discarded
>> anyway).
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> ---
>> Factor 2 may in principle still be too small: We zap what looks like
>> real file symbols already in read_symbol(), so table_cnt doesn't really
>> reflect the number of symbol table entries encountered. It has proven to
>> work for me in practice though, with still some leeway left.
>>
>> --- a/xen/tools/symbols.c
>> +++ b/xen/tools/symbols.c
>> @@ -213,6 +213,16 @@ static int symbol_valid(struct sym_entry
>> if (strstr((char *)s->sym + offset, "_compiled."))
>> return 0;
>>
>> + /* At least GNU ld 2.25 may emit bogus file symbols referencing a
>> + * section name while linking xen.efi. In COFF symbol tables the
>> + * "value" of file symbols is a link (symbol table index) to the next
>> + * file symbol. Since file (and other) symbols (can) come with one
>> + * (or in principle more) auxiliary symbol table entries, the value in
>> + * this heuristic is bounded to twice the number of symbols we have
>> + * found. See also read_symbol() as to the '?' checked for here. */
>> + if (s->sym[0] == '?' && s->sym[1] == '.' && s->addr < table_cnt * 2)
>> + return 0;
>> +
>> return 1;
>> }
>
> I looked at this. It'll drop symbols, but I don't know enough to give
> an R-b. I can't give an actionable A-b either. Maybe someone else can
> chime in.
>
> Maybe this is just showing my lack of knowledge, but could any symbol
> starting "?." be considered invalid? I don't think I've ever seen any
> like that.
With quotation, almost any symbol name can appear in principle. I wouldn't
want to judge symbol validity by its name. What's more important here,
though, is that sym[0] isn't part of the name; it's the symbol's type as
taken from nm's output. We're therefore heuristically looking at symbols
of unknown type with a dot as the first character (as section names would
conventionally have it).
Jan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Ping: [PATCH] symbols: discard stray file symbols
2025-04-16 9:00 [PATCH] symbols: discard stray file symbols Jan Beulich
2025-09-04 21:53 ` Jason Andryuk
@ 2025-09-25 7:36 ` Jan Beulich
2025-09-25 20:39 ` Oleksii Kurochko
2025-10-21 9:56 ` Roger Pau Monné
2 siblings, 1 reply; 7+ messages in thread
From: Jan Beulich @ 2025-09-25 7:36 UTC (permalink / raw)
To: Andrew Cooper, Julien Grall, Stefano Stabellini, Anthony PERARD,
Michal Orzel, Roger Pau Monné
Cc: xen-devel@lists.xenproject.org, Oleksii Kurochko
On 16.04.2025 11:00, Jan Beulich wrote:
> By observation GNU ld 2.25 may emit file symbols for .data.read_mostly
> when linking xen.efi. Due to the nature of file symbols in COFF symbol
> tables (see the code comment) the symbols_offsets[] entries for such
> symbols would cause assembler warnings regarding value truncation. Of
> course the resulting entries would also be both meaningless and useless.
> Add a heuristic to get rid of them, really taking effect only when
> --all-symbols is specified (otherwise these symbols are discarded
> anyway).
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
May I please ask for feedback here, so that hopefully we can have this
sorted in 4.21?
Jan
> ---
> Factor 2 may in principle still be too small: We zap what looks like
> real file symbols already in read_symbol(), so table_cnt doesn't really
> reflect the number of symbol table entries encountered. It has proven to
> work for me in practice though, with still some leeway left.
>
> --- a/xen/tools/symbols.c
> +++ b/xen/tools/symbols.c
> @@ -213,6 +213,16 @@ static int symbol_valid(struct sym_entry
> if (strstr((char *)s->sym + offset, "_compiled."))
> return 0;
>
> + /* At least GNU ld 2.25 may emit bogus file symbols referencing a
> + * section name while linking xen.efi. In COFF symbol tables the
> + * "value" of file symbols is a link (symbol table index) to the next
> + * file symbol. Since file (and other) symbols (can) come with one
> + * (or in principle more) auxiliary symbol table entries, the value in
> + * this heuristic is bounded to twice the number of symbols we have
> + * found. See also read_symbol() as to the '?' checked for here. */
> + if (s->sym[0] == '?' && s->sym[1] == '.' && s->addr < table_cnt * 2)
> + return 0;
> +
> return 1;
> }
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Ping: [PATCH] symbols: discard stray file symbols
2025-09-25 7:36 ` Ping: " Jan Beulich
@ 2025-09-25 20:39 ` Oleksii Kurochko
0 siblings, 0 replies; 7+ messages in thread
From: Oleksii Kurochko @ 2025-09-25 20:39 UTC (permalink / raw)
To: Jan Beulich, Andrew Cooper, Julien Grall, Stefano Stabellini,
Anthony PERARD, Michal Orzel, Roger Pau Monné
Cc: xen-devel@lists.xenproject.org
[-- Attachment #1: Type: text/plain, Size: 2013 bytes --]
On 9/25/25 9:36 AM, Jan Beulich wrote:
> On 16.04.2025 11:00, Jan Beulich wrote:
>> By observation GNU ld 2.25 may emit file symbols for .data.read_mostly
>> when linking xen.efi. Due to the nature of file symbols in COFF symbol
>> tables (see the code comment) the symbols_offsets[] entries for such
>> symbols would cause assembler warnings regarding value truncation. Of
>> course the resulting entries would also be both meaningless and useless.
>> Add a heuristic to get rid of them, really taking effect only when
>> --all-symbols is specified (otherwise these symbols are discarded
>> anyway).
>>
>> Signed-off-by: Jan Beulich<jbeulich@suse.com>
> May I please ask for feedback here, so that hopefully we can have this
> sorted in 4.21?
It is okay for me to have this change in 4.21:
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
~ Oleksii
>
> Jan
>
>> ---
>> Factor 2 may in principle still be too small: We zap what looks like
>> real file symbols already in read_symbol(), so table_cnt doesn't really
>> reflect the number of symbol table entries encountered. It has proven to
>> work for me in practice though, with still some leeway left.
>>
>> --- a/xen/tools/symbols.c
>> +++ b/xen/tools/symbols.c
>> @@ -213,6 +213,16 @@ static int symbol_valid(struct sym_entry
>> if (strstr((char *)s->sym + offset, "_compiled."))
>> return 0;
>>
>> + /* At least GNU ld 2.25 may emit bogus file symbols referencing a
>> + * section name while linking xen.efi. In COFF symbol tables the
>> + * "value" of file symbols is a link (symbol table index) to the next
>> + * file symbol. Since file (and other) symbols (can) come with one
>> + * (or in principle more) auxiliary symbol table entries, the value in
>> + * this heuristic is bounded to twice the number of symbols we have
>> + * found. See also read_symbol() as to the '?' checked for here. */
>> + if (s->sym[0] == '?' && s->sym[1] == '.' && s->addr < table_cnt * 2)
>> + return 0;
>> +
>> return 1;
>> }
>>
[-- Attachment #2: Type: text/html, Size: 2925 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] symbols: discard stray file symbols
2025-04-16 9:00 [PATCH] symbols: discard stray file symbols Jan Beulich
2025-09-04 21:53 ` Jason Andryuk
2025-09-25 7:36 ` Ping: " Jan Beulich
@ 2025-10-21 9:56 ` Roger Pau Monné
2025-10-21 10:13 ` Jan Beulich
2 siblings, 1 reply; 7+ messages in thread
From: Roger Pau Monné @ 2025-10-21 9:56 UTC (permalink / raw)
To: Jan Beulich
Cc: xen-devel@lists.xenproject.org, Andrew Cooper, Julien Grall,
Stefano Stabellini, Anthony PERARD, Michal Orzel
On Wed, Apr 16, 2025 at 11:00:57AM +0200, Jan Beulich wrote:
> By observation GNU ld 2.25 may emit file symbols for .data.read_mostly
> when linking xen.efi. Due to the nature of file symbols in COFF symbol
> tables (see the code comment) the symbols_offsets[] entries for such
> symbols would cause assembler warnings regarding value truncation. Of
> course the resulting entries would also be both meaningless and useless.
> Add a heuristic to get rid of them, really taking effect only when
> --all-symbols is specified (otherwise these symbols are discarded
> anyway).
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Factor 2 may in principle still be too small: We zap what looks like
> real file symbols already in read_symbol(), so table_cnt doesn't really
> reflect the number of symbol table entries encountered. It has proven to
> work for me in practice though, with still some leeway left.
>
> --- a/xen/tools/symbols.c
> +++ b/xen/tools/symbols.c
> @@ -213,6 +213,16 @@ static int symbol_valid(struct sym_entry
> if (strstr((char *)s->sym + offset, "_compiled."))
> return 0;
>
> + /* At least GNU ld 2.25 may emit bogus file symbols referencing a
> + * section name while linking xen.efi. In COFF symbol tables the
> + * "value" of file symbols is a link (symbol table index) to the next
> + * file symbol. Since file (and other) symbols (can) come with one
> + * (or in principle more) auxiliary symbol table entries, the value in
> + * this heuristic is bounded to twice the number of symbols we have
> + * found. See also read_symbol() as to the '?' checked for here. */
> + if (s->sym[0] == '?' && s->sym[1] == '.' && s->addr < table_cnt * 2)
Maybe a naive question, but couldn't you drop everything below
__XEN_VIRT_START, as we shouldn't have any symbols below that
address?
Thanks, Roger.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] symbols: discard stray file symbols
2025-10-21 9:56 ` Roger Pau Monné
@ 2025-10-21 10:13 ` Jan Beulich
0 siblings, 0 replies; 7+ messages in thread
From: Jan Beulich @ 2025-10-21 10:13 UTC (permalink / raw)
To: Roger Pau Monné
Cc: xen-devel@lists.xenproject.org, Andrew Cooper, Julien Grall,
Stefano Stabellini, Anthony PERARD, Michal Orzel
On 21.10.2025 11:56, Roger Pau Monné wrote:
> On Wed, Apr 16, 2025 at 11:00:57AM +0200, Jan Beulich wrote:
>> By observation GNU ld 2.25 may emit file symbols for .data.read_mostly
>> when linking xen.efi. Due to the nature of file symbols in COFF symbol
>> tables (see the code comment) the symbols_offsets[] entries for such
>> symbols would cause assembler warnings regarding value truncation. Of
>> course the resulting entries would also be both meaningless and useless.
>> Add a heuristic to get rid of them, really taking effect only when
>> --all-symbols is specified (otherwise these symbols are discarded
>> anyway).
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Thanks.
>> --- a/xen/tools/symbols.c
>> +++ b/xen/tools/symbols.c
>> @@ -213,6 +213,16 @@ static int symbol_valid(struct sym_entry
>> if (strstr((char *)s->sym + offset, "_compiled."))
>> return 0;
>>
>> + /* At least GNU ld 2.25 may emit bogus file symbols referencing a
>> + * section name while linking xen.efi. In COFF symbol tables the
>> + * "value" of file symbols is a link (symbol table index) to the next
>> + * file symbol. Since file (and other) symbols (can) come with one
>> + * (or in principle more) auxiliary symbol table entries, the value in
>> + * this heuristic is bounded to twice the number of symbols we have
>> + * found. See also read_symbol() as to the '?' checked for here. */
>> + if (s->sym[0] == '?' && s->sym[1] == '.' && s->addr < table_cnt * 2)
>
> Maybe a naive question, but couldn't you drop everything below
> __XEN_VIRT_START, as we shouldn't have any symbols below that
> address?
If we assumed so, then that might be an option. Such an assumption doesn't
look safe to me, though. See how e.g. hv_hcall_page is outside of the Xen
image (albeit still within __XEN_VIRT_{START,END}). I wouldn't want to
preclude architectures playing "interesting" games with symbols, in
particular when - unlike x86 - they have the entire VA space for their use.
Jan
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-10-21 10:13 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-16 9:00 [PATCH] symbols: discard stray file symbols Jan Beulich
2025-09-04 21:53 ` Jason Andryuk
2025-09-05 6:23 ` Jan Beulich
2025-09-25 7:36 ` Ping: " Jan Beulich
2025-09-25 20:39 ` Oleksii Kurochko
2025-10-21 9:56 ` Roger Pau Monné
2025-10-21 10:13 ` Jan Beulich
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.