* Re: [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs
2024-10-11 14:38 [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs Zheng Yejian
@ 2024-10-11 14:08 ` Gary Guo
2024-10-11 22:01 ` Matthew Wilcox
` (2 subsequent siblings)
3 siblings, 0 replies; 9+ messages in thread
From: Gary Guo @ 2024-10-11 14:08 UTC (permalink / raw)
To: Zheng Yejian
Cc: arnd, kees, mcgrof, masahiroy, ndesaulniers, ardb, jannh, song,
willy, boqun.feng, gregkh, wedsonaf, linux-kernel, yeweihua4
On Fri, 11 Oct 2024 22:38:53 +0800
Zheng Yejian <zhengyejian@huaweicloud.com> wrote:
> Currently when the length of a symbol is longer than 0x7f characters,
> its type shown in /proc/kallsyms can be incorrect.
>
> I found this issue when reading the code, but it can be reproduced by
> following steps:
>
> 1. Define a function which symbol length is 130 characters:
>
> #define X13(x) x##x##x##x##x##x##x##x##x##x##x##x##x
> static noinline void X13(x123456789)(void)
> {
> printk("hello world\n");
> }
>
> 2. The type in vmlinux is 't':
>
> $ nm vmlinux | grep x123456
> ffffffff816290f0 t x123456789x123456789x123456789x12[...]
>
> 3. Then boot the kernel, the type shown in /proc/kallsyms becomes 'g'
> instead of the expected 't':
>
> # cat /proc/kallsyms | grep x123456
> ffffffff816290f0 g x123456789x123456789x123456789x12[...]
>
> The root cause is that, after commit 73bbb94466fd ("kallsyms: support
> "big" kernel symbols"), ULEB128 was used to encode symbol name length.
> That is, for "big" kernel symbols of which name length is longer than
> 0x7f characters, the length info is encoded into 2 bytes.
>
> kallsyms_get_symbol_type() expects to read the first char of the
> symbol name which indicates the symbol type. However, due to the
> "big" symbol case not being handled, the symbol type read from
> /proc/kallsyms may be wrong, so handle it properly.
>
> Cc: stable@vger.kernel.org
> Fixes: 73bbb94466fd ("kallsyms: support "big" kernel symbols")
> Signed-off-by: Zheng Yejian <zhengyejian@huaweicloud.com>
Acked-by: Gary Guo <gary@garyguo.net>
> ---
>
> v1 -> v2:
> - Add reproduction info into commit message to make it clearer;
> - Add cc: stable line;
>
> v1: https://lore.kernel.org/all/20240830062935.1187613-1-zhengyejian@huaweicloud.com/
>
> kernel/kallsyms.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
> index a9a0ca605d4a..9e4bf061bb83 100644
> --- a/kernel/kallsyms.c
> +++ b/kernel/kallsyms.c
> @@ -103,8 +103,11 @@ static char kallsyms_get_symbol_type(unsigned int off)
> {
> /*
> * Get just the first code, look it up in the token table,
> - * and return the first char from this token.
> + * and return the first char from this token. If MSB of length
> + * is 1, it is a "big" symbol, so needs an additional byte.
> */
> + if (kallsyms_names[off] & 0x80)
> + off++;
> return kallsyms_token_table[kallsyms_token_index[kallsyms_names[off + 1]]];
> }
>
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs
2024-10-11 14:38 [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs Zheng Yejian
2024-10-11 14:08 ` Gary Guo
@ 2024-10-11 22:01 ` Matthew Wilcox
2024-10-12 1:36 ` Zheng Yejian
2024-10-12 1:47 ` Gary Guo
2025-11-11 21:13 ` Miguel Ojeda
2025-11-16 22:30 ` Miguel Ojeda
3 siblings, 2 replies; 9+ messages in thread
From: Matthew Wilcox @ 2024-10-11 22:01 UTC (permalink / raw)
To: Zheng Yejian
Cc: arnd, kees, mcgrof, masahiroy, ndesaulniers, ardb, jannh, song,
boqun.feng, gregkh, gary, wedsonaf, linux-kernel, yeweihua4
On Fri, Oct 11, 2024 at 10:38:53PM +0800, Zheng Yejian wrote:
> The root cause is that, after commit 73bbb94466fd ("kallsyms: support
> "big" kernel symbols"), ULEB128 was used to encode symbol name length.
> That is, for "big" kernel symbols of which name length is longer than
> 0x7f characters, the length info is encoded into 2 bytes.
Technically, at least two. If we ever have a symbol larger than
16kB, we'll use three bytes.
> +++ b/kernel/kallsyms.c
> @@ -103,8 +103,11 @@ static char kallsyms_get_symbol_type(unsigned int off)
> {
> /*
> * Get just the first code, look it up in the token table,
> - * and return the first char from this token.
> + * and return the first char from this token. If MSB of length
> + * is 1, it is a "big" symbol, so needs an additional byte.
> */
> + if (kallsyms_names[off] & 0x80)
> + off++;
So this "if" should be a "while" for maximum future proofing against the
day that we have a 16kB function ...
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs
2024-10-11 22:01 ` Matthew Wilcox
@ 2024-10-12 1:36 ` Zheng Yejian
2024-10-12 1:47 ` Gary Guo
1 sibling, 0 replies; 9+ messages in thread
From: Zheng Yejian @ 2024-10-12 1:36 UTC (permalink / raw)
To: Matthew Wilcox
Cc: arnd, kees, mcgrof, masahiroy, ndesaulniers, ardb, jannh, song,
boqun.feng, gregkh, gary, wedsonaf, linux-kernel, yeweihua4
On 2024/10/12 06:01, Matthew Wilcox wrote:
> On Fri, Oct 11, 2024 at 10:38:53PM +0800, Zheng Yejian wrote:
>> The root cause is that, after commit 73bbb94466fd ("kallsyms: support
>> "big" kernel symbols"), ULEB128 was used to encode symbol name length.
>> That is, for "big" kernel symbols of which name length is longer than
>> 0x7f characters, the length info is encoded into 2 bytes.
>
> Technically, at least two. If we ever have a symbol larger than
> 16kB, we'll use three bytes.
>
Well, yes!
>> +++ b/kernel/kallsyms.c
>> @@ -103,8 +103,11 @@ static char kallsyms_get_symbol_type(unsigned int off)
>> {
>> /*
>> * Get just the first code, look it up in the token table,
>> - * and return the first char from this token.
>> + * and return the first char from this token. If MSB of length
>> + * is 1, it is a "big" symbol, so needs an additional byte.
>> */
>> + if (kallsyms_names[off] & 0x80)
>> + off++;
>
> So this "if" should be a "while" for maximum future proofing against the
> day that we have a 16kB function ...
I'll test it and send a v3.
--
Thanks,
Zheng Yejian
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs
2024-10-11 22:01 ` Matthew Wilcox
2024-10-12 1:36 ` Zheng Yejian
@ 2024-10-12 1:47 ` Gary Guo
2024-10-12 2:09 ` Zheng Yejian
1 sibling, 1 reply; 9+ messages in thread
From: Gary Guo @ 2024-10-12 1:47 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Zheng Yejian, arnd, kees, mcgrof, masahiroy, ndesaulniers, ardb,
jannh, song, boqun.feng, gregkh, linux-kernel, yeweihua4
On Fri, 11 Oct 2024 23:01:12 +0100
Matthew Wilcox <willy@infradead.org> wrote:
> On Fri, Oct 11, 2024 at 10:38:53PM +0800, Zheng Yejian wrote:
> > The root cause is that, after commit 73bbb94466fd ("kallsyms: support
> > "big" kernel symbols"), ULEB128 was used to encode symbol name length.
> > That is, for "big" kernel symbols of which name length is longer than
> > 0x7f characters, the length info is encoded into 2 bytes.
>
> Technically, at least two. If we ever have a symbol larger than
> 16kB, we'll use three bytes.
Let's not worry about things that would not happen.
scripts/kallsyms.c have a check to ensure that symbol names don't get
longer than 0x3FFF.
Best,
Gary
>
> > +++ b/kernel/kallsyms.c
> > @@ -103,8 +103,11 @@ static char kallsyms_get_symbol_type(unsigned int off)
> > {
> > /*
> > * Get just the first code, look it up in the token table,
> > - * and return the first char from this token.
> > + * and return the first char from this token. If MSB of length
> > + * is 1, it is a "big" symbol, so needs an additional byte.
> > */
> > + if (kallsyms_names[off] & 0x80)
> > + off++;
>
> So this "if" should be a "while" for maximum future proofing against the
> day that we have a 16kB function ...
>
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs
2024-10-12 1:47 ` Gary Guo
@ 2024-10-12 2:09 ` Zheng Yejian
0 siblings, 0 replies; 9+ messages in thread
From: Zheng Yejian @ 2024-10-12 2:09 UTC (permalink / raw)
To: Gary Guo, Matthew Wilcox
Cc: arnd, kees, mcgrof, masahiroy, ndesaulniers, ardb, jannh, song,
boqun.feng, gregkh, linux-kernel, yeweihua4
On 2024/10/12 09:47, Gary Guo wrote:
> On Fri, 11 Oct 2024 23:01:12 +0100
> Matthew Wilcox <willy@infradead.org> wrote:
>
>> On Fri, Oct 11, 2024 at 10:38:53PM +0800, Zheng Yejian wrote:
>>> The root cause is that, after commit 73bbb94466fd ("kallsyms: support
>>> "big" kernel symbols"), ULEB128 was used to encode symbol name length.
>>> That is, for "big" kernel symbols of which name length is longer than
>>> 0x7f characters, the length info is encoded into 2 bytes.
>>
>> Technically, at least two. If we ever have a symbol larger than
>> 16kB, we'll use three bytes.
>
> Let's not worry about things that would not happen.
>
> scripts/kallsyms.c have a check to ensure that symbol names don't get
> longer than 0x3FFF.
Yes, so currently in kallsyms_expand_symbol() and get_symbol_offset(), the
symbol length are also assumed to be encoded into one byte or two bytes.
If considering the "longer than 0x3FFF" case, those two functions may should
also be changed.
>
> Best,
> Gary
>
>>
>>> +++ b/kernel/kallsyms.c
>>> @@ -103,8 +103,11 @@ static char kallsyms_get_symbol_type(unsigned int off)
>>> {
>>> /*
>>> * Get just the first code, look it up in the token table,
>>> - * and return the first char from this token.
>>> + * and return the first char from this token. If MSB of length
>>> + * is 1, it is a "big" symbol, so needs an additional byte.
>>> */
>>> + if (kallsyms_names[off] & 0x80)
>>> + off++;
>>
>> So this "if" should be a "while" for maximum future proofing against the
>> day that we have a 16kB function ...
>>
--
Thanks,
Zheng Yejian
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs
2024-10-11 14:38 [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs Zheng Yejian
2024-10-11 14:08 ` Gary Guo
2024-10-11 22:01 ` Matthew Wilcox
@ 2025-11-11 21:13 ` Miguel Ojeda
2025-11-16 22:30 ` Miguel Ojeda
3 siblings, 0 replies; 9+ messages in thread
From: Miguel Ojeda @ 2025-11-11 21:13 UTC (permalink / raw)
To: zhengyejian
Cc: ardb, arnd, boqun.feng, gary, gregkh, jannh, kees, linux-kernel,
masahiroy, mcgrof, ndesaulniers, song, wedsonaf, willy, yeweihua4,
Arnaldo Carvalho de Melo, stable
On Fri, 11 Oct 2024 22:38:53 +0800 Zheng Yejian <zhengyejian@huaweicloud.com> wrote:
>
> Currently when the length of a symbol is longer than 0x7f characters,
> its type shown in /proc/kallsyms can be incorrect.
>
> I found this issue when reading the code, but it can be reproduced by
> following steps:
>
> 1. Define a function which symbol length is 130 characters:
>
> #define X13(x) x##x##x##x##x##x##x##x##x##x##x##x##x
> static noinline void X13(x123456789)(void)
> {
> printk("hello world\n");
> }
>
> 2. The type in vmlinux is 't':
>
> $ nm vmlinux | grep x123456
> ffffffff816290f0 t x123456789x123456789x123456789x12[...]
>
> 3. Then boot the kernel, the type shown in /proc/kallsyms becomes 'g'
> instead of the expected 't':
>
> # cat /proc/kallsyms | grep x123456
> ffffffff816290f0 g x123456789x123456789x123456789x12[...]
>
> The root cause is that, after commit 73bbb94466fd ("kallsyms: support
> "big" kernel symbols"), ULEB128 was used to encode symbol name length.
> That is, for "big" kernel symbols of which name length is longer than
> 0x7f characters, the length info is encoded into 2 bytes.
>
> kallsyms_get_symbol_type() expects to read the first char of the
> symbol name which indicates the symbol type. However, due to the
> "big" symbol case not being handled, the symbol type read from
> /proc/kallsyms may be wrong, so handle it properly.
>
> Cc: stable@vger.kernel.org
> Fixes: 73bbb94466fd ("kallsyms: support "big" kernel symbols")
> Signed-off-by: Zheng Yejian <zhengyejian@huaweicloud.com>
Gary made me aware of this thread (thanks!) -- we are coming from:
https://lore.kernel.org/all/aQjua6zkEHYNVN3X@x1/
For which I sent this patch without knowing about this one:
https://lore.kernel.org/rust-for-linux/20251107050414.511648-1-ojeda@kernel.org/
This has been seen now by Arnaldo (Cc'ing) in a real system, so I think
we should take this one since it was first, with:
Cc: stable@vger.kernel.org
Thanks!
Cheers,
Miguel
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs
2024-10-11 14:38 [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs Zheng Yejian
` (2 preceding siblings ...)
2025-11-11 21:13 ` Miguel Ojeda
@ 2025-11-16 22:30 ` Miguel Ojeda
2025-11-24 16:54 ` Miguel Ojeda
3 siblings, 1 reply; 9+ messages in thread
From: Miguel Ojeda @ 2025-11-16 22:30 UTC (permalink / raw)
To: zhengyejian
Cc: ardb, arnd, boqun.feng, gary, gregkh, jannh, kees, linux-kernel,
masahiroy, mcgrof, ndesaulniers, song, wedsonaf, willy, yeweihua4
On Fri, 11 Oct 2024 22:38:53 +0800 Zheng Yejian <zhengyejian@huaweicloud.com> wrote:
>
> Currently when the length of a symbol is longer than 0x7f characters,
> its type shown in /proc/kallsyms can be incorrect.
>
> I found this issue when reading the code, but it can be reproduced by
> following steps:
>
> 1. Define a function which symbol length is 130 characters:
>
> #define X13(x) x##x##x##x##x##x##x##x##x##x##x##x##x
> static noinline void X13(x123456789)(void)
> {
> printk("hello world\n");
> }
>
> 2. The type in vmlinux is 't':
>
> $ nm vmlinux | grep x123456
> ffffffff816290f0 t x123456789x123456789x123456789x12[...]
>
> 3. Then boot the kernel, the type shown in /proc/kallsyms becomes 'g'
> instead of the expected 't':
>
> # cat /proc/kallsyms | grep x123456
> ffffffff816290f0 g x123456789x123456789x123456789x12[...]
>
> The root cause is that, after commit 73bbb94466fd ("kallsyms: support
> "big" kernel symbols"), ULEB128 was used to encode symbol name length.
> That is, for "big" kernel symbols of which name length is longer than
> 0x7f characters, the length info is encoded into 2 bytes.
>
> kallsyms_get_symbol_type() expects to read the first char of the
> symbol name which indicates the symbol type. However, due to the
> "big" symbol case not being handled, the symbol type read from
> /proc/kallsyms may be wrong, so handle it properly.
>
> Cc: stable@vger.kernel.org
> Fixes: 73bbb94466fd ("kallsyms: support "big" kernel symbols")
> Signed-off-by: Zheng Yejian <zhengyejian@huaweicloud.com>
Applied to `rust-fixes` -- thanks everyone!
Let's get some linux-next testing.
If someone is against this or wants to pick it up, please shout!
Cheers,
Miguel
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs
2025-11-16 22:30 ` Miguel Ojeda
@ 2025-11-24 16:54 ` Miguel Ojeda
0 siblings, 0 replies; 9+ messages in thread
From: Miguel Ojeda @ 2025-11-24 16:54 UTC (permalink / raw)
To: ojeda
Cc: ardb, arnd, boqun.feng, gary, gregkh, jannh, kees, linux-kernel,
masahiroy, mcgrof, ndesaulniers, song, wedsonaf, willy, yeweihua4,
zhengyejian
On Sun, 16 Nov 2025 23:30:55 +0100 Miguel Ojeda <ojeda@kernel.org> wrote:
>
> Applied to `rust-fixes` -- thanks everyone!
>
> Let's get some linux-next testing.
>
> If someone is against this or wants to pick it up, please shout!
Moved to `rust-next` which I will send this week.
Cheers,
Miguel
^ permalink raw reply [flat|nested] 9+ messages in thread