public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs
  2024-10-11 14:38 [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs Zheng Yejian
@ 2024-10-11 14:08 ` Gary Guo
  2024-10-11 22:01 ` Matthew Wilcox
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Gary Guo @ 2024-10-11 14:08 UTC (permalink / raw)
  To: Zheng Yejian
  Cc: arnd, kees, mcgrof, masahiroy, ndesaulniers, ardb, jannh, song,
	willy, boqun.feng, gregkh, wedsonaf, linux-kernel, yeweihua4

On Fri, 11 Oct 2024 22:38:53 +0800
Zheng Yejian <zhengyejian@huaweicloud.com> wrote:

> Currently when the length of a symbol is longer than 0x7f characters,
> its type shown in /proc/kallsyms can be incorrect.
> 
> I found this issue when reading the code, but it can be reproduced by
> following steps:
> 
>   1. Define a function which symbol length is 130 characters:
> 
>     #define X13(x) x##x##x##x##x##x##x##x##x##x##x##x##x
>     static noinline void X13(x123456789)(void)
>     {
>         printk("hello world\n");
>     }
> 
>   2. The type in vmlinux is 't':
> 
>     $ nm vmlinux | grep x123456
>     ffffffff816290f0 t x123456789x123456789x123456789x12[...]
> 
>   3. Then boot the kernel, the type shown in /proc/kallsyms becomes 'g'
>      instead of the expected 't':
> 
>     # cat /proc/kallsyms | grep x123456
>     ffffffff816290f0 g x123456789x123456789x123456789x12[...]
> 
> The root cause is that, after commit 73bbb94466fd ("kallsyms: support
> "big" kernel symbols"), ULEB128 was used to encode symbol name length.
> That is, for "big" kernel symbols of which name length is longer than
> 0x7f characters, the length info is encoded into 2 bytes.
> 
> kallsyms_get_symbol_type() expects to read the first char of the
> symbol name which indicates the symbol type. However, due to the
> "big" symbol case not being handled, the symbol type read from
> /proc/kallsyms may be wrong, so handle it properly.
> 
> Cc: stable@vger.kernel.org
> Fixes: 73bbb94466fd ("kallsyms: support "big" kernel symbols")
> Signed-off-by: Zheng Yejian <zhengyejian@huaweicloud.com>

Acked-by: Gary Guo <gary@garyguo.net>

> ---
> 
> v1 -> v2:
> - Add reproduction info into commit message to make it clearer;
> - Add cc: stable line;
> 
> v1: https://lore.kernel.org/all/20240830062935.1187613-1-zhengyejian@huaweicloud.com/
> 
>  kernel/kallsyms.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
> index a9a0ca605d4a..9e4bf061bb83 100644
> --- a/kernel/kallsyms.c
> +++ b/kernel/kallsyms.c
> @@ -103,8 +103,11 @@ static char kallsyms_get_symbol_type(unsigned int off)
>  {
>  	/*
>  	 * Get just the first code, look it up in the token table,
> -	 * and return the first char from this token.
> +	 * and return the first char from this token. If MSB of length
> +	 * is 1, it is a "big" symbol, so needs an additional byte.
>  	 */
> +	if (kallsyms_names[off] & 0x80)
> +		off++;
>  	return kallsyms_token_table[kallsyms_token_index[kallsyms_names[off + 1]]];
>  }
>  


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs
@ 2024-10-11 14:38 Zheng Yejian
  2024-10-11 14:08 ` Gary Guo
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Zheng Yejian @ 2024-10-11 14:38 UTC (permalink / raw)
  To: arnd, kees, mcgrof, masahiroy, ndesaulniers, ardb, jannh, song,
	willy, boqun.feng, gregkh, gary, wedsonaf
  Cc: linux-kernel, yeweihua4

Currently when the length of a symbol is longer than 0x7f characters,
its type shown in /proc/kallsyms can be incorrect.

I found this issue when reading the code, but it can be reproduced by
following steps:

  1. Define a function which symbol length is 130 characters:

    #define X13(x) x##x##x##x##x##x##x##x##x##x##x##x##x
    static noinline void X13(x123456789)(void)
    {
        printk("hello world\n");
    }

  2. The type in vmlinux is 't':

    $ nm vmlinux | grep x123456
    ffffffff816290f0 t x123456789x123456789x123456789x12[...]

  3. Then boot the kernel, the type shown in /proc/kallsyms becomes 'g'
     instead of the expected 't':

    # cat /proc/kallsyms | grep x123456
    ffffffff816290f0 g x123456789x123456789x123456789x12[...]

The root cause is that, after commit 73bbb94466fd ("kallsyms: support
"big" kernel symbols"), ULEB128 was used to encode symbol name length.
That is, for "big" kernel symbols of which name length is longer than
0x7f characters, the length info is encoded into 2 bytes.

kallsyms_get_symbol_type() expects to read the first char of the
symbol name which indicates the symbol type. However, due to the
"big" symbol case not being handled, the symbol type read from
/proc/kallsyms may be wrong, so handle it properly.

Cc: stable@vger.kernel.org
Fixes: 73bbb94466fd ("kallsyms: support "big" kernel symbols")
Signed-off-by: Zheng Yejian <zhengyejian@huaweicloud.com>
---

v1 -> v2:
- Add reproduction info into commit message to make it clearer;
- Add cc: stable line;

v1: https://lore.kernel.org/all/20240830062935.1187613-1-zhengyejian@huaweicloud.com/

 kernel/kallsyms.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index a9a0ca605d4a..9e4bf061bb83 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -103,8 +103,11 @@ static char kallsyms_get_symbol_type(unsigned int off)
 {
 	/*
 	 * Get just the first code, look it up in the token table,
-	 * and return the first char from this token.
+	 * and return the first char from this token. If MSB of length
+	 * is 1, it is a "big" symbol, so needs an additional byte.
 	 */
+	if (kallsyms_names[off] & 0x80)
+		off++;
 	return kallsyms_token_table[kallsyms_token_index[kallsyms_names[off + 1]]];
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs
  2024-10-11 14:38 [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs Zheng Yejian
  2024-10-11 14:08 ` Gary Guo
@ 2024-10-11 22:01 ` Matthew Wilcox
  2024-10-12  1:36   ` Zheng Yejian
  2024-10-12  1:47   ` Gary Guo
  2025-11-11 21:13 ` Miguel Ojeda
  2025-11-16 22:30 ` Miguel Ojeda
  3 siblings, 2 replies; 9+ messages in thread
From: Matthew Wilcox @ 2024-10-11 22:01 UTC (permalink / raw)
  To: Zheng Yejian
  Cc: arnd, kees, mcgrof, masahiroy, ndesaulniers, ardb, jannh, song,
	boqun.feng, gregkh, gary, wedsonaf, linux-kernel, yeweihua4

On Fri, Oct 11, 2024 at 10:38:53PM +0800, Zheng Yejian wrote:
> The root cause is that, after commit 73bbb94466fd ("kallsyms: support
> "big" kernel symbols"), ULEB128 was used to encode symbol name length.
> That is, for "big" kernel symbols of which name length is longer than
> 0x7f characters, the length info is encoded into 2 bytes.

Technically, at least two.  If we ever have a symbol larger than
16kB, we'll use three bytes.

> +++ b/kernel/kallsyms.c
> @@ -103,8 +103,11 @@ static char kallsyms_get_symbol_type(unsigned int off)
>  {
>  	/*
>  	 * Get just the first code, look it up in the token table,
> -	 * and return the first char from this token.
> +	 * and return the first char from this token. If MSB of length
> +	 * is 1, it is a "big" symbol, so needs an additional byte.
>  	 */
> +	if (kallsyms_names[off] & 0x80)
> +		off++;

So this "if" should be a "while" for maximum future proofing against the
day that we have a 16kB function ...


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs
  2024-10-11 22:01 ` Matthew Wilcox
@ 2024-10-12  1:36   ` Zheng Yejian
  2024-10-12  1:47   ` Gary Guo
  1 sibling, 0 replies; 9+ messages in thread
From: Zheng Yejian @ 2024-10-12  1:36 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: arnd, kees, mcgrof, masahiroy, ndesaulniers, ardb, jannh, song,
	boqun.feng, gregkh, gary, wedsonaf, linux-kernel, yeweihua4

On 2024/10/12 06:01, Matthew Wilcox wrote:
> On Fri, Oct 11, 2024 at 10:38:53PM +0800, Zheng Yejian wrote:
>> The root cause is that, after commit 73bbb94466fd ("kallsyms: support
>> "big" kernel symbols"), ULEB128 was used to encode symbol name length.
>> That is, for "big" kernel symbols of which name length is longer than
>> 0x7f characters, the length info is encoded into 2 bytes.
> 
> Technically, at least two.  If we ever have a symbol larger than
> 16kB, we'll use three bytes.
> 

Well, yes!

>> +++ b/kernel/kallsyms.c
>> @@ -103,8 +103,11 @@ static char kallsyms_get_symbol_type(unsigned int off)
>>   {
>>   	/*
>>   	 * Get just the first code, look it up in the token table,
>> -	 * and return the first char from this token.
>> +	 * and return the first char from this token. If MSB of length
>> +	 * is 1, it is a "big" symbol, so needs an additional byte.
>>   	 */
>> +	if (kallsyms_names[off] & 0x80)
>> +		off++;
> 
> So this "if" should be a "while" for maximum future proofing against the
> day that we have a 16kB function ...

I'll test it and send a v3.

-- 
Thanks,
Zheng Yejian


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs
  2024-10-11 22:01 ` Matthew Wilcox
  2024-10-12  1:36   ` Zheng Yejian
@ 2024-10-12  1:47   ` Gary Guo
  2024-10-12  2:09     ` Zheng Yejian
  1 sibling, 1 reply; 9+ messages in thread
From: Gary Guo @ 2024-10-12  1:47 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Zheng Yejian, arnd, kees, mcgrof, masahiroy, ndesaulniers, ardb,
	jannh, song, boqun.feng, gregkh, linux-kernel, yeweihua4

On Fri, 11 Oct 2024 23:01:12 +0100
Matthew Wilcox <willy@infradead.org> wrote:

> On Fri, Oct 11, 2024 at 10:38:53PM +0800, Zheng Yejian wrote:
> > The root cause is that, after commit 73bbb94466fd ("kallsyms: support
> > "big" kernel symbols"), ULEB128 was used to encode symbol name length.
> > That is, for "big" kernel symbols of which name length is longer than
> > 0x7f characters, the length info is encoded into 2 bytes.  
> 
> Technically, at least two.  If we ever have a symbol larger than
> 16kB, we'll use three bytes.

Let's not worry about things that would not happen.

scripts/kallsyms.c have a check to ensure that symbol names don't get
longer than 0x3FFF.

Best,
Gary

> 
> > +++ b/kernel/kallsyms.c
> > @@ -103,8 +103,11 @@ static char kallsyms_get_symbol_type(unsigned int off)
> >  {
> >  	/*
> >  	 * Get just the first code, look it up in the token table,
> > -	 * and return the first char from this token.
> > +	 * and return the first char from this token. If MSB of length
> > +	 * is 1, it is a "big" symbol, so needs an additional byte.
> >  	 */
> > +	if (kallsyms_names[off] & 0x80)
> > +		off++;  
> 
> So this "if" should be a "while" for maximum future proofing against the
> day that we have a 16kB function ...
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs
  2024-10-12  1:47   ` Gary Guo
@ 2024-10-12  2:09     ` Zheng Yejian
  0 siblings, 0 replies; 9+ messages in thread
From: Zheng Yejian @ 2024-10-12  2:09 UTC (permalink / raw)
  To: Gary Guo, Matthew Wilcox
  Cc: arnd, kees, mcgrof, masahiroy, ndesaulniers, ardb, jannh, song,
	boqun.feng, gregkh, linux-kernel, yeweihua4

On 2024/10/12 09:47, Gary Guo wrote:
> On Fri, 11 Oct 2024 23:01:12 +0100
> Matthew Wilcox <willy@infradead.org> wrote:
> 
>> On Fri, Oct 11, 2024 at 10:38:53PM +0800, Zheng Yejian wrote:
>>> The root cause is that, after commit 73bbb94466fd ("kallsyms: support
>>> "big" kernel symbols"), ULEB128 was used to encode symbol name length.
>>> That is, for "big" kernel symbols of which name length is longer than
>>> 0x7f characters, the length info is encoded into 2 bytes.
>>
>> Technically, at least two.  If we ever have a symbol larger than
>> 16kB, we'll use three bytes.
> 
> Let's not worry about things that would not happen.
> 
> scripts/kallsyms.c have a check to ensure that symbol names don't get
> longer than 0x3FFF.

Yes, so currently in kallsyms_expand_symbol() and get_symbol_offset(), the
symbol length are also assumed to be encoded into one byte or two bytes.
If considering the "longer than 0x3FFF" case, those two functions may should
also be changed.

> 
> Best,
> Gary
> 
>>
>>> +++ b/kernel/kallsyms.c
>>> @@ -103,8 +103,11 @@ static char kallsyms_get_symbol_type(unsigned int off)
>>>   {
>>>   	/*
>>>   	 * Get just the first code, look it up in the token table,
>>> -	 * and return the first char from this token.
>>> +	 * and return the first char from this token. If MSB of length
>>> +	 * is 1, it is a "big" symbol, so needs an additional byte.
>>>   	 */
>>> +	if (kallsyms_names[off] & 0x80)
>>> +		off++;
>>
>> So this "if" should be a "while" for maximum future proofing against the
>> day that we have a 16kB function ...
>>

-- 
Thanks,
Zheng Yejian


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs
  2024-10-11 14:38 [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs Zheng Yejian
  2024-10-11 14:08 ` Gary Guo
  2024-10-11 22:01 ` Matthew Wilcox
@ 2025-11-11 21:13 ` Miguel Ojeda
  2025-11-16 22:30 ` Miguel Ojeda
  3 siblings, 0 replies; 9+ messages in thread
From: Miguel Ojeda @ 2025-11-11 21:13 UTC (permalink / raw)
  To: zhengyejian
  Cc: ardb, arnd, boqun.feng, gary, gregkh, jannh, kees, linux-kernel,
	masahiroy, mcgrof, ndesaulniers, song, wedsonaf, willy, yeweihua4,
	Arnaldo Carvalho de Melo, stable

On Fri, 11 Oct 2024 22:38:53 +0800 Zheng Yejian <zhengyejian@huaweicloud.com> wrote:
>
> Currently when the length of a symbol is longer than 0x7f characters,
> its type shown in /proc/kallsyms can be incorrect.
>
> I found this issue when reading the code, but it can be reproduced by
> following steps:
>
> 1. Define a function which symbol length is 130 characters:
>
> #define X13(x) x##x##x##x##x##x##x##x##x##x##x##x##x
> static noinline void X13(x123456789)(void)
> {
> printk("hello world\n");
> }
>
> 2. The type in vmlinux is 't':
>
> $ nm vmlinux | grep x123456
> ffffffff816290f0 t x123456789x123456789x123456789x12[...]
>
> 3. Then boot the kernel, the type shown in /proc/kallsyms becomes 'g'
> instead of the expected 't':
>
> # cat /proc/kallsyms | grep x123456
> ffffffff816290f0 g x123456789x123456789x123456789x12[...]
>
> The root cause is that, after commit 73bbb94466fd ("kallsyms: support
> "big" kernel symbols"), ULEB128 was used to encode symbol name length.
> That is, for "big" kernel symbols of which name length is longer than
> 0x7f characters, the length info is encoded into 2 bytes.
>
> kallsyms_get_symbol_type() expects to read the first char of the
> symbol name which indicates the symbol type. However, due to the
> "big" symbol case not being handled, the symbol type read from
> /proc/kallsyms may be wrong, so handle it properly.
>
> Cc: stable@vger.kernel.org
> Fixes: 73bbb94466fd ("kallsyms: support "big" kernel symbols")
> Signed-off-by: Zheng Yejian <zhengyejian@huaweicloud.com>

Gary made me aware of this thread (thanks!) -- we are coming from:

    https://lore.kernel.org/all/aQjua6zkEHYNVN3X@x1/

For which I sent this patch without knowing about this one:

    https://lore.kernel.org/rust-for-linux/20251107050414.511648-1-ojeda@kernel.org/

This has been seen now by Arnaldo (Cc'ing) in a real system, so I think
we should take this one since it was first, with:

Cc: stable@vger.kernel.org

Thanks!

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs
  2024-10-11 14:38 [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs Zheng Yejian
                   ` (2 preceding siblings ...)
  2025-11-11 21:13 ` Miguel Ojeda
@ 2025-11-16 22:30 ` Miguel Ojeda
  2025-11-24 16:54   ` Miguel Ojeda
  3 siblings, 1 reply; 9+ messages in thread
From: Miguel Ojeda @ 2025-11-16 22:30 UTC (permalink / raw)
  To: zhengyejian
  Cc: ardb, arnd, boqun.feng, gary, gregkh, jannh, kees, linux-kernel,
	masahiroy, mcgrof, ndesaulniers, song, wedsonaf, willy, yeweihua4

On Fri, 11 Oct 2024 22:38:53 +0800 Zheng Yejian <zhengyejian@huaweicloud.com> wrote:
>
> Currently when the length of a symbol is longer than 0x7f characters,
> its type shown in /proc/kallsyms can be incorrect.
>
> I found this issue when reading the code, but it can be reproduced by
> following steps:
>
>   1. Define a function which symbol length is 130 characters:
>
>     #define X13(x) x##x##x##x##x##x##x##x##x##x##x##x##x
>     static noinline void X13(x123456789)(void)
>     {
>         printk("hello world\n");
>     }
>
>   2. The type in vmlinux is 't':
>
>     $ nm vmlinux | grep x123456
>     ffffffff816290f0 t x123456789x123456789x123456789x12[...]
>
>   3. Then boot the kernel, the type shown in /proc/kallsyms becomes 'g'
>      instead of the expected 't':
>
>     # cat /proc/kallsyms | grep x123456
>     ffffffff816290f0 g x123456789x123456789x123456789x12[...]
>
> The root cause is that, after commit 73bbb94466fd ("kallsyms: support
> "big" kernel symbols"), ULEB128 was used to encode symbol name length.
> That is, for "big" kernel symbols of which name length is longer than
> 0x7f characters, the length info is encoded into 2 bytes.
>
> kallsyms_get_symbol_type() expects to read the first char of the
> symbol name which indicates the symbol type. However, due to the
> "big" symbol case not being handled, the symbol type read from
> /proc/kallsyms may be wrong, so handle it properly.
>
> Cc: stable@vger.kernel.org
> Fixes: 73bbb94466fd ("kallsyms: support "big" kernel symbols")
> Signed-off-by: Zheng Yejian <zhengyejian@huaweicloud.com>

Applied to `rust-fixes` -- thanks everyone!

Let's get some linux-next testing.

If someone is against this or wants to pick it up, please shout!

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs
  2025-11-16 22:30 ` Miguel Ojeda
@ 2025-11-24 16:54   ` Miguel Ojeda
  0 siblings, 0 replies; 9+ messages in thread
From: Miguel Ojeda @ 2025-11-24 16:54 UTC (permalink / raw)
  To: ojeda
  Cc: ardb, arnd, boqun.feng, gary, gregkh, jannh, kees, linux-kernel,
	masahiroy, mcgrof, ndesaulniers, song, wedsonaf, willy, yeweihua4,
	zhengyejian

On Sun, 16 Nov 2025 23:30:55 +0100 Miguel Ojeda <ojeda@kernel.org> wrote:
>
> Applied to `rust-fixes` -- thanks everyone!
>
> Let's get some linux-next testing.
>
> If someone is against this or wants to pick it up, please shout!

Moved to `rust-next` which I will send this week.

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-11-24 16:54 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-11 14:38 [PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs Zheng Yejian
2024-10-11 14:08 ` Gary Guo
2024-10-11 22:01 ` Matthew Wilcox
2024-10-12  1:36   ` Zheng Yejian
2024-10-12  1:47   ` Gary Guo
2024-10-12  2:09     ` Zheng Yejian
2025-11-11 21:13 ` Miguel Ojeda
2025-11-16 22:30 ` Miguel Ojeda
2025-11-24 16:54   ` Miguel Ojeda

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox