* Crashes in linux-next on powerpc with CONFIG_PPC_KUAP and CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG
@ 2019-05-07 14:54 Michael Ellerman
2019-05-09 9:29 ` Petr Mladek
0 siblings, 1 reply; 3+ messages in thread
From: Michael Ellerman @ 2019-05-07 14:54 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Stephen Rothwell, Petr Mladek
Hi folks,
Just an FYI in case anyone else is seeing crashes very early in boot in
linux-next with the above config options.
The problem is the combination of some new code called via printk(),
check_pointer() which calls probe_kernel_read(). That then calls
allow_user_access() (PPC_KUAP) and that uses mmu_has_feature() too early
(before we've patched features). With the JUMP_LABEL debug enabled that
causes us to call printk() & dump_stack() and we end up recursing and
overflowing the stack.
Because it happens so early you don't get any output, just an apparently
dead system.
The stack trace (which you don't see) is something like:
...
dump_stack+0xdc
probe_kernel_read+0x1a4
check_pointer+0x58
string+0x3c
vsnprintf+0x1bc
vscnprintf+0x20
printk_safe_log_store+0x7c
printk+0x40
dump_stack_print_info+0xbc
dump_stack+0x8
probe_kernel_read+0x1a4
probe_kernel_read+0x19c
check_pointer+0x58
string+0x3c
vsnprintf+0x1bc
vscnprintf+0x20
vprintk_store+0x6c
vprintk_emit+0xec
vprintk_func+0xd4
printk+0x40
cpufeatures_process_feature+0xc8
scan_cpufeatures_subnodes+0x380
of_scan_flat_dt_subnodes+0xb4
dt_cpu_ftrs_scan_callback+0x158
of_scan_flat_dt+0xf0
dt_cpu_ftrs_scan+0x3c
early_init_devtree+0x360
early_setup+0x9c
The simple fix is to use early_mmu_has_feature() in allow_user_access(),
but we'd rather not do that because it penalises all
copy_to/from_users() for the life of the system with the cost of the
runtime check vs the jump label. The irony is probe_kernel_read()
shouldn't be allowing user access at all, because we're reading the
kernel not userspace.
For now if you're hitting it just turn off
CONFIG_PPC_KUAP and/or CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG.
cheers
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Crashes in linux-next on powerpc with CONFIG_PPC_KUAP and CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG
2019-05-07 14:54 Crashes in linux-next on powerpc with CONFIG_PPC_KUAP and CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG Michael Ellerman
@ 2019-05-09 9:29 ` Petr Mladek
2019-05-09 14:42 ` Michael Ellerman
0 siblings, 1 reply; 3+ messages in thread
From: Petr Mladek @ 2019-05-09 9:29 UTC (permalink / raw)
To: Michael Ellerman
Cc: linuxppc-dev, linux-kernel, Stephen Rothwell, Linus Torvalds
On Wed 2019-05-08 00:54:51, Michael Ellerman wrote:
> Hi folks,
>
> Just an FYI in case anyone else is seeing crashes very early in boot in
> linux-next with the above config options.
>
> The problem is the combination of some new code called via printk(),
> check_pointer() which calls probe_kernel_read(). That then calls
> allow_user_access() (PPC_KUAP) and that uses mmu_has_feature() too early
> (before we've patched features). With the JUMP_LABEL debug enabled that
> causes us to call printk() & dump_stack() and we end up recursing and
> overflowing the stack.
Sigh, the check_pointer() stuff is in Linus's tree now, see
the commit 3e5903eb9cff707301712 ("vsprintf: Prevent crash when
dereferencing invalid pointers").
> Because it happens so early you don't get any output, just an apparently
> dead system.
>
> The stack trace (which you don't see) is something like:
>
> ...
> dump_stack+0xdc
> probe_kernel_read+0x1a4
> check_pointer+0x58
> string+0x3c
> vsnprintf+0x1bc
> vscnprintf+0x20
> printk_safe_log_store+0x7c
> printk+0x40
> dump_stack_print_info+0xbc
> dump_stack+0x8
> probe_kernel_read+0x1a4
> probe_kernel_read+0x19c
> check_pointer+0x58
> string+0x3c
> vsnprintf+0x1bc
> vscnprintf+0x20
> vprintk_store+0x6c
> vprintk_emit+0xec
> vprintk_func+0xd4
> printk+0x40
> cpufeatures_process_feature+0xc8
> scan_cpufeatures_subnodes+0x380
> of_scan_flat_dt_subnodes+0xb4
> dt_cpu_ftrs_scan_callback+0x158
> of_scan_flat_dt+0xf0
> dt_cpu_ftrs_scan+0x3c
> early_init_devtree+0x360
> early_setup+0x9c
>
>
> The simple fix is to use early_mmu_has_feature() in allow_user_access(),
> but we'd rather not do that because it penalises all
> copy_to/from_users() for the life of the system with the cost of the
> runtime check vs the jump label. The irony is probe_kernel_read()
> shouldn't be allowing user access at all, because we're reading the
> kernel not userspace.
I have tried to find a lightweight way for a safe reading of unknown
kernel pointer. But I have not succeeded so far. I see only variants
with user access. The user access is handled in arch-specific code
and I do not see any variant without it.
I am not sure on which level it should get fixed.
Could you please send it to lkml to get a wider audience?
Best Regards,
Petr
> For now if you're hitting it just turn off
> CONFIG_PPC_KUAP and/or CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG.
>
> cheers
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: Crashes in linux-next on powerpc with CONFIG_PPC_KUAP and CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG
2019-05-09 9:29 ` Petr Mladek
@ 2019-05-09 14:42 ` Michael Ellerman
0 siblings, 0 replies; 3+ messages in thread
From: Michael Ellerman @ 2019-05-09 14:42 UTC (permalink / raw)
To: Petr Mladek; +Cc: linuxppc-dev, linux-kernel, Stephen Rothwell, Linus Torvalds
Petr Mladek <pmladek@suse.com> writes:
> On Wed 2019-05-08 00:54:51, Michael Ellerman wrote:
>> Hi folks,
>>
>> Just an FYI in case anyone else is seeing crashes very early in boot in
>> linux-next with the above config options.
>>
>> The problem is the combination of some new code called via printk(),
>> check_pointer() which calls probe_kernel_read(). That then calls
>> allow_user_access() (PPC_KUAP) and that uses mmu_has_feature() too early
>> (before we've patched features). With the JUMP_LABEL debug enabled that
>> causes us to call printk() & dump_stack() and we end up recursing and
>> overflowing the stack.
>
> Sigh, the check_pointer() stuff is in Linus's tree now, see
> the commit 3e5903eb9cff707301712 ("vsprintf: Prevent crash when
> dereferencing invalid pointers").
No worries.
>> Because it happens so early you don't get any output, just an apparently
>> dead system.
>>
>> The stack trace (which you don't see) is something like:
>>
>> ...
>> dump_stack+0xdc
>> probe_kernel_read+0x1a4
>> check_pointer+0x58
>> string+0x3c
>> vsnprintf+0x1bc
>> vscnprintf+0x20
>> printk_safe_log_store+0x7c
>> printk+0x40
>> dump_stack_print_info+0xbc
>> dump_stack+0x8
>> probe_kernel_read+0x1a4
>> probe_kernel_read+0x19c
>> check_pointer+0x58
>> string+0x3c
>> vsnprintf+0x1bc
>> vscnprintf+0x20
>> vprintk_store+0x6c
>> vprintk_emit+0xec
>> vprintk_func+0xd4
>> printk+0x40
>> cpufeatures_process_feature+0xc8
>> scan_cpufeatures_subnodes+0x380
>> of_scan_flat_dt_subnodes+0xb4
>> dt_cpu_ftrs_scan_callback+0x158
>> of_scan_flat_dt+0xf0
>> dt_cpu_ftrs_scan+0x3c
>> early_init_devtree+0x360
>> early_setup+0x9c
>>
>>
>> The simple fix is to use early_mmu_has_feature() in allow_user_access(),
>> but we'd rather not do that because it penalises all
>> copy_to/from_users() for the life of the system with the cost of the
>> runtime check vs the jump label. The irony is probe_kernel_read()
>> shouldn't be allowing user access at all, because we're reading the
>> kernel not userspace.
>
> I have tried to find a lightweight way for a safe reading of unknown
> kernel pointer. But I have not succeeded so far. I see only variants
> with user access. The user access is handled in arch-specific code
> and I do not see any variant without it.
>
> I am not sure on which level it should get fixed.
I sent a fix in powerpc code (sorry might have forgot to Cc you):
https://patchwork.ozlabs.org/patch/1097015/
I've merged that into the powerpc tree. I think it's too subtle for us
to have an ordering requirement that deep in the user copy code, it was
just a matter of time before it caused a problem, you were just unlucky
it was your patch that did :)
We'll eventually switch it back to using a jump label but make it safe
to call early in boot before we've detected features.
> Could you please send it to lkml to get a wider audience?
I see you also sent a fix, that looks like a safe default to me.
But as I said I'm happy with the powerpc fix, so there's no requirement
from us that your fix get merged.
cheers
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-05-09 14:47 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-05-07 14:54 Crashes in linux-next on powerpc with CONFIG_PPC_KUAP and CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG Michael Ellerman
2019-05-09 9:29 ` Petr Mladek
2019-05-09 14:42 ` Michael Ellerman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox