From: Michael Ellerman <mpe@ellerman.id.au>
To: Christophe Leroy <christophe.leroy@csgroup.eu>,
Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Paul Mackerras <paulus@samba.org>,
Dmitry Vyukov <dvyukov@google.com>,
LKML <linux-kernel@vger.kernel.org>,
linuxppc-dev@lists.ozlabs.org,
kasan-dev <kasan-dev@googlegroups.com>
Subject: Re: [RFC PATCH v1] powerpc: Enable KFENCE for PPC32
Date: Tue, 02 Mar 2021 22:40:03 +1100 [thread overview]
Message-ID: <87h7ltss18.fsf@mpe.ellerman.id.au> (raw)
In-Reply-To: <08a96c5d-4ae7-03b4-208f-956226dee6bb@csgroup.eu>
Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Le 02/03/2021 à 10:53, Marco Elver a écrit :
>> On Tue, 2 Mar 2021 at 10:27, Christophe Leroy
>> <christophe.leroy@csgroup.eu> wrote:
>>> Le 02/03/2021 à 10:21, Alexander Potapenko a écrit :
>>>>> [ 14.998426] BUG: KFENCE: invalid read in finish_task_switch.isra.0+0x54/0x23c
>>>>> [ 14.998426]
>>>>> [ 15.007061] Invalid read at 0x(ptrval):
>>>>> [ 15.010906] finish_task_switch.isra.0+0x54/0x23c
>>>>> [ 15.015633] kunit_try_run_case+0x5c/0xd0
>>>>> [ 15.019682] kunit_generic_run_threadfn_adapter+0x24/0x30
>>>>> [ 15.025099] kthread+0x15c/0x174
>>>>> [ 15.028359] ret_from_kernel_thread+0x14/0x1c
>>>>> [ 15.032747]
>>>>> [ 15.034251] CPU: 0 PID: 111 Comm: kunit_try_catch Tainted: G B
>>>>> 5.12.0-rc1-s3k-dev-01534-g4f14ae75edf0-dirty #4674
>>>>> [ 15.045811] ==================================================================
>>>>> [ 15.053324] # test_invalid_access: EXPECTATION FAILED at mm/kfence/kfence_test.c:636
>>>>> [ 15.053324] Expected report_matches(&expect) to be true, but is false
>>>>> [ 15.068359] not ok 21 - test_invalid_access
>>>>
>>>> The test expects the function name to be test_invalid_access, i. e.
>>>> the first line should be "BUG: KFENCE: invalid read in
>>>> test_invalid_access".
>>>> The error reporting function unwinds the stack, skips a couple of
>>>> "uninteresting" frames
>>>> (https://elixir.bootlin.com/linux/v5.12-rc1/source/mm/kfence/report.c#L43)
>>>> and uses the first "interesting" one frame to print the report header
>>>> (https://elixir.bootlin.com/linux/v5.12-rc1/source/mm/kfence/report.c#L226).
>>>>
>>>> It's strange that test_invalid_access is missing altogether from the
>>>> stack trace - is that expected?
>>>> Can you try printing the whole stacktrace without skipping any frames
>>>> to see if that function is there?
>>>>
>>>
>>> Booting with 'no_hash_pointers" I get the following. Does it helps ?
>>>
>>> [ 16.837198] ==================================================================
>>> [ 16.848521] BUG: KFENCE: invalid read in finish_task_switch.isra.0+0x54/0x23c
>>> [ 16.848521]
>>> [ 16.857158] Invalid read at 0xdf98800a:
>>> [ 16.861004] finish_task_switch.isra.0+0x54/0x23c
>>> [ 16.865731] kunit_try_run_case+0x5c/0xd0
>>> [ 16.869780] kunit_generic_run_threadfn_adapter+0x24/0x30
>>> [ 16.875199] kthread+0x15c/0x174
>>> [ 16.878460] ret_from_kernel_thread+0x14/0x1c
>>> [ 16.882847]
>>> [ 16.884351] CPU: 0 PID: 111 Comm: kunit_try_catch Tainted: G B
>>> 5.12.0-rc1-s3k-dev-01534-g4f14ae75edf0-dirty #4674
>>> [ 16.895908] NIP: c016eb8c LR: c02f50dc CTR: c016eb38
>>> [ 16.900963] REGS: e2449d90 TRAP: 0301 Tainted: G B
>>> (5.12.0-rc1-s3k-dev-01534-g4f14ae75edf0-dirty)
>>> [ 16.911386] MSR: 00009032 <EE,ME,IR,DR,RI> CR: 22000004 XER: 00000000
>>> [ 16.918153] DAR: df98800a DSISR: 20000000
>>> [ 16.918153] GPR00: c02f50dc e2449e50 c1140d00 e100dd24 c084b13c 00000008 c084b32b c016eb38
>>> [ 16.918153] GPR08: c0850000 df988000 c0d10000 e2449eb0 22000288
>>> [ 16.936695] NIP [c016eb8c] test_invalid_access+0x54/0x108
>>> [ 16.942125] LR [c02f50dc] kunit_try_run_case+0x5c/0xd0
>>> [ 16.947292] Call Trace:
>>> [ 16.949746] [e2449e50] [c005a5ec] finish_task_switch.isra.0+0x54/0x23c (unreliable)
>>
>> The "(unreliable)" might be a clue that it's related to ppc32 stack
>> unwinding. Any ppc expert know what this is about?
>>
>>> [ 16.957443] [e2449eb0] [c02f50dc] kunit_try_run_case+0x5c/0xd0
>>> [ 16.963319] [e2449ed0] [c02f63ec] kunit_generic_run_threadfn_adapter+0x24/0x30
>>> [ 16.970574] [e2449ef0] [c004e710] kthread+0x15c/0x174
>>> [ 16.975670] [e2449f30] [c001317c] ret_from_kernel_thread+0x14/0x1c
>>> [ 16.981896] Instruction dump:
>>> [ 16.984879] 8129d608 38e7eb38 81020280 911f004c 39000000 995f0024 907f0028 90ff001c
>>> [ 16.992710] 3949000a 915f0020 3d40c0d1 3d00c085 <8929000a> 3908adb0 812a4b98 3d40c02f
>>> [ 17.000711] ==================================================================
>>> [ 17.008223] # test_invalid_access: EXPECTATION FAILED at mm/kfence/kfence_test.c:636
>>> [ 17.008223] Expected report_matches(&expect) to be true, but is false
>>> [ 17.023243] not ok 21 - test_invalid_access
>>
>> On a fault in test_invalid_access, KFENCE prints the stack trace based
>> on the information in pt_regs. So we do not think there's anything we
>> can do to improve stack printing pe-se.
>
> stack printing, probably not. Would be good anyway to mark the last level [unreliable] as the ppc does.
>
> IIUC, on ppc the address in the stack frame of the caller is written by the caller. In most tests,
> there is some function call being done before the fault, for instance
> test_kmalloc_aligned_oob_read() does a call to kunit_do_assertion which populates the address of the
> call in the stack. However this is fragile.
>
> This works for function calls because in order to call a subfunction, a function has to set up a
> stack frame in order to same the value in the Link Register, which contains the address of the
> function's parent and that will be clobbered by the sub-function call.
>
> However, it cannot be done by exceptions, because exceptions can happen in a function that has no
> stack frame (because that function has no need to call a subfunction and doesn't need to same
> anything on the stack). If the exception handler was writting the caller's address in the stack
> frame, it would in fact write it in the parent's frame, leading to a mess.
>
> But in fact the information is in pt_regs, it is in regs->nip so KFENCE should be able to use that
> instead of the stack.
>
>>
>> What's confusing is that it's only this test, and none of the others.
>> Given that, it might be code-gen related, which results in some subtle
>> issue with stack unwinding. There are a few things to try, if you feel
>> like it:
>>
>> -- Change the unwinder, if it's possible for ppc32.
>
> I don't think it is possible.
I think this actually is the solution.
It seems the good architectures have all added support for
arch_stack_walk(), and we have not.
Looking at some of the implementations of arch_stack_walk() it seems
it's expected that the first entry emitted includes the PC (or NIP on
ppc).
For us stack_trace_save() calls save_stack_trace() which only emits
entries from the stack, which doesn't necessarily include the function
NIP is pointing to.
So I think it's probably on us to update to that new API. Or at least
update our save_stack_trace() to fabricate an entry using the NIP, as it
seems that's what callers expect.
cheers
next prev parent reply other threads:[~2021-03-02 16:07 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-02 8:37 [RFC PATCH v1] powerpc: Enable KFENCE for PPC32 Christophe Leroy
2021-03-02 8:58 ` Marco Elver
2021-03-02 9:05 ` Christophe Leroy
2021-03-02 9:21 ` Alexander Potapenko
2021-03-02 9:27 ` Christophe Leroy
2021-03-02 9:53 ` Marco Elver
2021-03-02 11:21 ` Christophe Leroy
2021-03-02 11:39 ` Marco Elver
2021-03-03 10:38 ` Christophe Leroy
2021-03-03 10:56 ` Marco Elver
2021-03-04 11:23 ` Christophe Leroy
2021-03-04 11:31 ` Marco Elver
2021-03-04 11:48 ` Christophe Leroy
2021-03-04 12:00 ` Christophe Leroy
2021-03-04 12:02 ` Marco Elver
2021-03-04 12:48 ` Marco Elver
2021-03-04 14:08 ` Christophe Leroy
2021-03-04 14:19 ` Marco Elver
2021-03-05 5:01 ` Michael Ellerman
2021-03-05 7:50 ` Marco Elver
2021-03-05 8:23 ` Christophe Leroy
2021-03-05 9:14 ` Marco Elver
2021-03-05 11:49 ` Michael Ellerman
2021-03-05 13:46 ` Marco Elver
2021-03-02 11:40 ` Michael Ellerman [this message]
2021-03-02 18:48 ` Segher Boessenkool
2021-03-03 10:28 ` Christophe Leroy
2021-03-03 10:31 ` Christophe Leroy
2021-03-03 10:39 ` Marco Elver
2021-03-03 10:56 ` Christophe Leroy
[not found] ` <CANpmjNMKEObjf=WyfDQB5vPmR5RuyUMBJyfr6P2ykCd67wyMbA__49537.1361424745$1614767987$gmane$org@mail.gmail.com>
2021-03-03 10:46 ` Andreas Schwab
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87h7ltss18.fsf@mpe.ellerman.id.au \
--to=mpe@ellerman.id.au \
--cc=benh@kernel.crashing.org \
--cc=christophe.leroy@csgroup.eu \
--cc=dvyukov@google.com \
--cc=elver@google.com \
--cc=glider@google.com \
--cc=kasan-dev@googlegroups.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=paulus@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox