From: Yonghong Song <yonghong.song@linux.dev>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>,
bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Kernel Team <kernel-team@fb.com>,
Martin KaFai Lau <martin.lau@kernel.org>
Subject: Re: yet another approach Was: [PATCH bpf-next v3 4/5] bpf, x86: Add jit support for private stack
Date: Fri, 4 Oct 2024 12:27:55 -0700 [thread overview]
Message-ID: <a4468429-3b93-49b3-b8e4-122b903c98fb@linux.dev> (raw)
In-Reply-To: <d8ff2878-c53b-48d7-b624-93aeb2087113@linux.dev>
On 10/3/24 10:22 PM, Yonghong Song wrote:
>
> On 10/3/24 3:32 PM, Alexei Starovoitov wrote:
>> On Thu, Oct 3, 2024 at 1:44 PM Yonghong Song
>> <yonghong.song@linux.dev> wrote:
>>>> Looks like the idea needs more thought.
>>>>
>>>> in_task_stack() won't recognize the private stack,
>>>> so it will look like stack overflow and double fault.
>>>>
>>>> do you have CONFIG_VMAP_STACK ?
>>> Yes, my above test runs fine withCONFIG_VMAP_STACK. Let me guard
>>> private stack support with
>>> CONFIG_VMAP_STACK for now. Not sure whether distributions enable
>>> CONFIG_VMAP_STACK or not.
>> Good! but I'm surprised it makes a difference.
>
> That only for the test case I tried. Now I tried the whole bpf selftests
> with CONFIG_VMAP_STACK on. There are still some failures. Some of them
> due to stack protector. I disabled stack protector and then those stack
> protector error gone. But some other errors show up like below:
>
> [ 27.186581] kernel tried to execute NX-protected page - exploit
> attempt? (uid: 0)
> [ 27.187480] BUG: unable to handle page fault for address:
> ffff888109572800
> [ 27.188299] #PF: supervisor instruction fetch in kernel mode
> [ 27.189085] #PF: error_code(0x0011) - permissions violation
>
> or
>
> [ 27.736844] BUG: unable to handle page fault for address:
> 0000000080000000
> [ 27.737759] #PF: supervisor instruction fetch in kernel mode
> [ 27.738631] #PF: error_code(0x0010) - not-present page
> [ 27.739455] PGD 0 P4D 0
> [ 27.739818] Oops: Oops: 0010 [#1] PREEMPT SMP PTI
>
> ...
>
> Some further investigations are needed.
I found one failure case (with stackprotector disabled):
[ 20.032611] traps: PANIC: double fault, error_code: 0x0
[ 20.032615] Oops: double fault: 0000 [#1] PREEMPT SMP PTI
[ 20.032619] CPU: 0 UID: 0 PID: 1959 Comm: test_progs Tainted: G OE 6.11.0-10576-g17baa0096769-dirty #1006
[ 20.032623] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[ 20.032624] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 20.032626] RIP: 0010:error_entry+0x17/0x140
[ 20.032633] Code: ff 0f 01 f8 e9 56 fe ff ff 90 90 90 90 90 90 90 90 90 90 56 48 8b 74 24 08 48 89 7c 24 08 52 51 50 41 50 41 51 41 52 49
[ 20.032635] RSP: 0018:ffffe8ffff400000 EFLAGS: 00010093
[ 20.032637] RAX: ffffe8ffff4000a8 RBX: ffffe8ffff4000a8 RCX: ffffffff82201737
[ 20.032639] RDX: 0000000000000000 RSI: ffffffff8220128d RDI: ffffe8ffff4000a8
[ 20.032640] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 20.032641] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 20.032642] R13: 0000000000000000 R14: 000000000002ed80 R15: 0000000000000000
[ 20.032643] FS: 00007f8a3a2006c0(0000) GS:ffff888237c00000(0000) knlGS:ffff888237c00000
[ 20.032645] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 20.032646] CR2: ffffe8ffff3ffff8 CR3: 0000000103580002 CR4: 0000000000370ef0
[ 20.032649] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 20.032650] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 20.032651] Call Trace:
[ 20.032660] <#DF>
[ 20.032664] ? __die_body+0xaf/0xc0
[ 20.032667] ? die+0x2f/0x50
[ 20.032670] ? exc_double_fault+0xbf/0xd0
[ 20.032674] ? asm_exc_double_fault+0x23/0x30
[ 20.032678] ? restore_regs_and_return_to_kernel+0x1b/0x1b
[ 20.032681] ? asm_exc_page_fault+0xd/0x30
[ 20.032684] ? error_entry+0x17/0x140
[ 20.032687] </#DF>
The private stack for cpu 0:
priv_stack_ptr cpu 0 = [ffffe8ffff434000, ffffe8ffff438000] (total 16KB)
That is, the top stack is ffffe8ffff438000 and the bottom stack is ffffe8ffff434000.
During bpf execution, a softirq may happen, at that point,
stack pointer becomes:
RSP: 0018:ffffe8ffff400000 (see above)
and there is a read/write (mostly write) to address
CR2: ffffe8ffff3ffff8
And this may cause a fault.
After this fault, there are some further access and probably because
of invalid stack, double fault happens.
So the quesiton is why RSP is reset to ffffe8ffff400000?
I have not figured out which code changed this? Maybe somebody can help?
>
>> Please still root cause the crash without VMAP_STACK.
>
> Sure. Let me investigate cases with VMAP_STACK first and
> then will try to look at it without VMAP_STACK.
>
>>
>> We need to do a lot more homework here before proceeding.
>> Look at arch/x86/kernel/dumpstack_64.c
>> At least we need new stack_type for priv stack.
>> stack_type_unknown doesn't inspire confidence.
>> Need to make sure stack trace is still reliable with priv stack.
>> Though it may look appealing from performance pov.
>> We may need to go back to r9 approach with push/pop around calls,
>> since that is surely keeping unwinder happy
>> while this approach will have to teach unwinder.
>
> Good point.
>
>
next prev parent reply other threads:[~2024-10-04 19:28 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-26 23:45 [PATCH bpf-next v3 0/5] bpf: Support private stack for bpf progs Yonghong Song
2024-09-26 23:45 ` [PATCH bpf-next v3 1/5] bpf: Allow each subprog having stack size of 512 bytes Yonghong Song
2024-09-26 23:45 ` [PATCH bpf-next v3 2/5] bpf: Collect stack depth information Yonghong Song
2024-09-30 14:42 ` Alexei Starovoitov
2024-09-30 16:23 ` Yonghong Song
2024-09-26 23:45 ` [PATCH bpf-next v3 3/5] bpf: Mark each subprog with proper pstack states Yonghong Song
2024-09-30 14:49 ` Alexei Starovoitov
2024-09-30 16:26 ` Yonghong Song
2024-09-26 23:45 ` [PATCH bpf-next v3 4/5] bpf, x86: Add jit support for private stack Yonghong Song
2024-09-27 4:58 ` Leon Hwang
2024-09-27 15:24 ` Yonghong Song
2024-09-29 8:31 ` kernel test robot
2024-09-30 16:29 ` Yonghong Song
2024-09-29 13:02 ` kernel test robot
2024-09-30 16:31 ` Yonghong Song
2024-09-29 13:34 ` kernel test robot
2024-09-30 15:03 ` Alexei Starovoitov
2024-09-30 16:33 ` Yonghong Song
2024-10-01 4:31 ` Kumar Kartikeya Dwivedi
2024-10-01 4:37 ` Kumar Kartikeya Dwivedi
2024-10-01 18:49 ` Alexei Starovoitov
2024-10-01 19:53 ` yet another approach Was: " Alexei Starovoitov
2024-10-01 20:50 ` Kumar Kartikeya Dwivedi
2024-10-01 21:28 ` Alexei Starovoitov
2024-10-02 0:22 ` Kumar Kartikeya Dwivedi
2024-10-02 1:26 ` Alexei Starovoitov
2024-10-02 2:16 ` Kumar Kartikeya Dwivedi
2024-10-02 6:28 ` Yonghong Song
2024-10-02 6:48 ` Yonghong Song
2024-10-03 6:17 ` Yonghong Song
2024-10-03 13:39 ` Kumar Kartikeya Dwivedi
2024-10-03 17:35 ` Alexei Starovoitov
2024-10-03 18:53 ` Yonghong Song
2024-10-03 20:44 ` Yonghong Song
2024-10-03 20:47 ` Kumar Kartikeya Dwivedi
2024-10-03 20:54 ` Yonghong Song
2024-10-03 22:32 ` Alexei Starovoitov
2024-10-04 5:22 ` Yonghong Song
2024-10-04 19:27 ` Yonghong Song [this message]
2024-10-04 19:52 ` Alexei Starovoitov
2024-10-05 2:03 ` Yonghong Song
2024-10-08 22:10 ` Alexei Starovoitov
2024-10-09 2:06 ` Alexei Starovoitov
2024-10-09 6:31 ` Yonghong Song
2024-10-09 14:56 ` Alexei Starovoitov
2024-10-09 15:56 ` Yonghong Song
2024-10-09 16:36 ` Kumar Kartikeya Dwivedi
2024-10-09 16:38 ` Kumar Kartikeya Dwivedi
2024-10-09 17:37 ` Kumar Kartikeya Dwivedi
2024-10-09 6:12 ` Yonghong Song
2024-09-26 23:45 ` [PATCH bpf-next v3 5/5] selftests/bpf: Add private stack tests Yonghong Song
2024-09-30 13:40 ` Jiri Olsa
2024-09-30 15:05 ` Alexei Starovoitov
2024-09-30 16:35 ` Yonghong Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a4468429-3b93-49b3-b8e4-122b903c98fb@linux.dev \
--to=yonghong.song@linux.dev \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=kernel-team@fb.com \
--cc=martin.lau@kernel.org \
--cc=memxor@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox