BPF List
 help / color / mirror / Atom feed
From: Hou Tao <houtao@huaweicloud.com>
To: Kumar Kartikeya Dwivedi <memxor@gmail.com>,
	Yan Zhai <yan@cloudflare.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Ignat Korchagin <ignat@cloudflare.com>, bpf <bpf@vger.kernel.org>,
	kernel-team <kernel-team@cloudflare.com>,
	Jakub Sitnicki <jakub@cloudflare.com>
Subject: Re: Page faults in tracepoint caused by aliased pointer
Date: Thu, 22 Feb 2024 17:27:01 +0800	[thread overview]
Message-ID: <acfd5d49-7727-e07e-4cc1-76decda3b0e5@huaweicloud.com> (raw)
In-Reply-To: <CAP01T77KHwS8bmcXfYXn1OmdAXdrSz_sXooUZ5jAa7vSk=HmnQ@mail.gmail.com>

Hi,

On 2/13/2024 8:33 AM, Kumar Kartikeya Dwivedi wrote:
> On Tue, 13 Feb 2024 at 01:21, Yan Zhai <yan@cloudflare.com> wrote:
>> On Mon, Feb 12, 2024 at 5:52 PM Alexei Starovoitov
>> <alexei.starovoitov@gmail.com> wrote:
>>> On Mon, Feb 12, 2024 at 3:42 PM Kumar Kartikeya Dwivedi
>>> <memxor@gmail.com> wrote:
>>>> On Tue, 13 Feb 2024 at 00:34, Alexei Starovoitov
>>>> <alexei.starovoitov@gmail.com> wrote:
>>>>> On Mon, Feb 12, 2024 at 3:16 PM Ignat Korchagin <ignat@cloudflare.com> wrote:
>>>>>> [288931.217143][T109754] CPU: 4 PID: 109754 Comm: bpftrace Not tainted
>>>>>> 6.6.16+ #10
>>>>> ...
>>>>>> [288931.217143][T109754]  ? copy_from_kernel_nofault+0x1d/0xe0
>>>>>> [288931.217143][T109754]  bpf_probe_read_compat+0x6a/0x90
>>>>>>
>>>>>> And Jakub CCed here did it for 6.8.0-rc2+
>>>>> I suspect something is broken in your kernels.
>>>>> Above is doing generic copy_from_kernel_nofault(),
>>>>> so one should be able to crash the kernel without any bpf.
>>>>>
>>>>> We have this in selftests/bpf:
>>>>> __weak noinline struct file *bpf_testmod_return_ptr(int arg)
>>>>> {
>>>>>         static struct file f = {};
>>>>>
>>>>>         switch (arg) {
>>>>>         case 1: return (void *)EINVAL;          /* user addr */
>>>>>         case 2: return (void *)0xcafe4a11;      /* user addr */
>>>>>         case 3: return (void *)-EINVAL;         /* canonical, but invalid */
>>>>>         case 4: return (void *)(1ull << 60);    /* non-canonical and invalid */
>>>>>         case 5: return (void *)~(1ull << 30);   /* trigger extable */
>>>>>         case 6: return &f;                      /* valid addr */
>>>>>         case 7: return (void *)((long)&f | 1);  /* kernel tricks */
>>>>>         default: return NULL;
>>>>>         }
>>>>> }
>>>>> where we check that extables setup by JIT for bpf progs are working correctly.
>>>>> You should see the kernel crashing when you just run bpf selftests.
>>>> I agree, this appears unrelated to BPF since it is happening when
>>>> using copy_from_kernel_nofault (which should be jumping to the Efault
>>>> label instead of the oops), but I think it's not specific to some
>>>> custom kernel. I can reproduce it on my dev machine on top of bpf-next
>>>> as well, and another machine with Ubuntu's generic 6.5 kernel for
>>>> 24.04. And I think Ignat tried it on the mainline 6.8-rc2 as well.
>> copy_from_kernel_nofault is called in Jakub's reproducer, but the
>> panic case in our production seems to be direct memory accessing
>> according to bpftool dumped jited code. Will faults from such
>> instructions also be caught correctly?
>>
> Yep, since faults in both cases end up in the page fault handler.
> Once the fix pointed out by Alexei is applied, it should address both scenarios.

I didn't get the idea on how the vsyscall patch [1] will fix the
unhandled page fault caused by BTF pointer dereference. In my
understanding, for BTF pointer dereference, x86 JIT checks whether the
address is a kernel space address or not. If it is the kernel space
address, it will setup an exception fix-up entry for its dereference and
will try to do dereference directly. If the address is vsyscall address,
x86 JIT will consider it as kernel space address and will try to
dereference it directly. The dereference of vsyscall page in kernel will
trigger the page fault, handle_page_fault() will be invoked and it will
invoke do_user_addr_fault() and page_fault_oops() accordingly.

[1]:
https://patchwork.kernel.org/project/netdevbpf/patch/20240202103935.3154011-3-houtao@huaweicloud.com/

>
>> Yan
>>
>>> Then it must be vsyscall address that this series are fixing:
>>> https://patchwork.kernel.org/project/netdevbpf/patch/20240202103935.3154011-3-houtao@huaweicloud.com/
>>>
>>> We're still waiting on x86 maintainers to ack them.
> .


      parent reply	other threads:[~2024-02-22  9:27 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-12 22:14 Page faults in tracepoint caused by aliased pointer Yan Zhai
2024-02-12 22:55 ` Kumar Kartikeya Dwivedi
2024-02-12 23:15   ` Ignat Korchagin
2024-02-12 23:27     ` Yan Zhai
2024-02-12 23:33     ` Alexei Starovoitov
2024-02-12 23:41       ` Kumar Kartikeya Dwivedi
2024-02-12 23:52         ` Alexei Starovoitov
2024-02-13  0:21           ` Yan Zhai
2024-02-13  0:33             ` Kumar Kartikeya Dwivedi
2024-02-21 17:47               ` Ignat Korchagin
2024-02-22  9:27               ` Hou Tao [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=acfd5d49-7727-e07e-4cc1-76decda3b0e5@huaweicloud.com \
    --to=houtao@huaweicloud.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=ignat@cloudflare.com \
    --cc=jakub@cloudflare.com \
    --cc=kernel-team@cloudflare.com \
    --cc=memxor@gmail.com \
    --cc=yan@cloudflare.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox