All of lore.kernel.org
 help / color / mirror / Atom feed
From: Puranjay Mohan <puranjay12@gmail.com>
To: syzbot <syzbot+186522670e6722692d86@syzkaller.appspotmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
	"Russell King (Oracle)" <linux@armlinux.org.uk>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	syzkaller-bugs <syzkaller-bugs@googlegroups.com>,
	bpf <bpf@vger.kernel.org>
Subject: Re: [syzbot] [mm?] BUG: unable to handle kernel paging request in copy_from_kernel_nofault (2)
Date: Tue, 09 Apr 2024 10:03:01 +0000	[thread overview]
Message-ID: <mb61pa5m2yht6.fsf@gmail.com> (raw)
In-Reply-To: <ZhT5NRYnceirmAGz@shell.armlinux.org.uk>

"Russell King (Oracle)" <linux@armlinux.org.uk> writes:

> On Tue, Apr 09, 2024 at 07:45:54AM +0000, Puranjay Mohan wrote:
>> "Russell King (Oracle)" <linux@armlinux.org.uk> writes:
>> 
>> > On Fri, Apr 05, 2024 at 10:50:30AM -0700, Andrii Nakryiko wrote:
>> >> On Fri, Apr 5, 2024 at 9:30 AM Alexei Starovoitov
>> >> <alexei.starovoitov@gmail.com> wrote:
>> >> >
>> >> > On Fri, Apr 5, 2024 at 4:36 AM Russell King (Oracle)
>> >> > <linux@armlinux.org.uk> wrote:
>> >> > >
>> >> > > On Fri, Apr 05, 2024 at 12:02:36PM +0100, Mark Rutland wrote:
>> >> > > > On Thu, Apr 04, 2024 at 03:57:04PM -0700, Alexei Starovoitov wrote:
>> >> > > > > On Wed, Apr 3, 2024 at 6:56 PM Andrew Morton <akpm@linux-foundationorg> wrote:
>> >> > > > > >
>> >> > > > > > On Mon, 01 Apr 2024 22:19:25 -0700 syzbot <syzbot+186522670e6722692d86@syzkaller.appspotmail.com> wrote:
>> >> > > > > >
>> >> > > > > > > Hello,
>> >> > > > > >
>> >> > > > > > Thanks.  Cc: bpf@vger.kernel.org
>> >> > > > >
>> >> > > > > I suspect the issue is not on bpf side.
>> >> > > > > Looks like the bug is somewhere in arm32 bits.
>> >> > > > > copy_from_kernel_nofault() is called from lots of places.
>> >> > > > > bpf is just one user that is easy for syzbot to fuzz.
>> >> > > > > Interestingly arm defines copy_from_kernel_nofault_allowed()
>> >> > > > > that should have filtered out user addresses.
>> >> > > > > In this case ffffffe9 is probably a kernel address?
>> >> > > >
>> >> > > > It's at the end of the kernel range, and it's ERR_PTR(-EINVAL).
>> >> > > >
>> >> > > > 0xffffffe9 is -0x16, which is -22, which is -EINVAL.
>> >> > > >
>> >> > > > > But the kernel is doing a write?
>> >> > > > > Which makes no sense, since copy_from_kernel_nofault is probe reading.
>> >> > > >
>> >> > > > It makes perfect sense; the read from 'src' happened, then the kernel tries to
>> >> > > > write the result to 'dst', and that aligns with the disassembly in the report
>> >> > > > below, which I beleive is:
>> >> > > >
>> >> > > >      8: e4942000        ldr     r2, [r4], #0  <-- Read of 'src', fault fixup is elsewhere
>> >> > > >      c: e3530000        cmp     r3, #0
>> >> > > >   * 10: e5852000        str     r2, [r5]      <-- Write to 'dst'
>> >> > > >
>> >> > > > As above, it looks like 'dst' is ERR_PTR(-EINVAL).
>> >> > > >
>> >> > > > Are you certain that BPF is passing a sane value for 'dst'? Where does that
>> >> > > > come from in the first place?
>> >> > >
>> >> > > It looks to me like it gets passed in from the BPF program, and the
>> >> > > "type" for the argument is set to ARG_PTR_TO_UNINIT_MEM. What that
>> >> > > means for validation purposes, I've no idea, I'm not a BPF hacker.
>> >> > >
>> >> > > Obviously, if BPF is allowing copy_from_kernel_nofault() to be passed
>> >> > > an arbitary destination address, that would be a huge security hole.
>> >> >
>> >> > If that's the case that's indeed a giant security hole,
>> >> > but I doubt it. We would be crashing other archs as well.
>> >> > I cannot really tell whether arm32 JIT is on.
>> >> > If it is, it's likely a bug there.
>> >> > Puranjay,
>> >> > could you please take a look.
>> >> >
>> >> 
>> >> I dumped the BPF program that repro.c is loading, it works on x86-64
>> >> and there is nothing special there. We are probe-reading 5 bytes from
>> >> somewhere into the stack. Everything is unaligned here, but stays
>> >> within a well-defined memory slot.
>> >> 
>> >> Note the r3 = (s8)r1, that's a new-ish thing, maybe bug is somewhere
>> >> there (but then it would be JIT, not verifier itself)
>> >> 
>> >>    0: (7a) *(u64 *)(r10 -8) = 896542069
>> >>    1: (bf) r1 = r10
>> >>    2: (07) r1 += -7
>> >>    3: (b7) r2 = 5
>> >>    4: (bf) r3 = (s8)r1
>> >>    5: (85) call bpf_probe_read_kernel#-72390
>> >
>> 
>> I have started looking into this, the issue only reproduces when the JIT
>> is enabled. With the interpreter, it works fine.
>> 
>> I used GDB to dump the JITed BPF program:
>> 
>>    0xbf00012c:  push    {r4, r5, r6, r7, r8, r9, r11, lr}
>>    0xbf000130:  mov     r11, sp
>>    0xbf000134:  mov     r3, #0
>>    0xbf000138:  sub     r2, sp, #80     @ 0x50
>>    0xbf00013c:  sub     sp, sp, #88     @ 0x58
>>    0xbf000140:  strd    r2, [r11, #-64] @ 0xffffffc0
>>    0xbf000144:  mov     r2, #0
>>    0xbf000148:  strd    r2, [r11, #-72] @ 0xffffffb8
>>    0xbf00014c:  mov     r2, r0
>>    0xbf000150:  movw    r8, #9589       @ 0x2575
>>    0xbf000154:  movt    r8, #13680      @ 0x3570
>>    0xbf000158:  mov     r9, #0
>>    0xbf00015c:  ldr     r6, [r11, #-64] @ 0xffffffc0
>>    0xbf000160:  str     r8, [r6, #-8]
>>    0xbf000164:  str     r9, [r6, #-4]
>>    0xbf000168:  ldrd    r2, [r11, #-64] @ 0xffffffc0
>>    0xbf00016c:  movw    r8, #65529      @ 0xfff9
>>    0xbf000170:  movt    r8, #65535      @ 0xffff
>>    0xbf000174:  movw    r9, #65535      @ 0xffff
>>    0xbf000178:  movt    r9, #65535      @ 0xffff
>>    0xbf00017c:  adds    r2, r2, r8
>>    0xbf000180:  adc     r3, r3, r9
>>    0xbf000184:  mov     r6, #5
>>    0xbf000188:  mov     r7, #0
>>    0xbf00018c:  strd    r6, [r11, #-8]
>>    0xbf000190:  ldrd    r6, [r11, #-16]
>
> Up to this point, it looks correct. r2/r3 contain the stack pointer
> which corresponds to the instruction at "2:"
>
>>    0xbf000194:  lsl     r2, r2, #24
>>    0xbf000198:  asr     r2, r2, #24
>>    0xbf00019c:  str     r2, [r11, #-16]
>
> This then narrows the 64-bit pointer down to just 8!!! bits, but this
> is what the instruction at "4:" is asking for. However, it looks like
> it's happening to BPF's "r1" rather than "r3" and this is probably
> where the problem lies.
>
> I haven't got time to analyse this further this morning - I'm only
> around sporadically today. I'll try to look deeper at this later on.
>
> -- 
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

I found the problem. The implementation of Sign extended move is broken,
it clobbers the source register. I have sent a patch to fix it and also
fixed another issue that I saw:
https://lore.kernel.org/bpf/20240409095038.26356-1-puranjay@kernel.org/

I have manually tested with the reproducer but let's try to rerun the
reproducer through syzbot:

#syz test: https://github.com/puranjaymohan/linux.git arm32_movsx_fix

WARNING: multiple messages have this Message-ID (diff)
From: Puranjay Mohan <puranjay12@gmail.com>
To: syzbot <syzbot+186522670e6722692d86@syzkaller.appspotmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
	"Russell King (Oracle)" <linux@armlinux.org.uk>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	syzkaller-bugs <syzkaller-bugs@googlegroups.com>,
	bpf <bpf@vger.kernel.org>
Subject: Re: [syzbot] [mm?] BUG: unable to handle kernel paging request in copy_from_kernel_nofault (2)
Date: Tue, 09 Apr 2024 10:03:01 +0000	[thread overview]
Message-ID: <mb61pa5m2yht6.fsf@gmail.com> (raw)
In-Reply-To: <ZhT5NRYnceirmAGz@shell.armlinux.org.uk>

"Russell King (Oracle)" <linux@armlinux.org.uk> writes:

> On Tue, Apr 09, 2024 at 07:45:54AM +0000, Puranjay Mohan wrote:
>> "Russell King (Oracle)" <linux@armlinux.org.uk> writes:
>> 
>> > On Fri, Apr 05, 2024 at 10:50:30AM -0700, Andrii Nakryiko wrote:
>> >> On Fri, Apr 5, 2024 at 9:30 AM Alexei Starovoitov
>> >> <alexei.starovoitov@gmail.com> wrote:
>> >> >
>> >> > On Fri, Apr 5, 2024 at 4:36 AM Russell King (Oracle)
>> >> > <linux@armlinux.org.uk> wrote:
>> >> > >
>> >> > > On Fri, Apr 05, 2024 at 12:02:36PM +0100, Mark Rutland wrote:
>> >> > > > On Thu, Apr 04, 2024 at 03:57:04PM -0700, Alexei Starovoitov wrote:
>> >> > > > > On Wed, Apr 3, 2024 at 6:56 PM Andrew Morton <akpm@linux-foundationorg> wrote:
>> >> > > > > >
>> >> > > > > > On Mon, 01 Apr 2024 22:19:25 -0700 syzbot <syzbot+186522670e6722692d86@syzkaller.appspotmail.com> wrote:
>> >> > > > > >
>> >> > > > > > > Hello,
>> >> > > > > >
>> >> > > > > > Thanks.  Cc: bpf@vger.kernel.org
>> >> > > > >
>> >> > > > > I suspect the issue is not on bpf side.
>> >> > > > > Looks like the bug is somewhere in arm32 bits.
>> >> > > > > copy_from_kernel_nofault() is called from lots of places.
>> >> > > > > bpf is just one user that is easy for syzbot to fuzz.
>> >> > > > > Interestingly arm defines copy_from_kernel_nofault_allowed()
>> >> > > > > that should have filtered out user addresses.
>> >> > > > > In this case ffffffe9 is probably a kernel address?
>> >> > > >
>> >> > > > It's at the end of the kernel range, and it's ERR_PTR(-EINVAL).
>> >> > > >
>> >> > > > 0xffffffe9 is -0x16, which is -22, which is -EINVAL.
>> >> > > >
>> >> > > > > But the kernel is doing a write?
>> >> > > > > Which makes no sense, since copy_from_kernel_nofault is probe reading.
>> >> > > >
>> >> > > > It makes perfect sense; the read from 'src' happened, then the kernel tries to
>> >> > > > write the result to 'dst', and that aligns with the disassembly in the report
>> >> > > > below, which I beleive is:
>> >> > > >
>> >> > > >      8: e4942000        ldr     r2, [r4], #0  <-- Read of 'src', fault fixup is elsewhere
>> >> > > >      c: e3530000        cmp     r3, #0
>> >> > > >   * 10: e5852000        str     r2, [r5]      <-- Write to 'dst'
>> >> > > >
>> >> > > > As above, it looks like 'dst' is ERR_PTR(-EINVAL).
>> >> > > >
>> >> > > > Are you certain that BPF is passing a sane value for 'dst'? Where does that
>> >> > > > come from in the first place?
>> >> > >
>> >> > > It looks to me like it gets passed in from the BPF program, and the
>> >> > > "type" for the argument is set to ARG_PTR_TO_UNINIT_MEM. What that
>> >> > > means for validation purposes, I've no idea, I'm not a BPF hacker.
>> >> > >
>> >> > > Obviously, if BPF is allowing copy_from_kernel_nofault() to be passed
>> >> > > an arbitary destination address, that would be a huge security hole.
>> >> >
>> >> > If that's the case that's indeed a giant security hole,
>> >> > but I doubt it. We would be crashing other archs as well.
>> >> > I cannot really tell whether arm32 JIT is on.
>> >> > If it is, it's likely a bug there.
>> >> > Puranjay,
>> >> > could you please take a look.
>> >> >
>> >> 
>> >> I dumped the BPF program that repro.c is loading, it works on x86-64
>> >> and there is nothing special there. We are probe-reading 5 bytes from
>> >> somewhere into the stack. Everything is unaligned here, but stays
>> >> within a well-defined memory slot.
>> >> 
>> >> Note the r3 = (s8)r1, that's a new-ish thing, maybe bug is somewhere
>> >> there (but then it would be JIT, not verifier itself)
>> >> 
>> >>    0: (7a) *(u64 *)(r10 -8) = 896542069
>> >>    1: (bf) r1 = r10
>> >>    2: (07) r1 += -7
>> >>    3: (b7) r2 = 5
>> >>    4: (bf) r3 = (s8)r1
>> >>    5: (85) call bpf_probe_read_kernel#-72390
>> >
>> 
>> I have started looking into this, the issue only reproduces when the JIT
>> is enabled. With the interpreter, it works fine.
>> 
>> I used GDB to dump the JITed BPF program:
>> 
>>    0xbf00012c:  push    {r4, r5, r6, r7, r8, r9, r11, lr}
>>    0xbf000130:  mov     r11, sp
>>    0xbf000134:  mov     r3, #0
>>    0xbf000138:  sub     r2, sp, #80     @ 0x50
>>    0xbf00013c:  sub     sp, sp, #88     @ 0x58
>>    0xbf000140:  strd    r2, [r11, #-64] @ 0xffffffc0
>>    0xbf000144:  mov     r2, #0
>>    0xbf000148:  strd    r2, [r11, #-72] @ 0xffffffb8
>>    0xbf00014c:  mov     r2, r0
>>    0xbf000150:  movw    r8, #9589       @ 0x2575
>>    0xbf000154:  movt    r8, #13680      @ 0x3570
>>    0xbf000158:  mov     r9, #0
>>    0xbf00015c:  ldr     r6, [r11, #-64] @ 0xffffffc0
>>    0xbf000160:  str     r8, [r6, #-8]
>>    0xbf000164:  str     r9, [r6, #-4]
>>    0xbf000168:  ldrd    r2, [r11, #-64] @ 0xffffffc0
>>    0xbf00016c:  movw    r8, #65529      @ 0xfff9
>>    0xbf000170:  movt    r8, #65535      @ 0xffff
>>    0xbf000174:  movw    r9, #65535      @ 0xffff
>>    0xbf000178:  movt    r9, #65535      @ 0xffff
>>    0xbf00017c:  adds    r2, r2, r8
>>    0xbf000180:  adc     r3, r3, r9
>>    0xbf000184:  mov     r6, #5
>>    0xbf000188:  mov     r7, #0
>>    0xbf00018c:  strd    r6, [r11, #-8]
>>    0xbf000190:  ldrd    r6, [r11, #-16]
>
> Up to this point, it looks correct. r2/r3 contain the stack pointer
> which corresponds to the instruction at "2:"
>
>>    0xbf000194:  lsl     r2, r2, #24
>>    0xbf000198:  asr     r2, r2, #24
>>    0xbf00019c:  str     r2, [r11, #-16]
>
> This then narrows the 64-bit pointer down to just 8!!! bits, but this
> is what the instruction at "4:" is asking for. However, it looks like
> it's happening to BPF's "r1" rather than "r3" and this is probably
> where the problem lies.
>
> I haven't got time to analyse this further this morning - I'm only
> around sporadically today. I'll try to look deeper at this later on.
>
> -- 
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

I found the problem. The implementation of Sign extended move is broken,
it clobbers the source register. I have sent a patch to fix it and also
fixed another issue that I saw:
https://lore.kernel.org/bpf/20240409095038.26356-1-puranjay@kernel.org/

I have manually tested with the reproducer but let's try to rerun the
reproducer through syzbot:

#syz test: https://github.com/puranjaymohan/linux.git arm32_movsx_fix

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2024-04-09 10:03 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-02  5:19 [syzbot] [mm?] BUG: unable to handle kernel paging request in copy_from_kernel_nofault (2) syzbot
2024-04-04  1:41 ` Andrew Morton
2024-04-04 22:57   ` Alexei Starovoitov
2024-04-04 22:57     ` Alexei Starovoitov
2024-04-05 11:02     ` Mark Rutland
2024-04-05 11:02       ` Mark Rutland
2024-04-05 11:35       ` Russell King (Oracle)
2024-04-05 11:35         ` Russell King (Oracle)
2024-04-05 16:12         ` Alexei Starovoitov
2024-04-05 16:12           ` Alexei Starovoitov
2024-04-05 17:50           ` Andrii Nakryiko
2024-04-05 17:50             ` Andrii Nakryiko
2024-04-05 18:19             ` Russell King (Oracle)
2024-04-05 18:19               ` Russell King (Oracle)
2024-04-09  7:45               ` Puranjay Mohan
2024-04-09  7:45                 ` Puranjay Mohan
2024-04-09  8:15                 ` Russell King (Oracle)
2024-04-09  8:15                   ` Russell King (Oracle)
2024-04-09 10:03                   ` Puranjay Mohan [this message]
2024-04-09 10:03                     ` Puranjay Mohan
2024-04-09 10:26                     ` syzbot
2024-04-09 10:26                       ` syzbot
2024-04-09 11:07                       ` Puranjay Mohan
2024-04-09 11:07                         ` Puranjay Mohan
2024-04-09 11:23                         ` syzbot
2024-04-09 11:23                           ` syzbot
2024-04-09 14:18                           ` Puranjay Mohan
2024-04-09 14:18                             ` Puranjay Mohan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mb61pa5m2yht6.fsf@gmail.com \
    --to=puranjay12@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@armlinux.org.uk \
    --cc=mark.rutland@arm.com \
    --cc=syzbot+186522670e6722692d86@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.