Re: BUG: bad usercopy in memdup

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Re: BUG: bad usercopy in memdup_user
       [not found] <001a113e9ca8a3affd05609d7ccf@google.com>
@ 2017-12-18 14:22 ` Tetsuo Handa
  2017-12-19  0:57   ` Kees Cook
  0 siblings, 1 reply; 25+ messages in thread
From: Tetsuo Handa @ 2017-12-18 14:22 UTC (permalink / raw)
  To: linux-mm
  Cc: syzbot, dave, keescook, keun-o.park, labbott, linux-kernel,
	mark.rutland, mingo, syzkaller-bugs, will.deacon

On 2017/12/18 22:40, syzbot wrote:
> Hello,
> 
> syzkaller hit the following crash on 6084b576dca2e898f5c101baef151f7bfdbb606d
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
> 
> Unfortunately, I don't have any reproducer for this bug yet.
> 
> 

This BUG is reporting

[   26.089789] usercopy: kernel memory overwrite attempt detected to 0000000022a5b430 (kmalloc-1024) (1024 bytes)

line. But isn't 0000000022a5b430 strange for kmalloc(1024, GFP_KERNEL)ed kernel address?

> netlink: 1 bytes leftover after parsing attributes in process `syz-executor5'.
> ------------[ cut here ]------------
> kernel BUG at mm/usercopy.c:84!
> invalid opcode: 0000 [#1] SMP
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> Modules linked in:
> CPU: 0 PID: 3943 Comm: syz-executor0 Not tainted 4.15.0-rc3-next-20171214+ #67
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> RIP: 0010:report_usercopy mm/usercopy.c:76 [inline]
> RIP: 0010:__check_object_size+0x1e2/0x250 mm/usercopy.c:276
> RSP: 0018:ffffc90000d6fca8 EFLAGS: 00010292
> RAX: 0000000000000062 RBX: ffffffff82e57be7 RCX: ffffffff8123dede
> RDX: 0000000000006340 RSI: ffffc900036c9000 RDI: ffff88021fc136f8
> RBP: ffffc90000d6fce0 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801e076dc50
> R13: 0000000000000400 R14: 0000000000000000 R15: ffffffff82edf8a5
> FS:  00007fe747e20700(0000) GS:ffff88021fc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000000002000bffa CR3: 00000001dbd02006 CR4: 00000000001626f0
> DR0: 0000000020000000 DR1: 0000000000010000 DR2: 000000000000f004
> DR3: 0000000000000000 DR6: 00000000fffe0ff3 DR7: 0000000000000600
> Call Trace:
>  check_object_size include/linux/thread_info.h:112 [inline]
>  check_copy_size include/linux/thread_info.h:143 [inline]
>  copy_from_user include/linux/uaccess.h:146 [inline]
>  memdup_user+0x46/0x90 mm/util.c:168
>  kvm_arch_vcpu_ioctl+0xc85/0x1810 arch/x86/kvm/x86.c:3499
>  kvm_vcpu_ioctl+0xf3/0x820 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2715
>  vfs_ioctl fs/ioctl.c:46 [inline]
>  do_vfs_ioctl+0xaf/0x840 fs/ioctl.c:686
>  SYSC_ioctl fs/ioctl.c:701 [inline]
>  SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692
>  entry_SYSCALL_64_fastpath+0x1f/0x96
> RIP: 0033:0x452a09
> RSP: 002b:00007fe747e1fc58 EFLAGS: 00000212 ORIG_RAX: 0000000000000010
> RAX: ffffffffffffffda RBX: 000000000071bea0 RCX: 0000000000452a09
> RDX: 0000000020ebec00 RSI: 000000004400ae8f RDI: 0000000000000019
> RBP: 000000000000023b R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000212 R12: 00000000006f0628
> R13: 00000000ffffffff R14: 00007fe747e206d4 R15: 0000000000000000
> Code: 7b e5 82 48 0f 44 da e8 8d 82 eb ff 48 8b 45 d0 4d 89 e9 4c 89 e1 4c 89 fa 48 89 de 48 c7 c7 a8 51 e6 82 49 89 c0 e8 76 b7 e3 ff <0f> 0b 48 c7 c0 43 51 e6 82 eb a1 48 c7 c0 53 51 e6 82 eb 98 48
> RIP: report_usercopy mm/usercopy.c:76 [inline] RSP: ffffc90000d6fca8
> RIP: __check_object_size+0x1e2/0x250 mm/usercopy.c:276 RSP: ffffc90000d6fca8
> ---[ end trace 189465b430781fff ]---
> Kernel panic - not syncing: Fatal exception
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> Kernel Offset: disabled
> Rebooting in 86400 seconds..

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: BUG: bad usercopy in memdup_user
  2017-12-18 14:22 ` BUG: bad usercopy in memdup_user Tetsuo Handa
@ 2017-12-19  0:57   ` Kees Cook
  2017-12-19  8:12     ` Dmitry Vyukov
  0 siblings, 1 reply; 25+ messages in thread
From: Kees Cook @ 2017-12-19  0:57 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Linux-MM, syzbot, David Windsor, keun-o.park, Laura Abbott, LKML,
	Mark Rutland, Ingo Molnar, syzkaller-bugs, Will Deacon

On Mon, Dec 18, 2017 at 6:22 AM, Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
> On 2017/12/18 22:40, syzbot wrote:
>> Hello,
>>
>> syzkaller hit the following crash on 6084b576dca2e898f5c101baef151f7bfdbb606d
>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
>> compiler: gcc (GCC) 7.1.1 20170620
>> .config is attached
>> Raw console output is attached.
>>
>> Unfortunately, I don't have any reproducer for this bug yet.
>>
>>
>
> This BUG is reporting
>
> [   26.089789] usercopy: kernel memory overwrite attempt detected to 0000000022a5b430 (kmalloc-1024) (1024 bytes)
>
> line. But isn't 0000000022a5b430 strange for kmalloc(1024, GFP_KERNEL)ed kernel address?

The address is hashed (see the %p threads for 4.15).

There's something strange going on with the slab allocator (I haven't
tracked it down yet, unfortunately).

-Kees

>
>> netlink: 1 bytes leftover after parsing attributes in process `syz-executor5'.
>> ------------[ cut here ]------------
>> kernel BUG at mm/usercopy.c:84!
>> invalid opcode: 0000 [#1] SMP
>> Dumping ftrace buffer:
>>    (ftrace buffer empty)
>> Modules linked in:
>> CPU: 0 PID: 3943 Comm: syz-executor0 Not tainted 4.15.0-rc3-next-20171214+ #67
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>> RIP: 0010:report_usercopy mm/usercopy.c:76 [inline]
>> RIP: 0010:__check_object_size+0x1e2/0x250 mm/usercopy.c:276
>> RSP: 0018:ffffc90000d6fca8 EFLAGS: 00010292
>> RAX: 0000000000000062 RBX: ffffffff82e57be7 RCX: ffffffff8123dede
>> RDX: 0000000000006340 RSI: ffffc900036c9000 RDI: ffff88021fc136f8
>> RBP: ffffc90000d6fce0 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801e076dc50
>> R13: 0000000000000400 R14: 0000000000000000 R15: ffffffff82edf8a5
>> FS:  00007fe747e20700(0000) GS:ffff88021fc00000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 000000002000bffa CR3: 00000001dbd02006 CR4: 00000000001626f0
>> DR0: 0000000020000000 DR1: 0000000000010000 DR2: 000000000000f004
>> DR3: 0000000000000000 DR6: 00000000fffe0ff3 DR7: 0000000000000600
>> Call Trace:
>>  check_object_size include/linux/thread_info.h:112 [inline]
>>  check_copy_size include/linux/thread_info.h:143 [inline]
>>  copy_from_user include/linux/uaccess.h:146 [inline]
>>  memdup_user+0x46/0x90 mm/util.c:168
>>  kvm_arch_vcpu_ioctl+0xc85/0x1810 arch/x86/kvm/x86.c:3499
>>  kvm_vcpu_ioctl+0xf3/0x820 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2715
>>  vfs_ioctl fs/ioctl.c:46 [inline]
>>  do_vfs_ioctl+0xaf/0x840 fs/ioctl.c:686
>>  SYSC_ioctl fs/ioctl.c:701 [inline]
>>  SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692
>>  entry_SYSCALL_64_fastpath+0x1f/0x96
>> RIP: 0033:0x452a09
>> RSP: 002b:00007fe747e1fc58 EFLAGS: 00000212 ORIG_RAX: 0000000000000010
>> RAX: ffffffffffffffda RBX: 000000000071bea0 RCX: 0000000000452a09
>> RDX: 0000000020ebec00 RSI: 000000004400ae8f RDI: 0000000000000019
>> RBP: 000000000000023b R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000212 R12: 00000000006f0628
>> R13: 00000000ffffffff R14: 00007fe747e206d4 R15: 0000000000000000
>> Code: 7b e5 82 48 0f 44 da e8 8d 82 eb ff 48 8b 45 d0 4d 89 e9 4c 89 e1 4c 89 fa 48 89 de 48 c7 c7 a8 51 e6 82 49 89 c0 e8 76 b7 e3 ff <0f> 0b 48 c7 c0 43 51 e6 82 eb a1 48 c7 c0 53 51 e6 82 eb 98 48
>> RIP: report_usercopy mm/usercopy.c:76 [inline] RSP: ffffc90000d6fca8
>> RIP: __check_object_size+0x1e2/0x250 mm/usercopy.c:276 RSP: ffffc90000d6fca8
>> ---[ end trace 189465b430781fff ]---
>> Kernel panic - not syncing: Fatal exception
>> Dumping ftrace buffer:
>>    (ftrace buffer empty)
>> Kernel Offset: disabled
>> Rebooting in 86400 seconds..



-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: BUG: bad usercopy in memdup_user
  2017-12-19  0:57   ` Kees Cook
@ 2017-12-19  8:12     ` Dmitry Vyukov
  2017-12-19  8:37       ` Tobin C. Harding
  0 siblings, 1 reply; 25+ messages in thread
From: Dmitry Vyukov @ 2017-12-19  8:12 UTC (permalink / raw)
  To: Kees Cook
  Cc: Tetsuo Handa, Linux-MM, syzbot, David Windsor, keun-o.park,
	Laura Abbott, LKML, Mark Rutland, Ingo Molnar, syzkaller-bugs,
	Will Deacon, me

On Tue, Dec 19, 2017 at 1:57 AM, Kees Cook <keescook@chromium.org> wrote:
> On Mon, Dec 18, 2017 at 6:22 AM, Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
>> On 2017/12/18 22:40, syzbot wrote:
>>> Hello,
>>>
>>> syzkaller hit the following crash on 6084b576dca2e898f5c101baef151f7bfdbb606d
>>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
>>> compiler: gcc (GCC) 7.1.1 20170620
>>> .config is attached
>>> Raw console output is attached.
>>>
>>> Unfortunately, I don't have any reproducer for this bug yet.
>>>
>>>
>>
>> This BUG is reporting
>>
>> [   26.089789] usercopy: kernel memory overwrite attempt detected to 0000000022a5b430 (kmalloc-1024) (1024 bytes)
>>
>> line. But isn't 0000000022a5b430 strange for kmalloc(1024, GFP_KERNEL)ed kernel address?
>
> The address is hashed (see the %p threads for 4.15).


+Tobin, is there a way to disable hashing entirely? The only
designation of syzbot is providing crash reports to kernel developers
with as much info as possible. It's fine for it to leak whatever.


> There's something strange going on with the slab allocator (I haven't
> tracked it down yet, unfortunately).
>
> -Kees
>
>>
>>> netlink: 1 bytes leftover after parsing attributes in process `syz-executor5'.
>>> ------------[ cut here ]------------
>>> kernel BUG at mm/usercopy.c:84!
>>> invalid opcode: 0000 [#1] SMP
>>> Dumping ftrace buffer:
>>>    (ftrace buffer empty)
>>> Modules linked in:
>>> CPU: 0 PID: 3943 Comm: syz-executor0 Not tainted 4.15.0-rc3-next-20171214+ #67
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>>> RIP: 0010:report_usercopy mm/usercopy.c:76 [inline]
>>> RIP: 0010:__check_object_size+0x1e2/0x250 mm/usercopy.c:276
>>> RSP: 0018:ffffc90000d6fca8 EFLAGS: 00010292
>>> RAX: 0000000000000062 RBX: ffffffff82e57be7 RCX: ffffffff8123dede
>>> RDX: 0000000000006340 RSI: ffffc900036c9000 RDI: ffff88021fc136f8
>>> RBP: ffffc90000d6fce0 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801e076dc50
>>> R13: 0000000000000400 R14: 0000000000000000 R15: ffffffff82edf8a5
>>> FS:  00007fe747e20700(0000) GS:ffff88021fc00000(0000) knlGS:0000000000000000
>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: 000000002000bffa CR3: 00000001dbd02006 CR4: 00000000001626f0
>>> DR0: 0000000020000000 DR1: 0000000000010000 DR2: 000000000000f004
>>> DR3: 0000000000000000 DR6: 00000000fffe0ff3 DR7: 0000000000000600
>>> Call Trace:
>>>  check_object_size include/linux/thread_info.h:112 [inline]
>>>  check_copy_size include/linux/thread_info.h:143 [inline]
>>>  copy_from_user include/linux/uaccess.h:146 [inline]
>>>  memdup_user+0x46/0x90 mm/util.c:168
>>>  kvm_arch_vcpu_ioctl+0xc85/0x1810 arch/x86/kvm/x86.c:3499
>>>  kvm_vcpu_ioctl+0xf3/0x820 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2715
>>>  vfs_ioctl fs/ioctl.c:46 [inline]
>>>  do_vfs_ioctl+0xaf/0x840 fs/ioctl.c:686
>>>  SYSC_ioctl fs/ioctl.c:701 [inline]
>>>  SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692
>>>  entry_SYSCALL_64_fastpath+0x1f/0x96
>>> RIP: 0033:0x452a09
>>> RSP: 002b:00007fe747e1fc58 EFLAGS: 00000212 ORIG_RAX: 0000000000000010
>>> RAX: ffffffffffffffda RBX: 000000000071bea0 RCX: 0000000000452a09
>>> RDX: 0000000020ebec00 RSI: 000000004400ae8f RDI: 0000000000000019
>>> RBP: 000000000000023b R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000000 R11: 0000000000000212 R12: 00000000006f0628
>>> R13: 00000000ffffffff R14: 00007fe747e206d4 R15: 0000000000000000
>>> Code: 7b e5 82 48 0f 44 da e8 8d 82 eb ff 48 8b 45 d0 4d 89 e9 4c 89 e1 4c 89 fa 48 89 de 48 c7 c7 a8 51 e6 82 49 89 c0 e8 76 b7 e3 ff <0f> 0b 48 c7 c0 43 51 e6 82 eb a1 48 c7 c0 53 51 e6 82 eb 98 48
>>> RIP: report_usercopy mm/usercopy.c:76 [inline] RSP: ffffc90000d6fca8
>>> RIP: __check_object_size+0x1e2/0x250 mm/usercopy.c:276 RSP: ffffc90000d6fca8
>>> ---[ end trace 189465b430781fff ]---
>>> Kernel panic - not syncing: Fatal exception
>>> Dumping ftrace buffer:
>>>    (ftrace buffer empty)
>>> Kernel Offset: disabled
>>> Rebooting in 86400 seconds..
>
>
>
> --
> Kees Cook
> Pixel Security
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/CAGXu5jKLBuQ8Ne6BjjPH%2B1SVw-Fj4ko5H04GHn-dxXYwoMEZtw%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: BUG: bad usercopy in memdup_user
  2017-12-19  8:12     ` Dmitry Vyukov
@ 2017-12-19  8:37       ` Tobin C. Harding
  2017-12-19  8:41         ` Dmitry Vyukov
  2017-12-19 13:22         ` Matthew Wilcox
  0 siblings, 2 replies; 25+ messages in thread
From: Tobin C. Harding @ 2017-12-19  8:37 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Kees Cook, Tetsuo Handa, Linux-MM, syzbot, David Windsor,
	keun-o.park, Laura Abbott, LKML, Mark Rutland, Ingo Molnar,
	syzkaller-bugs, Will Deacon

On Tue, Dec 19, 2017 at 09:12:58AM +0100, Dmitry Vyukov wrote:
> On Tue, Dec 19, 2017 at 1:57 AM, Kees Cook <keescook@chromium.org> wrote:
> > On Mon, Dec 18, 2017 at 6:22 AM, Tetsuo Handa
> > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> >> On 2017/12/18 22:40, syzbot wrote:
> >>> Hello,
> >>>
> >>> syzkaller hit the following crash on 6084b576dca2e898f5c101baef151f7bfdbb606d
> >>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> >>> compiler: gcc (GCC) 7.1.1 20170620
> >>> .config is attached
> >>> Raw console output is attached.
> >>>
> >>> Unfortunately, I don't have any reproducer for this bug yet.
> >>>
> >>>
> >>
> >> This BUG is reporting
> >>
> >> [   26.089789] usercopy: kernel memory overwrite attempt detected to 0000000022a5b430 (kmalloc-1024) (1024 bytes)
> >>
> >> line. But isn't 0000000022a5b430 strange for kmalloc(1024, GFP_KERNEL)ed kernel address?
> >
> > The address is hashed (see the %p threads for 4.15).
> 
> 
> +Tobin, is there a way to disable hashing entirely? The only
> designation of syzbot is providing crash reports to kernel developers
> with as much info as possible. It's fine for it to leak whatever.

We have new specifier %px to print addresses in hex if leaking info is
not a worry.

Hope this helps,
Tobin.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: BUG: bad usercopy in memdup_user
  2017-12-19  8:37       ` Tobin C. Harding
@ 2017-12-19  8:41         ` Dmitry Vyukov
  2017-12-19  9:04           ` Tobin C. Harding
  2017-12-19 13:22         ` Matthew Wilcox
  1 sibling, 1 reply; 25+ messages in thread
From: Dmitry Vyukov @ 2017-12-19  8:41 UTC (permalink / raw)
  To: Tobin C. Harding
  Cc: Kees Cook, Tetsuo Handa, Linux-MM, syzbot, David Windsor,
	keun-o.park, Laura Abbott, LKML, Mark Rutland, Ingo Molnar,
	syzkaller-bugs, Will Deacon

On Tue, Dec 19, 2017 at 9:37 AM, Tobin C. Harding <me@tobin.cc> wrote:
>> > <penguin-kernel@i-love.sakura.ne.jp> wrote:
>> >> On 2017/12/18 22:40, syzbot wrote:
>> >>> Hello,
>> >>>
>> >>> syzkaller hit the following crash on 6084b576dca2e898f5c101baef151f7bfdbb606d
>> >>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
>> >>> compiler: gcc (GCC) 7.1.1 20170620
>> >>> .config is attached
>> >>> Raw console output is attached.
>> >>>
>> >>> Unfortunately, I don't have any reproducer for this bug yet.
>> >>>
>> >>>
>> >>
>> >> This BUG is reporting
>> >>
>> >> [   26.089789] usercopy: kernel memory overwrite attempt detected to 0000000022a5b430 (kmalloc-1024) (1024 bytes)
>> >>
>> >> line. But isn't 0000000022a5b430 strange for kmalloc(1024, GFP_KERNEL)ed kernel address?
>> >
>> > The address is hashed (see the %p threads for 4.15).
>>
>>
>> +Tobin, is there a way to disable hashing entirely? The only
>> designation of syzbot is providing crash reports to kernel developers
>> with as much info as possible. It's fine for it to leak whatever.
>
> We have new specifier %px to print addresses in hex if leaking info is
> not a worry.

This is not a per-printf-site thing. It's per-machine thing.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: BUG: bad usercopy in memdup_user
  2017-12-19  8:41         ` Dmitry Vyukov
@ 2017-12-19  9:04           ` Tobin C. Harding
  2017-12-19  9:07             ` Dmitry Vyukov
  0 siblings, 1 reply; 25+ messages in thread
From: Tobin C. Harding @ 2017-12-19  9:04 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Kees Cook, Tetsuo Handa, Linux-MM, syzbot, David Windsor,
	keun-o.park, Laura Abbott, LKML, Mark Rutland, Ingo Molnar,
	syzkaller-bugs, Will Deacon

On Tue, Dec 19, 2017 at 09:41:39AM +0100, Dmitry Vyukov wrote:
> On Tue, Dec 19, 2017 at 9:37 AM, Tobin C. Harding <me@tobin.cc> wrote:
> >> > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> >> >> On 2017/12/18 22:40, syzbot wrote:
> >> >>> Hello,
> >> >>>
> >> >>> syzkaller hit the following crash on 6084b576dca2e898f5c101baef151f7bfdbb606d
> >> >>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> >> >>> compiler: gcc (GCC) 7.1.1 20170620
> >> >>> .config is attached
> >> >>> Raw console output is attached.
> >> >>>
> >> >>> Unfortunately, I don't have any reproducer for this bug yet.
> >> >>>
> >> >>>
> >> >>
> >> >> This BUG is reporting
> >> >>
> >> >> [   26.089789] usercopy: kernel memory overwrite attempt detected to 0000000022a5b430 (kmalloc-1024) (1024 bytes)
> >> >>
> >> >> line. But isn't 0000000022a5b430 strange for kmalloc(1024, GFP_KERNEL)ed kernel address?
> >> >
> >> > The address is hashed (see the %p threads for 4.15).
> >>
> >>
> >> +Tobin, is there a way to disable hashing entirely? The only
> >> designation of syzbot is providing crash reports to kernel developers
> >> with as much info as possible. It's fine for it to leak whatever.
> >
> > We have new specifier %px to print addresses in hex if leaking info is
> > not a worry.
> 
> This is not a per-printf-site thing. It's per-machine thing.

There is no way to disable the hashing currently built into the system.

	Tobin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: BUG: bad usercopy in memdup_user
  2017-12-19  9:04           ` Tobin C. Harding
@ 2017-12-19  9:07             ` Dmitry Vyukov
  0 siblings, 0 replies; 25+ messages in thread
From: Dmitry Vyukov @ 2017-12-19  9:07 UTC (permalink / raw)
  To: Tobin C. Harding
  Cc: Kees Cook, Tetsuo Handa, Linux-MM, syzbot, David Windsor,
	keun-o.park, Laura Abbott, LKML, Mark Rutland, Ingo Molnar,
	syzkaller-bugs, Will Deacon

On Tue, Dec 19, 2017 at 10:04 AM, Tobin C. Harding <me@tobin.cc> wrote:
>> >> > <penguin-kernel@i-love.sakura.ne.jp> wrote:
>> >> >> On 2017/12/18 22:40, syzbot wrote:
>> >> >>> Hello,
>> >> >>>
>> >> >>> syzkaller hit the following crash on 6084b576dca2e898f5c101baef151f7bfdbb606d
>> >> >>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
>> >> >>> compiler: gcc (GCC) 7.1.1 20170620
>> >> >>> .config is attached
>> >> >>> Raw console output is attached.
>> >> >>>
>> >> >>> Unfortunately, I don't have any reproducer for this bug yet.
>> >> >>>
>> >> >>>
>> >> >>
>> >> >> This BUG is reporting
>> >> >>
>> >> >> [   26.089789] usercopy: kernel memory overwrite attempt detected to 0000000022a5b430 (kmalloc-1024) (1024 bytes)
>> >> >>
>> >> >> line. But isn't 0000000022a5b430 strange for kmalloc(1024, GFP_KERNEL)ed kernel address?
>> >> >
>> >> > The address is hashed (see the %p threads for 4.15).
>> >>
>> >>
>> >> +Tobin, is there a way to disable hashing entirely? The only
>> >> designation of syzbot is providing crash reports to kernel developers
>> >> with as much info as possible. It's fine for it to leak whatever.
>> >
>> > We have new specifier %px to print addresses in hex if leaking info is
>> > not a worry.
>>
>> This is not a per-printf-site thing. It's per-machine thing.
>
> There is no way to disable the hashing currently built into the system.

Ack.
Any kind of continuous testing systems would be a use case for this.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: BUG: bad usercopy in memdup_user
  2017-12-19  8:37       ` Tobin C. Harding
  2017-12-19  8:41         ` Dmitry Vyukov
@ 2017-12-19 13:22         ` Matthew Wilcox
  2017-12-19 13:41           ` Dmitry Vyukov
                             ` (2 more replies)
  1 sibling, 3 replies; 25+ messages in thread
From: Matthew Wilcox @ 2017-12-19 13:22 UTC (permalink / raw)
  To: Tobin C. Harding
  Cc: Dmitry Vyukov, Kees Cook, Tetsuo Handa, Linux-MM, syzbot,
	David Windsor, keun-o.park, Laura Abbott, LKML, Mark Rutland,
	Ingo Molnar, syzkaller-bugs, Will Deacon

On Tue, Dec 19, 2017 at 07:37:46PM +1100, Tobin C. Harding wrote:
> On Tue, Dec 19, 2017 at 09:12:58AM +0100, Dmitry Vyukov wrote:
> > On Tue, Dec 19, 2017 at 1:57 AM, Kees Cook <keescook@chromium.org> wrote:
> > > On Mon, Dec 18, 2017 at 6:22 AM, Tetsuo Handa
> > >> This BUG is reporting
> > >>
> > >> [   26.089789] usercopy: kernel memory overwrite attempt detected to 0000000022a5b430 (kmalloc-1024) (1024 bytes)
> > >>
> > >> line. But isn't 0000000022a5b430 strange for kmalloc(1024, GFP_KERNEL)ed kernel address?
> > >
> > > The address is hashed (see the %p threads for 4.15).
> > 
> > 
> > +Tobin, is there a way to disable hashing entirely? The only
> > designation of syzbot is providing crash reports to kernel developers
> > with as much info as possible. It's fine for it to leak whatever.
> 
> We have new specifier %px to print addresses in hex if leaking info is
> not a worry.

Could we have a way to know that the printed address is hashed and not just
a pointer getting completely scrogged?  Perhaps prefix it with ... a hash!
So this line would look like:

[   26.089789] usercopy: kernel memory overwrite attempt detected to #0000000022a5b430 (kmalloc-1024) (1024 bytes)

Or does that miss the point of hashing the address, so the attacker
thinks its a real address?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: BUG: bad usercopy in memdup_user
  2017-12-19 13:22         ` Matthew Wilcox
@ 2017-12-19 13:41           ` Dmitry Vyukov
  2017-12-19 14:08             ` Tetsuo Handa
  2017-12-19 20:33           ` Tobin C. Harding
  2017-12-19 21:36           ` Linus Torvalds
  2 siblings, 1 reply; 25+ messages in thread
From: Dmitry Vyukov @ 2017-12-19 13:41 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Tobin C. Harding, Kees Cook, Tetsuo Handa, Linux-MM, syzbot,
	David Windsor, keun-o.park, Laura Abbott, LKML, Mark Rutland,
	Ingo Molnar, syzkaller-bugs, Will Deacon

On Tue, Dec 19, 2017 at 2:22 PM, Matthew Wilcox <willy@infradead.org> wrote:
>> > >> This BUG is reporting
>> > >>
>> > >> [   26.089789] usercopy: kernel memory overwrite attempt detected to 0000000022a5b430 (kmalloc-1024) (1024 bytes)
>> > >>
>> > >> line. But isn't 0000000022a5b430 strange for kmalloc(1024, GFP_KERNEL)ed kernel address?
>> > >
>> > > The address is hashed (see the %p threads for 4.15).
>> >
>> >
>> > +Tobin, is there a way to disable hashing entirely? The only
>> > designation of syzbot is providing crash reports to kernel developers
>> > with as much info as possible. It's fine for it to leak whatever.
>>
>> We have new specifier %px to print addresses in hex if leaking info is
>> not a worry.
>
> Could we have a way to know that the printed address is hashed and not just
> a pointer getting completely scrogged?  Perhaps prefix it with ... a hash!
> So this line would look like:
>
> [   26.089789] usercopy: kernel memory overwrite attempt detected to #0000000022a5b430 (kmalloc-1024) (1024 bytes)
>
> Or does that miss the point of hashing the address, so the attacker
> thinks its a real address?

If we do something with this, I would suggest that we just disable
hashing. Any of the concerns that lead to hashed pointers are not
applicable in this context, moreover they are harmful, cause confusion
and make it harder to debug these bugs. That perfectly can be an
opt-in CONFIG_DEBUG_INSECURE_BLA_BLA_BLA.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: BUG: bad usercopy in memdup_user
  2017-12-19 13:41           ` Dmitry Vyukov
@ 2017-12-19 14:08             ` Tetsuo Handa
  2017-12-19 14:12               ` Dmitry Vyukov
  0 siblings, 1 reply; 25+ messages in thread
From: Tetsuo Handa @ 2017-12-19 14:08 UTC (permalink / raw)
  To: dvyukov, willy
  Cc: me, keescook, linux-mm,
	bot+719398b443fd30155f92f2a888e749026c62b427, dave, keun-o.park,
	labbott, linux-kernel, mark.rutland, mingo, syzkaller-bugs,
	will.deacon

Dmitry Vyukov wrote:
> On Tue, Dec 19, 2017 at 2:22 PM, Matthew Wilcox <willy@infradead.org> wrote:
> >> > >> This BUG is reporting
> >> > >>
> >> > >> [   26.089789] usercopy: kernel memory overwrite attempt detected to 0000000022a5b430 (kmalloc-1024) (1024 bytes)
> >> > >>
> >> > >> line. But isn't 0000000022a5b430 strange for kmalloc(1024, GFP_KERNEL)ed kernel address?
> >> > >
> >> > > The address is hashed (see the %p threads for 4.15).
> >> >
> >> >
> >> > +Tobin, is there a way to disable hashing entirely? The only
> >> > designation of syzbot is providing crash reports to kernel developers
> >> > with as much info as possible. It's fine for it to leak whatever.
> >>
> >> We have new specifier %px to print addresses in hex if leaking info is
> >> not a worry.
> >
> > Could we have a way to know that the printed address is hashed and not just
> > a pointer getting completely scrogged?  Perhaps prefix it with ... a hash!
> > So this line would look like:
> >
> > [   26.089789] usercopy: kernel memory overwrite attempt detected to #0000000022a5b430 (kmalloc-1024) (1024 bytes)
> >
> > Or does that miss the point of hashing the address, so the attacker
> > thinks its a real address?
> 
> If we do something with this, I would suggest that we just disable
> hashing. Any of the concerns that lead to hashed pointers are not
> applicable in this context, moreover they are harmful, cause confusion
> and make it harder to debug these bugs. That perfectly can be an
> opt-in CONFIG_DEBUG_INSECURE_BLA_BLA_BLA.
> 
Why not a kernel command line option? Hashing by default.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: BUG: bad usercopy in memdup_user
  2017-12-19 14:08             ` Tetsuo Handa
@ 2017-12-19 14:12               ` Dmitry Vyukov
  2017-12-19 20:45                 ` Tobin C. Harding
  0 siblings, 1 reply; 25+ messages in thread
From: Dmitry Vyukov @ 2017-12-19 14:12 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Matthew Wilcox, Tobin C. Harding, Kees Cook, Linux-MM, syzbot,
	David Windsor, keun-o.park, Laura Abbott, LKML, Mark Rutland,
	Ingo Molnar, syzkaller-bugs, Will Deacon

On Tue, Dec 19, 2017 at 3:08 PM, Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
> Dmitry Vyukov wrote:
>> On Tue, Dec 19, 2017 at 2:22 PM, Matthew Wilcox <willy@infradead.org> wrote:
>> >> > >> This BUG is reporting
>> >> > >>
>> >> > >> [   26.089789] usercopy: kernel memory overwrite attempt detected to 0000000022a5b430 (kmalloc-1024) (1024 bytes)
>> >> > >>
>> >> > >> line. But isn't 0000000022a5b430 strange for kmalloc(1024, GFP_KERNEL)ed kernel address?
>> >> > >
>> >> > > The address is hashed (see the %p threads for 4.15).
>> >> >
>> >> >
>> >> > +Tobin, is there a way to disable hashing entirely? The only
>> >> > designation of syzbot is providing crash reports to kernel developers
>> >> > with as much info as possible. It's fine for it to leak whatever.
>> >>
>> >> We have new specifier %px to print addresses in hex if leaking info is
>> >> not a worry.
>> >
>> > Could we have a way to know that the printed address is hashed and not just
>> > a pointer getting completely scrogged?  Perhaps prefix it with ... a hash!
>> > So this line would look like:
>> >
>> > [   26.089789] usercopy: kernel memory overwrite attempt detected to #0000000022a5b430 (kmalloc-1024) (1024 bytes)
>> >
>> > Or does that miss the point of hashing the address, so the attacker
>> > thinks its a real address?
>>
>> If we do something with this, I would suggest that we just disable
>> hashing. Any of the concerns that lead to hashed pointers are not
>> applicable in this context, moreover they are harmful, cause confusion
>> and make it harder to debug these bugs. That perfectly can be an
>> opt-in CONFIG_DEBUG_INSECURE_BLA_BLA_BLA.
>>
> Why not a kernel command line option? Hashing by default.


Would work for continuous testing systems too.
I just thought that since it has security implications, a config would
be more reliable. Say if a particular distribution builds kernel
without this config, then there is no way to enable it on the fly,
intentionally or not.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: BUG: bad usercopy in memdup_user
  2017-12-19 14:12               ` Dmitry Vyukov
@ 2017-12-19 20:45                 ` Tobin C. Harding
  0 siblings, 0 replies; 25+ messages in thread
From: Tobin C. Harding @ 2017-12-19 20:45 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Tetsuo Handa, Matthew Wilcox, Kees Cook, Linux-MM, syzbot,
	David Windsor, keun-o.park, Laura Abbott, LKML, Mark Rutland,
	Ingo Molnar, syzkaller-bugs, Will Deacon, Linus Torvalds

Adding Linus

On Tue, Dec 19, 2017 at 03:12:05PM +0100, Dmitry Vyukov wrote:
> On Tue, Dec 19, 2017 at 3:08 PM, Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > Dmitry Vyukov wrote:
> >> On Tue, Dec 19, 2017 at 2:22 PM, Matthew Wilcox <willy@infradead.org> wrote:
> >> >> > >> This BUG is reporting
> >> >> > >>
> >> >> > >> [   26.089789] usercopy: kernel memory overwrite attempt detected to 0000000022a5b430 (kmalloc-1024) (1024 bytes)
> >> >> > >>
> >> >> > >> line. But isn't 0000000022a5b430 strange for kmalloc(1024, GFP_KERNEL)ed kernel address?
> >> >> > >
> >> >> > > The address is hashed (see the %p threads for 4.15).
> >> >> >
> >> >> >
> >> >> > +Tobin, is there a way to disable hashing entirely? The only
> >> >> > designation of syzbot is providing crash reports to kernel developers
> >> >> > with as much info as possible. It's fine for it to leak whatever.
> >> >>
> >> >> We have new specifier %px to print addresses in hex if leaking info is
> >> >> not a worry.
> >> >
> >> > Could we have a way to know that the printed address is hashed and not just
> >> > a pointer getting completely scrogged?  Perhaps prefix it with ... a hash!
> >> > So this line would look like:
> >> >
> >> > [   26.089789] usercopy: kernel memory overwrite attempt detected to #0000000022a5b430 (kmalloc-1024) (1024 bytes)
> >> >
> >> > Or does that miss the point of hashing the address, so the attacker
> >> > thinks its a real address?
> >>
> >> If we do something with this, I would suggest that we just disable
> >> hashing. Any of the concerns that lead to hashed pointers are not
> >> applicable in this context, moreover they are harmful, cause confusion
> >> and make it harder to debug these bugs. That perfectly can be an
> >> opt-in CONFIG_DEBUG_INSECURE_BLA_BLA_BLA.
> >>
> > Why not a kernel command line option? Hashing by default.
> 
> 
> Would work for continuous testing systems too.
> I just thought that since it has security implications, a config would
> be more reliable. Say if a particular distribution builds kernel
> without this config, then there is no way to enable it on the fly,
> intentionally or not.

I wasn't the architect behind the hashing, I've cc'd Linus in the event
he wants to correct me. I believe that some of the benefit of hashing
was to shake things up and force people to think about this issue. If we
implement a method of disabling hashing (command-line parameter or
CONFIG_) at this stage then we risk loosing this benefit since one has
to assume that people will just take the easy option and disable
it. Though perhaps after things settle a bit we could implement this
without the risk?

thanks,
Tobin.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: BUG: bad usercopy in memdup_user
  2017-12-19 13:22         ` Matthew Wilcox
  2017-12-19 13:41           ` Dmitry Vyukov
@ 2017-12-19 20:33           ` Tobin C. Harding
  2017-12-19 21:36           ` Linus Torvalds
  2 siblings, 0 replies; 25+ messages in thread
From: Tobin C. Harding @ 2017-12-19 20:33 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Dmitry Vyukov, Kees Cook, Tetsuo Handa, Linux-MM, syzbot,
	David Windsor, keun-o.park, Laura Abbott, LKML, Mark Rutland,
	Ingo Molnar, syzkaller-bugs, Will Deacon

On Tue, Dec 19, 2017 at 05:22:46AM -0800, Matthew Wilcox wrote:
> On Tue, Dec 19, 2017 at 07:37:46PM +1100, Tobin C. Harding wrote:
> > On Tue, Dec 19, 2017 at 09:12:58AM +0100, Dmitry Vyukov wrote:
> > > On Tue, Dec 19, 2017 at 1:57 AM, Kees Cook <keescook@chromium.org> wrote:
> > > > On Mon, Dec 18, 2017 at 6:22 AM, Tetsuo Handa
> > > >> This BUG is reporting
> > > >>
> > > >> [   26.089789] usercopy: kernel memory overwrite attempt detected to 0000000022a5b430 (kmalloc-1024) (1024 bytes)
> > > >>
> > > >> line. But isn't 0000000022a5b430 strange for kmalloc(1024, GFP_KERNEL)ed kernel address?
> > > >
> > > > The address is hashed (see the %p threads for 4.15).
> > > 
> > > 
> > > +Tobin, is there a way to disable hashing entirely? The only
> > > designation of syzbot is providing crash reports to kernel developers
> > > with as much info as possible. It's fine for it to leak whatever.
> > 
> > We have new specifier %px to print addresses in hex if leaking info is
> > not a worry.
> 
> Could we have a way to know that the printed address is hashed and not just
> a pointer getting completely scrogged?  Perhaps prefix it with ... a hash!
> So this line would look like:
> 
> [   26.089789] usercopy: kernel memory overwrite attempt detected to #0000000022a5b430 (kmalloc-1024) (1024 bytes)

This poses the risk of breaking userland tools that parse the
address. The zeroing of the first 32 bits was a design compromise to
keep the address format while making _kind of_ explicit that some funny
business was going on.

> Or does that miss the point of hashing the address, so the attacker
> thinks its a real address?

No subterfuge intended.

Bonus points Wily, I had to go to 'The New Hackers Dictionary' to look
up 'scrogged' :)

thanks,
Tobin.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: BUG: bad usercopy in memdup_user
  2017-12-19 13:22         ` Matthew Wilcox
  2017-12-19 13:41           ` Dmitry Vyukov
  2017-12-19 20:33           ` Tobin C. Harding
@ 2017-12-19 21:36           ` Linus Torvalds
  2017-12-19 21:48             ` Al Viro
                               ` (3 more replies)
  2 siblings, 4 replies; 25+ messages in thread
From: Linus Torvalds @ 2017-12-19 21:36 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Tobin C. Harding, Dmitry Vyukov, Kees Cook, Tetsuo Handa,
	Linux-MM, syzbot, David Windsor, keun-o.park, Laura Abbott, LKML,
	Mark Rutland, Ingo Molnar, syzkaller-bugs, Will Deacon

On Tue, Dec 19, 2017 at 5:22 AM, Matthew Wilcox <willy@infradead.org> wrote:
>
> Could we have a way to know that the printed address is hashed and not just
> a pointer getting completely scrogged?  Perhaps prefix it with ... a hash!
> So this line would look like:

The problem with that is that it will break tools that parse things.

So no, it won't work.

When we find something like this, we should either remove it, fix the
permissions, or switch to %px.

In this case, there's obviously no permission issue: it's an error
report. So it's either "remove it, or switch to %px".

I'm personally not clear on whether the pointer really makes any sense
at all. But if it does, it should just be changed to %px, since it's a
bug report.

But honestly, what do people expect that the pointer value will
actually tell you if it is unhashed?

I suspect that an "offset and size within the kernel object" value
might make sense.  But what does the _pointer_ tell you?

I've noticed this with pretty much every report. People get upset
about the hashing, but don't seem to actually be able to ever tell
what the f*ck they would use the non-hashed pointer value for.

I've asked for this before: whenever somebody complains about the
hashing, you had better tell exactly what the unhashed value would
have given you, and how it would have helped debug the problem.

Because if you can't tell that, then dammit, then we should just
_remove_ the stupid %p.

Instead, people ask for "can I get everything unhashed" even when they
can't give a reason for it.

            Linus

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: BUG: bad usercopy in memdup_user
  2017-12-19 21:36           ` Linus Torvalds
@ 2017-12-19 21:48             ` Al Viro
  2017-12-19 22:09               ` Randy Dunlap
                                 ` (3 more replies)
  2017-12-19 21:54             ` Kees Cook
                               ` (2 subsequent siblings)
  3 siblings, 4 replies; 25+ messages in thread
From: Al Viro @ 2017-12-19 21:48 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Matthew Wilcox, Tobin C. Harding, Dmitry Vyukov, Kees Cook,
	Tetsuo Handa, Linux-MM, syzbot, David Windsor, keun-o.park,
	Laura Abbott, LKML, Mark Rutland, Ingo Molnar, syzkaller-bugs,
	Will Deacon

On Tue, Dec 19, 2017 at 01:36:46PM -0800, Linus Torvalds wrote:

> I suspect that an "offset and size within the kernel object" value
> might make sense.  But what does the _pointer_ tell you?

Well, for example seeing a 0xfffffffffffffff4 where a pointer to object
must have been is a pretty strong hint to start looking for a way for
that ERR_PTR(-ENOMEM) having ended up there...  Something like
0x6e69622f7273752f is almost certainly a misplaced "/usr/bin", i.e. a
pathname overwriting whatever it ends up in, etc.  And yes, I have run
into both of those in real life.

Debugging the situation when crap value has ended up in place of a
pointer is certainly a case where you do want to see what exactly has
ended up in there...

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: BUG: bad usercopy in memdup_user
  2017-12-19 21:48             ` Al Viro
@ 2017-12-19 22:09               ` Randy Dunlap
  2017-12-19 23:24               ` Linus Torvalds
                                 ` (2 subsequent siblings)
  3 siblings, 0 replies; 25+ messages in thread
From: Randy Dunlap @ 2017-12-19 22:09 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds
  Cc: Matthew Wilcox, Tobin C. Harding, Dmitry Vyukov, Kees Cook,
	Tetsuo Handa, Linux-MM, syzbot, David Windsor, keun-o.park,
	Laura Abbott, LKML, Mark Rutland, Ingo Molnar, syzkaller-bugs,
	Will Deacon

On 12/19/2017 01:48 PM, Al Viro wrote:
> On Tue, Dec 19, 2017 at 01:36:46PM -0800, Linus Torvalds wrote:
> 
>> I suspect that an "offset and size within the kernel object" value
>> might make sense.  But what does the _pointer_ tell you?
> 
> Well, for example seeing a 0xfffffffffffffff4 where a pointer to object
> must have been is a pretty strong hint to start looking for a way for
> that ERR_PTR(-ENOMEM) having ended up there...  Something like
> 0x6e69622f7273752f is almost certainly a misplaced "/usr/bin", i.e. a

possibly poison values also?

> pathname overwriting whatever it ends up in, etc.  And yes, I have run
> into both of those in real life.
> 
> Debugging the situation when crap value has ended up in place of a
> pointer is certainly a case where you do want to see what exactly has
> ended up in there...


-- 
~Randy

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: BUG: bad usercopy in memdup_user
  2017-12-19 21:48             ` Al Viro
  2017-12-19 22:09               ` Randy Dunlap
@ 2017-12-19 23:24               ` Linus Torvalds
  2017-12-20  3:50               ` Matthew Wilcox
  2017-12-20  9:44               ` David Laight
  3 siblings, 0 replies; 25+ messages in thread
From: Linus Torvalds @ 2017-12-19 23:24 UTC (permalink / raw)
  To: Al Viro
  Cc: Matthew Wilcox, Tobin C. Harding, Dmitry Vyukov, Kees Cook,
	Tetsuo Handa, Linux-MM, syzbot, David Windsor, keun-o.park,
	Laura Abbott, LKML, Mark Rutland, Ingo Molnar, syzkaller-bugs,
	Will Deacon

On Tue, Dec 19, 2017 at 1:48 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Tue, Dec 19, 2017 at 01:36:46PM -0800, Linus Torvalds wrote:
>
>> I suspect that an "offset and size within the kernel object" value
>> might make sense.  But what does the _pointer_ tell you?
>
> Well, for example seeing a 0xfffffffffffffff4 where a pointer to object
> must have been is a pretty strong hint to start looking for a way for
> that ERR_PTR(-ENOMEM) having ended up there...  Something like
> 0x6e69622f7273752f is almost certainly a misplaced "/usr/bin", i.e. a
> pathname overwriting whatever it ends up in, etc.  And yes, I have run
> into both of those in real life.

Sure. But that's for a faulting address when you have an invalid pointer.

That's not the case here at all.

Here, we've explicitly checked that it's a kernel pointer of some
particular type (in a slab cache in this case), and the pointer is
valid but shouldn't be copied to/from user space.

So it's not something like 0xfffffffffffffff4 or 0x6e69622f7273752f.
It's something like "in slab cache for size 1024".

So the pointer value isn't interesting. But the offset within the slab could be.

See? This is what I am talking about. People don't actually seem to
*think* about what the %p is. There seems to be very little critical
thinking about what should be printed out, and what is actually
useful.

The most common thing seems to be "I'm confused by a bad value".  But
that should *not* cause a mindless "let's not hash it" reaction.

It should cause actual thinking about the situation! Not about %p in
general, but very much about the situation of THAT PARTICULAR use of
%p.

That's what I'm looking for, and what I'm not seeing in these discussions.

                Linus

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: BUG: bad usercopy in memdup_user
  2017-12-19 21:48             ` Al Viro
  2017-12-19 22:09               ` Randy Dunlap
  2017-12-19 23:24               ` Linus Torvalds
@ 2017-12-20  3:50               ` Matthew Wilcox
  2017-12-20  4:05                 ` Linus Torvalds
  2017-12-20  9:44               ` David Laight
  3 siblings, 1 reply; 25+ messages in thread
From: Matthew Wilcox @ 2017-12-20  3:50 UTC (permalink / raw)
  To: Al Viro
  Cc: Linus Torvalds, Tobin C. Harding, Dmitry Vyukov, Kees Cook,
	Tetsuo Handa, Linux-MM, syzbot, David Windsor, keun-o.park,
	Laura Abbott, LKML, Mark Rutland, Ingo Molnar, syzkaller-bugs,
	Will Deacon

On Tue, Dec 19, 2017 at 09:48:49PM +0000, Al Viro wrote:
> Well, for example seeing a 0xfffffffffffffff4 where a pointer to object
> must have been is a pretty strong hint to start looking for a way for
> that ERR_PTR(-ENOMEM) having ended up there...  Something like
> 0x6e69622f7273752f is almost certainly a misplaced "/usr/bin", i.e. a
> pathname overwriting whatever it ends up in, etc.  And yes, I have run
> into both of those in real life.
> 
> Debugging the situation when crap value has ended up in place of a
> pointer is certainly a case where you do want to see what exactly has
> ended up in there...

Linus, how would you feel about printing ERR_PTRs without molestation?
It's not going to leak any information about the kernel address space
layout.  I'm a little less certain about trying to detect ASCII strings,
but I think this is an improvement.

diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 01c3957b2de6..c80c60b4b3ef 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -1859,6 +1859,9 @@ char *pointer(const char *fmt, char *buf, char *end, void *ptr,
 		return string(buf, end, "(null)", spec);
 	}
 
+	if (IS_ERR(ptr))
+		return pointer_string(buf, end, ptr, spec);
+
 	switch (*fmt) {
 	case 'F':
 	case 'f':

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: BUG: bad usercopy in memdup_user
  2017-12-20  3:50               ` Matthew Wilcox
@ 2017-12-20  4:05                 ` Linus Torvalds
  2017-12-20  4:36                   ` Linus Torvalds
  0 siblings, 1 reply; 25+ messages in thread
From: Linus Torvalds @ 2017-12-20  4:05 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Al Viro, Tobin C. Harding, Dmitry Vyukov, Kees Cook, Tetsuo Handa,
	Linux-MM, syzbot, David Windsor, keun-o.park, Laura Abbott, LKML,
	Mark Rutland, Ingo Molnar, syzkaller-bugs, Will Deacon

On Tue, Dec 19, 2017 at 7:50 PM, Matthew Wilcox <willy@infradead.org> wrote:
> On Tue, Dec 19, 2017 at 09:48:49PM +0000, Al Viro wrote:
>> Well, for example seeing a 0xfffffffffffffff4 where a pointer to object
>> must have been is a pretty strong hint to start looking for a way for
>> that ERR_PTR(-ENOMEM) having ended up there...  Something like
>> 0x6e69622f7273752f is almost certainly a misplaced "/usr/bin", i.e. a
>> pathname overwriting whatever it ends up in, etc.  And yes, I have run
>> into both of those in real life.
>>
>> Debugging the situation when crap value has ended up in place of a
>> pointer is certainly a case where you do want to see what exactly has
>> ended up in there...
>
> Linus, how would you feel about printing ERR_PTRs without molestation?

Stop this stupidity already.

Guys, seriously. You're making idiotic arguments that have nothing to
do with reality.

If you actually have an invalid pointer that causes an oops (whether
it's an ERR_PTR or something like 0x6e69622f7273752f or something like
the list poison values etc),

  WE ALREADY PRINT OUT THE WHOLE UNHASHED POINTER VALUE

This "but but but some pointers are magic and shouldn't be hashed"
stuff is *garbage*. You're making shit up. You don't have a single
actual real-life example that you can point to that is relevant.

Again, I've seen those bad pointer oopses myself. Yes, the magic
values are relevant, and should be printed out.

BUT THEY ALREADY ARE PRINTED OUT.

Christ.

So let me repeat:

 - don't change %p behavior.

 - don't use "I was confused" as an argument. Yes, things changed, and
yes, it clearly caused confusion, but that is temporary and is not an
argument for making magic changes.

 - don't make up some garbage theoretical example: give a hard example
of output that actually didn't have enough information.

And yes, we'll just replace the %p with %px when that last situation
holds. Really. Really really.

But it needs to be a real example, not a "what if" that doesn't make sense.

Not some pet theory that doesn't hold water.

This whole "what if it was a poison pointer" argument is a _prime_
example of pure and utter garbage.

If we have an oops, and it was due a poison value or an err-pointer
that we dereferenced, we will *see* the poison value.

It will be right there in the register state.

It will be right there in the bad address.

It will be quite visible.

And yes, we had a few cases where the hashing actually did hide the
values, and I've been applying patches to turn those from %p to %px.

But it had better be actual real cases, and real thought out
situations. Not "let's just randomly pick values that we don't hash".

                Linus

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: BUG: bad usercopy in memdup_user
  2017-12-20  4:05                 ` Linus Torvalds
@ 2017-12-20  4:36                   ` Linus Torvalds
  0 siblings, 0 replies; 25+ messages in thread
From: Linus Torvalds @ 2017-12-20  4:36 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Al Viro, Tobin C. Harding, Dmitry Vyukov, Kees Cook, Tetsuo Handa,
	Linux-MM, syzbot, David Windsor, keun-o.park, Laura Abbott, LKML,
	Mark Rutland, Ingo Molnar, syzkaller-bugs, Will Deacon

On Tue, Dec 19, 2017 at 8:05 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> And yes, we had a few cases where the hashing actually did hide the
> values, and I've been applying patches to turn those from %p to %px.

So far at least:

  10a7e9d84915 Do not hash userspace addresses in fault handlers
  85c3e4a5a185 mm/slab.c: do not hash pointers when debugging slab
  d81041820873 powerpc/xmon: Don't print hashed pointers in xmon
  328b4ed93b69 x86: don't hash faulting address in oops printout
  b7ad7ef742a9 remove task and stack pointer printout from oops dump
  6424f6bb4327 kasan: use %px to print addresses instead of %p

although that next-to-last case is a "remove %p" case rather than
"convert to %px".

And we'll probably hit a few more, I'm not at all claiming that we're
somehow "done". There's bound to be other cases people haven't noticed
yet (or haven't patched yet, like the usercopy case that Kees is
signed up to fix up).

But considering that we had something like 12k of those %p users, I
think a handful now (and maybe a few tens eventually) is worth the
pain and confusion.

I just want to make sure that the ones we _do_ convert we actually
spend the mental effort really looking at, and really asking "does it
make sense to convert this?"

Not just knee-jerking "oh, it's hashed, let's just unhash it".

               Linus

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: BUG: bad usercopy in memdup_user
  2017-12-19 21:48             ` Al Viro
                                 ` (2 preceding siblings ...)
  2017-12-20  3:50               ` Matthew Wilcox
@ 2017-12-20  9:44               ` David Laight
  2017-12-31  8:11                 ` Dmitry Vyukov
  3 siblings, 1 reply; 25+ messages in thread
From: David Laight @ 2017-12-20  9:44 UTC (permalink / raw)
  To: 'Al Viro', Linus Torvalds
  Cc: Matthew Wilcox, Tobin C. Harding, Dmitry Vyukov, Kees Cook,
	Tetsuo Handa, Linux-MM, syzbot, David Windsor,
	keun-o.park@darkmatter.ae, Laura Abbott, LKML, Mark Rutland,
	Ingo Molnar, syzkaller-bugs@googlegroups.com, Will Deacon

From: Al Viro
> Sent: 19 December 2017 21:49
> > I suspect that an "offset and size within the kernel object" value
> > might make sense.  But what does the _pointer_ tell you?
> 
> Well, for example seeing a 0xfffffffffffffff4 where a pointer to object
> must have been is a pretty strong hint to start looking for a way for
> that ERR_PTR(-ENOMEM) having ended up there...  Something like
> 0x6e69622f7273752f is almost certainly a misplaced "/usr/bin", i.e. a
> pathname overwriting whatever it ends up in, etc.  And yes, I have run
> into both of those in real life.
> 
> Debugging the situation when crap value has ended up in place of a
> pointer is certainly a case where you do want to see what exactly has
> ended up in there...

I've certainly seen a lot of ascii in pointers (usually because the
previous item has overrun).
Although I suspect they'd appear in the fault frame - which hopefully
carries real addresses.

A compromise would be to hash the 'page' part of the address.
On 64bit systems this is probably about 32 bits.
It would still show whether pointers are user, kernel, vmalloc (etc)
but without giving away the actual value.
The page offset (12 bits) would show the alignment (etc).

Including a per-boot random number would make it harder to generate
'rainbow tables' to reverse the hash.

	David

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: BUG: bad usercopy in memdup_user
  2017-12-20  9:44               ` David Laight
@ 2017-12-31  8:11                 ` Dmitry Vyukov
  0 siblings, 0 replies; 25+ messages in thread
From: Dmitry Vyukov @ 2017-12-31  8:11 UTC (permalink / raw)
  To: David Laight
  Cc: Al Viro, Linus Torvalds, Matthew Wilcox, Tobin C. Harding,
	Kees Cook, Tetsuo Handa, Linux-MM, syzbot, David Windsor,
	keun-o.park@darkmatter.ae, Laura Abbott, LKML, Mark Rutland,
	Ingo Molnar, syzkaller-bugs@googlegroups.com, Will Deacon

On Wed, Dec 20, 2017 at 10:44 AM, David Laight <David.Laight@aculab.com> wrote:
> From: Al Viro
>> Sent: 19 December 2017 21:49
>> > I suspect that an "offset and size within the kernel object" value
>> > might make sense.  But what does the _pointer_ tell you?
>>
>> Well, for example seeing a 0xfffffffffffffff4 where a pointer to object
>> must have been is a pretty strong hint to start looking for a way for
>> that ERR_PTR(-ENOMEM) having ended up there...  Something like
>> 0x6e69622f7273752f is almost certainly a misplaced "/usr/bin", i.e. a
>> pathname overwriting whatever it ends up in, etc.  And yes, I have run
>> into both of those in real life.
>>
>> Debugging the situation when crap value has ended up in place of a
>> pointer is certainly a case where you do want to see what exactly has
>> ended up in there...
>
> I've certainly seen a lot of ascii in pointers (usually because the
> previous item has overrun).
> Although I suspect they'd appear in the fault frame - which hopefully
> carries real addresses.
>
> A compromise would be to hash the 'page' part of the address.
> On 64bit systems this is probably about 32 bits.
> It would still show whether pointers are user, kernel, vmalloc (etc)
> but without giving away the actual value.
> The page offset (12 bits) would show the alignment (etc).
>
> Including a per-boot random number would make it harder to generate
> 'rainbow tables' to reverse the hash.


Bad things on kmalloc-1024 are most likely caused by an invalid free
in pcrypt, it freed a pointer into a middle of a 1024 byte heap object
which was undetected by KASAN (now there is a patch for this in mm
tree) and later caused all kinds of bad things:
https://groups.google.com/forum/#!topic/syzkaller-bugs/NKn_ivoPOpk
https://patchwork.kernel.org/patch/10126761/

#syz dup: KASAN: use-after-free Read in __list_del_entry_valid (2)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: BUG: bad usercopy in memdup_user
  2017-12-19 21:36           ` Linus Torvalds
  2017-12-19 21:48             ` Al Viro
@ 2017-12-19 21:54             ` Kees Cook
  2017-12-19 22:16             ` Matthew Wilcox
  2017-12-19 22:24             ` Laura Abbott
  3 siblings, 0 replies; 25+ messages in thread
From: Kees Cook @ 2017-12-19 21:54 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Matthew Wilcox, Tobin C. Harding, Dmitry Vyukov, Tetsuo Handa,
	Linux-MM, syzbot, David Windsor, keun-o.park, Laura Abbott, LKML,
	Mark Rutland, Ingo Molnar, syzkaller-bugs, Will Deacon

On Tue, Dec 19, 2017 at 1:36 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> In this case, there's obviously no permission issue: it's an error
> report. So it's either "remove it, or switch to %px".

Yup, my intention was to kill this %p and enhance the report to
actually include the useful information like, "what is the offset from
the start of the object", etc. That's what really matters.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: BUG: bad usercopy in memdup_user
  2017-12-19 21:36           ` Linus Torvalds
  2017-12-19 21:48             ` Al Viro
  2017-12-19 21:54             ` Kees Cook
@ 2017-12-19 22:16             ` Matthew Wilcox
  2017-12-19 22:24             ` Laura Abbott
  3 siblings, 0 replies; 25+ messages in thread
From: Matthew Wilcox @ 2017-12-19 22:16 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Tobin C. Harding, Dmitry Vyukov, Kees Cook, Tetsuo Handa,
	Linux-MM, syzbot, David Windsor, keun-o.park, Laura Abbott, LKML,
	Mark Rutland, Ingo Molnar, syzkaller-bugs, Will Deacon

On Tue, Dec 19, 2017 at 01:36:46PM -0800, Linus Torvalds wrote:
> On Tue, Dec 19, 2017 at 5:22 AM, Matthew Wilcox <willy@infradead.org> wrote:
> >
> > Could we have a way to know that the printed address is hashed and not just
> > a pointer getting completely scrogged?  Perhaps prefix it with ... a hash!
> > So this line would look like:
> 
> The problem with that is that it will break tools that parse things.

Yeah, but the problem is that until people know to expect hashes, it
breaks people.  I spent most of a day last week puzzling over a value
coming from a VM_BUG_ON that was explicitly tested for and couldn't
happen.

> When we find something like this, we should either remove it, fix the
> permissions, or switch to %px.

Right; I sent a patch to fix VM_BUG_ON earlier today after reading
this thread.

> But honestly, what do people expect that the pointer value will
> actually tell you if it is unhashed?

It would have been meaningful to me.  For a start, I would have seen
that the bottom two bits were clear, so this was actually a pointer and
not something masquerading as a pointer.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: BUG: bad usercopy in memdup_user
  2017-12-19 21:36           ` Linus Torvalds
                               ` (2 preceding siblings ...)
  2017-12-19 22:16             ` Matthew Wilcox
@ 2017-12-19 22:24             ` Laura Abbott
  3 siblings, 0 replies; 25+ messages in thread
From: Laura Abbott @ 2017-12-19 22:24 UTC (permalink / raw)
  To: Linus Torvalds, Matthew Wilcox
  Cc: Tobin C. Harding, Dmitry Vyukov, Kees Cook, Tetsuo Handa,
	Linux-MM, syzbot, David Windsor, keun-o.park, LKML, Mark Rutland,
	Ingo Molnar, syzkaller-bugs, Will Deacon

On 12/19/2017 01:36 PM, Linus Torvalds wrote:
> On Tue, Dec 19, 2017 at 5:22 AM, Matthew Wilcox <willy@infradead.org> wrote:
>>
>> Could we have a way to know that the printed address is hashed and not just
>> a pointer getting completely scrogged?  Perhaps prefix it with ... a hash!
>> So this line would look like:
> 
> The problem with that is that it will break tools that parse things.
> 
> So no, it won't work.
> 
> When we find something like this, we should either remove it, fix the
> permissions, or switch to %px.
> 
> In this case, there's obviously no permission issue: it's an error
> report. So it's either "remove it, or switch to %px".
> 
> I'm personally not clear on whether the pointer really makes any sense
> at all. But if it does, it should just be changed to %px, since it's a
> bug report.
> 
> But honestly, what do people expect that the pointer value will
> actually tell you if it is unhashed?
> 
> I suspect that an "offset and size within the kernel object" value
> might make sense.  But what does the _pointer_ tell you?
> 
> I've noticed this with pretty much every report. People get upset
> about the hashing, but don't seem to actually be able to ever tell
> what the f*ck they would use the non-hashed pointer value for.
> 
> I've asked for this before: whenever somebody complains about the
> hashing, you had better tell exactly what the unhashed value would
> have given you, and how it would have helped debug the problem.
> 
> Because if you can't tell that, then dammit, then we should just
> _remove_ the stupid %p.
> 
> Instead, people ask for "can I get everything unhashed" even when they
> can't give a reason for it.
> 
>              Linus
> 

It's most useful in the "I really have no idea what's going on" case.
Sometimes just narrowing down if a pointer was kernel or vmalloc or
kmap or whatever is the only starting point. I agree that
size plus offset from object would be helpful.

Thanks,
Laura

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2017-12-31  8:11 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <001a113e9ca8a3affd05609d7ccf@google.com>
2017-12-18 14:22 ` BUG: bad usercopy in memdup_user Tetsuo Handa
2017-12-19  0:57   ` Kees Cook
2017-12-19  8:12     ` Dmitry Vyukov
2017-12-19  8:37       ` Tobin C. Harding
2017-12-19  8:41         ` Dmitry Vyukov
2017-12-19  9:04           ` Tobin C. Harding
2017-12-19  9:07             ` Dmitry Vyukov
2017-12-19 13:22         ` Matthew Wilcox
2017-12-19 13:41           ` Dmitry Vyukov
2017-12-19 14:08             ` Tetsuo Handa
2017-12-19 14:12               ` Dmitry Vyukov
2017-12-19 20:45                 ` Tobin C. Harding
2017-12-19 20:33           ` Tobin C. Harding
2017-12-19 21:36           ` Linus Torvalds
2017-12-19 21:48             ` Al Viro
2017-12-19 22:09               ` Randy Dunlap
2017-12-19 23:24               ` Linus Torvalds
2017-12-20  3:50               ` Matthew Wilcox
2017-12-20  4:05                 ` Linus Torvalds
2017-12-20  4:36                   ` Linus Torvalds
2017-12-20  9:44               ` David Laight
2017-12-31  8:11                 ` Dmitry Vyukov
2017-12-19 21:54             ` Kees Cook
2017-12-19 22:16             ` Matthew Wilcox
2017-12-19 22:24             ` Laura Abbott

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox