From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Ahern Subject: Re: Repeatable inet6_dump_fib crash in stock 4.12.0-rc4+ Date: Tue, 17 Apr 2018 18:38:34 -0600 Message-ID: <42e545cc-e00e-ee30-c975-1c04e8f8f94f@gmail.com> References: <94bcc041-6402-d0ce-b9cf-3b46aa622f34@candelatech.com> <7e0c97fa-cd6e-ed0f-bf99-0e4af40fbd2f@gmail.com> <1497043557.736.94.camel@edumazet-glaptop3.roam.corp.google.com> <9cb61ef0-37c0-8f35-bb5c-e3d8e63cbe2f@candelatech.com> <3230b360-528b-0ae0-8731-7906e57ee993@gmail.com> <4b65e262-e727-010a-ce1f-eb45fcef8e42@candelatech.com> <8630b942-2684-2f21-fdb9-8474aba71528@gmail.com> <09a00004-da54-dc8f-5806-9576bbf577c7@candelatech.com> <20170620180515.GB6104@unicorn.suse.cz> <46695455-c476-fa5c-f272-b8864898dd28@candelatech.com> <763bdb6c-bd5f-2398-53ca-6d9dc28c3df6@candelatech.com> <0b938b4e-bfe7-c320-4f61-031ae5870159@candelatech.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit Cc: Cong Wang , Eric Dumazet , netdev To: Ben Greear , Michal Kubecek Return-path: Received: from mail-pf0-f194.google.com ([209.85.192.194]:37399 "EHLO mail-pf0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752944AbeDRAih (ORCPT ); Tue, 17 Apr 2018 20:38:37 -0400 Received: by mail-pf0-f194.google.com with SMTP id p6so22768pfn.4 for ; Tue, 17 Apr 2018 17:38:37 -0700 (PDT) In-Reply-To: <0b938b4e-bfe7-c320-4f61-031ae5870159@candelatech.com> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 4/17/18 5:29 PM, Ben Greear wrote: > > FYI, problem still happens in 4.16.  I'm going to re-enable my hack below > for this kernel as well...I had hopes it might be fixed... Interesting. I was hoping the same. > > BUG: unable to handle kernel NULL pointer dereference at 8 > IP: fib6_walk_continue+0x5b/0x140 [ipv6] > PGD 80000007dfc0c067 P4D 80000007dfc0c067 PUD 7e66ff067 PMD 0 > Oops: 0000 [#1] PREEMPT SMP PTI > Modules linked in: nf_conntrack_netlink nf_conntrack nfnetlink > nf_defrag_ipv4 libcrc32c vrf] > CPU: 3 PID: 15117 Comm: ip Tainted: G           O     4.16.0+ #5 > Hardware name: Iron_Systems,Inc CS-CAD-2U-A02/X10SRL-F, BIOS 2.0b > 05/02/2017 > RIP: 0010:fib6_walk_continue+0x5b/0x140 [ipv6] > RSP: 0018:ffffc90008c3bc10 EFLAGS: 00010287 > RAX: ffff88085ac45050 RBX: ffff8807e03008a0 RCX: 0000000000000000 > RDX: 0000000000000000 RSI: ffffc90008c3bc48 RDI: ffffffff8232b240 > RBP: ffff880819167600 R08: 0000000000000008 R09: ffff8807dff10071 > R10: ffffc90008c3bbd0 R11: 0000000000000000 R12: ffff8807e03008a0 > R13: 0000000000000002 R14: ffff8807e05744c8 R15: ffff8807e08ef000 > FS:  00007f2f04342700(0000) GS:ffff88087fcc0000(0000) > knlGS:0000000000000000 > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000008 CR3: 00000007e0556002 CR4: 00000000003606e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: >  inet6_dump_fib+0x14b/0x2c0 [ipv6] >  netlink_dump+0x216/0x2a0 >  netlink_recvmsg+0x254/0x400 >  ? copy_msghdr_from_user+0xb5/0x110 >  ___sys_recvmsg+0xe9/0x230 >  ? find_held_lock+0x3b/0xb0 >  ? __handle_mm_fault+0x617/0x1180 >  ? __audit_syscall_entry+0xb3/0x110 >  ? __sys_recvmsg+0x39/0x70 >  __sys_recvmsg+0x39/0x70 >  do_syscall_64+0x63/0x120 >  entry_SYSCALL_64_after_hwframe+0x3d/0xa2 > RIP: 0033:0x7f2f03a72030 > RSP: 002b:00007fffab3de508 EFLAGS: 00000246 ORIG_RAX: 000000000000002f > RAX: ffffffffffffffda RBX: 00007fffab3e641c RCX: 00007f2f03a72030 > RDX: 0000000000000000 RSI: 00007fffab3de570 RDI: 0000000000000004 > RBP: 0000000000000000 R08: 0000000000007e6c R09: 00007fffab3e63a8 > R10: 00007fffab3de5b0 R11: 0000000000000246 R12: 00007fffab3e6608 > R13: 000000000066b460 R14: 0000000000007e6c R15: 0000000000000000 > Code: 85 d2 74 17 f6 40 2a 04 74 11 8b 53 2c 85 d2 0f 84 d7 00 00 00 83 > ea 01 89 53 2c c7 4 > RIP: fib6_walk_continue+0x5b/0x140 [ipv6] RSP: ffffc90008c3bc10 > CR2: 0000000000000008 > ---[ end trace bd03458864eb266c ]--- > Kernel panic - not syncing: Fatal exception in interrupt > Kernel Offset: disabled > Rebooting in 10 seconds.. > ACPI MEMORY or I/O RESET_REG. > Since you can reproduce, would you mind trying https://github.com/dsahern/linux.git ipv6/fib6-change-v2 Hopefully these will be committed upstream soon. It changes the game a bit with the FIB walker. Would be interesting to know if this problem goes away.