From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yonghong Song Subject: oops with ip6_rt_cache_alloc Date: Fri, 24 Aug 2018 15:26:55 -0700 Message-ID: <5d3d7d56-ce9f-79c3-04ec-122a2451b580@fb.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit To: David Ahern , netdev , Alexei Starovoitov , Martin Lau , Dave Jones Return-path: Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:43654 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726770AbeHYCDi (ORCPT ); Fri, 24 Aug 2018 22:03:38 -0400 Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: Hi, We got a kernel oops with the following stack trace: CPU: 24 PID: 0 Comm: swapper/24 Not tainted 4.16.0-10_fbk1_1183_g7e4ee4c8171c #10 "Hardware name: Quanta Leopard-DDR3/Leopard-DDR3, BIOS F06_3A16.DDR3 11/19/2015" RIP: 0010:ip6_rt_get_dev_rcu+0x6/0x60 RSP: 0018:ffff88046fb03c78 EFLAGS: 00010286 RAX: 0000000040000003 RBX: ffff88035a6c1500 RCX: ffffffff81ec5dc0 RDX: ffff88033192a090 RSI: ffff88033192a0a0 RDI: 0000000000000000 RBP: ffff88046fb03cb0 R08: 0000000040000003 R09: ffff8803eb770d00 R10: 0000000000000000 R11: 0000000000000000 R12: ffff88033192a0a0 R13: ffff88033192a090 R14: 0000000000000000 R15: ffff8803d748d700 FS: 0000000000000000(0000) GS:ffff88046fb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000054 CR3: 000000000220a002 CR4: 00000000001606e0 Call Trace: ip6_rt_cache_alloc+0x20/0x100 __ip6_rt_update_pmtu+0xae/0x180 ip6_tnl_xmit+0x330/0x970 [ip6_tunnel] ? __gre6_xmit+0x2d5/0x540 [ip6_gre] ? ip6_forward+0x522/0x7e0 ? ip6_tnl_parse_tlv_enc_lim+0x59/0x190 [ip6_tunnel] ? ip6gre_tunnel_xmit+0xe3/0x320 [ip6_gre] ip6gre_tunnel_xmit+0xe3/0x320 [ip6_gre] dev_hard_start_xmit+0x9e/0x200 sch_direct_xmit+0xeb/0x250 __qdisc_run+0x146/0x510 net_tx_action+0xde/0x210 __do_softirq+0xd8/0x2a8 irq_exit+0xa8/0xb0 smp_apic_timer_interrupt+0x6c/0x120 apic_timer_interrupt+0xf/0x20 RIP: 0010:poll_idle+0x31/0x61 RSP: 0018:ffffc9000328fed8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff12 RAX: 0000000000000000 RBX: ffffffff822da9e0 RCX: ffff88046d4e7000 RDX: 0000000000000000 RSI: ffffffff822da9e0 RDI: ffffe8fc00301c00 RBP: ffffe8fc00301c00 R08: 0000000000000f1a R09: 0000000000000001 R10: ffffc9000328fec8 R11: 0000000000000f15 R12: 0000000000000000 R13: ffffffff822da9f8 R14: 0000000000000000 R15: 00002e37d560bb8e ? acpi_idle_do_entry+0x40/0x40 cpuidle_enter_state+0x70/0x2a0 do_idle+0xdf/0x170 cpu_startup_entry+0x19/0x20 secondary_startup_64+0xa5/0xb0 Code: d7 be 01 00 00 00 48 83 e0 fe 48 8b 00 48 89 42 10 ba 0f 00 00 00 e9 7a fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 53 47 54 00 00 10 80 48 8b 9f a8 00 00 00 74 22 8b 83 0c 02 00 RIP: ip6_rt_get_dev_rcu+0x6/0x60 RSP: ffff88046fb03c78 CR2: 0000000000000054 Our internal experiments showed that an early version of 4.16 works fine and after backporting some ipv6 route related changes and the above problem showed up. Have anybody seen this issue? Thanks!