* Repeatable IPv6 crash in 3.19.0-1 @ 2015-02-27 21:37 Brian Rak 2015-02-28 0:48 ` Eric Dumazet 0 siblings, 1 reply; 10+ messages in thread From: Brian Rak @ 2015-02-27 21:37 UTC (permalink / raw) To: netdev I've been seeing a crash under 3.19.0 that seems to occur when I put heavy traffic across a macvtap/veth interface. We have a KVM guest attached to a veth pair using macvtap. We're routing IPv6 traffic into one end of the veth pair using some static routes. We do *not* have proxy_ndp enabled (though, we are using some software to do neighbor proxying - http://priv.nu/projects/ndppd/ ). I've been able to reproduce this pretty easily by downloading some large files from the guest. We see two traces in a row when this occurs: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 6520 at arch/x86/kernel/smp.c:124 native_smp_send_reschedule+0x5f/0x70() Modules linked in: ip_set netconsole configfs xt_comment ebt_ip6 ip6table_mangle veth xt_physdev br_netfilter ebt_arp ebt_ip ebtable_nat ebtables cls_fw sch_sfq sch_htb vhost_net macvtap macvlan vhost tun kvm_intel kvm 8021q garp nfnetlink_queue nfnetlink_log nfnetlink bluetooth rfkill bridge stp llc xt_CHECKSUM iptable_mangle ipt_REJECT nf_reject_ipv4 iptable_filter ip_tables ip6t_REJECT nf_reject_ipv6 ip6table_filter ip6_tables ipv6 joydev iTCO_wdt iTCO_vendor_support 8250_fintek ipmi_devintf ipmi_si ipmi_msghandler microcode pcspkr i2c_i801 sg lpc_ich igb dca ptp pps_core hwmon shpchp xhci_pci xhci_hcd ie31200_edac edac_core ext4 jbd2 mbcache sd_mod ahci libahci video ttm drm_kms_helper sysimgblt sysfillrect syscopyarea dm_mirror dm_region_hash dm_log dm_mod CPU: 0 PID: 6520 Comm: vhost-6518 Tainted: G D 3.19.0-1.el6.elrepo.x86_64 #1 Hardware name: Supermicro X10SLH-F/X10SLM+-F/X10SLH-F/X10SLM+-F, BIOS 1.1a 12/03/2013 000000000000007c ffff88041fc035a0 ffffffff816754e2 000000000000007c 0000000000000000 ffff88041fc035e0 ffffffff81074bc5 ffff88041fc03600 ffff88041fc53f00 0000000000000001 ffff88041fc13f00 ffff8803f6a11150 Call Trace: <IRQ> [<ffffffff816754e2>] dump_stack+0x48/0x5e [<ffffffff81074bc5>] warn_slowpath_common+0x95/0xe0 [<ffffffff81074c2a>] warn_slowpath_null+0x1a/0x20 [<ffffffff8104749f>] native_smp_send_reschedule+0x5f/0x70 [<ffffffff810a83fa>] trigger_load_balance+0x14a/0x1f0 [<ffffffff81099a06>] scheduler_tick+0xa6/0xe0 [<ffffffff810da121>] update_process_times+0x51/0x70 [<ffffffff810eb919>] tick_sched_handle+0x39/0x80 [<ffffffff810ebb62>] tick_sched_timer+0x52/0xa0 [<ffffffff810dc9d3>] __run_hrtimer+0x83/0x1d0 [<ffffffff810ebb10>] ? tick_nohz_handler+0xc0/0xc0 [<ffffffff810dcd46>] hrtimer_interrupt+0x106/0x250 [<ffffffff8104a249>] local_apic_timer_interrupt+0x39/0x60 [<ffffffff8167c7d5>] smp_apic_timer_interrupt+0x45/0x60 [<ffffffff8167a87d>] apic_timer_interrupt+0x6d/0x80 [<ffffffff81675362>] ? panic+0x1c0/0x206 [<ffffffff8167535b>] ? panic+0x1b9/0x206 [<ffffffff810185ca>] oops_end+0xea/0xf0 [<ffffffff810602c5>] no_context+0x125/0x200 [<ffffffff810604cd>] __bad_area_nosemaphore+0x12d/0x230 [<ffffffffa02f726c>] ? ip6t_do_table+0x29c/0x6e0 [ip6_tables] [<ffffffffa0331ed0>] ? deliver_clone+0x60/0x60 [bridge] [<ffffffff810605e3>] bad_area_nosemaphore+0x13/0x20 [<ffffffff81060b76>] __do_page_fault+0x336/0x520 [<ffffffffa03320b9>] ? br_dev_queue_push_xmit+0x1e9/0x200 [bridge] [<ffffffff81060e6c>] do_page_fault+0x2c/0x40 [<ffffffff8167b928>] page_fault+0x28/0x30 [<ffffffffa02836a3>] ? ip6_finish_output2+0x193/0x490 [ipv6] [<ffffffff815d9e4d>] ? nf_hook_slow+0x7d/0x150 [<ffffffffa0283e10>] ? ip6_xmit+0x470/0x470 [ipv6] [<ffffffffa0282a00>] ? ip6_forward_proxy_check+0x150/0x150 [ipv6] [<ffffffffa0283ea5>] ip6_finish_output+0x95/0xd0 [ipv6] [<ffffffffa0283f58>] ip6_output+0x78/0xb0 [ipv6] [<ffffffffa0282a16>] ip6_forward_finish+0x16/0x20 [ipv6] [<ffffffffa0284548>] ip6_forward+0x5b8/0x7a0 [ipv6] [<ffffffffa0290cac>] ? ip6_route_input+0xbc/0xe0 [ipv6] [<ffffffffa028590d>] ip6_rcv_finish+0x9d/0xb0 [ipv6] [<ffffffffa0285c88>] ipv6_rcv+0x368/0x4d0 [ipv6] [<ffffffff815a8274>] __netif_receive_skb_core+0x4b4/0x640 [<ffffffff815a8427>] __netif_receive_skb+0x27/0x70 [<ffffffff815a8562>] process_backlog+0xf2/0x1b0 [<ffffffff815a8de3>] napi_poll+0xd3/0x1c0 [<ffffffff810e9664>] ? clockevents_program_event+0x74/0x120 [<ffffffff815a8f60>] net_rx_action+0x90/0x1c0 [<ffffffff81078b3b>] __do_softirq+0xfb/0x2a0 [<ffffffff8167b53c>] do_softirq_own_stack+0x1c/0x30 <EOI> [<ffffffff81078645>] do_softirq+0x55/0x60 [<ffffffff81078728>] __local_bh_enable_ip+0x88/0x90 [<ffffffff815a9c67>] __dev_queue_xmit+0x227/0x5a0 [<ffffffff815aa000>] dev_queue_xmit+0x10/0x20 [<ffffffffa04b4417>] macvtap_get_user+0x437/0x5d0 [macvtap] [<ffffffffa04a1172>] ? vhost_get_vq_desc+0x152/0x300 [vhost] [<ffffffffa04b45d5>] macvtap_sendmsg+0x25/0x30 [macvtap] [<ffffffffa04b9f8b>] handle_tx+0x27b/0x480 [vhost_net] [<ffffffffa04ba1c5>] handle_tx_kick+0x15/0x20 [vhost_net] [<ffffffffa04a0f6d>] vhost_worker+0x10d/0x1c0 [vhost] [<ffffffffa04a0e60>] ? vhost_dev_init+0x1d0/0x1d0 [vhost] [<ffffffff8109244e>] kthread+0xce/0xf0 [<ffffffff81092380>] ? kthread_freezable_should_stop+0x70/0x70 [<ffffffff816798bc>] ret_from_fork+0x7c/0xb0 [<ffffffff81092380>] ? kthread_freezable_should_stop+0x70/0x70 ---[ end trace eb7c35e4dfea0d83 ]--- BUG: unable to handle kernel paging request at ffff880408812ffe IP: [<ffffffffa027b6a3>] ip6_finish_output2+0x193/0x490 [ipv6] PGD 211e067 PUD 2121067 PMD 409339063 PTE 8000000408812161 Oops: 0003 [#1] SMP Modules linked in: netconsole configfs ip_set xt_comment ebt_ip6 ip6table_mangle veth xt_physdev br_netfilter ebt_arp ebt_ip ebtable_nat ebtables cls_fw sch_sfq sch_htb vhost_net macvtap macvlan vhost tun kvm_intel kvm 8021q garp nfnetlink_queue nfnetlink_log nfnetlink bluetooth rfkill bridge stp llc joydev xt_CHECKSUM iptable_mangle ipt_REJECT nf_reject_ipv4 iptable_filter ip_tables ip6t_REJECT nf_reject_ipv6 ip6table_filter ip6_tables ipv6 iTCO_wdt iTCO_vendor_support 8250_fintek ipmi_devintf ipmi_si ipmi_msghandler microcode pcspkr i2c_i801 sg lpc_ich igb dca ptp pps_core hwmon shpchp xhci_pci xhci_hcd ie31200_edac edac_core ext4 jbd2 mbcache sd_mod ahci libahci video ttm drm_kms_helper sysimgblt sysfillrect syscopyarea dm_mirror dm_region_hash dm_log dm_mod CPU: 7 PID: 8187 Comm: vhost-8184 Not tainted 3.19.0-1.el6.elrepo.x86_64 #1 Hardware name: Supermicro X10SLH-F/X10SLM+-F/X10SLH-F/X10SLM+-F, BIOS 1.1a 12/03/2013 task: ffff8803f391c050 ti: ffff88040c128000 task.ti: ffff88040c128000 RIP: 0010:[<ffffffffa027b6a3>] [<ffffffffa027b6a3>] ip6_finish_output2+0x193/0x490 [ipv6] RSP: 0018:ffff88041fdc3be8 EFLAGS: 00010283 RAX: ffff88040881300e RBX: ffff8803cfcd3a00 RCX: ffff88040d1c52e4 RDX: 7f813e3323000000 RSI: ffff88040bcee168 RDI: ffff8803f65b55c0 RBP: ffff88041fdc3c38 R08: ffff8803d36283d8 R09: 00000000ff332302 R10: 00000000000080fe R11: 000000007f813efe R12: 000000000000000e R13: ffff88040d1c5200 R14: ffff88040d1c52f0 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88041fdc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff880408812ffe CR3: 00000000d1613000 CR4: 00000000001427e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Stack: ffffffffa027be10 ffff880380000000 ffffffffa027aa00 0000000a00000002 ffffffff81d5e380 ffff8803cfcd3a00 00000000000005dc ffffffff81d25340 ffff88040881300e ffff880408813000 ffff88041fdc3c58 ffffffffa027bea5 Call Trace: <IRQ> [<ffffffffa027be10>] ? ip6_xmit+0x470/0x470 [ipv6] [<ffffffffa027aa00>] ? ip6_forward_proxy_check+0x150/0x150 [ipv6] [<ffffffffa027bea5>] ip6_finish_output+0x95/0xd0 [ipv6] [<ffffffffa027bf58>] ip6_output+0x78/0xb0 [ipv6] [<ffffffffa027aa16>] ip6_forward_finish+0x16/0x20 [ipv6] [<ffffffffa027c548>] ip6_forward+0x5b8/0x7a0 [ipv6] [<ffffffffa0288cac>] ? ip6_route_input+0xbc/0xe0 [ipv6] [<ffffffffa027d90d>] ip6_rcv_finish+0x9d/0xb0 [ipv6] [<ffffffffa027dc88>] ipv6_rcv+0x368/0x4d0 [ipv6] [<ffffffff815a8274>] __netif_receive_skb_core+0x4b4/0x640 [<ffffffff815a8427>] __netif_receive_skb+0x27/0x70 [<ffffffff815a8562>] process_backlog+0xf2/0x1b0 [<ffffffff815a8de3>] napi_poll+0xd3/0x1c0 [<ffffffff815a8f60>] net_rx_action+0x90/0x1c0 [<ffffffff81078b3b>] __do_softirq+0xfb/0x2a0 [<ffffffff8167b53c>] do_softirq_own_stack+0x1c/0x30 <EOI> [<ffffffff81078645>] do_softirq+0x55/0x60 [<ffffffff81078728>] __local_bh_enable_ip+0x88/0x90 [<ffffffff815a9c67>] __dev_queue_xmit+0x227/0x5a0 [<ffffffff815aa000>] dev_queue_xmit+0x10/0x20 [<ffffffffa04b0417>] macvtap_get_user+0x437/0x5d0 [macvtap] [<ffffffffa049d172>] ? vhost_get_vq_desc+0x152/0x300 [vhost] [<ffffffffa04b05d5>] macvtap_sendmsg+0x25/0x30 [macvtap] [<ffffffffa04b5f8b>] handle_tx+0x27b/0x480 [vhost_net] [<ffffffffa04b61c5>] handle_tx_kick+0x15/0x20 [vhost_net] [<ffffffffa049cf6d>] vhost_worker+0x10d/0x1c0 [vhost] [<ffffffffa049ce60>] ? vhost_dev_init+0x1d0/0x1d0 [vhost] [<ffffffff8109244e>] kthread+0xce/0xf0 [<ffffffff81092380>] ? kthread_freezable_should_stop+0x70/0x70 [<ffffffff816798bc>] ret_from_fork+0x7c/0xb0 [<ffffffff81092380>] ? kthread_freezable_should_stop+0x70/0x70 Code: 00 00 44 8b 39 41 f6 c7 01 0f 85 8d 02 00 00 45 0f b7 a5 e0 00 00 00 41 83 fc 10 0f 8f 82 02 00 00 49 8b 16 48 8b 83 d8 00 00 00 <48> 89 50 f0 49 8b 56 08 48 89 50 f8 45 3b bd e4 00 00 00 75 c2 RIP [<ffffffffa027b6a3>] ip6_finish_output2+0x193/0x490 [ipv6] RSP <ffff88041fdc3be8> CR2: ffff880408812ffe ---[ end trace d743d347dba40c49 ]--- ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Repeatable IPv6 crash in 3.19.0-1 2015-02-27 21:37 Repeatable IPv6 crash in 3.19.0-1 Brian Rak @ 2015-02-28 0:48 ` Eric Dumazet 2015-02-28 1:16 ` Eric Dumazet 0 siblings, 1 reply; 10+ messages in thread From: Eric Dumazet @ 2015-02-28 0:48 UTC (permalink / raw) To: Brian Rak; +Cc: netdev On Fri, 2015-02-27 at 16:37 -0500, Brian Rak wrote: > I've been seeing a crash under 3.19.0 that seems to occur when I put > heavy traffic across a macvtap/veth interface. > > We have a KVM guest attached to a veth pair using macvtap. We're > routing IPv6 traffic into one end of the veth pair using some static > routes. We do *not* have proxy_ndp enabled (though, we are using some > software to do neighbor proxying - http://priv.nu/projects/ndppd/ ). > > I've been able to reproduce this pretty easily by downloading some large > files from the guest. We see two traces in a row when this occurs: Nice ! Crash is in neigh_hh_output() -> memcpy(skb->data - HH_DATA_MOD, hh->hh_data, HH_DATA_MOD); And there is only 14 bytes of headroom instead of 16. Some layer did not align skb_headroom(skb) to HH_DATA_MOD for ethernet header. IPv4 has a paranoid section, not IPv6 : /* Be paranoid, rather than too clever. */ if (unlikely(skb_headroom(skb) < hh_len && dev->header_ops)) { struct sk_buff *skb2; skb2 = skb_realloc_headroom(skb, LL_RESERVED_SPACE(dev)); if (skb2 == NULL) { kfree_skb(skb); return -ENOMEM; } if (skb->sk) skb_set_owner_w(skb2, skb->sk); consume_skb(skb); skb = skb2; } ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Repeatable IPv6 crash in 3.19.0-1 2015-02-28 0:48 ` Eric Dumazet @ 2015-02-28 1:16 ` Eric Dumazet 2015-02-28 1:54 ` Brian Rak 0 siblings, 1 reply; 10+ messages in thread From: Eric Dumazet @ 2015-02-28 1:16 UTC (permalink / raw) To: Brian Rak; +Cc: netdev On Fri, 2015-02-27 at 16:48 -0800, Eric Dumazet wrote: > On Fri, 2015-02-27 at 16:37 -0500, Brian Rak wrote: > > I've been seeing a crash under 3.19.0 that seems to occur when I put > > heavy traffic across a macvtap/veth interface. > > > > We have a KVM guest attached to a veth pair using macvtap. We're > > routing IPv6 traffic into one end of the veth pair using some static > > routes. We do *not* have proxy_ndp enabled (though, we are using some > > software to do neighbor proxying - http://priv.nu/projects/ndppd/ ). > > > > I've been able to reproduce this pretty easily by downloading some large > > files from the guest. We see two traces in a row when this occurs: > > > Nice ! > > Crash is in neigh_hh_output() > > -> memcpy(skb->data - HH_DATA_MOD, hh->hh_data, HH_DATA_MOD); > > And there is only 14 bytes of headroom instead of 16. > > Some layer did not align skb_headroom(skb) to HH_DATA_MOD for ethernet > header. Could you try following patch ? diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c index e40fdfccc9c10df4ea8676a1dd59275d5d9c6b88..27ecc5c4fa2665cd42ac1ca81717255f85507113 100644 --- a/drivers/net/macvtap.c +++ b/drivers/net/macvtap.c @@ -654,11 +654,14 @@ static void macvtap_skb_to_vnet_hdr(struct macvtap_queue *q, } /* else everything is zero */ } +/* Neighbour code has some assumptions on HH_DATA_MOD alignment */ +#define MACVTAP_RESERVE HH_DATA_OFF(ETH_HLEN) + /* Get packet from user space buffer */ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m, struct iov_iter *from, int noblock) { - int good_linear = SKB_MAX_HEAD(NET_IP_ALIGN); + int good_linear = SKB_MAX_HEAD(MACVTAP_RESERVE); struct sk_buff *skb; struct macvlan_dev *vlan; unsigned long total_len = iov_iter_count(from); @@ -722,7 +725,7 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m, linear = macvtap16_to_cpu(q, vnet_hdr.hdr_len); } - skb = macvtap_alloc_skb(&q->sk, NET_IP_ALIGN, copylen, + skb = macvtap_alloc_skb(&q->sk, MACVTAP_RESERVE, copylen, linear, noblock, &err); if (!skb) goto err; ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: Repeatable IPv6 crash in 3.19.0-1 2015-02-28 1:16 ` Eric Dumazet @ 2015-02-28 1:54 ` Brian Rak 2015-02-28 2:01 ` Eric Dumazet 0 siblings, 1 reply; 10+ messages in thread From: Brian Rak @ 2015-02-28 1:54 UTC (permalink / raw) To: Eric Dumazet; +Cc: netdev On 2/27/2015 8:16 PM, Eric Dumazet wrote: > On Fri, 2015-02-27 at 16:48 -0800, Eric Dumazet wrote: >> On Fri, 2015-02-27 at 16:37 -0500, Brian Rak wrote: >>> I've been seeing a crash under 3.19.0 that seems to occur when I put >>> heavy traffic across a macvtap/veth interface. >>> >>> We have a KVM guest attached to a veth pair using macvtap. We're >>> routing IPv6 traffic into one end of the veth pair using some static >>> routes. We do *not* have proxy_ndp enabled (though, we are using some >>> software to do neighbor proxying - http://priv.nu/projects/ndppd/ ). >>> >>> I've been able to reproduce this pretty easily by downloading some large >>> files from the guest. We see two traces in a row when this occurs: >> >> >> Nice ! >> >> Crash is in neigh_hh_output() >> >> -> memcpy(skb->data - HH_DATA_MOD, hh->hh_data, HH_DATA_MOD); >> >> And there is only 14 bytes of headroom instead of 16. >> >> Some layer did not align skb_headroom(skb) to HH_DATA_MOD for ethernet >> header. > > Could you try following patch ? > > diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c > index e40fdfccc9c10df4ea8676a1dd59275d5d9c6b88..27ecc5c4fa2665cd42ac1ca81717255f85507113 100644 > --- a/drivers/net/macvtap.c > +++ b/drivers/net/macvtap.c > @@ -654,11 +654,14 @@ static void macvtap_skb_to_vnet_hdr(struct macvtap_queue *q, > } /* else everything is zero */ > } > > +/* Neighbour code has some assumptions on HH_DATA_MOD alignment */ > +#define MACVTAP_RESERVE HH_DATA_OFF(ETH_HLEN) > + > /* Get packet from user space buffer */ > static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m, > struct iov_iter *from, int noblock) > { > - int good_linear = SKB_MAX_HEAD(NET_IP_ALIGN); > + int good_linear = SKB_MAX_HEAD(MACVTAP_RESERVE); > struct sk_buff *skb; > struct macvlan_dev *vlan; > unsigned long total_len = iov_iter_count(from); > @@ -722,7 +725,7 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m, > linear = macvtap16_to_cpu(q, vnet_hdr.hdr_len); > } > > - skb = macvtap_alloc_skb(&q->sk, NET_IP_ALIGN, copylen, > + skb = macvtap_alloc_skb(&q->sk, MACVTAP_RESERVE, copylen, > linear, noblock, &err); > if (!skb) > goto err; > > Wow, that was *much* faster then I was expecting, thanks a bunch! I can confirm that resolves the issue.. I've tested this and it fixes the issue perfectly. I've been able to put a whole bunch of IPv6 traffic through the interface now, whereas before even a minor amount of traffic would crash the host. Thanks again! ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Repeatable IPv6 crash in 3.19.0-1 2015-02-28 1:54 ` Brian Rak @ 2015-02-28 2:01 ` Eric Dumazet 2015-02-28 2:03 ` Eric Dumazet 0 siblings, 1 reply; 10+ messages in thread From: Eric Dumazet @ 2015-02-28 2:01 UTC (permalink / raw) To: Brian Rak; +Cc: netdev On Fri, 2015-02-27 at 20:54 -0500, Brian Rak wrote: > Wow, that was *much* faster then I was expecting, thanks a bunch! > > I can confirm that resolves the issue.. I've tested this and it fixes > the issue perfectly. I've been able to put a whole bunch of IPv6 > traffic through the interface now, whereas before even a minor amount of > traffic would crash the host. > > Thanks again! Interesting... Had a prior version of linux kernel been fine ? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Repeatable IPv6 crash in 3.19.0-1 2015-02-28 2:01 ` Eric Dumazet @ 2015-02-28 2:03 ` Eric Dumazet 2015-02-28 2:11 ` Brian Rak 0 siblings, 1 reply; 10+ messages in thread From: Eric Dumazet @ 2015-02-28 2:03 UTC (permalink / raw) To: Brian Rak; +Cc: netdev On Fri, 2015-02-27 at 18:01 -0800, Eric Dumazet wrote: > On Fri, 2015-02-27 at 20:54 -0500, Brian Rak wrote: > > > Wow, that was *much* faster then I was expecting, thanks a bunch! > > > > I can confirm that resolves the issue.. I've tested this and it fixes > > the issue perfectly. I've been able to put a whole bunch of IPv6 > > traffic through the interface now, whereas before even a minor amount of > > traffic would crash the host. > > > > Thanks again! > > Interesting... > > Had a prior version of linux kernel been fine ? Or maybe you recently switched on this config option ? CONFIG_DEBUG_PAGEALLOC=y ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Repeatable IPv6 crash in 3.19.0-1 2015-02-28 2:03 ` Eric Dumazet @ 2015-02-28 2:11 ` Brian Rak 2015-02-28 2:21 ` Eric Dumazet 2015-02-28 2:35 ` [PATCH net] macvtap: make sure neighbour code can push ethernet header Eric Dumazet 0 siblings, 2 replies; 10+ messages in thread From: Brian Rak @ 2015-02-28 2:11 UTC (permalink / raw) To: Eric Dumazet; +Cc: netdev On 2/27/2015 9:03 PM, Eric Dumazet wrote: > On Fri, 2015-02-27 at 18:01 -0800, Eric Dumazet wrote: >> On Fri, 2015-02-27 at 20:54 -0500, Brian Rak wrote: >> >>> Wow, that was *much* faster then I was expecting, thanks a bunch! >>> >>> I can confirm that resolves the issue.. I've tested this and it fixes >>> the issue perfectly. I've been able to put a whole bunch of IPv6 >>> traffic through the interface now, whereas before even a minor amount of >>> traffic would crash the host. >>> >>> Thanks again! >> >> Interesting... >> >> Had a prior version of linux kernel been fine ? > > Or maybe you recently switched on this config option ? > > CONFIG_DEBUG_PAGEALLOC=y > > > We've only recently started using this veth/macvtap combo, so it's possible this has been around for awhile and we just hadn't noticed. I don't have any info on older kernels currently. I *think* I've seen crashes on 3.17.1, but I didn't save any stack traces, so I can't be sure. CONFIG_DEBUG_PAGEALLOC is not set, and never has been. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Repeatable IPv6 crash in 3.19.0-1 2015-02-28 2:11 ` Brian Rak @ 2015-02-28 2:21 ` Eric Dumazet 2015-02-28 2:35 ` [PATCH net] macvtap: make sure neighbour code can push ethernet header Eric Dumazet 1 sibling, 0 replies; 10+ messages in thread From: Eric Dumazet @ 2015-02-28 2:21 UTC (permalink / raw) To: Brian Rak; +Cc: netdev On Fri, 2015-02-27 at 21:11 -0500, Brian Rak wrote: > We've only recently started using this veth/macvtap combo, so it's > possible this has been around for awhile and we just hadn't noticed. > > I don't have any info on older kernels currently. I *think* I've seen > crashes on 3.17.1, but I didn't save any stack traces, so I can't be sure. > > CONFIG_DEBUG_PAGEALLOC is not set, and never has been. OK, thanks for the confirmation. I'll send an official patch. (I guess same patch is also needed for drivers/net/tun.c) ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH net] macvtap: make sure neighbour code can push ethernet header 2015-02-28 2:11 ` Brian Rak 2015-02-28 2:21 ` Eric Dumazet @ 2015-02-28 2:35 ` Eric Dumazet 2015-03-01 5:30 ` David Miller 1 sibling, 1 reply; 10+ messages in thread From: Eric Dumazet @ 2015-02-28 2:35 UTC (permalink / raw) To: Brian Rak, David Miller; +Cc: netdev From: Eric Dumazet <edumazet@google.com> Brian reported crashes using IPv6 traffic with macvtap/veth combo. I tracked the crashes in neigh_hh_output() -> memcpy(skb->data - HH_DATA_MOD, hh->hh_data, HH_DATA_MOD); Neighbour code assumes headroom to push Ethernet header is at least 16 bytes. It appears macvtap has only 14 bytes available on arches where NET_IP_ALIGN is 0 (like x86) Effect is a corruption of 2 bytes right before skb->head, and possible crashes if accessing non existing memory. This fix should also increase IPv4 performance, as paranoid code in ip_finish_output2() wont have to call skb_realloc_headroom() Reported-by: Brian Rak <brak@vultr.com> Tested-by: Brian Rak <brak@vultr.com> Signed-off-by: Eric Dumazet <edumazet@google.com> --- drivers/net/macvtap.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c index e40fdfccc9c10df4ea8676a1dd59275d5d9c6b88..27ecc5c4fa2665cd42ac1ca81717255f85507113 100644 --- a/drivers/net/macvtap.c +++ b/drivers/net/macvtap.c @@ -654,11 +654,14 @@ static void macvtap_skb_to_vnet_hdr(struct macvtap_queue *q, } /* else everything is zero */ } +/* Neighbour code has some assumptions on HH_DATA_MOD alignment */ +#define MACVTAP_RESERVE HH_DATA_OFF(ETH_HLEN) + /* Get packet from user space buffer */ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m, struct iov_iter *from, int noblock) { - int good_linear = SKB_MAX_HEAD(NET_IP_ALIGN); + int good_linear = SKB_MAX_HEAD(MACVTAP_RESERVE); struct sk_buff *skb; struct macvlan_dev *vlan; unsigned long total_len = iov_iter_count(from); @@ -722,7 +725,7 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m, linear = macvtap16_to_cpu(q, vnet_hdr.hdr_len); } - skb = macvtap_alloc_skb(&q->sk, NET_IP_ALIGN, copylen, + skb = macvtap_alloc_skb(&q->sk, MACVTAP_RESERVE, copylen, linear, noblock, &err); if (!skb) goto err; ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH net] macvtap: make sure neighbour code can push ethernet header 2015-02-28 2:35 ` [PATCH net] macvtap: make sure neighbour code can push ethernet header Eric Dumazet @ 2015-03-01 5:30 ` David Miller 0 siblings, 0 replies; 10+ messages in thread From: David Miller @ 2015-03-01 5:30 UTC (permalink / raw) To: eric.dumazet; +Cc: brak, netdev From: Eric Dumazet <eric.dumazet@gmail.com> Date: Fri, 27 Feb 2015 18:35:35 -0800 > From: Eric Dumazet <edumazet@google.com> > > Brian reported crashes using IPv6 traffic with macvtap/veth combo. > > I tracked the crashes in neigh_hh_output() > > -> memcpy(skb->data - HH_DATA_MOD, hh->hh_data, HH_DATA_MOD); > > Neighbour code assumes headroom to push Ethernet header is > at least 16 bytes. > > It appears macvtap has only 14 bytes available on arches > where NET_IP_ALIGN is 0 (like x86) > > Effect is a corruption of 2 bytes right before skb->head, > and possible crashes if accessing non existing memory. > > This fix should also increase IPv4 performance, as paranoid code > in ip_finish_output2() wont have to call skb_realloc_headroom() > > Reported-by: Brian Rak <brak@vultr.com> > Tested-by: Brian Rak <brak@vultr.com> > Signed-off-by: Eric Dumazet <edumazet@google.com> Applied and queued up for -stable, thanks. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2015-03-01 5:30 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-02-27 21:37 Repeatable IPv6 crash in 3.19.0-1 Brian Rak 2015-02-28 0:48 ` Eric Dumazet 2015-02-28 1:16 ` Eric Dumazet 2015-02-28 1:54 ` Brian Rak 2015-02-28 2:01 ` Eric Dumazet 2015-02-28 2:03 ` Eric Dumazet 2015-02-28 2:11 ` Brian Rak 2015-02-28 2:21 ` Eric Dumazet 2015-02-28 2:35 ` [PATCH net] macvtap: make sure neighbour code can push ethernet header Eric Dumazet 2015-03-01 5:30 ` David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).