* Bpqether broken in 4.1
@ 2015-07-02 18:38 Ralf Baechle
2015-07-02 21:03 ` Eric W. Biederman
0 siblings, 1 reply; 3+ messages in thread
From: Ralf Baechle @ 2015-07-02 18:38 UTC (permalink / raw)
To: linux-hams, netdev, Eric W. Biederman
Cc: Steven Whitehouse, Richard Stearn, f6bvp
Eric's Commit 1d5da757da860a6916adbf68b09e868062b4b3b8 (ax25: Stop using
magic neighbour cache operations.) breaks IP traffic over the AX.25 bpqether
driver.
Here's how to reproduce the issue if you don't have an AX.25 setup. The
arp command is there to fudge things if you don't have a peer that would
answer ARP requests.
# modprobe bpqether
# ifconfig bpq0 hw ax25 abcdef-7 172.20.4.1/24
# arp -H ax25 -s 172.20.4.2 uvwxyz-9
# ping 172.20.4.2
Result in one "Dead loop on virtual device bpq0, fix it urgently!" message
per ping packet. With the following little debug patch
diff --git a/net/core/dev.c b/net/core/dev.c
index aa82f9a..5fef868 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3011,6 +3011,7 @@ static int __dev_queue_xmit(struct sk_buff *skb, void *accel_priv)
recursion_alert:
net_crit_ratelimited("Dead loop on virtual device %s, fix it urgently!\n",
dev->name);
+ WARN_ON(1);
}
}
I get the following backtrace:
[ 33.149171] Dead loop on virtual device bpq0, fix it urgently!
[ 33.149718] ------------[ cut here ]------------
[ 33.149754] WARNING: CPU: 0 PID: 0 at net/core/dev.c:3014 __dev_queue_xmit+0x3f6/0x530()
[ 33.149769] Modules linked in:
[ 33.149789] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.1.0-00010-g21c6d95-dirty #18
[ 33.149799] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
[ 33.149810] 0000000000000000 de52945c8e778a65 ffff88007fc039a8 ffffffff816d2165
[ 33.149823] 0000000000000000 0000000000000000 ffff88007fc039e8 ffffffff810634aa
[ 33.149833] ffff88007fc039c8 0000000000000000 ffff880078f90000 ffff880078f90000
[ 33.149844] Call Trace:
[ 33.149885] <IRQ> [<ffffffff816d2165>] dump_stack+0x45/0x57
[ 33.149927] [<ffffffff810634aa>] warn_slowpath_common+0x8a/0xc0
[ 33.149939] [<ffffffff810635da>] warn_slowpath_null+0x1a/0x20
[ 33.149949] [<ffffffff815c7c06>] __dev_queue_xmit+0x3f6/0x530
[ 33.149967] [<ffffffff8108cbed>] ? ttwu_do_wakeup+0x1d/0xe0
[ 33.149978] [<ffffffff815c7d53>] dev_queue_xmit_sk+0x13/0x20
[ 33.149994] [<ffffffff816b9951>] ax25_queue_xmit+0x61/0x70
[ 33.150005] [<ffffffff816b9476>] ax25_ip_xmit+0xd6/0x2d0
[ 33.150022] [<ffffffff8108fb47>] ? wake_up_process+0x27/0x50
[ 33.150050] [<ffffffff814dda35>] bpq_xmit+0x1d5/0x200
[ 33.150061] [<ffffffff815c7694>] dev_hard_start_xmit+0x264/0x3e0
[ 33.150073] [<ffffffff815c7ccd>] __dev_queue_xmit+0x4bd/0x530
[ 33.150083] [<ffffffff815c7d53>] dev_queue_xmit_sk+0x13/0x20
[ 33.150099] [<ffffffff815d03c2>] neigh_connected_output+0xc2/0x110
[ 33.150110] [<ffffffff815d3483>] neigh_update+0x333/0x770
[ 33.150117] [<ffffffff8162d2a7>] arp_process.isra.15+0x2f7/0x690
[ 33.150117] [<ffffffff8162d736>] arp_rcv+0xe6/0x130
[ 33.150117] [<ffffffff815c5543>] __netif_receive_skb_core+0x693/0x830
[ 33.150117] [<ffffffff815c56f8>] __netif_receive_skb+0x18/0x60
[ 33.150117] [<ffffffff815c6532>] process_backlog+0xb2/0x150
[ 33.150117] [<ffffffff815c5cd2>] net_rx_action+0x212/0x340
[ 33.150117] [<ffffffff81067aeb>] __do_softirq+0x10b/0x2d0
[ 33.150117] [<ffffffff81067f15>] irq_exit+0x145/0x150
[ 33.150117] [<ffffffff816da8a8>] do_IRQ+0x58/0xf0
[ 33.150117] [<ffffffff816d896e>] common_interrupt+0x6e/0x6e
[ 33.150117] <EOI> [<ffffffff8104b236>] ? native_safe_halt+0x6/0x10
[ 33.150117] [<ffffffff810c4d43>] ? rcu_eqs_enter+0xa3/0xb0
[ 33.150117] [<ffffffff8100ddbe>] default_idle+0x1e/0xc0
[ 33.150117] [<ffffffff8100e81f>] arch_cpu_idle+0xf/0x20
[ 33.150117] [<ffffffff810a6f57>] cpu_startup_entry+0x377/0x3f0
[ 33.150117] [<ffffffff816c989c>] rest_init+0x7c/0x80
[ 33.150117] [<ffffffff81d32fe4>] start_kernel+0x484/0x4a5
[ 33.150117] [<ffffffff81d32120>] ? early_idt_handler_array+0x120/0x120
[ 33.150117] [<ffffffff81d32315>] x86_64_start_reservations+0x2a/0x2c
[ 33.150117] [<ffffffff81d3245c>] x86_64_start_kernel+0x145/0x168
[ 33.150117] ---[ end trace ff4df9d904cced48 ]---
Ralf
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: Bpqether broken in 4.1
2015-07-02 18:38 Bpqether broken in 4.1 Ralf Baechle
@ 2015-07-02 21:03 ` Eric W. Biederman
2015-07-02 21:55 ` Ralf Baechle
0 siblings, 1 reply; 3+ messages in thread
From: Eric W. Biederman @ 2015-07-02 21:03 UTC (permalink / raw)
To: Ralf Baechle; +Cc: linux-hams, netdev, Steven Whitehouse, Richard Stearn, f6bvp
Ralf Baechle <ralf@linux-mips.org> writes:
> Eric's Commit 1d5da757da860a6916adbf68b09e868062b4b3b8 (ax25: Stop using
> magic neighbour cache operations.) breaks IP traffic over the AX.25 bpqether
> driver.
Sigh. NETIF_F_LLTX is not set so recursion does not work :(
So we can either set NETIF_F_LLTX or just rever the offending commit.
I think either will work. ax25 is so very weird it just abuses the
neighbour table something awful. It ax25 is not caching ip address to
ax25 address translations in there, ax25 should really not be using the
neighbour table. Sigh.
So perhaps something like the below will be good enough.
diff --git a/drivers/net/hamradio/bpqether.c b/drivers/net/hamradio/bpqether.c
index 63ff08a26da8..fc2be36c9425 100644
--- a/drivers/net/hamradio/bpqether.c
+++ b/drivers/net/hamradio/bpqether.c
@@ -483,6 +483,7 @@ static void bpq_setup(struct net_device *dev)
memcpy(dev->dev_addr, &ax25_defaddr, AX25_ADDR_LEN);
dev->flags = 0;
+ dev->features = NETIF_F_LLTX; /* Allow recursion */
#if defined(CONFIG_AX25) || defined(CONFIG_AX25_MODULE)
dev->header_ops = &ax25_header_ops;
> Here's how to reproduce the issue if you don't have an AX.25 setup. The
> arp command is there to fudge things if you don't have a peer that would
> answer ARP requests.
>
> # modprobe bpqether
> # ifconfig bpq0 hw ax25 abcdef-7 172.20.4.1/24
> # arp -H ax25 -s 172.20.4.2 uvwxyz-9
> # ping 172.20.4.2
>
> Result in one "Dead loop on virtual device bpq0, fix it urgently!" message
> per ping packet. With the following little debug patch
Eric
> diff --git a/net/core/dev.c b/net/core/dev.c
> index aa82f9a..5fef868 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3011,6 +3011,7 @@ static int __dev_queue_xmit(struct sk_buff *skb, void *accel_priv)
> recursion_alert:
> net_crit_ratelimited("Dead loop on virtual device %s, fix it urgently!\n",
> dev->name);
> + WARN_ON(1);
> }
> }
>
> I get the following backtrace:
>
> [ 33.149171] Dead loop on virtual device bpq0, fix it urgently!
> [ 33.149718] ------------[ cut here ]------------
> [ 33.149754] WARNING: CPU: 0 PID: 0 at net/core/dev.c:3014 __dev_queue_xmit+0x3f6/0x530()
> [ 33.149769] Modules linked in:
> [ 33.149789] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.1.0-00010-g21c6d95-dirty #18
> [ 33.149799] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
> [ 33.149810] 0000000000000000 de52945c8e778a65 ffff88007fc039a8 ffffffff816d2165
> [ 33.149823] 0000000000000000 0000000000000000 ffff88007fc039e8 ffffffff810634aa
> [ 33.149833] ffff88007fc039c8 0000000000000000 ffff880078f90000 ffff880078f90000
> [ 33.149844] Call Trace:
> [ 33.149885] <IRQ> [<ffffffff816d2165>] dump_stack+0x45/0x57
> [ 33.149927] [<ffffffff810634aa>] warn_slowpath_common+0x8a/0xc0
> [ 33.149939] [<ffffffff810635da>] warn_slowpath_null+0x1a/0x20
> [ 33.149949] [<ffffffff815c7c06>] __dev_queue_xmit+0x3f6/0x530
> [ 33.149967] [<ffffffff8108cbed>] ? ttwu_do_wakeup+0x1d/0xe0
> [ 33.149978] [<ffffffff815c7d53>] dev_queue_xmit_sk+0x13/0x20
> [ 33.149994] [<ffffffff816b9951>] ax25_queue_xmit+0x61/0x70
> [ 33.150005] [<ffffffff816b9476>] ax25_ip_xmit+0xd6/0x2d0
> [ 33.150022] [<ffffffff8108fb47>] ? wake_up_process+0x27/0x50
> [ 33.150050] [<ffffffff814dda35>] bpq_xmit+0x1d5/0x200
> [ 33.150061] [<ffffffff815c7694>] dev_hard_start_xmit+0x264/0x3e0
> [ 33.150073] [<ffffffff815c7ccd>] __dev_queue_xmit+0x4bd/0x530
> [ 33.150083] [<ffffffff815c7d53>] dev_queue_xmit_sk+0x13/0x20
> [ 33.150099] [<ffffffff815d03c2>] neigh_connected_output+0xc2/0x110
> [ 33.150110] [<ffffffff815d3483>] neigh_update+0x333/0x770
> [ 33.150117] [<ffffffff8162d2a7>] arp_process.isra.15+0x2f7/0x690
> [ 33.150117] [<ffffffff8162d736>] arp_rcv+0xe6/0x130
> [ 33.150117] [<ffffffff815c5543>] __netif_receive_skb_core+0x693/0x830
> [ 33.150117] [<ffffffff815c56f8>] __netif_receive_skb+0x18/0x60
> [ 33.150117] [<ffffffff815c6532>] process_backlog+0xb2/0x150
> [ 33.150117] [<ffffffff815c5cd2>] net_rx_action+0x212/0x340
> [ 33.150117] [<ffffffff81067aeb>] __do_softirq+0x10b/0x2d0
> [ 33.150117] [<ffffffff81067f15>] irq_exit+0x145/0x150
> [ 33.150117] [<ffffffff816da8a8>] do_IRQ+0x58/0xf0
> [ 33.150117] [<ffffffff816d896e>] common_interrupt+0x6e/0x6e
> [ 33.150117] <EOI> [<ffffffff8104b236>] ? native_safe_halt+0x6/0x10
> [ 33.150117] [<ffffffff810c4d43>] ? rcu_eqs_enter+0xa3/0xb0
> [ 33.150117] [<ffffffff8100ddbe>] default_idle+0x1e/0xc0
> [ 33.150117] [<ffffffff8100e81f>] arch_cpu_idle+0xf/0x20
> [ 33.150117] [<ffffffff810a6f57>] cpu_startup_entry+0x377/0x3f0
> [ 33.150117] [<ffffffff816c989c>] rest_init+0x7c/0x80
> [ 33.150117] [<ffffffff81d32fe4>] start_kernel+0x484/0x4a5
> [ 33.150117] [<ffffffff81d32120>] ? early_idt_handler_array+0x120/0x120
> [ 33.150117] [<ffffffff81d32315>] x86_64_start_reservations+0x2a/0x2c
> [ 33.150117] [<ffffffff81d3245c>] x86_64_start_kernel+0x145/0x168
> [ 33.150117] ---[ end trace ff4df9d904cced48 ]---
>
> Ralf
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: Bpqether broken in 4.1
2015-07-02 21:03 ` Eric W. Biederman
@ 2015-07-02 21:55 ` Ralf Baechle
0 siblings, 0 replies; 3+ messages in thread
From: Ralf Baechle @ 2015-07-02 21:55 UTC (permalink / raw)
To: Eric W. Biederman
Cc: linux-hams, netdev, Steven Whitehouse, Richard Stearn, f6bvp
On Thu, Jul 02, 2015 at 04:03:07PM -0500, Eric W. Biederman wrote:
> > Eric's Commit 1d5da757da860a6916adbf68b09e868062b4b3b8 (ax25: Stop using
> > magic neighbour cache operations.) breaks IP traffic over the AX.25 bpqether
> > driver.
>
> Sigh. NETIF_F_LLTX is not set so recursion does not work :(
>
> So we can either set NETIF_F_LLTX or just rever the offending commit.
The AX.25 stack has a sufficient number of hacks that attempts to fix
any hack is likely to cause issues somewhere else and the header and
neighbour stuff is the worst minefield. I'm happy that your patch at
least concentrates all those hacks in the AX.25 stack itself removing
the impact from the generic networking code.
> I think either will work. ax25 is so very weird it just abuses the
> neighbour table something awful. It ax25 is not caching ip address to
> ax25 address translations in there, ax25 should really not be using the
> neighbour table. Sigh.
>
> So perhaps something like the below will be good enough.
>
> diff --git a/drivers/net/hamradio/bpqether.c b/drivers/net/hamradio/bpqether.c
> index 63ff08a26da8..fc2be36c9425 100644
> --- a/drivers/net/hamradio/bpqether.c
> +++ b/drivers/net/hamradio/bpqether.c
> @@ -483,6 +483,7 @@ static void bpq_setup(struct net_device *dev)
> memcpy(dev->dev_addr, &ax25_defaddr, AX25_ADDR_LEN);
>
> dev->flags = 0;
> + dev->features = NETIF_F_LLTX; /* Allow recursion */
>
> #if defined(CONFIG_AX25) || defined(CONFIG_AX25_MODULE)
> dev->header_ops = &ax25_header_ops;
Thanks, that restored bpqether to work. I will cook up a patch to fix
all other AX.25 drivers.
Thanks!
Ralf
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2015-07-02 21:56 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-02 18:38 Bpqether broken in 4.1 Ralf Baechle
2015-07-02 21:03 ` Eric W. Biederman
2015-07-02 21:55 ` Ralf Baechle
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).