* AX25 mkiss interface not deleted when the serial port is removed @ 2015-09-25 17:21 Jean-Christian de Rivaz 2015-09-28 20:06 ` Jean-Christian de Rivaz 2015-09-29 15:31 ` Ralf Baechle DL5RB 0 siblings, 2 replies; 25+ messages in thread From: Jean-Christian de Rivaz @ 2015-09-25 17:21 UTC (permalink / raw) To: linux-hams Hello, On a embedded system we use AX25 over an USB serial port with the kissattach command. For some hardware reason the microcontroller that act as a TNC and USB CDC device can be rested while the system is running, causing a USB disconnect of the USN CDC device and the removal of the corresponding serial port in the kernel. But the ax0 interface is not removed in this case and after a few seconds the kernel panic with the crash below: [<c0595468>] (skb_panic) from [<c0401f70>] (skb_push+0x4c/0x50) [<c0401f70>] (skb_push) from [<bf0bdad4>] (ax25_hard_header+0x34/0xf4 [ax25]) [<bf0bdad4>] (ax25_hard_header [ax25]) from [<bf0d05d4>] (ax_header+0x38/0x40 [mkiss]) [<bf0d05d4>] (ax_header [mkiss]) from [<c041b584>] (neigh_compat_output+0x8c/0xd8) [<c041b584>] (neigh_compat_output) from [<c043e7a8>] (ip_finish_output+0x2a0/0x914) [<c043e7a8>] (ip_finish_output) from [<c043f948>] (ip_output+0xd8/0xf0) [<c043f948>] (ip_output) from [<c043f04c>] (ip_local_out_sk+0x44/0x48) [<c043f04c>] (ip_local_out_sk) from [<c04721f4>] (igmpv3_sendpack+0x54/0x58) [<c04721f4>] (igmpv3_sendpack) from [<c04734bc>] (igmp_ifc_timer_expire+0x1c0/0x2ac) [<c04734bc>] (igmp_ifc_timer_expire) from [<c00696cc>] (call_timer_fn+0x4c/0x1ac) [<c00696cc>] (call_timer_fn) from [<c0069d20>] (run_timer_softirq+0x21c/0x340) [<c0069d20>] (run_timer_softirq) from [<c0024ea8>] (__do_softirq+0xa4/0x370) [<c0024ea8>] (__do_softirq) from [<c002540c>] (irq_exit+0x88/0xc4) [<c002540c>] (irq_exit) from [<c005b550>] (__handle_domain_irq+0x74/0xdc) [<c005b550>] (__handle_domain_irq) from [<c000872c>] (aic5_handle+0xe8/0xf4) [<c000872c>] (aic5_handle) from [<c059aca8>] (__irq_usr+0x48/0x60) I suspect that some code is missing somewhere in the serial port release part to remove the AX25 interface that is attached to it. But I am not certain about this, and I don't know where to look into the kernel to fix this. The kernel version is 3.19.0 running on a ARMv7 processor in case that matter. Any hint would be appreciate. Best Regards, Jean-Christian de Rivaz ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: AX25 mkiss interface not deleted when the serial port is removed 2015-09-25 17:21 AX25 mkiss interface not deleted when the serial port is removed Jean-Christian de Rivaz @ 2015-09-28 20:06 ` Jean-Christian de Rivaz 2015-09-28 23:13 ` Jean-Christian de Rivaz 2015-09-29 15:31 ` Ralf Baechle DL5RB 1 sibling, 1 reply; 25+ messages in thread From: Jean-Christian de Rivaz @ 2015-09-28 20:06 UTC (permalink / raw) To: linux-hams Le 25. 09. 15 19:21, Jean-Christian de Rivaz a écrit : > Hello, > > On a embedded system we use AX25 over an USB serial port with the > kissattach command. For some hardware reason the microcontroller that > act as a TNC and USB CDC device can be rested while the system is > running, causing a USB disconnect of the USN CDC device and the > removal of the corresponding serial port in the kernel. But the ax0 > interface is not removed in this case and after a few seconds the > kernel panic with the crash below: > > [<c0595468>] (skb_panic) from [<c0401f70>] (skb_push+0x4c/0x50) > [<c0401f70>] (skb_push) from [<bf0bdad4>] (ax25_hard_header+0x34/0xf4 > [ax25]) > [<bf0bdad4>] (ax25_hard_header [ax25]) from [<bf0d05d4>] > (ax_header+0x38/0x40 [mkiss]) > [<bf0d05d4>] (ax_header [mkiss]) from [<c041b584>] > (neigh_compat_output+0x8c/0xd8) > [<c041b584>] (neigh_compat_output) from [<c043e7a8>] > (ip_finish_output+0x2a0/0x914) > [<c043e7a8>] (ip_finish_output) from [<c043f948>] (ip_output+0xd8/0xf0) > [<c043f948>] (ip_output) from [<c043f04c>] (ip_local_out_sk+0x44/0x48) > [<c043f04c>] (ip_local_out_sk) from [<c04721f4>] > (igmpv3_sendpack+0x54/0x58) > [<c04721f4>] (igmpv3_sendpack) from [<c04734bc>] > (igmp_ifc_timer_expire+0x1c0/0x2ac) > [<c04734bc>] (igmp_ifc_timer_expire) from [<c00696cc>] > (call_timer_fn+0x4c/0x1ac) > [<c00696cc>] (call_timer_fn) from [<c0069d20>] > (run_timer_softirq+0x21c/0x340) > [<c0069d20>] (run_timer_softirq) from [<c0024ea8>] > (__do_softirq+0xa4/0x370) > [<c0024ea8>] (__do_softirq) from [<c002540c>] (irq_exit+0x88/0xc4) > [<c002540c>] (irq_exit) from [<c005b550>] (__handle_domain_irq+0x74/0xdc) > [<c005b550>] (__handle_domain_irq) from [<c000872c>] > (aic5_handle+0xe8/0xf4) > [<c000872c>] (aic5_handle) from [<c059aca8>] (__irq_usr+0x48/0x60) > > I suspect that some code is missing somewhere in the serial port > release part to remove the AX25 interface that is attached to it. But > I am not certain about this, and I don't know where to look into the > kernel to fix this. The kernel version is 3.19.0 running on a ARMv7 > processor in case that matter. > I can reproduce this situation when I reset the USB CDC device: # lsusb Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub # ls -al /proc/$(pgrep kissattach)/fd total 0 dr-x------ 2 root root 0 Sep 28 19:00 . dr-xr-xr-x 7 root root 0 Sep 28 19:00 .. lr-x------ 1 root root 64 Sep 28 19:00 3 -> /dev/ttyACM1 (deleted) # ifconfig ax0 ax0 Link encap:AMPR AX.25 HWaddr BROADCAST MULTICAST MTU:236 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:10 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) At this time there is no more USB device attached to the system and kissattach still have a open file descriptor on the deleted serial device since it use an infinite loop wrapping a long sleep. Maybe kissattach should use select/poll/epoll to be notified to read a EOF and terminate, but this is not the problem of the kernel. What look strange to me is the fact that the kernel keep the ax0 interface until the kissattach file descriptor is closed, usually by the termination of kissattach. It that the expected behavior ? I suspect that the response is "no". I have narrowed down the crash cause to this two conditions: 1) ax0 still exists because of kissattach file descriptor after the serial device is deleted. 2) NetworkManager is running. I don't know yet what NM do, but the crash do not take place if it is not running. Any help would be welcome. Jean-Christian -- To unsubscribe from this list: send the line "unsubscribe linux-hams" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: AX25 mkiss interface not deleted when the serial port is removed 2015-09-28 20:06 ` Jean-Christian de Rivaz @ 2015-09-28 23:13 ` Jean-Christian de Rivaz 2015-09-29 2:13 ` David Ranch 0 siblings, 1 reply; 25+ messages in thread From: Jean-Christian de Rivaz @ 2015-09-28 23:13 UTC (permalink / raw) To: linux-hams Le 28. 09. 15 22:06, Jean-Christian de Rivaz a écrit : > Le 25. 09. 15 19:21, Jean-Christian de Rivaz a écrit : >> Hello, >> >> On a embedded system we use AX25 over an USB serial port with the >> kissattach command. For some hardware reason the microcontroller that >> act as a TNC and USB CDC device can be rested while the system is >> running, causing a USB disconnect of the USN CDC device and the >> removal of the corresponding serial port in the kernel. But the ax0 >> interface is not removed in this case and after a few seconds the >> kernel panic with the crash below: >> >> [<c0595468>] (skb_panic) from [<c0401f70>] (skb_push+0x4c/0x50) >> [<c0401f70>] (skb_push) from [<bf0bdad4>] (ax25_hard_header+0x34/0xf4 >> [ax25]) >> [<bf0bdad4>] (ax25_hard_header [ax25]) from [<bf0d05d4>] >> (ax_header+0x38/0x40 [mkiss]) >> [<bf0d05d4>] (ax_header [mkiss]) from [<c041b584>] >> (neigh_compat_output+0x8c/0xd8) >> [<c041b584>] (neigh_compat_output) from [<c043e7a8>] >> (ip_finish_output+0x2a0/0x914) >> [<c043e7a8>] (ip_finish_output) from [<c043f948>] (ip_output+0xd8/0xf0) >> [<c043f948>] (ip_output) from [<c043f04c>] (ip_local_out_sk+0x44/0x48) >> [<c043f04c>] (ip_local_out_sk) from [<c04721f4>] >> (igmpv3_sendpack+0x54/0x58) >> [<c04721f4>] (igmpv3_sendpack) from [<c04734bc>] >> (igmp_ifc_timer_expire+0x1c0/0x2ac) >> [<c04734bc>] (igmp_ifc_timer_expire) from [<c00696cc>] >> (call_timer_fn+0x4c/0x1ac) >> [<c00696cc>] (call_timer_fn) from [<c0069d20>] >> (run_timer_softirq+0x21c/0x340) >> [<c0069d20>] (run_timer_softirq) from [<c0024ea8>] >> (__do_softirq+0xa4/0x370) >> [<c0024ea8>] (__do_softirq) from [<c002540c>] (irq_exit+0x88/0xc4) >> [<c002540c>] (irq_exit) from [<c005b550>] >> (__handle_domain_irq+0x74/0xdc) >> [<c005b550>] (__handle_domain_irq) from [<c000872c>] >> (aic5_handle+0xe8/0xf4) >> [<c000872c>] (aic5_handle) from [<c059aca8>] (__irq_usr+0x48/0x60) >> >> I suspect that some code is missing somewhere in the serial port >> release part to remove the AX25 interface that is attached to it. But >> I am not certain about this, and I don't know where to look into the >> kernel to fix this. The kernel version is 3.19.0 running on a ARMv7 >> processor in case that matter. >> > > I can reproduce this situation when I reset the USB CDC device: > > # lsusb > Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub > Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub > > # ls -al /proc/$(pgrep kissattach)/fd > total 0 > dr-x------ 2 root root 0 Sep 28 19:00 . > dr-xr-xr-x 7 root root 0 Sep 28 19:00 .. > lr-x------ 1 root root 64 Sep 28 19:00 3 -> /dev/ttyACM1 (deleted) > > # ifconfig ax0 > ax0 Link encap:AMPR AX.25 HWaddr > BROADCAST MULTICAST MTU:236 Metric:1 > RX packets:0 errors:0 dropped:0 overruns:0 frame:0 > TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:10 > RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) > > At this time there is no more USB device attached to the system and > kissattach still have a open file descriptor on the deleted serial > device since it use an infinite loop wrapping a long sleep. Maybe > kissattach should use select/poll/epoll to be notified to read a EOF > and terminate, but this is not the problem of the kernel. What look > strange to me is the fact that the kernel keep the ax0 interface until > the kissattach file descriptor is closed, usually by the termination > of kissattach. > > It that the expected behavior ? I suspect that the response is "no". I > have narrowed down the crash cause to this two conditions: > > 1) ax0 still exists because of kissattach file descriptor after the > serial device is deleted. > After adding some printk() and dump_stack() I found this path: tty_ldisc_hangup()->tty_ldisc_reinit()->tty_ldisc_close()->mkiss_close()->unregister_netdev() And then: [ 251.460000] drivers/net/hamradio/mkiss.c:738 mkiss_open() [ 251.460000] CPU: 0 PID: 247 Comm: kworker/0:1 Tainted: G W 3.19.0-rc1+ #41 [ 251.500000] Hardware name: Atmel SAMA5 (Device Tree) [ 251.500000] Workqueue: usb_hub_wq hub_event [ 251.500000] [<c00149f4>] (unwind_backtrace) from [<c00124f8>] (show_stack+0x20/0x24) [ 251.550000] [<c00124f8>] (show_stack) from [<c0594700>] (dump_stack+0x20/0x28) [ 251.550000] [<c0594700>] (dump_stack) from [<bf093ed0>] (mkiss_open+0x34/0x2e8 [mkiss]) [ 251.570000] [<bf093ed0>] (mkiss_open [mkiss]) from [<c02d5b3c>] (tty_ldisc_open+0x54/0x94) [ 251.570000] [<c02d5b3c>] (tty_ldisc_open) from [<c02d63a8>] (tty_ldisc_hangup+0x190/0x1bc) [ 251.590000] [<c02d63a8>] (tty_ldisc_hangup) from [<c02cdc10>] (__tty_hangup+0x2b4/0x3ec) [ 251.590000] [<c02cdc10>] (__tty_hangup) from [<c02cdd88>] (tty_vhangup+0x1c/0x20) [ 251.610000] [<c02cdd88>] (tty_vhangup) from [<bf00f628>] (acm_disconnect+0xcc/0x1a8 [cdc_acm]) [ 251.610000] [<bf00f628>] (acm_disconnect [cdc_acm]) from [<c0377938>] (usb_unbind_interface+0x7c/0x288) [ 251.630000] [<c0377938>] (usb_unbind_interface) from [<c02f8548>] (__device_release_driver+0x80/0xd4) [ 251.630000] [<c02f8548>] (__device_release_driver) from [<c02f85c8>] (device_release_driver+0x2c/0x38) [ 251.650000] [<c02f85c8>] (device_release_driver) from [<c02f8148>] (bus_remove_device+0xe4/0x104) [ 251.650000] [<c02f8148>] (bus_remove_device) from [<c02f5830>] (device_del+0x108/0x1f0) [ 251.670000] [<c02f5830>] (device_del) from [<c03753d8>] (usb_disable_device+0xb0/0x1fc) [ 251.670000] [<c03753d8>] (usb_disable_device) from [<c036cb0c>] (usb_disconnect+0x74/0x250) [ 251.680000] [<c036cb0c>] (usb_disconnect) from [<c036e4b4>] (hub_event+0x4ec/0x1110) [ 251.710000] [<c036e4b4>] (hub_event) from [<c00369b0>] (process_one_work+0x120/0x424) [ 251.710000] [<c00369b0>] (process_one_work) from [<c0036e5c>] (worker_thread+0x164/0x4c0) [ 251.730000] [<c0036e5c>] (worker_thread) from [<c003b9f8>] (kthread+0xdc/0xf8) [ 251.730000] [<c003b9f8>] (kthread) from [<c000f038>] (ret_from_fork+0x14/0x3c) So tty_ldisc_hangup() seem to open a new mkiss line discipline that will bring back the ax0 interface on a new deleted serial device. I actually don't understand why tty_ldisc_open() is called. Is there anybody here ? Jean-Christian -- To unsubscribe from this list: send the line "unsubscribe linux-hams" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: AX25 mkiss interface not deleted when the serial port is removed 2015-09-28 23:13 ` Jean-Christian de Rivaz @ 2015-09-29 2:13 ` David Ranch 0 siblings, 0 replies; 25+ messages in thread From: David Ranch @ 2015-09-29 2:13 UTC (permalink / raw) To: Jean-Christian de Rivaz, linux-hams; +Cc: Thomas Osterried, Ralf Baechle Hello Jean-Christian, > Is there anybody here ? Yes.. we're here and good debugging work on your part. I think I've seen this before as well but since I use static udev rules, the ports re-enumerate back to where they were pretty quickly. Anyway, It seems we need to hear back from Ralf or Thomas to see what they think. --David ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: AX25 mkiss interface not deleted when the serial port is removed 2015-09-25 17:21 AX25 mkiss interface not deleted when the serial port is removed Jean-Christian de Rivaz 2015-09-28 20:06 ` Jean-Christian de Rivaz @ 2015-09-29 15:31 ` Ralf Baechle DL5RB 2015-09-29 15:46 ` Jean-Christian de Rivaz ` (2 more replies) 1 sibling, 3 replies; 25+ messages in thread From: Ralf Baechle DL5RB @ 2015-09-29 15:31 UTC (permalink / raw) To: Jean-Christian de Rivaz; +Cc: linux-hams On Fri, Sep 25, 2015 at 07:21:18PM +0200, Jean-Christian de Rivaz wrote: > On a embedded system we use AX25 over an USB serial port with the kissattach > command. For some hardware reason the microcontroller that act as a TNC and > USB CDC device can be rested while the system is running, causing a USB > disconnect of the USN CDC device and the removal of the corresponding serial > port in the kernel. But the ax0 interface is not removed in this case and > after a few seconds the kernel panic with the crash below: > > [<c0595468>] (skb_panic) from [<c0401f70>] (skb_push+0x4c/0x50) > [<c0401f70>] (skb_push) from [<bf0bdad4>] (ax25_hard_header+0x34/0xf4 > [ax25]) > [<bf0bdad4>] (ax25_hard_header [ax25]) from [<bf0d05d4>] > (ax_header+0x38/0x40 [mkiss]) > [<bf0d05d4>] (ax_header [mkiss]) from [<c041b584>] > (neigh_compat_output+0x8c/0xd8) > [<c041b584>] (neigh_compat_output) from [<c043e7a8>] > (ip_finish_output+0x2a0/0x914) > [<c043e7a8>] (ip_finish_output) from [<c043f948>] (ip_output+0xd8/0xf0) > [<c043f948>] (ip_output) from [<c043f04c>] (ip_local_out_sk+0x44/0x48) skb_push is invoked via a code path that was conceptually completly broken and has been rewritten recently. So it would be great if you could retest with a recent release version preferably 4.2. I suspect this is not the cause but I'd like to exclude the possibility. Thanks, Ralf ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: AX25 mkiss interface not deleted when the serial port is removed 2015-09-29 15:31 ` Ralf Baechle DL5RB @ 2015-09-29 15:46 ` Jean-Christian de Rivaz 2015-09-29 16:49 ` David Ranch 2015-09-30 0:29 ` Jean-Christian de Rivaz 2 siblings, 0 replies; 25+ messages in thread From: Jean-Christian de Rivaz @ 2015-09-29 15:46 UTC (permalink / raw) To: Ralf Baechle DL5RB; +Cc: linux-hams Le 29. 09. 15 17:31, Ralf Baechle DL5RB a écrit : > On Fri, Sep 25, 2015 at 07:21:18PM +0200, Jean-Christian de Rivaz wrote: > >> On a embedded system we use AX25 over an USB serial port with the kissattach >> command. For some hardware reason the microcontroller that act as a TNC and >> USB CDC device can be rested while the system is running, causing a USB >> disconnect of the USN CDC device and the removal of the corresponding serial >> port in the kernel. But the ax0 interface is not removed in this case and >> after a few seconds the kernel panic with the crash below: >> >> [<c0595468>] (skb_panic) from [<c0401f70>] (skb_push+0x4c/0x50) >> [<c0401f70>] (skb_push) from [<bf0bdad4>] (ax25_hard_header+0x34/0xf4 >> [ax25]) >> [<bf0bdad4>] (ax25_hard_header [ax25]) from [<bf0d05d4>] >> (ax_header+0x38/0x40 [mkiss]) >> [<bf0d05d4>] (ax_header [mkiss]) from [<c041b584>] >> (neigh_compat_output+0x8c/0xd8) >> [<c041b584>] (neigh_compat_output) from [<c043e7a8>] >> (ip_finish_output+0x2a0/0x914) >> [<c043e7a8>] (ip_finish_output) from [<c043f948>] (ip_output+0xd8/0xf0) >> [<c043f948>] (ip_output) from [<c043f04c>] (ip_local_out_sk+0x44/0x48) > skb_push is invoked via a code path that was conceptually completly broken > and has been rewritten recently. So it would be great if you could retest > with a recent release version preferably 4.2. Hi Ralf, I try since this morning to make the git linux master running on the system, but I hit many problems due to some changes on what is allowed in a device tree. I have solved some of them but now I have a spinlock kernel crash that I am completely unable to understand. I will report it to linux-arm-kernel in a moment. > I suspect this is not the > cause but I'd like to exclude the possibility. > I have read the patches between my kernel and git linux master affecting ax15 and mkiss, and I was unable to identify something related. Please read the previous message from me in the linux-hams mailing list, I suspect it contain a better code path where the problem could be. I copy it below: [ 251.460000] drivers/net/hamradio/mkiss.c:738 mkiss_open() [ 251.460000] CPU: 0 PID: 247 Comm: kworker/0:1 Tainted: G W 3.19.0-rc1+ #41 [ 251.500000] Hardware name: Atmel SAMA5 (Device Tree) [ 251.500000] Workqueue: usb_hub_wq hub_event [ 251.500000] [<c00149f4>] (unwind_backtrace) from [<c00124f8>] (show_stack+0x20/0x24) [ 251.550000] [<c00124f8>] (show_stack) from [<c0594700>] (dump_stack+0x20/0x28) [ 251.550000] [<c0594700>] (dump_stack) from [<bf093ed0>] (mkiss_open+0x34/0x2e8 [mkiss]) [ 251.570000] [<bf093ed0>] (mkiss_open [mkiss]) from [<c02d5b3c>] (tty_ldisc_open+0x54/0x94) [ 251.570000] [<c02d5b3c>] (tty_ldisc_open) from [<c02d63a8>] (tty_ldisc_hangup+0x190/0x1bc) [ 251.590000] [<c02d63a8>] (tty_ldisc_hangup) from [<c02cdc10>] (__tty_hangup+0x2b4/0x3ec) [ 251.590000] [<c02cdc10>] (__tty_hangup) from [<c02cdd88>] (tty_vhangup+0x1c/0x20) [ 251.610000] [<c02cdd88>] (tty_vhangup) from [<bf00f628>] (acm_disconnect+0xcc/0x1a8 [cdc_acm]) [ 251.610000] [<bf00f628>] (acm_disconnect [cdc_acm]) from [<c0377938>] (usb_unbind_interface+0x7c/0x288) [ 251.630000] [<c0377938>] (usb_unbind_interface) from [<c02f8548>] (__device_release_driver+0x80/0xd4) [ 251.630000] [<c02f8548>] (__device_release_driver) from [<c02f85c8>] (device_release_driver+0x2c/0x38) [ 251.650000] [<c02f85c8>] (device_release_driver) from [<c02f8148>] (bus_remove_device+0xe4/0x104) [ 251.650000] [<c02f8148>] (bus_remove_device) from [<c02f5830>] (device_del+0x108/0x1f0) [ 251.670000] [<c02f5830>] (device_del) from [<c03753d8>] (usb_disable_device+0xb0/0x1fc) [ 251.670000] [<c03753d8>] (usb_disable_device) from [<c036cb0c>] (usb_disconnect+0x74/0x250) [ 251.680000] [<c036cb0c>] (usb_disconnect) from [<c036e4b4>] (hub_event+0x4ec/0x1110) [ 251.710000] [<c036e4b4>] (hub_event) from [<c00369b0>] (process_one_work+0x120/0x424) [ 251.710000] [<c00369b0>] (process_one_work) from [<c0036e5c>] (worker_thread+0x164/0x4c0) [ 251.730000] [<c0036e5c>] (worker_thread) from [<c003b9f8>] (kthread+0xdc/0xf8) [ 251.730000] [<c003b9f8>] (kthread) from [<c000f038>] (ret_from_fork+0x14/0x3c) So tty_ldisc_hangup() seem to open a new mkiss line discipline that will bring back the ax0 interface on a new deleted serial device. Regards, Jean-Christian -- To unsubscribe from this list: send the line "unsubscribe linux-hams" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: AX25 mkiss interface not deleted when the serial port is removed 2015-09-29 15:31 ` Ralf Baechle DL5RB 2015-09-29 15:46 ` Jean-Christian de Rivaz @ 2015-09-29 16:49 ` David Ranch 2015-09-29 17:14 ` Ralf Baechle 2015-09-30 0:29 ` Jean-Christian de Rivaz 2 siblings, 1 reply; 25+ messages in thread From: David Ranch @ 2015-09-29 16:49 UTC (permalink / raw) To: Ralf Baechle DL5RB, Jean-Christian de Rivaz; +Cc: linux-hams Hello Ralf, Thanks so much for chiming into this thread. > skb_push is invoked via a code path that was conceptually completly broken > and has been rewritten recently. So it would be great if you could retest > with a recent release version preferably 4.2. I suspect this is not the > cause but I'd like to exclude the possibility. Do you know which commit brought in these specific improvements? I'd like to see if Linux kernel 4.1.8 has it. --David KI6ZHD ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: AX25 mkiss interface not deleted when the serial port is removed 2015-09-29 16:49 ` David Ranch @ 2015-09-29 17:14 ` Ralf Baechle 0 siblings, 0 replies; 25+ messages in thread From: Ralf Baechle @ 2015-09-29 17:14 UTC (permalink / raw) To: David Ranch; +Cc: Jean-Christian de Rivaz, linux-hams On Tue, Sep 29, 2015 at 09:49:27AM -0700, David Ranch wrote: > >skb_push is invoked via a code path that was conceptually completly broken > >and has been rewritten recently. So it would be great if you could retest > >with a recent release version preferably 4.2. I suspect this is not the > >cause but I'd like to exclude the possibility. > > Do you know which commit brought in these specific improvements? I'd like to > see if Linux kernel 4.1.8 has it. It's Eric Biederman's series 1d5da757da860a6916adbf68b09e868062b4b3b8 and preceeding commits. It's in kernel 4.1 and newer. Ralf ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: AX25 mkiss interface not deleted when the serial port is removed 2015-09-29 15:31 ` Ralf Baechle DL5RB 2015-09-29 15:46 ` Jean-Christian de Rivaz 2015-09-29 16:49 ` David Ranch @ 2015-09-30 0:29 ` Jean-Christian de Rivaz 2015-09-30 20:22 ` Thomas Osterried 2 siblings, 1 reply; 25+ messages in thread From: Jean-Christian de Rivaz @ 2015-09-30 0:29 UTC (permalink / raw) To: Ralf Baechle DL5RB, linux-hams; +Cc: linux-hams, thomas Le 29. 09. 15 17:31, Ralf Baechle DL5RB a écrit : > On Fri, Sep 25, 2015 at 07:21:18PM +0200, Jean-Christian de Rivaz wrote: > >> On a embedded system we use AX25 over an USB serial port with the kissattach >> command. For some hardware reason the microcontroller that act as a TNC and >> USB CDC device can be rested while the system is running, causing a USB >> disconnect of the USN CDC device and the removal of the corresponding serial >> port in the kernel. But the ax0 interface is not removed in this case and >> after a few seconds the kernel panic with the crash below: >> >> [<c0595468>] (skb_panic) from [<c0401f70>] (skb_push+0x4c/0x50) >> [<c0401f70>] (skb_push) from [<bf0bdad4>] (ax25_hard_header+0x34/0xf4 >> [ax25]) >> [<bf0bdad4>] (ax25_hard_header [ax25]) from [<bf0d05d4>] >> (ax_header+0x38/0x40 [mkiss]) >> [<bf0d05d4>] (ax_header [mkiss]) from [<c041b584>] >> (neigh_compat_output+0x8c/0xd8) >> [<c041b584>] (neigh_compat_output) from [<c043e7a8>] >> (ip_finish_output+0x2a0/0x914) >> [<c043e7a8>] (ip_finish_output) from [<c043f948>] (ip_output+0xd8/0xf0) >> [<c043f948>] (ip_output) from [<c043f04c>] (ip_local_out_sk+0x44/0x48) > skb_push is invoked via a code path that was conceptually completly broken > and has been rewritten recently. So it would be great if you could retest > with a recent release version preferably 4.2. I suspect this is not the > cause but I'd like to exclude the possibility. Hi Ralf, Unfortunately the kernel 4.3.0-rc3+ crash the same way. I suspect that the real question is: what in ax25/mkiss make tty_ldisc_hangup() calling tty_ldisc_open() on a serial port that no longer exists, and how to avoid this ? Regards, Jean-Christian -- To unsubscribe from this list: send the line "unsubscribe linux-hams" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: AX25 mkiss interface not deleted when the serial port is removed 2015-09-30 0:29 ` Jean-Christian de Rivaz @ 2015-09-30 20:22 ` Thomas Osterried 2015-09-30 23:04 ` Jean-Christian de Rivaz 0 siblings, 1 reply; 25+ messages in thread From: Thomas Osterried @ 2015-09-30 20:22 UTC (permalink / raw) To: Jean-Christian de Rivaz, David Ranch Cc: Ralf Baechle DL5RB, linux-hams, linux-hams Hello, > Unfortunately the kernel 4.3.0-rc3+ crash the same way. I suspect that the real question is: what in ax25/mkiss make tty_ldisc_hangup() calling tty_ldisc_open() on a serial port that no longer exists, and how to avoid this ? that’s a good question ;) We had a long debug sessions today. tty_ldisc_hangup() always calls tty_ldisc_reinit(tty, N_TTY)) and tty_ldisc_open(tty, tty->ldisc)). mkiss_close() is supposed to be called, because the old interface disappeared (we have the same number of ax25-interfaces bevor and after the plug). After the plug, mkiss_open() is called again, because the ldisc defines: static struct tty_ldisc_ops ax_ldisc = { … .open = mkiss_open, …} mkiss_open() pre-initializes the „new“ interface with defaults. It has no callsign, no broadcast address, the netrom-compatible MTU, and the IP address configuration has lost. Also, the interface has the state „DOWN“. Another indicator that the old interface has been removed and the new has disappeared is the interface identifier number which you could see with „ip addr“ (use this command before and after the plug - and mind the gap ;) Then we thought about what happens to slip interfaces if the usbserial-adapter is plugged off? The answer is: it has the same issue. kissattach and slattach do not get notified by the kernel: I strace’d them, but they did not get informed by signal. This would not solve the problem, but if it would have terminated, the bug would not have been triggered. And it’s easier to observe (while true; do kissattach … || echo mailx -s „kissattach has died - please plug in your usb-serial adapter“; sleep 10; done ). In my tests, the kernel did not panic automatically. The new interface had no IP Address. The bug is in the handler where an IP packet (i.E. a broadcast, arp, avahi, etc..) goes to the interface. In our case, the interface has lost the IP configuration and furthermore, it is down. But right after ifconfig ax0 44.128.128.1, my avahid on my test-system greeted the new world and the kernel panic’ed. The bug is in triggered in net/ax25/ax25_ip.c in ax25_hard_header(), where the driver adds the inteeface’s src- and dst/broadcast-address to the ip-packet it should send. It get’s the skb containing the data and reserves with skb_push room to place the ax25-header before. This always works under normal conditions, but it fails if the device is unplugged. The great question is: what happened to the skb? My trace showed me the output of skb_panic(), called by skb_under_panic, called by our skb_push(). skb->head is a normal address (and it changed through my reboot experiments). skb->data was indeed < than skb->head: off-by-one (it’s one byte lower). What causes the off-by-one? I still don’t know. Why it does work under normal conditions? Was the skb large enough and had enough space to the left? Currently, I assume the problem is caused due to the re-initialization. mkiss_setup() was called. It does some basic setup, and initializes addre_len and hard_header_len to 0. But in a normal initialization via kissattach, afterwards ioctl(…SIOCSIFENCAP …) is called. It fine-tunes addr_len and hard_header_len. I’m not sure, but perhaps something that passes the skb with the ip packet along honors the hard_header_len and the skb has enough room. The slip-Test does not cause a kernel panic (I tried sending pings via the re-initialized iface after plug-off the usb-serial adapter). This is, because slip does not have any lower hardware-layer, and thus there’s no code like our ax25_hard_header(). My resume: 1. I’d like to understand why ax25_hard_header()’s skb_push() fails on the new device. This should be fixed, but it does not fix the whole problem. 2. I wonder why spattach and kissattach should not be signaled that there’s a problem with the tty. And I wonder, why nobody implemented it before. 3. Obviously, we find the re-open-Problematic also for slip devices. Who likes to test it with ppp? ;) I argue, it’s a very old bug and nobody has wondered about. The only thing is you have an interface that does not work anymore and that has lost it’s traffic statistic.. PS: my next test on this is 6pack - I accept a bet ;)) vy 73, - Thomas dl9sau -- To unsubscribe from this list: send the line "unsubscribe linux-hams" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: AX25 mkiss interface not deleted when the serial port is removed 2015-09-30 20:22 ` Thomas Osterried @ 2015-09-30 23:04 ` Jean-Christian de Rivaz 2015-10-01 2:56 ` [PATCH 1/1] Force mkiss to reset the line discipline when serial device " jc 0 siblings, 1 reply; 25+ messages in thread From: Jean-Christian de Rivaz @ 2015-09-30 23:04 UTC (permalink / raw) To: Thomas Osterried, David Ranch; +Cc: Ralf Baechle DL5RB, linux-hams, linux-hams Le 30. 09. 15 22:22, Thomas Osterried a écrit : > Hello, > >> Unfortunately the kernel 4.3.0-rc3+ crash the same way. I suspect that the real question is: what in ax25/mkiss make tty_ldisc_hangup() calling tty_ldisc_open() on a serial port that no longer exists, and how to avoid this ? First, many many thanks for your time investigating this bug. > that’s a good question ;) > > We had a long debug sessions today. > > tty_ldisc_hangup() always calls tty_ldisc_reinit(tty, N_TTY)) and tty_ldisc_open(tty, tty->ldisc)). Conceptually, is there any good reason to call tty_ldisc_reinit() on a deleted device ? I would like to propose a check in tty_ldisc_hangup() to test if the underlying device still exists. If not the tty can be deleted. That said, I don't know enough this kernel part to assert that it's a good proposal or not. In case the call tty_ldisc_reinit() is really needed, it seem to be acceptable to use the N_TTY line discipline instead of N_AX25 one. My understanding is that this will prevent calling mkiss_open() right after the mkiss_close(). > mkiss_close() is supposed to be called, because the old interface disappeared (we have the same number of ax25-interfaces bevor and after the plug). I remember having traced this, and yes mkiss_close() is called before. I don't think there is any issue regarding this. > After the plug, mkiss_open() is called again, because the ldisc defines: > static struct tty_ldisc_ops ax_ldisc = { … .open = mkiss_open, …} This is certainly the strangest part. How is that possible since the tty_ldisc_reinit() is explicitly called with the N_TTY line discipline in argument ? I would expect that tty_ldisc_reinit() will call this instead: struct tty_ldisc_ops tty_ldisc_N_TTY = { ... .open = n_tty_open, ...} I was unable to catch a problem in tty_ldisc_reinit(tty, N_TTY) that correctly set tty->ldisc so that the tty_ldisc_open(tty, tty->ldisc) must call n_tty_open(). I now suspect that in tty_ldisc_hangup() the reset variable is zero so that tty_ldisc_reinit(tty, tty->termios.c_line) and tty_ldisc_open(tty, tty->ldisc) are called. At this point I think that the bug is that some code is missing to set tty->driver->flags |= TTY_DRIVER_RESET_TERMIOS before tty_ldisc_hangup() is called. I fully agree with all the rest of your message, but I believe it describes the logical consequences of an earlier problem. Regards, Jean-Christian -- To unsubscribe from this list: send the line "unsubscribe linux-hams" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 1/1] Force mkiss to reset the line discipline when serial device is removed 2015-09-30 23:04 ` Jean-Christian de Rivaz @ 2015-10-01 2:56 ` jc 2015-10-01 7:31 ` Ralf Baechle 0 siblings, 1 reply; 25+ messages in thread From: jc @ 2015-10-01 2:56 UTC (permalink / raw) To: Thomas Osterried, David Ranch; +Cc: Ralf Baechle DL5RB, linux-hams, linux-hams From: Jean-Christian de Rivaz <jc@eclis.ch> Without the TTY_DRIVER_RESET_TERMIOS in the tty driver flags, the ax_open() is called again and a new unconfigured interface replace the one that was just deleted. Since this parasit interface is not linked to any serial device, when a packet is send the kernel will panic. Signed-off-by: Jean-Christian de Rivaz <jc@eclis.ch> --- drivers/net/hamradio/mkiss.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/net/hamradio/mkiss.c b/drivers/net/hamradio/mkiss.c index 216bfd3..788f43e 100644 --- a/drivers/net/hamradio/mkiss.c +++ b/drivers/net/hamradio/mkiss.c @@ -612,6 +612,11 @@ static int ax_open(struct net_device *dev) ax->flags &= (1 << AXF_INUSE); /* Clear ESCAPE & ERROR flags */ + /* Will force N_TTY discipline in tty_ldisc_hangup() to avoid + * tty_ldisc_reinit() call ax_open() again when the serial + * device is removed */ + ax->tty->driver->flags |= TTY_DRIVER_RESET_TERMIOS; + spin_lock_init(&ax->buflock); return 0; -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 1/1] Force mkiss to reset the line discipline when serial device is removed 2015-10-01 2:56 ` [PATCH 1/1] Force mkiss to reset the line discipline when serial device " jc @ 2015-10-01 7:31 ` Ralf Baechle 2015-10-01 9:18 ` [PATCH v2 " Jean-Christian de Rivaz 0 siblings, 1 reply; 25+ messages in thread From: Ralf Baechle @ 2015-10-01 7:31 UTC (permalink / raw) To: jc; +Cc: Thomas Osterried, David Ranch, linux-hams, linux-hams On Thu, Oct 01, 2015 at 04:56:20AM +0200, jc@eclis.ch wrote: > From: Jean-Christian de Rivaz <jc@eclis.ch> > > Without the TTY_DRIVER_RESET_TERMIOS in the tty driver flags, the > ax_open() is called again and a new unconfigured interface replace the > one that was just deleted. Since this parasit interface is not linked > to any serial device, when a packet is send the kernel will panic. > > Signed-off-by: Jean-Christian de Rivaz <jc@eclis.ch> > --- > drivers/net/hamradio/mkiss.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/drivers/net/hamradio/mkiss.c b/drivers/net/hamradio/mkiss.c > index 216bfd3..788f43e 100644 > --- a/drivers/net/hamradio/mkiss.c > +++ b/drivers/net/hamradio/mkiss.c > @@ -612,6 +612,11 @@ static int ax_open(struct net_device *dev) > > ax->flags &= (1 << AXF_INUSE); /* Clear ESCAPE & ERROR flags */ > > + /* Will force N_TTY discipline in tty_ldisc_hangup() to avoid > + * tty_ldisc_reinit() call ax_open() again when the serial > + * device is removed */ > + ax->tty->driver->flags |= TTY_DRIVER_RESET_TERMIOS; TTY_DRIVER_RESET_TERMIOS is a flag for use by the device driver so I don't think this is ok. Ralf ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v2 1/1] Force mkiss to reset the line discipline when serial device is removed 2015-10-01 7:31 ` Ralf Baechle @ 2015-10-01 9:18 ` Jean-Christian de Rivaz 2015-10-01 16:56 ` Jean-Christian de Rivaz 0 siblings, 1 reply; 25+ messages in thread From: Jean-Christian de Rivaz @ 2015-10-01 9:18 UTC (permalink / raw) To: Thomas Osterried, David Ranch, Ralf Baechle DL5RB; +Cc: linux-hams, linux-hams Without the TTY_OTHER_CLOSED in the tty flags, the ax_open() is called again and a new unconfigured interface replace the one that was just deleted. Since this parasit interface is not linked to any serial device, when a packet is send the kernel will panic. Signed-off-by: Jean-Christian de Rivaz <jc@eclis.ch> --- drivers/net/hamradio/mkiss.c | 5 +++++ drivers/tty/tty_ldisc.c | 2 ++ 2 files changed, 7 insertions(+) diff --git a/drivers/net/hamradio/mkiss.c b/drivers/net/hamradio/mkiss.c index 216bfd3..fba41e9 100644 --- a/drivers/net/hamradio/mkiss.c +++ b/drivers/net/hamradio/mkiss.c @@ -612,6 +612,11 @@ static int ax_open(struct net_device *dev) ax->flags &= (1 << AXF_INUSE); /* Clear ESCAPE & ERROR flags */ + /* Will force N_TTY discipline in tty_ldisc_hangup() to avoid + * tty_ldisc_reinit() call ax_open() again when the serial + * device is removed */ + set_bit(TTY_OTHER_CLOSED, &ax->tty->flags); + spin_lock_init(&ax->buflock); return 0; diff --git a/drivers/tty/tty_ldisc.c b/drivers/tty/tty_ldisc.c index 71750cb..7a24ab6 100644 --- a/drivers/tty/tty_ldisc.c +++ b/drivers/tty/tty_ldisc.c @@ -691,6 +691,8 @@ void tty_ldisc_hangup(struct tty_struct *tty) if (tty->ldisc) { + reset |= test_bit(TTY_OTHER_CLOSED, &tty->flags); + /* At this point we have a halted ldisc; we want to close it and reopen a new ldisc. We could defer the reopen to the next open but it means auditing a lot of other paths so this is -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Force mkiss to reset the line discipline when serial device is removed 2015-10-01 9:18 ` [PATCH v2 " Jean-Christian de Rivaz @ 2015-10-01 16:56 ` Jean-Christian de Rivaz 2015-10-01 22:57 ` Peter Hurley 0 siblings, 1 reply; 25+ messages in thread From: Jean-Christian de Rivaz @ 2015-10-01 16:56 UTC (permalink / raw) To: Greg Kroah-Hartman, Jiri Slaby Cc: Thomas Osterried, David Ranch, Ralf Baechle DL5RB, linux-hams, linux-hams, linux-kernel Hi Greg and Jiri, I try to fix a kernel panic bug related to the AX25 (and probably SLIP) line discipline when the corresponding serial device is removed [1]. I proposed some patches [2] [3] on the linux-hams mailing list but I think there raise more questions about how tty_ldisc_hangup() should work when a serial device is removed [4]. I actually see the following options: a) Let the specific line discipline set the TTY_DRIVER_RESET_TERMIOS flag in tty->driver as in [2] but this is suspected bad practice [5]. b) Let the specific line discipline set the TTY_OTHER_CLOSED flag in tty and check it in tty_ldisc_hangup() as in [3]. c) Let the specific line discipline set the TTY_LDISC_HALTED flag in tty and check it in tty_ldisc_hangup(). d) Let the specific line discipline set a new flag for that purpose, for example TTY_LDISC_RESET, and check it in tty_ldisc_hangup(). e) Close the tty earlier so that tty_ldisc_reinit() is not even called. Need some advise on how this should be done. f) That's all wrong, something other need to be changed. I would appreciate some comments from tty subsystem experts about this issue. [1] http://www.spinics.net/lists/linux-hams/msg03500.html [2] http://www.spinics.net/lists/linux-hams/msg03511.html [3] http://www.spinics.net/lists/linux-hams/msg03513.html [4] http://www.spinics.net/lists/linux-hams/msg03509.html [5] http://www.spinics.net/lists/linux-hams/msg03512.html Best Regards, Jean-Christian de Rivaz ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Force mkiss to reset the line discipline when serial device is removed 2015-10-01 16:56 ` Jean-Christian de Rivaz @ 2015-10-01 22:57 ` Peter Hurley 2015-10-02 8:30 ` Jean-Christian de Rivaz 0 siblings, 1 reply; 25+ messages in thread From: Peter Hurley @ 2015-10-01 22:57 UTC (permalink / raw) To: Jean-Christian de Rivaz Cc: Greg Kroah-Hartman, Jiri Slaby, Thomas Osterried, David Ranch, Ralf Baechle DL5RB, linux-hams, linux-hams, linux-kernel On 10/01/2015 12:56 PM, Jean-Christian de Rivaz wrote: > Hi Greg and Jiri, > > I try to fix a kernel panic bug related to the AX25 (and probably SLIP) line discipline when the corresponding serial device is removed [1]. I proposed some patches [2] [3] on the linux-hams mailing list but I think there raise more questions about how tty_ldisc_hangup() should work when a serial device is removed [4]. > > I actually see the following options: > > a) Let the specific line discipline set the TTY_DRIVER_RESET_TERMIOS flag in tty->driver as in [2] but this is suspected bad practice [5]. > > b) Let the specific line discipline set the TTY_OTHER_CLOSED flag in tty and check it in tty_ldisc_hangup() as in [3]. > > c) Let the specific line discipline set the TTY_LDISC_HALTED flag in tty and check it in tty_ldisc_hangup(). > > d) Let the specific line discipline set a new flag for that purpose, for example TTY_LDISC_RESET, and check it in tty_ldisc_hangup(). > > e) Close the tty earlier so that tty_ldisc_reinit() is not even called. Need some advise on how this should be done. > > f) That's all wrong, something other need to be changed. > > I would appreciate some comments from tty subsystem experts about this issue. > > [1] http://www.spinics.net/lists/linux-hams/msg03500.html The crash reported here appears to be related to how mkiss handles its netdev; maybe prematurely freeing the tx/rx buffers? I'd relook at how slip handles netdev teardown. I don't see a problem with the ACM tty/tty core side of this. At the time the hangup occurs, there is actually still an ACM tty device. The line discipline is reinited as a security precaution to prevent a previous session's data from being visible in the new session. The tty core does not know at the time the vhangup() occurs that the ACM driver plans to unregister the tty device. Don't do any of the things you suggest above. Regards, Peter Hurley ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Force mkiss to reset the line discipline when serial device is removed 2015-10-01 22:57 ` Peter Hurley @ 2015-10-02 8:30 ` Jean-Christian de Rivaz 2015-10-02 10:35 ` Thomas Osterried 0 siblings, 1 reply; 25+ messages in thread From: Jean-Christian de Rivaz @ 2015-10-02 8:30 UTC (permalink / raw) To: Peter Hurley Cc: Greg Kroah-Hartman, Jiri Slaby, Thomas Osterried, David Ranch, Ralf Baechle DL5RB, linux-hams, linux-hams, linux-kernel Le 02. 10. 15 00:57, Peter Hurley a écrit : > On 10/01/2015 12:56 PM, Jean-Christian de Rivaz wrote: >> Hi Greg and Jiri, >> >> I try to fix a kernel panic bug related to the AX25 (and probably SLIP) line discipline when the corresponding serial device is removed [1]. I proposed some patches [2] [3] on the linux-hams mailing list but I think there raise more questions about how tty_ldisc_hangup() should work when a serial device is removed [4]. >> >> I actually see the following options: >> >> a) Let the specific line discipline set the TTY_DRIVER_RESET_TERMIOS flag in tty->driver as in [2] but this is suspected bad practice [5]. >> >> b) Let the specific line discipline set the TTY_OTHER_CLOSED flag in tty and check it in tty_ldisc_hangup() as in [3]. >> >> c) Let the specific line discipline set the TTY_LDISC_HALTED flag in tty and check it in tty_ldisc_hangup(). >> >> d) Let the specific line discipline set a new flag for that purpose, for example TTY_LDISC_RESET, and check it in tty_ldisc_hangup(). >> >> e) Close the tty earlier so that tty_ldisc_reinit() is not even called. Need some advise on how this should be done. >> >> f) That's all wrong, something other need to be changed. >> >> I would appreciate some comments from tty subsystem experts about this issue. >> >> [1] http://www.spinics.net/lists/linux-hams/msg03500.html Hi Peter, thanks for your time, > The crash reported here appears to be related to how mkiss handles its netdev; > maybe prematurely freeing the tx/rx buffers? I'd relook at how slip handles > netdev teardown. Yes but this is a consequence of the fact that the ax0 interface was re-opened uninitialized while the corresponding serial device is no longer connected to the system. I don't see any rational to create this bogus interface: the serial device is gone. > I don't see a problem with the ACM tty/tty core side of this. > > At the time the hangup occurs, there is actually still an ACM tty device. Not physically, sorry. The physical serial device was unplugged front the system (or in hardware forced reset in the case of my test), causing a USB disconnect. It's important to understand that the USB disconnect has already occurred seconds before the crash. The fact that there is still an ACM tty structure in the kernel corresponding to nothing real is the cause of the problem. > The line discipline is reinited as a security precaution to prevent a previous > session's data from being visible in the new session. Pragmatically reinited to N_TTY is ok, this is in fact how my proposed patches work. But reinited to N_AX25 while the serial device is no more have no sense at all and cause the crash when the new uninitialized parasitic interface try to send a packet. > The tty core does not know > at the time the vhangup() occurs that the ACM driver plans to unregister the > tty device. That's the root problem: It must a least known that it must not call mkiss_open(). That's the bug that must be fixed. Or maybe the option e) fix must be developed. > Don't do any of the things you suggest above. > Can I ask what did you suggest to solve the problem ? The bug is real, causing a kernel panic and complete crash of the system, requiring a hardware reset to reboot. Best Regards, Jean-Christian de Rivaz -- To unsubscribe from this list: send the line "unsubscribe linux-hams" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Force mkiss to reset the line discipline when serial device is removed 2015-10-02 8:30 ` Jean-Christian de Rivaz @ 2015-10-02 10:35 ` Thomas Osterried 2015-10-02 13:48 ` Jean-Christian de Rivaz 0 siblings, 1 reply; 25+ messages in thread From: Thomas Osterried @ 2015-10-02 10:35 UTC (permalink / raw) To: Jean-Christian de Rivaz Cc: Peter Hurley, Greg Kroah-Hartman, Jiri Slaby, David Ranch, Ralf Bächle DL5RB, linux-hams, linux-hams, linux-kernel Hello, > Am 02.10.2015 um 10:30 schrieb Jean-Christian de Rivaz <jc@eclis.ch>: > > Le 02. 10. 15 00:57, Peter Hurley a écrit : >> On 10/01/2015 12:56 PM, Jean-Christian de Rivaz wrote: >>> Hi Greg and Jiri, >>> >>> I try to fix a kernel panic bug related to the AX25 (and probably SLIP) line discipline when the corresponding serial device is removed [1]. I proposed some patches [2] [3] on the linux-hams mailing list but I think there raise more questions about how tty_ldisc_hangup() should work when a serial device is removed [4]. >>> >>> I actually see the following options: >>> >>> a) Let the specific line discipline set the TTY_DRIVER_RESET_TERMIOS flag in tty->driver as in [2] but this is suspected bad practice [5]. >>> >>> b) Let the specific line discipline set the TTY_OTHER_CLOSED flag in tty and check it in tty_ldisc_hangup() as in [3]. If I understand correctly, in current kernels TTY_OTHER_DONE is introduced, instead of TTY_OTHER_CLOSED. >>> c) Let the specific line discipline set the TTY_LDISC_HALTED flag in tty and check it in tty_ldisc_hangup(). >>> >>> d) Let the specific line discipline set a new flag for that purpose, for example TTY_LDISC_RESET, and check it in tty_ldisc_hangup(). >>> >>> e) Close the tty earlier so that tty_ldisc_reinit() is not even called. Need some advise on how this should be done. >>> >>> f) That's all wrong, something other need to be changed. >>> >>> I would appreciate some comments from tty subsystem experts about this issue. >>> >>> [1] http://www.spinics.net/lists/linux-hams/msg03500.html > > Hi Peter, thanks for your time, > >> The crash reported here appears to be related to how mkiss handles its netdev; >> maybe prematurely freeing the tx/rx buffers? I'd relook at how slip handles >> netdev teardown. > > Yes but this is a consequence of the fact that the ax0 interface was re-opened uninitialized while the corresponding serial device is no longer connected to the system. I don’t see any rational to create this bogus interface: the serial device is gone. I also tried 6pack, and the traditional slip interface. The same thing happens - device reappears. I don’t think there’s a good reason for this, because after reinitialization, the iface is down, ip address and routes over it have disappeard. Thus it’s even not usable anymore as not-active-dummy-interface for ip/routes. I’ve not tested ppp and possible other line-based protocols - but I assume they’ve all the same issue, and nobody noticed before. Anyone likes to track down ppp’s behavior? Do we have a way to determine if the interface was re-initialized by the ldisc handler? Then we (and any other line based driver) could try to check in the .open call and decide what to do. But imho, it would always may cause problems when people write new drivers and oversee the not obvious situation where devices may reappear. The default should be not to call open again. I also wonder why userspace processes like kissattach do not get a signal by the kernel, indicating that the filedescriptor is not valid anymore. Who’s job would it be to signal, the serial driver’s (slip, ppp, mkiss, ..), or ldisc’s? >> I don't see a problem with the ACM tty/tty core side of this. >> >> At the time the hangup occurs, there is actually still an ACM tty device. > > Not physically, sorry. The physical serial device was unplugged front the system (or in hardware forced reset in the case of my test), causing a USB disconnect. It's important to understand that the USB disconnect has already occurred seconds before the crash. The fact that there is still an ACM tty structure in the kernel corresponding to nothing real is the cause of the problem. > >> The line discipline is reinited as a security precaution to prevent a previous >> session's data from being visible in the new session. > > Pragmatically reinited to N_TTY is ok, this is in fact how my proposed patches work. But reinited to N_AX25 while the serial device is no more have no sense at all and cause the crash when the new uninitialized parasitic interface try to send a packet. > >> The tty core does not know >> at the time the vhangup() occurs that the ACM driver plans to unregister the >> tty device. > > That’s the root problem: It must a least known that it must not call mkiss_open(). Or at least mkiss_open() must have a way to dectect that a re-open was initiated. But as said, I’d prefer it would not happen, because otherwise it depends on every serial protocol driver to implement it correctly. > That's the bug that must be fixed. Or maybe the option e) fix must be developed. > >> Don't do any of the things you suggest above. >> > > Can I ask what did you suggest to solve the problem ? The bug is real, causing a kernel panic and complete crash of the system, requiring a hardware reset to reboot. > > Best Regards, > Jean-Christian de Rivaz vy 73, - Thomas dl9sau-- To unsubscribe from this list: send the line "unsubscribe linux-hams" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Force mkiss to reset the line discipline when serial device is removed 2015-10-02 10:35 ` Thomas Osterried @ 2015-10-02 13:48 ` Jean-Christian de Rivaz 2015-10-02 17:25 ` Jean-Christian de Rivaz 2015-10-02 21:40 ` [PATCH 1/1] Close the file descriptor and exit when the kernel notice hangup Jean-Christian de Rivaz 0 siblings, 2 replies; 25+ messages in thread From: Jean-Christian de Rivaz @ 2015-10-02 13:48 UTC (permalink / raw) To: Thomas Osterried Cc: Peter Hurley, Greg Kroah-Hartman, Jiri Slaby, David Ranch, Ralf Bächle DL5RB, linux-hams, linux-hams, linux-kernel Le 02. 10. 15 12:35, Thomas Osterried a écrit : > Hello, > > >> Am 02.10.2015 um 10:30 schrieb Jean-Christian de Rivaz <jc@eclis.ch>: >> >> Le 02. 10. 15 00:57, Peter Hurley a écrit : >>> On 10/01/2015 12:56 PM, Jean-Christian de Rivaz wrote: >>>> Hi Greg and Jiri, >>>> >>>> I try to fix a kernel panic bug related to the AX25 (and probably SLIP) line discipline when the corresponding serial device is removed [1]. I proposed some patches [2] [3] on the linux-hams mailing list but I think there raise more questions about how tty_ldisc_hangup() should work when a serial device is removed [4]. >>>> >>>> I actually see the following options: >>>> >>>> a) Let the specific line discipline set the TTY_DRIVER_RESET_TERMIOS flag in tty->driver as in [2] but this is suspected bad practice [5]. >>>> >>>> b) Let the specific line discipline set the TTY_OTHER_CLOSED flag in tty and check it in tty_ldisc_hangup() as in [3]. > If I understand correctly, in current kernels TTY_OTHER_DONE is introduced, instead of TTY_OTHER_CLOSED. > >>>> c) Let the specific line discipline set the TTY_LDISC_HALTED flag in tty and check it in tty_ldisc_hangup(). >>>> >>>> d) Let the specific line discipline set a new flag for that purpose, for example TTY_LDISC_RESET, and check it in tty_ldisc_hangup(). >>>> >>>> e) Close the tty earlier so that tty_ldisc_reinit() is not even called. Need some advise on how this should be done. >>>> >>>> f) That's all wrong, something other need to be changed. >>>> >>>> I would appreciate some comments from tty subsystem experts about this issue. >>>> >>>> [1] http://www.spinics.net/lists/linux-hams/msg03500.html >> Hi Peter, thanks for your time, >> >>> The crash reported here appears to be related to how mkiss handles its netdev; >>> maybe prematurely freeing the tx/rx buffers? I'd relook at how slip handles >>> netdev teardown. >> Yes but this is a consequence of the fact that the ax0 interface was re-opened uninitialized while the corresponding serial device is no longer connected to the system. I don’t see any rational to create this bogus interface: the serial device is gone. > I also tried 6pack, and the traditional slip interface. The same thing happens - device reappears. > > I don’t think there’s a good reason for this, because after reinitialization, the iface is down, ip address and routes over it have disappeard. Thus it’s even not usable anymore as not-active-dummy-interface for ip/routes. > > I’ve not tested ppp and possible other line-based protocols - but I assume they’ve all the same issue, and nobody noticed before. Anyone likes to track down ppp’s behavior? > > Do we have a way to determine if the interface was re-initialized by the ldisc handler? Then we (and any other line based driver) could try to check in the .open call and decide what to do. > But imho, it would always may cause problems when people write new drivers and oversee the not obvious situation where devices may reappear. The default should be not to call open again. Fully agree on that. > I also wonder why userspace processes like kissattach do not get a signal by the kernel, indicating that the filedescriptor is not valid anymore. > Who’s job would it be to signal, the serial driver’s (slip, ppp, mkiss, ..), or ldisc’s? It's a complete other problem, not kernel related. The safety of the kernel cannot depend on a user application closing a file descriptor. Even if the user application close his file descriptor, process scheduling can make this delayed long enough to let's a packet reach the parasitic uninitialized interface and completely crash the system. This will at best only reduce the race window but do nothing to fix the real bug. That said, kissattach uses a while (1) { sleep(); } loop that can be cheaply replaced by a single old select() waiting on the file descriptor. My understanding is that after the the AX25 discipline is in place the only event that can happen is that the descriptor is to be closed. I will test a kissattach patch for this. AFAIK tty_ldisc_hangup() already signal EOF to the file descriptor owner with these lines: wake_up_interruptible_poll(&tty->write_wait, POLLOUT); wake_up_interruptible_poll(&tty->read_wait, POLLIN); >>> I don't see a problem with the ACM tty/tty core side of this. >>> >>> At the time the hangup occurs, there is actually still an ACM tty device. >> Not physically, sorry. The physical serial device was unplugged front the system (or in hardware forced reset in the case of my test), causing a USB disconnect. It's important to understand that the USB disconnect has already occurred seconds before the crash. The fact that there is still an ACM tty structure in the kernel corresponding to nothing real is the cause of the problem. >> >>> The line discipline is reinited as a security precaution to prevent a previous >>> session's data from being visible in the new session. >> Pragmatically reinited to N_TTY is ok, this is in fact how my proposed patches work. But reinited to N_AX25 while the serial device is no more have no sense at all and cause the crash when the new uninitialized parasitic interface try to send a packet. >> >>> The tty core does not know >>> at the time the vhangup() occurs that the ACM driver plans to unregister the >>> tty device. >> That’s the root problem: It must a least known that it must not call mkiss_open(). > Or at least mkiss_open() must have a way to dectect that a re-open was initiated. > But as said, I’d prefer it would not happen, because otherwise it depends on every serial protocol driver to implement it correctly. > Fully agree again. There is absolutely no doubt that the N_AX25 line discipline must not be open again when the serial device is removed. The fact is that tty_ldisc_hangup() is actually a mandatory path and that it call tty_ldisc_reinit() if a line discipline exists for the tty (alway true in this case). So there is at least the following options: A) Let's tty_ldisc_hangup() call tty_ldisc_reinit() but with N_TTY since the existing code already make that possible. This is how my patches work. Only have to agree on the condition/flag to be used. My patch rely on code in the line discipline driver, but something more cleaver could maybe done. B) Same as A) but make the corresponding serial driver responsible to set the TTY_DRIVER_RESET_TERMIOS flag in case the device is removed, since the existing code handle that case. Unless some generic code in the serial device layer can do that, this imply to modify all serial drivers. C) Modify tty_ldisc_hangup() to call tty_ldisc_close() in that case but still have to agree on the condition. The raise to me this question: when an application still have an open file descriptor on a removed serial device, how long the kernel tty structure is supposed to live ? 1) Only until the serial device removal, the file descriptor is an other structure. 2) Until the application close the file descriptor, even if it's days after the serial device has been removed. >> That's the bug that must be fixed. Or maybe the option e) fix must be developed. >> >>> Don't do any of the things you suggest above. >>> >> Can I ask what did you suggest to solve the problem ? The bug is real, causing a kernel panic and complete crash of the system, requiring a hardware reset to reboot. >> >> Best Regards, >> Jean-Christian de Rivaz > vy 73, > - Thomas dl9sau Jean-Christian ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Force mkiss to reset the line discipline when serial device is removed 2015-10-02 13:48 ` Jean-Christian de Rivaz @ 2015-10-02 17:25 ` Jean-Christian de Rivaz 2015-10-02 21:46 ` [PATCH 1/1] Add poll method to mkiss let notify hangup to the user process Jean-Christian de Rivaz 2015-10-02 21:40 ` [PATCH 1/1] Close the file descriptor and exit when the kernel notice hangup Jean-Christian de Rivaz 1 sibling, 1 reply; 25+ messages in thread From: Jean-Christian de Rivaz @ 2015-10-02 17:25 UTC (permalink / raw) To: Thomas Osterried, Greg Kroah-Hartman, Jiri Slaby Cc: Peter Hurley, David Ranch, Ralf Bächle DL5RB, linux-hams, linux-hams, linux-kernel Le 02. 10. 15 15:48, Jean-Christian de Rivaz a écrit : > Le 02. 10. 15 12:35, Thomas Osterried a écrit : > >> I also wonder why userspace processes like kissattach do not get a >> signal by the kernel, indicating that the filedescriptor is not valid >> anymore. >> Who’s job would it be to signal, the serial driver’s (slip, ppp, >> mkiss, ..), or ldisc’s? > > It's a complete other problem, not kernel related. The safety of the > kernel cannot depend on a user application closing a file descriptor. > Even if the user application close his file descriptor, process > scheduling can make this delayed long enough to let's a packet reach > the parasitic uninitialized interface and completely crash the system. > This will at best only reduce the race window but do nothing to fix > the real bug. That said, kissattach uses a while (1) { sleep(); } loop > that can be cheaply replaced by a single old select() waiting on the > file descriptor. My understanding is that after the the AX25 > discipline is in place the only event that can happen is that the > descriptor is to be closed. I will test a kissattach patch for this. > > AFAIK tty_ldisc_hangup() already signal EOF to the file descriptor > owner with these lines: > > wake_up_interruptible_poll(&tty->write_wait, POLLOUT); > wake_up_interruptible_poll(&tty->read_wait, POLLIN); > I was completely wrong on this. The kissattach application get no event at all. I tried with select() and poll(). You are right, something is missing in the kernel to notify EOF in the descriptor of a removed serial device when at least the N_AX25 line discipline is used. The EOF is notified correctly at least in the case of the N_TTY line discipline. So your question make sense: who must send the EOF ? Maybe it's the line discipline code. Greg, Jiri, can you give some hint ? Best Regards, Jean-Christian de Rivaz -- To unsubscribe from this list: send the line "unsubscribe linux-hams" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 1/1] Add poll method to mkiss let notify hangup to the user process. 2015-10-02 17:25 ` Jean-Christian de Rivaz @ 2015-10-02 21:46 ` Jean-Christian de Rivaz 2015-10-03 0:29 ` David Ranch 0 siblings, 1 reply; 25+ messages in thread From: Jean-Christian de Rivaz @ 2015-10-02 21:46 UTC (permalink / raw) To: Thomas Osterried, David Ranch, Ralf Baechle DL5RB, Greg Kroah-Hartman, Jiri Slaby, Peter Hurley Cc: linux-hams, linux-hams, linux-kernel Without this the application that use the mkiss line discipline have no way to terminate in case the corresponding serial device is removed, for example when a USB TNC is unplugged. The application must wait on poll(). The kissattach application must be patched to take advantage of this feature. Signed-off-by: Jean-Christian de Rivaz <jc@eclis.ch> --- drivers/net/hamradio/mkiss.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/drivers/net/hamradio/mkiss.c b/drivers/net/hamradio/mkiss.c index fba41e9..50cd36c 100644 --- a/drivers/net/hamradio/mkiss.c +++ b/drivers/net/hamradio/mkiss.c @@ -893,6 +893,20 @@ static long mkiss_compat_ioctl(struct tty_struct *tty, struct file *file, #endif /* + * Waiting until hangup is the only allowed operation. This is used + * to notify the application in case the serial deivce is removed + * (ex. USB). The tty_ldisc_hangup() will wake up the process. + */ +static unsigned int mkiss_poll(struct tty_struct *tty, struct file *file, + poll_table *wait) +{ + poll_wait(file, &tty->read_wait, wait); + poll_wait(file, &tty->write_wait, wait); + + return 0; +} + +/* * Handle the 'receiver data ready' interrupt. * This function is called by the 'tty_io' module in the kernel when * a block of data has been received, which can now be decapsulated @@ -969,6 +983,7 @@ static struct tty_ldisc_ops ax_ldisc = { #ifdef CONFIG_COMPAT .compat_ioctl = mkiss_compat_ioctl, #endif + .poll = mkiss_poll, .receive_buf = mkiss_receive_buf, .write_wakeup = mkiss_write_wakeup }; -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 1/1] Add poll method to mkiss let notify hangup to the user process. 2015-10-02 21:46 ` [PATCH 1/1] Add poll method to mkiss let notify hangup to the user process Jean-Christian de Rivaz @ 2015-10-03 0:29 ` David Ranch 2015-10-03 1:02 ` Jean-Christian de Rivaz 2016-02-11 19:14 ` Thomas Osterried 0 siblings, 2 replies; 25+ messages in thread From: David Ranch @ 2015-10-03 0:29 UTC (permalink / raw) To: Jean-Christian de Rivaz, Thomas Osterried, Ralf Baechle DL5RB, Greg Kroah-Hartman, Jiri Slaby, Peter Hurley Cc: linux-hams, linux-hams Hello Jean-Christian, Everyone, Thanks for working on this as I'm pretty sure I've bit hit by this panic as well though I wasn't able to reproduce it more readily. Anyway, if this is the proper fix moving forward, will the kernel not panic even if kissattach is not updated? Can you also include the needed patch for the kissattach program to complete this solution? --David On 10/02/2015 02:46 PM, Jean-Christian de Rivaz wrote: > Without this the application that use the mkiss line discipline have > no way to terminate in case the corresponding serial device is > removed, for example when a USB TNC is unplugged. The application must > wait on poll(). > > The kissattach application must be patched to take advantage of this > feature. > > Signed-off-by: Jean-Christian de Rivaz <jc@eclis.ch> > --- > drivers/net/hamradio/mkiss.c | 15 +++++++++++++++ > 1 file changed, 15 insertions(+) > > diff --git a/drivers/net/hamradio/mkiss.c b/drivers/net/hamradio/mkiss.c > index fba41e9..50cd36c 100644 > --- a/drivers/net/hamradio/mkiss.c > +++ b/drivers/net/hamradio/mkiss.c > @@ -893,6 +893,20 @@ static long mkiss_compat_ioctl(struct tty_struct *tty, struct file *file, > #endif > > /* > + * Waiting until hangup is the only allowed operation. This is used > + * to notify the application in case the serial deivce is removed > + * (ex. USB). The tty_ldisc_hangup() will wake up the process. > + */ > +static unsigned int mkiss_poll(struct tty_struct *tty, struct file *file, > + poll_table *wait) > +{ > + poll_wait(file, &tty->read_wait, wait); > + poll_wait(file, &tty->write_wait, wait); > + > + return 0; > +} > + > +/* > * Handle the 'receiver data ready' interrupt. > * This function is called by the 'tty_io' module in the kernel when > * a block of data has been received, which can now be decapsulated > @@ -969,6 +983,7 @@ static struct tty_ldisc_ops ax_ldisc = { > #ifdef CONFIG_COMPAT > .compat_ioctl = mkiss_compat_ioctl, > #endif > + .poll = mkiss_poll, > .receive_buf = mkiss_receive_buf, > .write_wakeup = mkiss_write_wakeup > }; ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/1] Add poll method to mkiss let notify hangup to the user process. 2015-10-03 0:29 ` David Ranch @ 2015-10-03 1:02 ` Jean-Christian de Rivaz 2016-02-11 19:14 ` Thomas Osterried 1 sibling, 0 replies; 25+ messages in thread From: Jean-Christian de Rivaz @ 2015-10-03 1:02 UTC (permalink / raw) To: David Ranch, Thomas Osterried, Ralf Baechle DL5RB, Greg Kroah-Hartman, Jiri Slaby, Peter Hurley Cc: linux-hams Le 03. 10. 15 02:29, David Ranch a écrit : > > Hello Jean-Christian, Everyone, > > Thanks for working on this as I'm pretty sure I've bit hit by this > panic as well though I wasn't able to reproduce it more readily. > Anyway, if this is the proper fix moving forward, will the kernel not > panic even if kissattach is not updated? Hello David, The kernel will still panic and crash the system. The patch purpose is only to allow hangup notification to the user space. It has the side effect to reduce the race window of the panic&crash only if the user application get the hangup notification and close the file descriptor. The window is reduced in that case, but the window still exists. Even with this patch and the user space application patch, the kernel is still unsafe, as a packet can still reach the parasitic AX25 interface in the time between the kiss_open() in the kernel and the close() in the user application. More important, as I stated before, safety of the kernel cannot depend on a user space application closing a file descriptor. A fix to prevent mkiss_open() call on a removed serial device is still needed. I proposed patches and options, but to date I only get two critics and no hint on how to solve the problem. The tty maintainers stay silent. > Can you also include the needed patch for the kissattach program to > complete this solution? Yes, it was posted a few minutes before in the linux-hams mailing list only because it don't concern the kernel mailing list: http://www.spinics.net/lists/linux-hams/msg03520.html Jean-Christian -- To unsubscribe from this list: send the line "unsubscribe linux-hams" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/1] Add poll method to mkiss let notify hangup to the user process. 2015-10-03 0:29 ` David Ranch 2015-10-03 1:02 ` Jean-Christian de Rivaz @ 2016-02-11 19:14 ` Thomas Osterried 1 sibling, 0 replies; 25+ messages in thread From: Thomas Osterried @ 2016-02-11 19:14 UTC (permalink / raw) To: David Ranch Cc: Jean-Christian de Rivaz, Thomas Osterried, Ralf Baechle DL5RB, Greg Kroah-Hartman, Jiri Slaby, Peter Hurley, linux-hams Hello, unfortunately, I cannot get my mail to Jean-Christian's ISP (Deferred: Connection timed out with mail.eclis.ch.). That's why I try it now on this list. In short: - Jean-Christian's patch depends on both: a patched kissattach program AND the mentioned kernel patch below. - Unfotunately, it does not work: kissattach does not terminate (in order to prevent the panic) and the kernel still panics. - If we get it work, 6pack.c sould also be patched the same way as mkiss.c. - 6pack.c does not panic. The reason is the pre-initializiation of some variables (6pack.c usefule values, mkiss.c zero). I tried to use the correct values from 6pack.c in mkiss.c and this fixes the panic. - I'd like Jean's patch to get running, because it's good if kissattach gets notified that the usb device is not present anymore. Only if it's terminating, /dev/ttyUSB0 will be freed (otherwise, someone who plugs off-and-in will have an unsable /dev/ttyUSB0 and a new /dev/ttyUSB1). Here's my mail to Jean-Christian whith my request to verify - now to everyone who likes to test. Hello, has anybody tried it? I've tried it on debian kernel 3.16.0-4-amd64 with a patched+recompiled mkiss.ko and the modified kissattach program. It does not work. kissattach does not get notified in his poll(). Thus still keeps alive after the usb plug-off event. Because it's not terminating, the mkiss race condition is not prevented. Interesting: if I plug usb of and then strace to the pid of kissattach, it terminates. Btw, when pre-initializing dev->hard_header_len = AX25_MAX_HEADER_LEN; dev->addr_len = AX25_ADDR_LEN; in ax_setup() (like sp_setup() in 6pack.c does), the kernel does not panic anymore. Anyway, it would be nice if kissattach could get notice if something with the ax25 interface happened and would to the appropriate action (termination) - something Jean-Christian intended with his patch. Also, I like the idea of the blocking poll() instead of the previous while (1) { sleep(... )} construct. Furthermore, it's still backwards-compatible to an unpatched mkiss.ko and 6pack.ko (tested). If we can manage his patch to work, keep in mind that we have also to patch 6pack.c, because the kissattach program is the same for spattach. Test scenario: plug-in the usb device kissattach /dev/ttyUSB0 ax0 ifconfig ax0 44.128.128.1 plug-off the usb device ps axwww| grep kissattach # -> still running ifconfig ax0 44.128.128.1 up ping 44.128.128.2 PANIC vy 73, - Thomas dl9sau On Fri, Oct 02, 2015 at 05:29:47PM -0700, David Ranch wrote: > > Hello Jean-Christian, Everyone, > > Thanks for working on this as I'm pretty sure I've bit hit by this panic as > well though I wasn't able to reproduce it more readily. Anyway, if this is > the proper fix moving forward, will the kernel not panic even if kissattach > is not updated? Can you also include the needed patch for the kissattach > program to complete this solution? > > --David > > > On 10/02/2015 02:46 PM, Jean-Christian de Rivaz wrote: > >Without this the application that use the mkiss line discipline have > >no way to terminate in case the corresponding serial device is > >removed, for example when a USB TNC is unplugged. The application must > >wait on poll(). > > > >The kissattach application must be patched to take advantage of this > >feature. > > > >Signed-off-by: Jean-Christian de Rivaz <jc@eclis.ch> > >--- > > drivers/net/hamradio/mkiss.c | 15 +++++++++++++++ > > 1 file changed, 15 insertions(+) > > > >diff --git a/drivers/net/hamradio/mkiss.c b/drivers/net/hamradio/mkiss.c > >index fba41e9..50cd36c 100644 > >--- a/drivers/net/hamradio/mkiss.c > >+++ b/drivers/net/hamradio/mkiss.c > >@@ -893,6 +893,20 @@ static long mkiss_compat_ioctl(struct tty_struct *tty, struct file *file, > > #endif > > /* > >+ * Waiting until hangup is the only allowed operation. This is used > >+ * to notify the application in case the serial deivce is removed > >+ * (ex. USB). The tty_ldisc_hangup() will wake up the process. > >+ */ > >+static unsigned int mkiss_poll(struct tty_struct *tty, struct file *file, > >+ poll_table *wait) > >+{ > >+ poll_wait(file, &tty->read_wait, wait); > >+ poll_wait(file, &tty->write_wait, wait); > >+ > >+ return 0; > >+} > >+ > >+/* > > * Handle the 'receiver data ready' interrupt. > > * This function is called by the 'tty_io' module in the kernel when > > * a block of data has been received, which can now be decapsulated > >@@ -969,6 +983,7 @@ static struct tty_ldisc_ops ax_ldisc = { > > #ifdef CONFIG_COMPAT > > .compat_ioctl = mkiss_compat_ioctl, > > #endif > >+ .poll = mkiss_poll, > > .receive_buf = mkiss_receive_buf, > > .write_wakeup = mkiss_write_wakeup > > }; > > -- > To unsubscribe from this list: send the line "unsubscribe linux-hams" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 1/1] Close the file descriptor and exit when the kernel notice hangup. 2015-10-02 13:48 ` Jean-Christian de Rivaz 2015-10-02 17:25 ` Jean-Christian de Rivaz @ 2015-10-02 21:40 ` Jean-Christian de Rivaz 1 sibling, 0 replies; 25+ messages in thread From: Jean-Christian de Rivaz @ 2015-10-02 21:40 UTC (permalink / raw) To: Thomas Osterried; +Cc: Ralf Bächle DL5RB, linux-hams, linux-hams Some serial device used to communicate with the TCN can be removed, for example when a USB TCN is unplugged. If this occure, the kernel will send a hangup event. The file descriptor is then useles and can only be closed. The application can then try to wait on a new serial device or simply terminate. Since kissattach was not designed to wait on a serial device, it safer to simply exit. The kernel mkiss.c must be patched to ad the poll method. This problem was identified when trying to solve a kernel crash: http://www.spinics.net/lists/linux-hams/msg03500.html Signed-off-by: Jean-Christian de Rivaz <jc@eclis.ch> --- kiss/kissattach.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/kiss/kissattach.c b/kiss/kissattach.c index e30ed05..bfcad61 100644 --- a/kiss/kissattach.c +++ b/kiss/kissattach.c @@ -10,6 +10,7 @@ #include <netdb.h> #include <syslog.h> #include <errno.h> +#include <poll.h> #include <config.h> @@ -383,9 +384,9 @@ int main(int argc, char *argv[]) close(1); close(2); - while (1) - sleep(10000); + struct pollfd fds[1] = { [0] = { .fd = fd, .events = POLLHUP } }; + poll(fds, 1, -1); - /* NOT REACHED */ + close(fd); return 0; } -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 25+ messages in thread
end of thread, other threads:[~2016-02-11 19:14 UTC | newest] Thread overview: 25+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-09-25 17:21 AX25 mkiss interface not deleted when the serial port is removed Jean-Christian de Rivaz 2015-09-28 20:06 ` Jean-Christian de Rivaz 2015-09-28 23:13 ` Jean-Christian de Rivaz 2015-09-29 2:13 ` David Ranch 2015-09-29 15:31 ` Ralf Baechle DL5RB 2015-09-29 15:46 ` Jean-Christian de Rivaz 2015-09-29 16:49 ` David Ranch 2015-09-29 17:14 ` Ralf Baechle 2015-09-30 0:29 ` Jean-Christian de Rivaz 2015-09-30 20:22 ` Thomas Osterried 2015-09-30 23:04 ` Jean-Christian de Rivaz 2015-10-01 2:56 ` [PATCH 1/1] Force mkiss to reset the line discipline when serial device " jc 2015-10-01 7:31 ` Ralf Baechle 2015-10-01 9:18 ` [PATCH v2 " Jean-Christian de Rivaz 2015-10-01 16:56 ` Jean-Christian de Rivaz 2015-10-01 22:57 ` Peter Hurley 2015-10-02 8:30 ` Jean-Christian de Rivaz 2015-10-02 10:35 ` Thomas Osterried 2015-10-02 13:48 ` Jean-Christian de Rivaz 2015-10-02 17:25 ` Jean-Christian de Rivaz 2015-10-02 21:46 ` [PATCH 1/1] Add poll method to mkiss let notify hangup to the user process Jean-Christian de Rivaz 2015-10-03 0:29 ` David Ranch 2015-10-03 1:02 ` Jean-Christian de Rivaz 2016-02-11 19:14 ` Thomas Osterried 2015-10-02 21:40 ` [PATCH 1/1] Close the file descriptor and exit when the kernel notice hangup Jean-Christian de Rivaz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).