* Re: [2.6.20-rt8] "Neighbour table overflow." [not found] ` <Pine.LNX.4.63.0703210849570.534-pjC7uWgG1yLsj6RlE8knwQ@public.gmane.org> @ 2007-03-21 10:51 ` Guennadi Liakhovetski [not found] ` <Pine.LNX.4.63.0703211130140.534-pjC7uWgG1yLsj6RlE8knwQ@public.gmane.org> 0 siblings, 1 reply; 12+ messages in thread From: Guennadi Liakhovetski @ 2007-03-21 10:51 UTC (permalink / raw) To: Samuel Ortiz Cc: irda-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org, davem-fT/PcQaiUtIeIZ0/mPfg9Q, netdev-u79uwXL29TY76Z2rM5mHXA (Short recap for newly added to cc: netdev: I'm seeing an skb leak in 2.6.20 during an IrDA IrNET+ppp UDP test with periodic connection disruptions) On Wed, 21 Mar 2007, Guennadi Liakhovetski wrote: > On Tue, 20 Mar 2007, Guennadi Liakhovetski wrote: > > Ok, looks like all leaked skbuffs come from ip_append_data(), like this: > > (sock_alloc_send_skb+0x2c8/0x2e4) > (ip_append_data+0x7fc/0xa80) > (udp_sendmsg+0x248/0x68c) > (inet_sendmsg+0x60/0x64) > (sock_sendmsg+0xb4/0xe4) > r4 = C3CB4960 > (sys_sendto+0xc8/0xf0) > r4 = 00000000 > (sys_socketcall+0x168/0x1f0) > (ret_fast_syscall+0x0/0x2c) This call to sock_alloc_send_skb() in ip_append_data() is not from the inlined ip_ufo_append_data(), it is here: /* The last fragment gets additional space at tail. * Note, with MSG_MORE we overallocate on fragments, * because we have no idea what fragment will be * the last. */ if (datalen == length + fraggap) alloclen += rt->u.dst.trailer_len; if (transhdrlen) { skb = sock_alloc_send_skb(sk, alloclen + hh_len + 15, (flags & MSG_DONTWAIT), &err); } else { Then, I traced a couple of paths how such a skbuff, coming down from ip_append_data() and allocated above get freed (when they do): [<c0182380>] (__kfree_skb+0x0/0x170) from [<c0182514>] (kfree_skb+0x24/0x50) r5 = C332BC00 r4 = C332BC00 [<c01824f0>] (kfree_skb+0x0/0x50) from [<bf0fac58>] (irlap_update_nr_received+0x94/0xc8 [irda]) [<bf0fabc4>] (irlap_update_nr_received+0x0/0xc8 [irda]) from [<bf0fda98>] (irlap_state_nrm_p+0x530/0x7c0 [irda]) r7 = 00000001 r6 = C0367EC0 r5 = C332BC00 r4 = 00000000 [<bf0fd568>] (irlap_state_nrm_p+0x0/0x7c0 [irda]) from [<bf0fbd90>] (irlap_do_event+0x68/0x18c [irda]) [<bf0fbd28>] (irlap_do_event+0x0/0x18c [irda]) from [<bf1008cc>] (irlap_driver_rcv+0x1f0/0xd38 [irda]) [<bf1006dc>] (irlap_driver_rcv+0x0/0xd38 [irda]) from [<c01892c0>] (netif_receive_skb+0x244/0x338) [<c018907c>] (netif_receive_skb+0x0/0x338) from [<c0189468>] (process_backlog+0xb4/0x194) [<c01893b4>] (process_backlog+0x0/0x194) from [<c01895f8>] (net_rx_action+0xb0/0x210) [<c0189548>] (net_rx_action+0x0/0x210) from [<c0042f7c>] (ksoftirqd+0x108/0x1cc) [<c0042e74>] (ksoftirqd+0x0/0x1cc) from [<c0053614>] (kthread+0x10c/0x138) [<c0053508>] (kthread+0x0/0x138) from [<c003f918>] (do_exit+0x0/0x8b0) r8 = 00000000 r7 = 00000000 r6 = 00000000 r5 = 00000000 r4 = 00000000 and [<c0182380>] (__kfree_skb+0x0/0x170) from [<c0182514>] (kfree_skb+0x24/0x50) r5 = C03909E0 r4 = C1A97400 [<c01824f0>] (kfree_skb+0x0/0x50) from [<c0199bf8>] (pfifo_fast_enqueue+0xb4/0xd0) [<c0199b44>] (pfifo_fast_enqueue+0x0/0xd0) from [<c0188c30>] (dev_queue_xmit+0x17c/0x25c) r8 = C1A2DCE0 r7 = FFFFFFF4 r6 = C3393114 r5 = C03909E0 r4 = C3393000 [<c0188ab4>] (dev_queue_xmit+0x0/0x25c) from [<c01a7c18>] (ip_output+0x150/0x254) r7 = C3717120 r6 = C03909E0 r5 = 00000000 r4 = C1A2DCE0 [<c01a7ac8>] (ip_output+0x0/0x254) from [<c01a93d0>] (ip_push_pending_frames+0x368/0x4d4) [<c01a9068>] (ip_push_pending_frames+0x0/0x4d4) from [<c01c6954>] (udp_push_pending_frames+0x14c/0x310) [<c01c6808>] (udp_push_pending_frames+0x0/0x310) from [<c01c70d8>] (udp_sendmsg+0x5c0/0x690) [<c01c6b18>] (udp_sendmsg+0x0/0x690) from [<c01ceafc>] (inet_sendmsg+0x60/0x64) [<c01cea9c>] (inet_sendmsg+0x0/0x64) from [<c017c970>] (sock_sendmsg+0xb4/0xe4) r7 = C2CEFDF4 r6 = 00000064 r5 = C2CEFEA8 r4 = C3C94080 [<c017c8bc>] (sock_sendmsg+0x0/0xe4) from [<c017dd9c>] (sys_sendto+0xc8/0xf0) r7 = 00000064 r6 = C3571580 r5 = C2CEFEC4 r4 = 00000000 [<c017dcd4>] (sys_sendto+0x0/0xf0) from [<c017e654>] (sys_socketcall+0x168/0x1f0) [<c017e4ec>] (sys_socketcall+0x0/0x1f0) from [<c001ff40>] (ret_fast_syscall+0x0/0x2c) r5 = 00415344 r4 = 00000000 I would be greatful for any hints how I can identify which skbuff's get lost and why, and where and who should free them. I am not subscribed to netdev, please keep in cc. Thanks Guennadi --------------------------------- Guennadi Liakhovetski, Ph.D. DSA Daten- und Systemtechnik GmbH Pascalstr. 28 D-52076 Aachen Germany ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <Pine.LNX.4.63.0703211130140.534-pjC7uWgG1yLsj6RlE8knwQ@public.gmane.org>]
* Re: [2.6.20-rt8] "Neighbour table overflow." [not found] ` <Pine.LNX.4.63.0703211130140.534-pjC7uWgG1yLsj6RlE8knwQ@public.gmane.org> @ 2007-03-21 11:26 ` Samuel Ortiz 2007-03-21 12:01 ` [irda-users] " Guennadi Liakhovetski 0 siblings, 1 reply; 12+ messages in thread From: Samuel Ortiz @ 2007-03-21 11:26 UTC (permalink / raw) To: gl-o/hVf8ie6tKzQB+pC5nmwQ Cc: irda-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org, davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org, netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 3/21/2007, "Guennadi Liakhovetski" <gl-o/hVf8ie6tKzQB+pC5nmwQ@public.gmane.org> wrote: >(Short recap for newly added to cc: netdev: I'm seeing an skb leak in >2.6.20 during an IrDA IrNET+ppp UDP test with periodic connection >disruptions) > >On Wed, 21 Mar 2007, Guennadi Liakhovetski wrote: > >> On Tue, 20 Mar 2007, Guennadi Liakhovetski wrote: >> >> Ok, looks like all leaked skbuffs come from ip_append_data(), like this: >> >> (sock_alloc_send_skb+0x2c8/0x2e4) >> (ip_append_data+0x7fc/0xa80) >> (udp_sendmsg+0x248/0x68c) >> (inet_sendmsg+0x60/0x64) >> (sock_sendmsg+0xb4/0xe4) >> r4 = C3CB4960 >> (sys_sendto+0xc8/0xf0) >> r4 = 00000000 >> (sys_socketcall+0x168/0x1f0) >> (ret_fast_syscall+0x0/0x2c) > >This call to sock_alloc_send_skb() in ip_append_data() is not from the >inlined ip_ufo_append_data(), it is here: > > /* The last fragment gets additional space at tail. > * Note, with MSG_MORE we overallocate on fragments, > * because we have no idea what fragment will be > * the last. > */ > if (datalen == length + fraggap) > alloclen += rt->u.dst.trailer_len; > > if (transhdrlen) { > skb = sock_alloc_send_skb(sk, > alloclen + hh_len + 15, > (flags & MSG_DONTWAIT), &err); > } else { > >Then, I traced a couple of paths how such a skbuff, coming down from >ip_append_data() and allocated above get freed (when they do): > >[<c0182380>] (__kfree_skb+0x0/0x170) from [<c0182514>] (kfree_skb+0x24/0x50) > r5 = C332BC00 r4 = C332BC00 >[<c01824f0>] (kfree_skb+0x0/0x50) from [<bf0fac58>] (irlap_update_nr_received+0x94/0xc8 [irda]) >[<bf0fabc4>] (irlap_update_nr_received+0x0/0xc8 [irda]) from [<bf0fda98>] (irlap_state_nrm_p+0x530/0x7c0 [irda]) > r7 = 00000001 r6 = C0367EC0 r5 = C332BC00 r4 = 00000000 >[<bf0fd568>] (irlap_state_nrm_p+0x0/0x7c0 [irda]) from [<bf0fbd90>] (irlap_do_event+0x68/0x18c [irda]) >[<bf0fbd28>] (irlap_do_event+0x0/0x18c [irda]) from [<bf1008cc>] (irlap_driver_rcv+0x1f0/0xd38 [irda]) >[<bf1006dc>] (irlap_driver_rcv+0x0/0xd38 [irda]) from [<c01892c0>] (netif_receive_skb+0x244/0x338) >[<c018907c>] (netif_receive_skb+0x0/0x338) from [<c0189468>] (process_backlog+0xb4/0x194) >[<c01893b4>] (process_backlog+0x0/0x194) from [<c01895f8>] (net_rx_action+0xb0/0x210) >[<c0189548>] (net_rx_action+0x0/0x210) from [<c0042f7c>] (ksoftirqd+0x108/0x1cc) >[<c0042e74>] (ksoftirqd+0x0/0x1cc) from [<c0053614>] (kthread+0x10c/0x138) >[<c0053508>] (kthread+0x0/0x138) from [<c003f918>] (do_exit+0x0/0x8b0) > r8 = 00000000 r7 = 00000000 r6 = 00000000 r5 = 00000000 > r4 = 00000000 This is the IrDA RX path, so I doubt the corresponding skb ever got through ip_append_data(). The skb was allocated by your HW driver upon packet reception, then queued to the net input queue, and finally passed to the IrDA stack. Are you sure your tracing is correct ? >and > >[<c0182380>] (__kfree_skb+0x0/0x170) from [<c0182514>] (kfree_skb+0x24/0x50) > r5 = C03909E0 r4 = C1A97400 >[<c01824f0>] (kfree_skb+0x0/0x50) from [<c0199bf8>] (pfifo_fast_enqueue+0xb4/0xd0) >[<c0199b44>] (pfifo_fast_enqueue+0x0/0xd0) from [<c0188c30>] (dev_queue_xmit+0x17c/0x25c) > r8 = C1A2DCE0 r7 = FFFFFFF4 r6 = C3393114 r5 = C03909E0 > r4 = C3393000 >[<c0188ab4>] (dev_queue_xmit+0x0/0x25c) from [<c01a7c18>] (ip_output+0x150/0x254) > r7 = C3717120 r6 = C03909E0 r5 = 00000000 r4 = C1A2DCE0 >[<c01a7ac8>] (ip_output+0x0/0x254) from [<c01a93d0>] (ip_push_pending_frames+0x368/0x4d4) >[<c01a9068>] (ip_push_pending_frames+0x0/0x4d4) from [<c01c6954>] (udp_push_pending_frames+0x14c/0x310) >[<c01c6808>] (udp_push_pending_frames+0x0/0x310) from [<c01c70d8>] (udp_sendmsg+0x5c0/0x690) >[<c01c6b18>] (udp_sendmsg+0x0/0x690) from [<c01ceafc>] (inet_sendmsg+0x60/0x64) >[<c01cea9c>] (inet_sendmsg+0x0/0x64) from [<c017c970>] (sock_sendmsg+0xb4/0xe4) > r7 = C2CEFDF4 r6 = 00000064 r5 = C2CEFEA8 r4 = C3C94080 >[<c017c8bc>] (sock_sendmsg+0x0/0xe4) from [<c017dd9c>] (sys_sendto+0xc8/0xf0) > r7 = 00000064 r6 = C3571580 r5 = C2CEFEC4 r4 = 00000000 >[<c017dcd4>] (sys_sendto+0x0/0xf0) from [<c017e654>] (sys_socketcall+0x168/0x1f0) >[<c017e4ec>] (sys_socketcall+0x0/0x1f0) from [<c001ff40>] (ret_fast_syscall+0x0/0x2c) > r5 = 00415344 r4 = 00000000 This one is on the TX path, yes. However it got dropped and freed because your TX queue was full. Any idea in which situation does that happen ? >I would be greatful for any hints how I can identify which skbuff's get >lost and why, and where and who should free them. You're seeing skb leaks when cutting the ppp connection periodically, right ? Do you such leaks when not cutting the ppp connection ? If not, could you send me a kernel trace (with irda debug set to 5) when the ppp connection is shut down ? It would narrow down the problem a bit. I'm quite sure the leak is in the IrDA code rather than in the ppp or ipv4 one, hence the need for full irda debug... Cheers, Samuel. >I am not subscribed to netdev, please keep in cc. > >Thanks >Guennadi >--------------------------------- >Guennadi Liakhovetski, Ph.D. >DSA Daten- und Systemtechnik GmbH >Pascalstr. 28 >D-52076 Aachen >Germany ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [irda-users] [2.6.20-rt8] "Neighbour table overflow." 2007-03-21 11:26 ` Samuel Ortiz @ 2007-03-21 12:01 ` Guennadi Liakhovetski 2007-03-23 12:14 ` Guennadi Liakhovetski 0 siblings, 1 reply; 12+ messages in thread From: Guennadi Liakhovetski @ 2007-03-21 12:01 UTC (permalink / raw) To: Samuel Ortiz Cc: irda-users@lists.sourceforge.net, davem@davemloft.net, netdev@vger.kernel.org [-- Attachment #1: Type: TEXT/PLAIN, Size: 4848 bytes --] On Wed, 21 Mar 2007, Samuel Ortiz wrote: > On 3/21/2007, "Guennadi Liakhovetski" <gl@dsa-ac.de> wrote: > >> [<c0182380>] (__kfree_skb+0x0/0x170) from [<c0182514>] (kfree_skb+0x24/0x50) >> r5 = C332BC00 r4 = C332BC00 >> [<c01824f0>] (kfree_skb+0x0/0x50) from [<bf0fac58>] (irlap_update_nr_received+0x94/0xc8 [irda]) >> [<bf0fabc4>] (irlap_update_nr_received+0x0/0xc8 [irda]) from [<bf0fda98>] (irlap_state_nrm_p+0x530/0x7c0 [irda]) >> r7 = 00000001 r6 = C0367EC0 r5 = C332BC00 r4 = 00000000 >> [<bf0fd568>] (irlap_state_nrm_p+0x0/0x7c0 [irda]) from [<bf0fbd90>] (irlap_do_event+0x68/0x18c [irda]) >> [<bf0fbd28>] (irlap_do_event+0x0/0x18c [irda]) from [<bf1008cc>] (irlap_driver_rcv+0x1f0/0xd38 [irda]) >> [<bf1006dc>] (irlap_driver_rcv+0x0/0xd38 [irda]) from [<c01892c0>] (netif_receive_skb+0x244/0x338) >> [<c018907c>] (netif_receive_skb+0x0/0x338) from [<c0189468>] (process_backlog+0xb4/0x194) >> [<c01893b4>] (process_backlog+0x0/0x194) from [<c01895f8>] (net_rx_action+0xb0/0x210) >> [<c0189548>] (net_rx_action+0x0/0x210) from [<c0042f7c>] (ksoftirqd+0x108/0x1cc) >> [<c0042e74>] (ksoftirqd+0x0/0x1cc) from [<c0053614>] (kthread+0x10c/0x138) >> [<c0053508>] (kthread+0x0/0x138) from [<c003f918>] (do_exit+0x0/0x8b0) >> r8 = 00000000 r7 = 00000000 r6 = 00000000 r5 = 00000000 >> r4 = 00000000 > This is the IrDA RX path, so I doubt the corresponding skb ever got > through > ip_append_data(). The skb was allocated by your HW driver upon packet > reception, then queued to the net input queue, and finally passed to the > IrDA stack. Are you sure your tracing is correct ? I've added a bitfield to struct sk_buff: __u8 pkt_type:3, fclone:2, - ipvs_property:1; + ipvs_property:1, + trace_dbg:1; and I set itin ip_append_data() before sock_alloc_send_skb() is called. Then I check this bit in __kfree_skb(). The bit is set to 0 in __alloc_skb per memset(skb, 0, offsetof(struct sk_buff, truesize)); So, if it was a freshly allocated skb, the tracing should be correct. >> [<c0182380>] (__kfree_skb+0x0/0x170) from [<c0182514>] (kfree_skb+0x24/0x50) >> r5 = C03909E0 r4 = C1A97400 >> [<c01824f0>] (kfree_skb+0x0/0x50) from [<c0199bf8>] (pfifo_fast_enqueue+0xb4/0xd0) >> [<c0199b44>] (pfifo_fast_enqueue+0x0/0xd0) from [<c0188c30>] (dev_queue_xmit+0x17c/0x25c) >> r8 = C1A2DCE0 r7 = FFFFFFF4 r6 = C3393114 r5 = C03909E0 >> r4 = C3393000 >> [<c0188ab4>] (dev_queue_xmit+0x0/0x25c) from [<c01a7c18>] (ip_output+0x150/0x254) >> r7 = C3717120 r6 = C03909E0 r5 = 00000000 r4 = C1A2DCE0 >> [<c01a7ac8>] (ip_output+0x0/0x254) from [<c01a93d0>] (ip_push_pending_frames+0x368/0x4d4) >> [<c01a9068>] (ip_push_pending_frames+0x0/0x4d4) from [<c01c6954>] (udp_push_pending_frames+0x14c/0x310) >> [<c01c6808>] (udp_push_pending_frames+0x0/0x310) from [<c01c70d8>] (udp_sendmsg+0x5c0/0x690) >> [<c01c6b18>] (udp_sendmsg+0x0/0x690) from [<c01ceafc>] (inet_sendmsg+0x60/0x64) >> [<c01cea9c>] (inet_sendmsg+0x0/0x64) from [<c017c970>] (sock_sendmsg+0xb4/0xe4) >> r7 = C2CEFDF4 r6 = 00000064 r5 = C2CEFEA8 r4 = C3C94080 >> [<c017c8bc>] (sock_sendmsg+0x0/0xe4) from [<c017dd9c>] (sys_sendto+0xc8/0xf0) >> r7 = 00000064 r6 = C3571580 r5 = C2CEFEC4 r4 = 00000000 >> [<c017dcd4>] (sys_sendto+0x0/0xf0) from [<c017e654>] (sys_socketcall+0x168/0x1f0) >> [<c017e4ec>] (sys_socketcall+0x0/0x1f0) from [<c001ff40>] (ret_fast_syscall+0x0/0x2c) >> r5 = 00415344 r4 = 00000000 > This one is on the TX path, yes. However it got dropped and freed because > your TX queue was full. Any idea in which situation does that happen ? No. I can only describe what communication is running while ppp is disrupted - it's just some sort of udp mirror test - udp packets are sent one after another and mirrored back. >> I would be greatful for any hints how I can identify which skbuff's get >> lost and why, and where and who should free them. > You're seeing skb leaks when cutting the ppp connection periodically, > right ? Right > Do you such leaks when not cutting the ppp connection ? Looks like I don't. > If not, could you send me a kernel trace (with irda debug set to 5) when > the ppp connection is shut down ? It would narrow down the problem a bit. Attached bzipped... It's a complete log starting from irda up, running udp packets over the link, closing the link and bringing irda completely down. > I'm quite sure the leak is in the IrDA code rather than in the ppp or > ipv4 one, hence the need for full irda debug... Likely, yes. Why I am asking netdev guys for help is just because I have very little idea about the data flow in the network stack(s). And the more experienced eyes we have on the problem the sooner we might solve it, I hope... Thanks Guennadi --------------------------------- Guennadi Liakhovetski, Ph.D. DSA Daten- und Systemtechnik GmbH Pascalstr. 28 D-52076 Aachen Germany [-- Attachment #2: Type: APPLICATION/octet-stream, Size: 11010 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [irda-users] [2.6.20-rt8] "Neighbour table overflow." 2007-03-21 12:01 ` [irda-users] " Guennadi Liakhovetski @ 2007-03-23 12:14 ` Guennadi Liakhovetski 2007-03-24 0:36 ` Samuel Ortiz ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Guennadi Liakhovetski @ 2007-03-23 12:14 UTC (permalink / raw) To: Samuel Ortiz Cc: irda-users@lists.sourceforge.net, davem@davemloft.net, netdev@vger.kernel.org, Paul Mackerras On Wed, 21 Mar 2007, Guennadi Liakhovetski wrote: > On Wed, 21 Mar 2007, Samuel Ortiz wrote: > >> I'm quite sure the leak is in the IrDA code rather than in the ppp or >> ipv4 one, hence the need for full irda debug... Well, looks like you were wrong, Samuel. Below is a patch that fixes ONE sk_buff leak (maintainer added to cc: hi, Paul:-)). Still investigating if there are more there. Thanks Guennadi --------------------------------- Guennadi Liakhovetski, Ph.D. DSA Daten- und Systemtechnik GmbH Pascalstr. 28 D-52076 Aachen Germany Don't leak an sk_buff on interface destruction. Signed-off-by: G. Liakhovetski <gl@dsa-ac.de> --- a/drivers/net/ppp_generic.c 2007-03-23 13:04:04.000000000 +0100 +++ b/drivers/net/ppp_generic.c 2007-03-23 13:05:29.000000000 +0100 @@ -2544,6 +2544,9 @@ ppp->active_filter = NULL; #endif /* CONFIG_PPP_FILTER */ + if (ppp->xmit_pending) + kfree_skb(ppp->xmit_pending); + kfree(ppp); } ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [irda-users] [2.6.20-rt8] "Neighbour table overflow." 2007-03-23 12:14 ` Guennadi Liakhovetski @ 2007-03-24 0:36 ` Samuel Ortiz 2007-03-26 8:22 ` Guennadi Liakhovetski 2007-03-25 5:10 ` David Miller 2007-03-26 1:01 ` [irda-users] " Paul Mackerras 2 siblings, 1 reply; 12+ messages in thread From: Samuel Ortiz @ 2007-03-24 0:36 UTC (permalink / raw) To: Guennadi Liakhovetski Cc: irda-users@lists.sourceforge.net, Paul Mackerras, davem@davemloft.net, netdev@vger.kernel.org On Fri, Mar 23, 2007 at 01:14:43PM +0100, Guennadi Liakhovetski wrote: > On Wed, 21 Mar 2007, Guennadi Liakhovetski wrote: > > > On Wed, 21 Mar 2007, Samuel Ortiz wrote: > > > >> I'm quite sure the leak is in the IrDA code rather than in the ppp or > >> ipv4 one, hence the need for full irda debug... > > Well, looks like you were wrong, Samuel. Heh, it's good to be wrong sometimes :-) > Below is a patch that fixes ONE > sk_buff leak (maintainer added to cc: hi, Paul:-)). Still investigating if > there are more there. Are you still seeing the skb cache growing with your fix ? Cheers, Samuel. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [irda-users] [2.6.20-rt8] "Neighbour table overflow." 2007-03-24 0:36 ` Samuel Ortiz @ 2007-03-26 8:22 ` Guennadi Liakhovetski 0 siblings, 0 replies; 12+ messages in thread From: Guennadi Liakhovetski @ 2007-03-26 8:22 UTC (permalink / raw) To: Samuel Ortiz Cc: irda-users@lists.sourceforge.net, Paul Mackerras, davem@davemloft.net, netdev@vger.kernel.org On Sat, 24 Mar 2007, Samuel Ortiz wrote: > On Fri, Mar 23, 2007 at 01:14:43PM +0100, Guennadi Liakhovetski wrote: > >> Below is a patch that fixes ONE >> sk_buff leak (maintainer added to cc: hi, Paul:-)). Still investigating if >> there are more there. > Are you still seeing the skb cache growing with your fix ? No, running stable so far. Thanks Guennadi --------------------------------- Guennadi Liakhovetski, Ph.D. DSA Daten- und Systemtechnik GmbH Pascalstr. 28 D-52076 Aachen Germany ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [irda-users] [2.6.20-rt8] "Neighbour table overflow." 2007-03-23 12:14 ` Guennadi Liakhovetski 2007-03-24 0:36 ` Samuel Ortiz @ 2007-03-25 5:10 ` David Miller [not found] ` <20070324.221034.45741983.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> 2007-03-26 1:02 ` Paul Mackerras 2007-03-26 1:01 ` [irda-users] " Paul Mackerras 2 siblings, 2 replies; 12+ messages in thread From: David Miller @ 2007-03-25 5:10 UTC (permalink / raw) To: gl; +Cc: samuel, irda-users, netdev, paulus From: Guennadi Liakhovetski <gl@dsa-ac.de> Date: Fri, 23 Mar 2007 13:14:43 +0100 (CET) > On Wed, 21 Mar 2007, Guennadi Liakhovetski wrote: > > > On Wed, 21 Mar 2007, Samuel Ortiz wrote: > > > >> I'm quite sure the leak is in the IrDA code rather than in the ppp or > >> ipv4 one, hence the need for full irda debug... > > Well, looks like you were wrong, Samuel. Below is a patch that fixes ONE > sk_buff leak (maintainer added to cc: hi, Paul:-)). Still investigating if > there are more there. This is strange. The PPP generic layer seems to be very careful about it's handling of the ->xmit_pending packet. When a packet is added to ->xmit_pending, immediately ppp_push() is called, and ppp_push() gives the packet to the channels xmit function, and unless the xmit function returns zero the ->xmit_pending is reset to NULL because non-zero return from the channel xmit functions means that the driver took the packet. Now I checked irnet_ppp.c, which is the driver under scrutiny here, and it never ever returns zero, under any circumstance, it always return one. So the ->xmit_pending should always be NULL'd out by ppp_push(). There is some funny BLOCK_WHEN_CONNECT code, which will return 0 in certain cases, but that define it never set during the build. Nevermind... that code does get enabled. :( This looks like it might be a bug, perhaps you can only return zero from the transmit function when your queue really is full and you plan to wake things up properly when space appears (via ppp_output_wakeup()). You can't return 0 because of an event which might never occur, that's what makes ->xmit_pending get stuck and leak. I'm really surprised this leak doesn't trigger already via the ppp_synctty.c and ppp_async.c drivers, perhaps they do something to make sure the transmitter gets purged properly when unregistering and therefore ->xmit_pending does not get left non-NULL at unregister time. ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20070324.221034.45741983.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>]
* Re: [2.6.20-rt8] "Neighbour table overflow." [not found] ` <20070324.221034.45741983.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> @ 2007-03-25 23:50 ` Samuel Ortiz 2007-03-26 1:36 ` [irda-users] " David Miller 0 siblings, 1 reply; 12+ messages in thread From: Samuel Ortiz @ 2007-03-25 23:50 UTC (permalink / raw) To: David Miller Cc: gl-o/hVf8ie6tKzQB+pC5nmwQ, netdev-u79uwXL29TY76Z2rM5mHXA, paulus-eUNUBHrolfbYtjvyW6yDsg, irda-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f On Sat, Mar 24, 2007 at 10:10:34PM -0700, David Miller wrote: > From: Guennadi Liakhovetski <gl-o/hVf8ie6tKzQB+pC5nmwQ@public.gmane.org> > Date: Fri, 23 Mar 2007 13:14:43 +0100 (CET) > > > On Wed, 21 Mar 2007, Guennadi Liakhovetski wrote: > > > > > On Wed, 21 Mar 2007, Samuel Ortiz wrote: > > > > > >> I'm quite sure the leak is in the IrDA code rather than in the ppp or > > >> ipv4 one, hence the need for full irda debug... > > > > Well, looks like you were wrong, Samuel. Below is a patch that fixes ONE > > sk_buff leak (maintainer added to cc: hi, Paul:-)). Still investigating if > > there are more there. > > This is strange. > > The PPP generic layer seems to be very careful about it's handling of > the ->xmit_pending packet. > > When a packet is added to ->xmit_pending, immediately ppp_push() is > called, and ppp_push() gives the packet to the channels xmit function, > and unless the xmit function returns zero the ->xmit_pending is reset > to NULL because non-zero return from the channel xmit functions means > that the driver took the packet. > > Now I checked irnet_ppp.c, which is the driver under scrutiny here, > and it never ever returns zero, under any circumstance, it always > return one. > > So the ->xmit_pending should always be NULL'd out by ppp_push(). > > There is some funny BLOCK_WHEN_CONNECT code, which will return > 0 in certain cases, but that define it never set during the > build. > > Nevermind... that code does get enabled. :( It is enabled, yes, from net/irda/irnet/irnet.h It seems Jean added this option to make the connectdelay 0 pppd option working. > This looks like it might be a bug, perhaps you can only return zero > from the transmit function when your queue really is full and you plan > to wake things up properly when space appears (via > ppp_output_wakeup()). You can't return 0 because of an event which > might never occur, that's what makes ->xmit_pending get stuck and > leak. I still think Guennadi's fix is correct even if you return 0 from the TX function only when you're running out of space. If we unregister the ppp interface before we get a chance to call ppp_output_wake(), then we'll leak an xmit_pending skb. I guess Paul will decide here. If he rejects Guennadi's patch, I'll come up with some IrDA patch so that we free our pending skb from the irnet code, or try somehow not to block the PPP tx queue while connecting. Cheers, Samuel. ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [irda-users] [2.6.20-rt8] "Neighbour table overflow." 2007-03-25 23:50 ` Samuel Ortiz @ 2007-03-26 1:36 ` David Miller 0 siblings, 0 replies; 12+ messages in thread From: David Miller @ 2007-03-26 1:36 UTC (permalink / raw) To: samuel; +Cc: gl, irda-users, paulus, netdev From: Samuel Ortiz <samuel@sortiz.org> Date: Mon, 26 Mar 2007 02:50:59 +0300 > I still think Guennadi's fix is correct even if you return 0 from the TX > function only when you're running out of space. If we unregister the ppp > interface before we get a chance to call ppp_output_wake(), then we'll leak > an xmit_pending skb. I agree, especially after thinking about this some more. It's just really weird that we've never hit this before, since this logic has been there forever, that's all :-) ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [irda-users] [2.6.20-rt8] "Neighbour table overflow." 2007-03-25 5:10 ` David Miller [not found] ` <20070324.221034.45741983.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> @ 2007-03-26 1:02 ` Paul Mackerras [not found] ` <17927.7070.803210.600520-UYQwCShxghk5kJ7NmlRacFaTQe2KTcn/@public.gmane.org> 1 sibling, 1 reply; 12+ messages in thread From: Paul Mackerras @ 2007-03-26 1:02 UTC (permalink / raw) To: David Miller; +Cc: gl, samuel, irda-users, netdev David Miller writes: > The PPP generic layer seems to be very careful about it's handling of > the ->xmit_pending packet. Mostly, but I think that this is a genuine leak. > I'm really surprised this leak doesn't trigger already via the > ppp_synctty.c and ppp_async.c drivers, perhaps they do something to > make sure the transmitter gets purged properly when unregistering and > therefore ->xmit_pending does not get left non-NULL at unregister > time. At least for ppp_async, I think what saves us is that pppd does a TCFLSH ioctl when shutting down the link, which calls ppp_async_flush_output, which calls ppp_output_wakeup. Paul. ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <17927.7070.803210.600520-UYQwCShxghk5kJ7NmlRacFaTQe2KTcn/@public.gmane.org>]
* Re: [2.6.20-rt8] "Neighbour table overflow." [not found] ` <17927.7070.803210.600520-UYQwCShxghk5kJ7NmlRacFaTQe2KTcn/@public.gmane.org> @ 2007-03-26 1:37 ` David Miller 0 siblings, 0 replies; 12+ messages in thread From: David Miller @ 2007-03-26 1:37 UTC (permalink / raw) To: paulus-eUNUBHrolfbYtjvyW6yDsg Cc: gl-o/hVf8ie6tKzQB+pC5nmwQ, samuel-jcdQHdrhKHMdnm+yROfE0A, netdev-u79uwXL29TY76Z2rM5mHXA, irda-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f From: Paul Mackerras <paulus-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org> Date: Mon, 26 Mar 2007 11:02:22 +1000 > David Miller writes: > > > The PPP generic layer seems to be very careful about it's handling of > > the ->xmit_pending packet. > > Mostly, but I think that this is a genuine leak. > > > I'm really surprised this leak doesn't trigger already via the > > ppp_synctty.c and ppp_async.c drivers, perhaps they do something to > > make sure the transmitter gets purged properly when unregistering and > > therefore ->xmit_pending does not get left non-NULL at unregister > > time. > > At least for ppp_async, I think what saves us is that pppd does a > TCFLSH ioctl when shutting down the link, which calls > ppp_async_flush_output, which calls ppp_output_wakeup. Ok, that makes sense, I'll apply the leak fix for sure especially since I even have your ACK now :-) Thanks for reviewing Paul. ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [irda-users] [2.6.20-rt8] "Neighbour table overflow." 2007-03-23 12:14 ` Guennadi Liakhovetski 2007-03-24 0:36 ` Samuel Ortiz 2007-03-25 5:10 ` David Miller @ 2007-03-26 1:01 ` Paul Mackerras 2 siblings, 0 replies; 12+ messages in thread From: Paul Mackerras @ 2007-03-26 1:01 UTC (permalink / raw) To: Guennadi Liakhovetski Cc: Samuel Ortiz, irda-users@lists.sourceforge.net, davem@davemloft.net, netdev@vger.kernel.org, akpm Guennadi Liakhovetski writes: > Don't leak an sk_buff on interface destruction. > > Signed-off-by: G. Liakhovetski <gl@dsa-ac.de> Acked-by: Paul Mackerras <paulus@samba.org> ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2007-03-26 8:22 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <mtHxbRRF.1174407786.3649450.samuel@sortiz.org>
[not found] ` <Pine.LNX.4.63.0703201759390.534@pcgl.dsa-ac.de>
[not found] ` <Pine.LNX.4.63.0703210849570.534@pcgl.dsa-ac.de>
[not found] ` <Pine.LNX.4.63.0703210849570.534-pjC7uWgG1yLsj6RlE8knwQ@public.gmane.org>
2007-03-21 10:51 ` [2.6.20-rt8] "Neighbour table overflow." Guennadi Liakhovetski
[not found] ` <Pine.LNX.4.63.0703211130140.534-pjC7uWgG1yLsj6RlE8knwQ@public.gmane.org>
2007-03-21 11:26 ` Samuel Ortiz
2007-03-21 12:01 ` [irda-users] " Guennadi Liakhovetski
2007-03-23 12:14 ` Guennadi Liakhovetski
2007-03-24 0:36 ` Samuel Ortiz
2007-03-26 8:22 ` Guennadi Liakhovetski
2007-03-25 5:10 ` David Miller
[not found] ` <20070324.221034.45741983.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2007-03-25 23:50 ` Samuel Ortiz
2007-03-26 1:36 ` [irda-users] " David Miller
2007-03-26 1:02 ` Paul Mackerras
[not found] ` <17927.7070.803210.600520-UYQwCShxghk5kJ7NmlRacFaTQe2KTcn/@public.gmane.org>
2007-03-26 1:37 ` David Miller
2007-03-26 1:01 ` [irda-users] " Paul Mackerras
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).