* Re: [Bugme-new] [Bug 11469] New: TUN with 1024 neighbours: ip6_dst_lookup_tail NULL crash [not found] <bug-11469-10286@http.bugzilla.kernel.org/> @ 2008-08-31 18:13 ` Andrew Morton 2008-09-05 11:41 ` Evgeniy Polyakov 2008-09-07 18:11 ` Evgeniy Polyakov 0 siblings, 2 replies; 13+ messages in thread From: Andrew Morton @ 2008-08-31 18:13 UTC (permalink / raw) To: netdev; +Cc: bugme-daemon, rdenis (switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Sun, 31 Aug 2008 09:44:36 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=11469 > > Summary: TUN with 1024 neighbours: ip6_dst_lookup_tail NULL crash > Product: Networking > Version: 2.5 > KernelVersion: 2.6.26.3 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: IPV6 > AssignedTo: yoshfuji@linux-ipv6.org > ReportedBy: rdenis@simphalempin.com > > > Latest working kernel version: none known > Earliest failing kernel version: none tested > Distribution: Debian > Hardware Environment: Intel(R) Pentium(R) 4 CPU 2.80GHz, HyperThreaded > Software Environment: SMT kernel, Debian glibc 2.7 > Problem Description: > When an IFF_TUN (/dev/net/tun) device has more than 1023 IPv6 neighbors, a > process context crash occurs. Backtrace follows: > > BUG: unable to handle kernel NULL pointer dereference at 0000001d > IP: [<f8b375bf>] :ipv6:ip6_dst_lookup_tail+0x95/0x15a > *pde = 00000000 > Oops: 0000 [#14] SMP > Modules linked in: ipx p8022 psnap llc p8023 i915 drm tun cpufreq_ondemand > binfmt_misc fuse nf_conntrack_ftp nf_conntrack_ipv6 nf_conntrack_ipv4 > nf_conntrack ipv6 snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm_oss > snd_mixer_oss snd_pcm snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event > snd_seq snd_timer snd_seq_device snd intel_agp psmouse soundcore agpgart button > processor snd_page_alloc parport_pc parport iTCO_wdt evdev pcspkr dm_mirror > dm_log dm_snapshot dm_mod sg sr_mod cdrom e100 mii ehci_hcd uhci_hcd usbcore > unix > > Pid: 9950, comm: tunload Tainted: G D (2.6.26.3 #8) > EIP: 0060:[<f8b375bf>] EFLAGS: 00210246 CPU: 0 > EIP is at ip6_dst_lookup_tail+0x95/0x15a [ipv6] > EAX: 00000000 EBX: 00000000 ECX: ef4abdac EDX: 00000000 > ESI: ef4abd3c EDI: ef64ca00 EBP: ef4abcb8 ESP: ef4abc64 > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 > Process tunload (pid: 9950, ti=ef4aa000 task=f7d45320 task.ti=ef4aa000) > Stack: ef4abd58 ef4abdac f7cc0c00 ef4abc80 f8b36918 00000000 ef673e40 ef4abcc0 > f8b381b2 00000002 f7cc0c00 ef7c3e00 f7cc0e24 00000000 ef4abca8 ef4abca8 > c030bcfa ef4abcc0 00000000 ef4abed4 00000000 ef4abcc0 f8b377d5 ef4abdbc > Call Trace: > [<f8b36918>] ? ip6_cork_release+0x2e/0x52 [ipv6] > [<f8b381b2>] ? ip6_push_pending_frames+0x1c9/0x3d9 [ipv6] > [<c030bcfa>] ? _spin_unlock_bh+0xd/0xf > [<f8b377d5>] ? ip6_dst_lookup+0xe/0x10 [ipv6] > [<f8b4c2b2>] ? rawv6_sendmsg+0x25d/0xc08 [ipv6] > [<c0151022>] ? filemap_fault+0x203/0x3d5 > [<c02e8de0>] ? inet_sendmsg+0x2e/0x50 > [<c02a24b8>] ? sock_sendmsg+0xcc/0xf0 > [<c01365f5>] ? autoremove_wake_function+0x0/0x3a > [<c0136799>] ? remove_wait_queue+0x30/0x34 > [<f8a08fbe>] ? tun_chr_aio_read+0x298/0x31f [tun] > [<c0211d67>] ? copy_from_user+0x2a/0x114 > [<c02a2790>] ? sys_sendto+0xa5/0xc5 > [<c02b3713>] ? neigh_periodic_timer+0x0/0x17a > [<c01365f5>] ? autoremove_wake_function+0x0/0x3a > [<c02a348f>] ? sys_socketcall+0x141/0x262 > [<c0102f99>] ? sysenter_past_esp+0x6a/0x91 > ======================= > Code: 22 83 fb 9b 74 37 8b 4d b0 8b 01 e8 35 96 77 c7 8b 45 b0 c7 00 00 00 00 > 00 89 d8 83 c4 48 5b 5e 5f 5d c3 8b 4d b0 8b 39 8b 47 2c <f6> 40 1d de 74 23 31 > db 89 d8 83 c4 48 5b 5e 5f 5d c3 64 a1 04 > EIP: [<f8b375bf>] ip6_dst_lookup_tail+0x95/0x15a [ipv6] SS:ESP 0068:ef4abc64 > ---[ end trace 1035c8e1d028e84b ]--- > > > Steps to reproduce: > > Test case available at: http://www.remlab.net/files/divers/tunload.c > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Bugme-new] [Bug 11469] New: TUN with 1024 neighbours: ip6_dst_lookup_tail NULL crash 2008-08-31 18:13 ` [Bugme-new] [Bug 11469] New: TUN with 1024 neighbours: ip6_dst_lookup_tail NULL crash Andrew Morton @ 2008-09-05 11:41 ` Evgeniy Polyakov 2008-09-05 16:03 ` Rémi Denis-Courmont 2008-09-07 18:11 ` Evgeniy Polyakov 1 sibling, 1 reply; 13+ messages in thread From: Evgeniy Polyakov @ 2008-09-05 11:41 UTC (permalink / raw) To: Andrew Morton; +Cc: netdev, bugme-daemon, rdenis Hi. On Sun, Aug 31, 2008 at 11:13:04AM -0700, Andrew Morton (akpm@linux-foundation.org) wrote: > > When an IFF_TUN (/dev/net/tun) device has more than 1023 IPv6 neighbors, a > > process context crash occurs. Backtrace follows: Does this problem still exist? > > BUG: unable to handle kernel NULL pointer dereference at 0000001d > > IP: [<f8b375bf>] :ipv6:ip6_dst_lookup_tail+0x95/0x15a > > *pde = 00000000 > > Oops: 0000 [#14] SMP > > Modules linked in: ipx p8022 psnap llc p8023 i915 drm tun cpufreq_ondemand > > binfmt_misc fuse nf_conntrack_ftp nf_conntrack_ipv6 nf_conntrack_ipv4 > > nf_conntrack ipv6 snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm_oss > > snd_mixer_oss snd_pcm snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event > > snd_seq snd_timer snd_seq_device snd intel_agp psmouse soundcore agpgart button > > processor snd_page_alloc parport_pc parport iTCO_wdt evdev pcspkr dm_mirror > > dm_log dm_snapshot dm_mod sg sr_mod cdrom e100 mii ehci_hcd uhci_hcd usbcore > > unix > > > > Pid: 9950, comm: tunload Tainted: G D (2.6.26.3 #8) By whom it was tainted? -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Bugme-new] [Bug 11469] New: TUN with 1024 neighbours: ip6_dst_lookup_tail NULL crash 2008-09-05 11:41 ` Evgeniy Polyakov @ 2008-09-05 16:03 ` Rémi Denis-Courmont 2008-09-05 16:37 ` Evgeniy Polyakov 0 siblings, 1 reply; 13+ messages in thread From: Rémi Denis-Courmont @ 2008-09-05 16:03 UTC (permalink / raw) To: Evgeniy Polyakov; +Cc: Andrew Morton, netdev, bugme-daemon Le vendredi 5 septembre 2008 14:41:50 Evgeniy Polyakov, vous avez écrit : > Hi. > > On Sun, Aug 31, 2008 at 11:13:04AM -0700, Andrew Morton (akpm@linux-foundation.org) wrote: > > > When an IFF_TUN (/dev/net/tun) device has more than 1023 IPv6 > > > neighbors, a process context crash occurs. Backtrace follows: > > Does this problem still exist? Yes. With 2.6.27-rc5-b380b0d4f7dffcc235c0facefa537d4655619101, I get this: tun: Universal TUN/TAP device driver, 1.6 tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com> tun0: Disabled Privacy Extensions BUG: unable to handle kernel NULL pointer dereference at 0000001d IP: [<f8b205a0>] :ipv6:ip6_dst_lookup_tail+0x9e/0x166 *pde = 00000000 Oops: 0000 [#1] SMP Modules linked in: tun fuse nf_conntrack_ftp nf_conntrack_ipv6 nf_conntrack_ipv4 nf_conntrack ipv6 snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_se q_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device snd soundcore intel_agp agpgart snd_page_alloc psmouse iTCO_wdt evdev button parport_pc proce ssor parport pcspkr dm_mirror dm_log dm_snapshot sg sr_mod cdrom e100 ehci_hcd uhci_hcd usbcore unix Pid: 2313, comm: tunload Not tainted (2.6.27-rc5-00132-gb380b0d #13) EIP: 0060:[<f8b205a0>] EFLAGS: 00010246 CPU: 0 EIP is at ip6_dst_lookup_tail+0x9e/0x166 [ipv6] EAX: 00000000 EBX: f7d7fd38 ECX: f678e800 EDX: f7cb9600 ESI: f67aa400 EDI: 00000000 EBP: f7d7fcb4 ESP: f7d7fc5c DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process tunload (pid: 2313, ti=f7d7e000 task=f66a3b60 task.ti=f7d7e000) Stack: 00000000 f7d7fd54 f7d7fda8 f67aa400 f7d7fc7c f8b1f8e8 00000000 f6528e40 f7d7fcbc f8b211b2 f7d7fdb8 f67aa400 f7cb8900 f67aa624 00000000 00000246 f7d7fca4 c03124da f7d7fcbc 00000000 f7d7fed0 00000000 f7d7fcbc f8b207bd Call Trace: [<f8b1f8e8>] ? ip6_cork_release+0x2e/0x52 [ipv6] [<f8b211b2>] ? ip6_push_pending_frames+0x1c9/0x3d9 [ipv6] [<c03124da>] ? _spin_unlock_bh+0xd/0xf [<f8b207bd>] ? ip6_dst_lookup+0xe/0x10 [ipv6] [<f8b353fa>] ? rawv6_sendmsg+0x25d/0xc08 [ipv6] [<c01401e6>] ? clockevents_program_event+0x92/0x119 [<c02eed00>] ? inet_sendmsg+0x2e/0x50 [<c02a7bad>] ? sock_sendmsg+0xcc/0xf0 [<c011d3a9>] ? find_busiest_group+0x160/0x7a0 [<c01372a5>] ? autoremove_wake_function+0x0/0x3a [<c02ad2ba>] ? __kfree_skb+0x31/0x76 [<c02ad2ba>] ? __kfree_skb+0x31/0x76 [<c0137449>] ? remove_wait_queue+0x30/0x34 [<c021ba5f>] ? copy_from_user+0x2a/0x114 [<c02a7e85>] ? sys_sendto+0xa5/0xc5 [<c01401e6>] ? clockevents_program_event+0x92/0x119 [<c01372a5>] ? autoremove_wake_function+0x0/0x3a [<c02a8d31>] ? sys_socketcall+0x176/0x295 [<c0103061>] ? sysenter_do_call+0x12/0x25 ======================= Code: 22 83 ff 9b 74 37 8b 55 b0 8b 02 e8 24 63 79 c7 8b 4d b0 c7 01 00 00 00 00 89 f8 83 c4 4c 5b 5e 5f 5d c3 8b 45 b0 8b 10 8b 42 2c <f6> 40 1d de 74 23 31 ff 89 f8 83 c4 4c 5b 5e 5f 5d c3 64 a1 04 EIP: [<f8b205a0>] ip6_dst_lookup_tail+0x9e/0x166 [ipv6] SS:ESP 0068:f7d7fc5c ---[ end trace fd93373c6fb8880e ]--- > > > Pid: 9950, comm: tunload Tainted: G D (2.6.26.3 #8) > > By whom it was tainted? Its own self. I just failed to copy the very first crash trace. Regards, -- Rémi Denis-Courmont http://www.remlab.net/ ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Bugme-new] [Bug 11469] New: TUN with 1024 neighbours: ip6_dst_lookup_tail NULL crash 2008-09-05 16:03 ` Rémi Denis-Courmont @ 2008-09-05 16:37 ` Evgeniy Polyakov 0 siblings, 0 replies; 13+ messages in thread From: Evgeniy Polyakov @ 2008-09-05 16:37 UTC (permalink / raw) To: Rémi Denis-Courmont; +Cc: Andrew Morton, netdev, bugme-daemon Hi. On Fri, Sep 05, 2008 at 07:03:38PM +0300, Rémi Denis-Courmont (rdenis@simphalempin.com) wrote: > > Does this problem still exist? > > Yes. With 2.6.27-rc5-b380b0d4f7dffcc235c0facefa537d4655619101, I get this: I will try to work it out this weekend, thanks for the info. > > > > Pid: 9950, comm: tunload Tainted: G D (2.6.26.3 #8) > > > > By whom it was tainted? > > Its own self. I just failed to copy the very first crash trace. Likely it will not help, since subsequent crashes are results of the first one and rarely contain good info, but dump you have presented is not tainted, so problem is not in some other place. -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Bugme-new] [Bug 11469] New: TUN with 1024 neighbours: ip6_dst_lookup_tail NULL crash 2008-08-31 18:13 ` [Bugme-new] [Bug 11469] New: TUN with 1024 neighbours: ip6_dst_lookup_tail NULL crash Andrew Morton 2008-09-05 11:41 ` Evgeniy Polyakov @ 2008-09-07 18:11 ` Evgeniy Polyakov 2008-09-07 18:19 ` Rémi Denis-Courmont 1 sibling, 1 reply; 13+ messages in thread From: Evgeniy Polyakov @ 2008-09-07 18:11 UTC (permalink / raw) To: Andrew Morton; +Cc: netdev, bugme-daemon, rdenis Hi. On Sun, Aug 31, 2008 at 11:13:04AM -0700, Andrew Morton (akpm@linux-foundation.org) wrote: > > BUG: unable to handle kernel NULL pointer dereference at 0000001d > > IP: [<f8b375bf>] :ipv6:ip6_dst_lookup_tail+0x95/0x15a > > *pde = 00000000 > > Oops: 0000 [#14] SMP Attached patch fixes the problem. Since dst entry is allowed not to have neighbour entry, flush it just like with incomplete one. This drops performance of your application with more than 1024 neighbours to 1024 messages, to fix it you should tune ipv6 routing parameters (gc intervals, gc threshold, maximum number of entries and so on). There may be another problem with perfomance though, at least I was able to bump it 10 times with different settings, but still two times smaller than with 4k neighbours. Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index 8b67ca0..582dde5 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -934,7 +934,7 @@ static int ip6_dst_lookup_tail(struct sock *sk, * dst entry and replace it instead with the * dst entry of the nexthop router */ - if (!((*dst)->neighbour->nud_state & NUD_VALID)) { + if (!(*dst)->neighbour || !((*dst)->neighbour->nud_state & NUD_VALID)) { struct inet6_ifaddr *ifp; struct flowi fl_gw; int redirect; -- Evgeniy Polyakov ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [Bugme-new] [Bug 11469] New: TUN with 1024 neighbours: ip6_dst_lookup_tail NULL crash 2008-09-07 18:11 ` Evgeniy Polyakov @ 2008-09-07 18:19 ` Rémi Denis-Courmont 2008-09-08 20:15 ` David Miller 0 siblings, 1 reply; 13+ messages in thread From: Rémi Denis-Courmont @ 2008-09-07 18:19 UTC (permalink / raw) To: Evgeniy Polyakov; +Cc: Andrew Morton, netdev, bugme-daemon Le dimanche 7 septembre 2008 21:11:09 Evgeniy Polyakov, vous avez écrit : > Since dst entry is allowed not to have neighbour entry, flush it just > like with incomplete one. This drops performance of your application > with more than 1024 neighbours to 1024 messages, to fix it you should > tune ipv6 routing parameters (gc intervals, gc threshold, maximum number > of entries and so on). There may be another problem with perfomance > though, at least I was able to bump it 10 times with different settings, > but still two times smaller than with 4k neighbours. That looks like a trivial local DoS against the IPv6 stack though? Especially in the case that the interface has IFF_NOARP, that seems like a weird limitation. Oh well... -- Rémi Denis-Courmont http://www.remlab.net/ ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Bugme-new] [Bug 11469] New: TUN with 1024 neighbours: ip6_dst_lookup_tail NULL crash 2008-09-07 18:19 ` Rémi Denis-Courmont @ 2008-09-08 20:15 ` David Miller 2008-09-08 20:34 ` Evgeniy Polyakov 2008-09-09 11:32 ` Neil Horman 0 siblings, 2 replies; 13+ messages in thread From: David Miller @ 2008-09-08 20:15 UTC (permalink / raw) To: rdenis; +Cc: johnpol, akpm, netdev, bugme-daemon From: Rémi Denis-Courmont <rdenis@simphalempin.com> Date: Sun, 7 Sep 2008 21:19:49 +0300 > Le dimanche 7 septembre 2008 21:11:09 Evgeniy Polyakov, vous avez écrit : > > Since dst entry is allowed not to have neighbour entry, flush it just > > like with incomplete one. This drops performance of your application > > with more than 1024 neighbours to 1024 messages, to fix it you should > > tune ipv6 routing parameters (gc intervals, gc threshold, maximum number > > of entries and so on). There may be another problem with perfomance > > though, at least I was able to bump it 10 times with different settings, > > but still two times smaller than with 4k neighbours. > > That looks like a trivial local DoS against the IPv6 stack though? > > Especially in the case that the interface has IFF_NOARP, that seems like a > weird limitation. Oh well... It's less of a DoS than the NULL pointer deref we get now, isn't it? :-) But I don't like this patch for several reasons: 1) Slapping on a NULL check in response to a OOPS at that exact location is usually a very big red flag, and deserves high scrutiny instead of blind acceptance. 2) Looking at the indentation of this DAD code block (it's all one tab too much) it's obviously a very shitty cut and paste job. If the coding style was too difficult to get right, what does this say about that change that brought the code here, semantically? :-/ This means we should figure out how this code got to this place, and what kind of invariants existed at the old location that might make this dst->neighbour dereference valid, and what implications there are for the fact that it can now be NULL. Really, we really need to understand much more deeply this situation. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Bugme-new] [Bug 11469] New: TUN with 1024 neighbours: ip6_dst_lookup_tail NULL crash 2008-09-08 20:15 ` David Miller @ 2008-09-08 20:34 ` Evgeniy Polyakov 2008-09-09 10:56 ` Neil Horman 2008-09-09 11:32 ` Neil Horman 1 sibling, 1 reply; 13+ messages in thread From: Evgeniy Polyakov @ 2008-09-08 20:34 UTC (permalink / raw) To: David Miller; +Cc: rdenis, akpm, netdev, bugme-daemon, Neil Horman Hi David. On Mon, Sep 08, 2008 at 01:15:47PM -0700, David Miller (davem@davemloft.net) wrote: > But I don't like this patch for several reasons: > > 1) Slapping on a NULL check in response to a OOPS at that exact > location is usually a very big red flag, and deserves high scrutiny > instead of blind acceptance. > > 2) Looking at the indentation of this DAD code block (it's all one tab > too much) it's obviously a very shitty cut and paste job. If the > coding style was too difficult to get right, what does this say > about that change that brought the code here, semantically? :-/ > > This means we should figure out how this code got to this place, > and what kind of invariants existed at the old location that might > make this dst->neighbour dereference valid, and what implications > there are for the fact that it can now be NULL. > > Really, we really need to understand much more deeply this situation. Well, yes. The whole 'optimistic' dad looks a bit suspicious. I think failed dst entry without neighbour is a result of the 'static' dst entry returned by the above route lookup and previously neighbour was not used at all. This patch fixes the opps, but may be just hiding a problem, but reading how this optimistic duplicate address detection works, I see no strict requirements that returned route entry has to have neighbour, so this check actually can be a right fix. I've added Neil Horman, who created the patch 1.5 years ago, to the copy list. -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Bugme-new] [Bug 11469] New: TUN with 1024 neighbours: ip6_dst_lookup_tail NULL crash 2008-09-08 20:34 ` Evgeniy Polyakov @ 2008-09-09 10:56 ` Neil Horman 0 siblings, 0 replies; 13+ messages in thread From: Neil Horman @ 2008-09-09 10:56 UTC (permalink / raw) To: Evgeniy Polyakov; +Cc: David Miller, rdenis, akpm, netdev, bugme-daemon On Tue, Sep 09, 2008 at 12:34:18AM +0400, Evgeniy Polyakov wrote: > Hi David. > > On Mon, Sep 08, 2008 at 01:15:47PM -0700, David Miller (davem@davemloft.net) wrote: > > But I don't like this patch for several reasons: > > > > 1) Slapping on a NULL check in response to a OOPS at that exact > > location is usually a very big red flag, and deserves high scrutiny > > instead of blind acceptance. > > > > 2) Looking at the indentation of this DAD code block (it's all one tab > > too much) it's obviously a very shitty cut and paste job. If the > > coding style was too difficult to get right, what does this say > > about that change that brought the code here, semantically? :-/ > > > > This means we should figure out how this code got to this place, > > and what kind of invariants existed at the old location that might > > make this dst->neighbour dereference valid, and what implications > > there are for the fact that it can now be NULL. > > > > Really, we really need to understand much more deeply this situation. > > Well, yes. The whole 'optimistic' dad looks a bit suspicious. I think > failed dst entry without neighbour is a result of the 'static' dst entry > returned by the above route lookup and previously neighbour was not > used at all. This patch fixes the opps, but may be just hiding a > problem, but reading how this optimistic duplicate address detection > works, I see no strict requirements that returned route entry has to > have neighbour, so this check actually can be a right fix. > > I've added Neil Horman, who created the patch 1.5 years ago, to the > copy list. > I agree, while I don't normally like to fix an oops with just a NULL check, since as dave mentions, thats a pretty big red flag about not having thought through the problem, I think in this case its probably the right fix. having a dst entry without a corresponding neightbor entry is going to be a rarity to say the least, but it can happen (Evgeniy's above example, for instance, or perhaps a situation in which the arp cache was explicitly purged at just the right time from userspace). Regardless, making sure we have one by explicitly checking the pointer seems like the right thing to do. As Evgeniy noted, the Optimistic DAD code provides no explicit guarantee that we have a neigh entry for this dst. We just want to be sure we forward packets to an incomplete neighbour that is marked as optimistic to the router. We already do this dst->neighbour NULL check in many other locations (giving us precident, see ip6_output_finish and ip6_forward as just a few examples). I think its right to do the same check in the Optimistic DAD code. I just never hit it in my testing since its an uncommon case. As for the indentation, I'm pretty sure this was the only point I wrote/used this code, so I'm loathe to believe that the indentation was a cut and paste error. Unfortunately, that just implies it was my own stupidity and poor eyesight that created that uglynesss. Apologies Dave, I'll submit a patch shortly to correct that. Thanks & Regards Neil > -- > Evgeniy Polyakov > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- /**************************************************** * Neil Horman <nhorman@tuxdriver.com> * Software Engineer, Red Hat ****************************************************/ ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Bugme-new] [Bug 11469] New: TUN with 1024 neighbours: ip6_dst_lookup_tail NULL crash 2008-09-08 20:15 ` David Miller 2008-09-08 20:34 ` Evgeniy Polyakov @ 2008-09-09 11:32 ` Neil Horman 2008-09-09 14:31 ` Evgeniy Polyakov 1 sibling, 1 reply; 13+ messages in thread From: Neil Horman @ 2008-09-09 11:32 UTC (permalink / raw) To: David Miller; +Cc: rdenis, johnpol, akpm, netdev, bugme-daemon On Mon, Sep 08, 2008 at 01:15:47PM -0700, David Miller wrote: > From: Rémi Denis-Courmont <rdenis@simphalempin.com> > Date: Sun, 7 Sep 2008 21:19:49 +0300 > > > Le dimanche 7 septembre 2008 21:11:09 Evgeniy Polyakov, vous avez écrit : > > > Since dst entry is allowed not to have neighbour entry, flush it just > > > like with incomplete one. This drops performance of your application > > > with more than 1024 neighbours to 1024 messages, to fix it you should > > > tune ipv6 routing parameters (gc intervals, gc threshold, maximum number > > > of entries and so on). There may be another problem with perfomance > > > though, at least I was able to bump it 10 times with different settings, > > > but still two times smaller than with 4k neighbours. > > > > That looks like a trivial local DoS against the IPv6 stack though? > > > > Especially in the case that the interface has IFF_NOARP, that seems like a > > weird limitation. Oh well... > > It's less of a DoS than the NULL pointer deref we get now, isn't it? :-) > > But I don't like this patch for several reasons: > > 1) Slapping on a NULL check in response to a OOPS at that exact > location is usually a very big red flag, and deserves high scrutiny > instead of blind acceptance. > > 2) Looking at the indentation of this DAD code block (it's all one tab > too much) it's obviously a very shitty cut and paste job. If the > coding style was too difficult to get right, what does this say > about that change that brought the code here, semantically? :-/ > > This means we should figure out how this code got to this place, > and what kind of invariants existed at the old location that might > make this dst->neighbour dereference valid, and what implications > there are for the fact that it can now be NULL. > > Really, we really need to understand much more deeply this situation. > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Here you go, patch to correct the indentation in that optimistic dad clause. Also adds the NULL pointer check for dst->neighbour. Fixes the reported oops and my complete inabilty to determine proper code indentation :) Best Neil Signed-off-by: Neil Horman <nhorman@tuxdriver.com> ip6_output.c | 64 +++++++++++++++++++++++++++++------------------------------ 1 file changed, 32 insertions(+), 32 deletions(-) diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index a4402de..e6671bd 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -943,39 +943,39 @@ static int ip6_dst_lookup_tail(struct sock *sk, } #ifdef CONFIG_IPV6_OPTIMISTIC_DAD - /* - * Here if the dst entry we've looked up - * has a neighbour entry that is in the INCOMPLETE - * state and the src address from the flow is - * marked as OPTIMISTIC, we release the found - * dst entry and replace it instead with the - * dst entry of the nexthop router - */ - if (!((*dst)->neighbour->nud_state & NUD_VALID)) { - struct inet6_ifaddr *ifp; - struct flowi fl_gw; - int redirect; - - ifp = ipv6_get_ifaddr(net, &fl->fl6_src, - (*dst)->dev, 1); - - redirect = (ifp && ifp->flags & IFA_F_OPTIMISTIC); - if (ifp) - in6_ifa_put(ifp); - - if (redirect) { - /* - * We need to get the dst entry for the - * default router instead - */ - dst_release(*dst); - memcpy(&fl_gw, fl, sizeof(struct flowi)); - memset(&fl_gw.fl6_dst, 0, sizeof(struct in6_addr)); - *dst = ip6_route_output(net, sk, &fl_gw); - if ((err = (*dst)->error)) - goto out_err_release; - } + /* + * Here if the dst entry we've looked up + * has a neighbour entry that is in the INCOMPLETE + * state and the src address from the flow is + * marked as OPTIMISTIC, we release the found + * dst entry and replace it instead with the + * dst entry of the nexthop router + */ + if (dst->neighbour && !((*dst)->neighbour->nud_state & NUD_VALID)) { + struct inet6_ifaddr *ifp; + struct flowi fl_gw; + int redirect; + + ifp = ipv6_get_ifaddr(net, &fl->fl6_src, + (*dst)->dev, 1); + + redirect = (ifp && ifp->flags & IFA_F_OPTIMISTIC); + if (ifp) + in6_ifa_put(ifp); + + if (redirect) { + /* + * We need to get the dst entry for the + * default router instead + */ + dst_release(*dst); + memcpy(&fl_gw, fl, sizeof(struct flowi)); + memset(&fl_gw.fl6_dst, 0, sizeof(struct in6_addr)); + *dst = ip6_route_output(net, sk, &fl_gw); + if ((err = (*dst)->error)) + goto out_err_release; } + } #endif return 0; -- /**************************************************** * Neil Horman <nhorman@tuxdriver.com> * Software Engineer, Red Hat ****************************************************/ ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [Bugme-new] [Bug 11469] New: TUN with 1024 neighbours: ip6_dst_lookup_tail NULL crash 2008-09-09 11:32 ` Neil Horman @ 2008-09-09 14:31 ` Evgeniy Polyakov 2008-09-09 15:39 ` Neil Horman 0 siblings, 1 reply; 13+ messages in thread From: Evgeniy Polyakov @ 2008-09-09 14:31 UTC (permalink / raw) To: Neil Horman; +Cc: David Miller, rdenis, akpm, netdev, bugme-daemon Hi. On Tue, Sep 09, 2008 at 07:32:15AM -0400, Neil Horman (nhorman@tuxdriver.com) wrote: > + /* > + * Here if the dst entry we've looked up > + * has a neighbour entry that is in the INCOMPLETE > + * state and the src address from the flow is > + * marked as OPTIMISTIC, we release the found > + * dst entry and replace it instead with the > + * dst entry of the nexthop router > + */ > + if (dst->neighbour && !((*dst)->neighbour->nud_state & NUD_VALID)) { ^^^ This should be (*dst)->neighbour, isn't it? -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Bugme-new] [Bug 11469] New: TUN with 1024 neighbours: ip6_dst_lookup_tail NULL crash 2008-09-09 14:31 ` Evgeniy Polyakov @ 2008-09-09 15:39 ` Neil Horman 2008-09-09 20:52 ` David Miller 0 siblings, 1 reply; 13+ messages in thread From: Neil Horman @ 2008-09-09 15:39 UTC (permalink / raw) To: Evgeniy Polyakov; +Cc: David Miller, rdenis, akpm, netdev, bugme-daemon On Tue, Sep 09, 2008 at 06:31:05PM +0400, Evgeniy Polyakov wrote: > Hi. > > On Tue, Sep 09, 2008 at 07:32:15AM -0400, Neil Horman (nhorman@tuxdriver.com) wrote: > > + /* > > + * Here if the dst entry we've looked up > > + * has a neighbour entry that is in the INCOMPLETE > > + * state and the src address from the flow is > > + * marked as OPTIMISTIC, we release the found > > + * dst entry and replace it instead with the > > + * dst entry of the nexthop router > > + */ > > + if (dst->neighbour && !((*dst)->neighbour->nud_state & NUD_VALID)) { > ^^^ > This should be (*dst)->neighbour, isn't it? > Dang it, yes it should. Sorry. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> ip6_output.c | 64 +++++++++++++++++++++++++++++------------------------------ 1 file changed, 32 insertions(+), 32 deletions(-) diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index a4402de..e6671bd 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -943,39 +943,39 @@ static int ip6_dst_lookup_tail(struct sock *sk, } #ifdef CONFIG_IPV6_OPTIMISTIC_DAD - /* - * Here if the dst entry we've looked up - * has a neighbour entry that is in the INCOMPLETE - * state and the src address from the flow is - * marked as OPTIMISTIC, we release the found - * dst entry and replace it instead with the - * dst entry of the nexthop router - */ - if (!((*dst)->neighbour->nud_state & NUD_VALID)) { - struct inet6_ifaddr *ifp; - struct flowi fl_gw; - int redirect; - - ifp = ipv6_get_ifaddr(net, &fl->fl6_src, - (*dst)->dev, 1); - - redirect = (ifp && ifp->flags & IFA_F_OPTIMISTIC); - if (ifp) - in6_ifa_put(ifp); - - if (redirect) { - /* - * We need to get the dst entry for the - * default router instead - */ - dst_release(*dst); - memcpy(&fl_gw, fl, sizeof(struct flowi)); - memset(&fl_gw.fl6_dst, 0, sizeof(struct in6_addr)); - *dst = ip6_route_output(net, sk, &fl_gw); - if ((err = (*dst)->error)) - goto out_err_release; - } + /* + * Here if the dst entry we've looked up + * has a neighbour entry that is in the INCOMPLETE + * state and the src address from the flow is + * marked as OPTIMISTIC, we release the found + * dst entry and replace it instead with the + * dst entry of the nexthop router + */ + if ((*dst)->neighbour && !((*dst)->neighbour->nud_state & NUD_VALID)) { + struct inet6_ifaddr *ifp; + struct flowi fl_gw; + int redirect; + + ifp = ipv6_get_ifaddr(net, &fl->fl6_src, + (*dst)->dev, 1); + + redirect = (ifp && ifp->flags & IFA_F_OPTIMISTIC); + if (ifp) + in6_ifa_put(ifp); + + if (redirect) { + /* + * We need to get the dst entry for the + * default router instead + */ + dst_release(*dst); + memcpy(&fl_gw, fl, sizeof(struct flowi)); + memset(&fl_gw.fl6_dst, 0, sizeof(struct in6_addr)); + *dst = ip6_route_output(net, sk, &fl_gw); + if ((err = (*dst)->error)) + goto out_err_release; } + } #endif return 0; > -- > Evgeniy Polyakov > -- /**************************************************** * Neil Horman <nhorman@tuxdriver.com> * Software Engineer, Red Hat ****************************************************/ ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [Bugme-new] [Bug 11469] New: TUN with 1024 neighbours: ip6_dst_lookup_tail NULL crash 2008-09-09 15:39 ` Neil Horman @ 2008-09-09 20:52 ` David Miller 0 siblings, 0 replies; 13+ messages in thread From: David Miller @ 2008-09-09 20:52 UTC (permalink / raw) To: nhorman; +Cc: johnpol, rdenis, akpm, netdev, bugme-daemon From: Neil Horman <nhorman@tuxdriver.com> Date: Tue, 9 Sep 2008 11:39:55 -0400 > On Tue, Sep 09, 2008 at 06:31:05PM +0400, Evgeniy Polyakov wrote: > > This should be (*dst)->neighbour, isn't it? > > > Dang it, yes it should. Sorry. > > Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Applied with the following commit message. I'll queue this up for -stable as well. Thanks! ipv6: Fix OOPS in ip6_dst_lookup_tail(). This fixes kernel bugzilla 11469: "TUN with 1024 neighbours: ip6_dst_lookup_tail NULL crash" dst->neighbour is not necessarily hooked up at this point in the processing path, so blindly dereferencing it is the wrong thing to do. This NULL check exists in other similar paths and this case was just an oversight. Also fix the completely wrong and confusing indentation here while we're at it. Based upon a patch by Evgeniy Polyakov. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2008-09-09 20:52 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <bug-11469-10286@http.bugzilla.kernel.org/>
2008-08-31 18:13 ` [Bugme-new] [Bug 11469] New: TUN with 1024 neighbours: ip6_dst_lookup_tail NULL crash Andrew Morton
2008-09-05 11:41 ` Evgeniy Polyakov
2008-09-05 16:03 ` Rémi Denis-Courmont
2008-09-05 16:37 ` Evgeniy Polyakov
2008-09-07 18:11 ` Evgeniy Polyakov
2008-09-07 18:19 ` Rémi Denis-Courmont
2008-09-08 20:15 ` David Miller
2008-09-08 20:34 ` Evgeniy Polyakov
2008-09-09 10:56 ` Neil Horman
2008-09-09 11:32 ` Neil Horman
2008-09-09 14:31 ` Evgeniy Polyakov
2008-09-09 15:39 ` Neil Horman
2008-09-09 20:52 ` David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).