From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Eduard Guzovsky" Subject: Re: [IPv6] "sendmsg: invalid argument" to multicast group after some time Date: Sat, 27 Dec 2008 23:47:40 -0500 Message-ID: <1fd9f5a40812272047w464119f7l8796c7aa7a93576b@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: QUOTED-PRINTABLE To: netdev@vger.kernel.org Return-path: Received: from rv-out-0506.google.com ([209.85.198.227]:43181 "EHLO rv-out-0506.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754685AbYL1Erl convert rfc822-to-8bit (ORCPT ); Sat, 27 Dec 2008 23:47:41 -0500 Received: by rv-out-0506.google.com with SMTP id k40so3989084rvb.1 for ; Sat, 27 Dec 2008 20:47:40 -0800 (PST) Content-Disposition: inline Sender: netdev-owner@vger.kernel.org List-ID: > I even get the same error when doing a multicast ping6: > miredo:~# ping6 -I eth0 ff02::9 > PING ff02::9(ff02::9) from fe80::216:3eff:feb9:29f5 eth0: 56 data by= tes > ping: sendmsg: Invalid argument We had a similar problem in our lab network. I tracked down the source of the "Invalid argument" error to ip6_output_finish(). Here is the stack -----edg ip6_output_finish: failed to find neighbour [] show_trace_log_lvl+0x1a/0x30 [] show_trace+0x12/0x20 [] dump_stack+0x19/0x20 [] ip6_output2+0x279/0x290 [ipv6] [] ip6_output+0x2df/0x830 [ipv6] [] ip6_push_pending_frames+0x247/0x420 [ipv6] [] udp_v6_push_pending_frames+0x13f/0x1f0 [ipv6] [] udpv6_sendmsg+0x7ae/0xa60 [ipv6] [] inet_sendmsg+0x34/0x60 [] sock_sendmsg+0xfc/0x120 [] sys_sendto+0xbf/0xe0 [] sys_socketcall+0x187/0x260 [] syscall_call+0x7/0xb =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D ip6_output_finish() returns EINVAL because the route cache entry has NULL as a "neighbour" pointer. These invalid route cache entries are created when ipv6 neighbour table is filled up (one potential reason for that is a combination of a lot of multicast traffic =96"ff02:=85" and xen hosts with interfaces = in promiscuous mode). In this case ndisc_get_neigh() returns NULL, but at least in two places the routing code in net/ipv6/route.c ignores it and inserts invalid entries in the cache anyway. This is especially bad for frequently used multicast addresses. Garbage collector does not remove them from the cache, probably because of the frequent updates of the "__use" count. You need to flush the cache to get rid of them. One way to work around the problem is to increase "gc_thresh3" for ipv6 neighbour table. That still leaves you open for DOS attacks. Another way is to create permanent entries in neighbor/routing tables. In any case routing cache pollution problem has to be fixed. I suggest the following patch. I do not know this code and would appreciate if code maintainers could comment on it. Thanks, -Ed --- a/net/ipv6/route.c 2008-12-26 14:56:50.000000000 -0500 +++ b/net/ipv6/route.c 2008-12-26 14:57:19.000000000 -0500 @@ -638,6 +638,11 @@ rt->rt6i_nexthop =3D ndisc_get_neigh(rt->rt6i_dev, &rt->rt6i_gateway); + if (rt->rt6i_nexthop =3D=3D NULL) { + dst_free((struct dst_entry *)rt); + rt =3D NULL; + } + } return rt; @@ -991,9 +996,18 @@ dev_hold(dev); if (neigh) neigh_hold(neigh); - else + else { neigh =3D ndisc_get_neigh(dev, addr); + if (neigh =3D=3D NULL) { + dev_put(dev); + in6_dev_put(idev); + dst_free((struct dst_entry *)rt); + rt =3D NULL; + goto out; + } + } + rt->rt6i_dev =3D dev; rt->rt6i_idev =3D idev; rt->rt6i_nexthop =3D neigh;