From: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
To: Arnaud Ebalard <arno@natisbad.org>
Cc: Brian Haley <brian.haley@hp.com>,
David Miller <davem@davemloft.net>, Jiri Olsa <jolsa@redhat.com>,
Scott Otto <scott.otto@alcatel-lucent.com>,
netdev@vger.kernel.org,
YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Subject: Re: [REGRESSION,BISECTED] MIPv6 support broken by f4f914b58019f0
Date: Sat, 29 May 2010 03:40:55 +0900 [thread overview]
Message-ID: <4C000E37.7080306@linux-ipv6.org> (raw)
In-Reply-To: <87sk5d2h4u.fsf@small.ssi.corp>
Hello.
(2010/05/28 6:01), Arnaud Ebalard wrote:
> Hi,
>
> Brian Haley<brian.haley@hp.com> writes:
>
>>> In previous debug outputs, the content of the fl->oif is ok, i.e. it is
>>> set to the interface on which the CoA is configured, i.e. the output
>>> interface. But the commit results in flags |= RT6_LOOKUP_F_IFACE.
>>> Later, in rt6_score_route(), the call to rt6_check_dev() returns 0
>>> (dev->ifindex is ip6tnl1 but oif is wlan0). Because of the change to flags
>>> flags, we quickly return -1 in rt6_score_route():
>>
>> Ok, so the call to ip6_route_output() was from the tunnel code, which is
>> using it's cached flowi, which has oif set to the tunnel. The XFRM code
>> swaps the addresses, which should invalidate the oif, but it doesn't.
>>
>>> static int rt6_score_route(struct rt6_info *rt, int oif,
>>> int strict)
>>> {
>>> int m, n;
>>>
>>> m = rt6_check_dev(rt, oif);
>>> if (!m&& (strict& RT6_LOOKUP_F_IFACE))
>>> return -1;
>>> ...
>>>
>>> Now, I wonder if the following is correct. Don't hesitate to correct me
>>> if I am wrong:
>>>
>>> Initially (before f4f914b58019f0), the purpose of the test using
>>> rt6_need_strict() in ip6_route_output() (introduced by c71099ac) was to
>>> allow the multiple routing table logic to be applied to all global
>>> addresses but to preserve the addresses for which it would not make
>>> sense (link-local, multicast, ). The change introduced by f4f914b58019f0
>>> basically reduces the ability to route traffic as you want and forces
>>> the traffic to leave the device by the interface on which it is
>>> configured (if fl->oif is set).
>>
>> The problem is we assumed the caller's would only set fl->oif if they
>> wanted it enforced (multicast, link-local, SO_BINDTODEVICE), but it
>> didn't take into account the tunnel code. I guess the easy answer
>> would be to revert this until we can fix it correctly.
>
> Nothing against it but maybe Jiri or David have other ideas.
>
>
>>> From my (very limited and possibly wrong) understanding, the change
>>> introduced by f4f914b58019f0 looks like a workaround for the
>>> SO_BINDTODEVICE issue. Looking at the code, there is something I don't
>>> understand: if SO_BINDTODEVICE has been used on a socket, the socket
>>> should have its sk_bound_dev_if attribute set to the correct ifindex
>>> value. Hence the following (naive) question: why is that information not
>>> used to inflect the selection of the route cached for the socket? And
>>> why would the fix be at the adress level instead of being at the
>>> interface level (ifindex)?
>>
>> I guess I always believed setting SO_BINDTODEVICE should always force
>> traffic out that interface, but from Yoshifuji's email it seems that
>> maybe wasn't the intention, at least for things that don't meet
>> the rt_need_strict() criteria like globals. I don't know the history
>> behind the setsockopt.
>
> The behavior I would expect from a combination of RFC 4191 and
> SO_BINDTODEVICE sockopt would be the use of the interface as outgoing
> interface and then the use of the best router (using router preference
> info, reachability, ...) available on the subnet. IIRC, the router
> preference info is per default router list in the RFC, i.e. per
> interface.
Good point.
Whatever our original intention/thought was,
current RFC says that we should honor outgoing interface
specified by user (by IPV6_PKTINFO etc.), as we do for
SO_BINDTODEVICE in IPv4 as well.
In this sense, checking sk->sk_bound_dev_if in
ip6_route_output() is not enough because we need to
take outgoing interface specified in ancillary data
into account, which is set to fl->oif.
How about adding additional "flags" parameter
for ip6_route_output()?
Thoughts?
--yoshfuji
next prev parent reply other threads:[~2010-05-28 18:41 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-26 17:01 [REGRESSION,BISECTED] MIPv6 support broken by f4f914b58019f0 Arnaud Ebalard
2010-05-27 0:48 ` Brian Haley
2010-05-27 15:14 ` Arnaud Ebalard
2010-05-27 19:39 ` Brian Haley
2010-05-27 21:01 ` Arnaud Ebalard
2010-05-28 18:40 ` YOSHIFUJI Hideaki [this message]
2010-05-28 21:15 ` Arnaud Ebalard
2010-05-27 21:31 ` Scott C Otto
2010-05-28 8:51 ` Arnaud Ebalard
2010-05-28 17:59 ` Brian Haley
2010-05-28 18:17 ` [PATCH] IPv6: fix Mobile IPv6 regression Brian Haley
2010-05-29 6:03 ` David Miller
2010-05-31 8:46 ` Jiri Olsa
2010-05-31 12:49 ` Jiri Olsa
2010-05-27 17:39 ` [REGRESSION,BISECTED] MIPv6 support broken by f4f914b58019f0 YOSHIFUJI Hideaki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C000E37.7080306@linux-ipv6.org \
--to=yoshfuji@linux-ipv6.org \
--cc=arno@natisbad.org \
--cc=brian.haley@hp.com \
--cc=davem@davemloft.net \
--cc=jolsa@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=scott.otto@alcatel-lucent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.