* on-link assumption in ipv4 routing cache
@ 2008-10-21 1:02 Vlad Yasevich
2008-10-21 5:13 ` David Miller
0 siblings, 1 reply; 4+ messages in thread
From: Vlad Yasevich @ 2008-10-21 1:02 UTC (permalink / raw)
To: netdev
Hi All
Something that came as a surprise to me is that ipv4 implementation
seems to assume that a destination is on-link when there is
no explicit route to it.
A customer submitted an interesting problem where their SCTP associations
kept getting restarted. The configuration was as follows:
host A: eth0: 17.17.17.17/24
routing:
17.17.17.0/24 dev eth0 on-link
default dev eth1 10.0.0.1
host B: eth0: 18.18.18.18/24
18.18.18.0/24 dev eth0 on-link
default dev eth1 10.0.0.1
There were no routes to the "other" subnet on either host.
The application running on both hosts performed a bind to the specific
address as well as SO_BINDTODEVICE.
The result was that both hosts assumed that the peer was on-link, issued
ARP request/replies and successfully connected. tcpdump showed only packets
on eth0.
This was somewhat of a surprise since I expected a EHOSTUNRACH error since
there were no routes to the destination and SO_DONTROUTE was not set.
I am really curious as to reason for this behavior?
Thanks
-vlad
p.s. BTW, the solution to the association restart appeared to be
29e75252da20f3ab9e132c68c9aed156b87beae6 ([IPV4] route cache: Introduce rt_genid
for smooth cache invalidation). There used to be some kind of a
race between cache flushing and SCTP bottom half attempting to recreate
a cache entry. My guess is that there were rcu issues, since any
cache updated triggered by user application seem to have worked correctly.
Regardless, it appears to have been fixed in 2.6.25.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: on-link assumption in ipv4 routing cache
2008-10-21 1:02 on-link assumption in ipv4 routing cache Vlad Yasevich
@ 2008-10-21 5:13 ` David Miller
2008-10-23 17:47 ` Vlad Yasevich
0 siblings, 1 reply; 4+ messages in thread
From: David Miller @ 2008-10-21 5:13 UTC (permalink / raw)
To: vladislav.yasevich; +Cc: netdev
From: Vlad Yasevich <vladislav.yasevich@hp.com>
Date: Mon, 20 Oct 2008 21:02:44 -0400
> This was somewhat of a surprise since I expected a EHOSTUNRACH error since
> there were no routes to the destination and SO_DONTROUTE was not set.
>
> I am really curious as to reason for this behavior?
In general the Linux ipv4 stack tries to do things that make it more
likely for successful communication between two nodes.
This is one such example, another is the choice of using the host
based addressing model rather than the interface based addressing
model.
Alexey Kuznetsov is responsible for most of these decisions, he is
a genius.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: on-link assumption in ipv4 routing cache
2008-10-21 5:13 ` David Miller
@ 2008-10-23 17:47 ` Vlad Yasevich
2008-11-04 1:32 ` David Miller
0 siblings, 1 reply; 4+ messages in thread
From: Vlad Yasevich @ 2008-10-23 17:47 UTC (permalink / raw)
To: David Miller; +Cc: netdev
David Miller wrote:
> From: Vlad Yasevich <vladislav.yasevich@hp.com>
> Date: Mon, 20 Oct 2008 21:02:44 -0400
>
>> This was somewhat of a surprise since I expected a EHOSTUNRACH error since
>> there were no routes to the destination and SO_DONTROUTE was not set.
>>
>> I am really curious as to reason for this behavior?
>
> In general the Linux ipv4 stack tries to do things that make it more
> likely for successful communication between two nodes.
>
> This is one such example, another is the choice of using the host
> based addressing model rather than the interface based addressing
> model.
>
> Alexey Kuznetsov is responsible for most of these decisions, he is
> a genius.
>
Hi David
Ok, I've found the code and the explanation, but I think there is
a small bug here that's been around a very long time. There is absolutely
no checking for the interface state. This means that if the interface
is brought down, we are still going to attempt to route through it.
That seems broken, since the interface was administratively brought down.
I've actually tried to do this with my test app. I've set it to connect
over a given interface, brought the interface down, and then issued the
connect(). The result is that the app hung until tcp_syn_retries SYNs have
been issued and then returned error.
I've got no issues against on-link assumption as long as there are some smarts
behind it. The least we could do is use a running interface.
-vlad
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: on-link assumption in ipv4 routing cache
2008-10-23 17:47 ` Vlad Yasevich
@ 2008-11-04 1:32 ` David Miller
0 siblings, 0 replies; 4+ messages in thread
From: David Miller @ 2008-11-04 1:32 UTC (permalink / raw)
To: vladislav.yasevich; +Cc: netdev
From: Vlad Yasevich <vladislav.yasevich@hp.com>
Date: Thu, 23 Oct 2008 13:47:44 -0400
> Ok, I've found the code and the explanation, but I think there is
> a small bug here that's been around a very long time. There is absolutely
> no checking for the interface state. This means that if the interface
> is brought down, we are still going to attempt to route through it.
> That seems broken, since the interface was administratively brought down.
>
> I've actually tried to do this with my test app. I've set it to connect
> over a given interface, brought the interface down, and then issued the
> connect(). The result is that the app hung until tcp_syn_retries SYNs have
> been issued and then returned error.
>
> I've got no issues against on-link assumption as long as there are some smarts
> behind it. The least we could do is use a running interface.
I agree with you that this behavior is at best sub-optimal.
I'm happy to entertain patches the try to make this
area more sane.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-11-04 1:32 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-21 1:02 on-link assumption in ipv4 routing cache Vlad Yasevich
2008-10-21 5:13 ` David Miller
2008-10-23 17:47 ` Vlad Yasevich
2008-11-04 1:32 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).