* Re: Fw: Re: [PATCH] ipv4 tcp autobind problem [not found] <20030930024320.5aca8ddc.davem@redhat.com> @ 2003-09-30 12:37 ` kuznet 2003-09-30 12:37 ` David S. Miller 2003-09-30 14:19 ` Kovacs Krisztian 0 siblings, 2 replies; 5+ messages in thread From: kuznet @ 2003-09-30 12:37 UTC (permalink / raw) To: David S. Miller; +Cc: jmorris, hidden, netdev, linux-net Hello! > in inet_sendmsg() is that when an RST is received, sk->num is set to zero, Yes, I remember this. This funny thing was added to avoid using reserved ports obtained from accept() to do connect(). Before that sockets were never unbound after they bound once exactly to avoid weirdness of the kind descibed in your mail, but this happened to be insecure. >From this mail I still do not see why autobinding of void socket is so bad thing, that it requires marginal fixing at the place which is already marginal. What is the real problem? So, bad sendmsg() selects some port as a side effect. It makes it on udp and tcp. What is the deal? If it is disaster for tcp, why it is not bad for udp? > local port (sk->sport) remains unchanged until the socket is closed. Socket is _closed_. Local port is reset only after socket is closed, unless PORT_USERLOCK is set. And sk->sport remains unchanged even after socket is closed, btw, so... I do recognize that current behaviour is weird, but I still want to know how this marginal weirdness escaped to be seen in reality. Alexey ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Fw: Re: [PATCH] ipv4 tcp autobind problem 2003-09-30 12:37 ` Fw: Re: [PATCH] ipv4 tcp autobind problem kuznet @ 2003-09-30 12:37 ` David S. Miller 2003-09-30 14:19 ` Kovacs Krisztian 1 sibling, 0 replies; 5+ messages in thread From: David S. Miller @ 2003-09-30 12:37 UTC (permalink / raw) To: kuznet; +Cc: jmorris, hidden, netdev, linux-net On Tue, 30 Sep 2003 16:37:48 +0400 (MSD) kuznet@ms2.inr.ac.ru wrote: > From this mail I still do not see why autobinding of void socket is so > bad thing, that it requires marginal fixing at the place which is already > marginal. What is the real problem? I think it is some detail about their transparent proxy kernel hacks, remember that awful stuff? :) ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Fw: Re: [PATCH] ipv4 tcp autobind problem 2003-09-30 12:37 ` Fw: Re: [PATCH] ipv4 tcp autobind problem kuznet 2003-09-30 12:37 ` David S. Miller @ 2003-09-30 14:19 ` Kovacs Krisztian 2003-10-07 11:56 ` kuznet 1 sibling, 1 reply; 5+ messages in thread From: Kovacs Krisztian @ 2003-09-30 14:19 UTC (permalink / raw) To: kuznet; +Cc: David S. Miller, jmorris, netdev, linux-net, Balazs Scheidler Hi, kuznet@ms2.inr.ac.ru wrote: >>in inet_sendmsg() is that when an RST is received, sk->num is set to zero, > > Yes, I remember this. This funny thing was added to avoid using reserved > ports obtained from accept() to do connect(). Before that sockets were never > unbound after they bound once exactly to avoid weirdness of the kind > descibed in your mail, but this happened to be insecure. > > From this mail I still do not see why autobinding of void socket is so > bad thing, that it requires marginal fixing at the place which is already > marginal. What is the real problem? So, bad sendmsg() selects some port as > a side effect. It makes it on udp and tcp. What is the deal? If it is disaster > for tcp, why it is not bad for udp? > >>local port (sk->sport) remains unchanged until the socket is closed. > > Socket is _closed_. Local port is reset only after socket is closed, > unless PORT_USERLOCK is set. And sk->sport remains unchanged even > after socket is closed, btw, so... No, sk->sport does not remain unchanged. Imagine the following situation: the TCP stack receives an RST, tcp_reset() gets called. It calls tcp_done() -> tcp_set_state(TCP_CLOSE) -> tcp_put_port() -> __tcp_put_port(). __tcp_put_port() sets sk->num to zero. So, when you call send() from userspace, the mentioned part of inet_sendmsg() calls inet_autobind(), which changes sk->sport, too... > I do recognize that current behaviour is weird, but I still want to know > how this marginal weirdness escaped to be seen in reality. Yes, it is our "transparent proxy kernel hack" ((C) davem :) that have problem with this behaviour. Transparent proxying works the following way: - You have to assign a foreign address to a bound socket using a specific setsockopt call. This creates a new entry in the tproxy local IP hash table, and the hash key is based on the local ip:port. - tproxy registers its Netfilter hooks, and if a new connection comes in, it looks up its hash tables, to see if it is a transparent proxied connection (it uses Netfilter's connection tracking subsystem). If it has to be proxyed, the appropriate NAT mappings are applied to the connection. - When the socket is closed (inet_release() is called, tproxy has its "callback" function here), the corresponding entries are deleted from tproxy's hash tables, again, based on the local ip:port pair. So, the weirdness causes the following problem: since sk->sport changes after the corresponding entry was inserted into the hash table, it changes, so the entry cannot be deleted when userspace calls close() on the socket. Eventually, it should be enough if inet_sendmsg() did not call inet_autobind() if the socket is in an errorneous state, but I had no idea how this could be implemented. (sk->err gets zeroed by sock_error(), so it cannot be checked twice.) In the case of UDP, you cannot get an RST-like thing, so AFAIK there is no such situation in which sk->num of a bound socket could be reset to zero. This is why inet_sendmsg()'s mentioned part did not cause problems for us. -- Regards, Krisztian KOVACS ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Fw: Re: [PATCH] ipv4 tcp autobind problem 2003-09-30 14:19 ` Kovacs Krisztian @ 2003-10-07 11:56 ` kuznet 2003-10-07 12:35 ` Kovacs Krisztian 0 siblings, 1 reply; 5+ messages in thread From: kuznet @ 2003-10-07 11:56 UTC (permalink / raw) To: Kovacs Krisztian Cc: David S. Miller, jmorris, netdev, linux-net, Balazs Scheidler Hello! > - When the socket is closed (inet_release() is called, tproxy has its > "callback" function here), the corresponding entries are deleted from > tproxy's hash tables, again, based on the local ip:port pair. The mistake is here. inet_release() has nothing to do to connection state. Socket is not closed at this point, it goes to FIN-WAIT*, LAST-ACK or something like this and can exist for long time after this and proxying is to be continued all this time. Right place to finish tracking is when socket is removed from TCP hash tables, and to start tracking is when the socket is inserted to TCP hash tables. BTW you would not see the problem with binding if it was made right. Alexey ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] ipv4 tcp autobind problem 2003-10-07 11:56 ` kuznet @ 2003-10-07 12:35 ` Kovacs Krisztian 0 siblings, 0 replies; 5+ messages in thread From: Kovacs Krisztian @ 2003-10-07 12:35 UTC (permalink / raw) To: kuznet; +Cc: David S. Miller, jmorris, netdev, linux-net, Balazs Scheidler Hi, kuznet@ms2.inr.ac.ru wrote: > Right place to finish tracking is when socket is removed from TCP > hash tables, and to start tracking is when the socket is inserted > to TCP hash tables. BTW you would not see the problem with binding > if it was made right. Thanks for the reply. In the meantime, although because of slightly different reasons, we also came to the same conclusion. Right now I'm testing a version in which our "unassign" hook is removed from inet_release(), and added to udp_v4_unhash() and tcp_put_port(). However, I'm not quite sure there won't be locking problems when calling our hook from these places. -- Regards, Krisztian KOVACS ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2003-10-07 12:35 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20030930024320.5aca8ddc.davem@redhat.com>
2003-09-30 12:37 ` Fw: Re: [PATCH] ipv4 tcp autobind problem kuznet
2003-09-30 12:37 ` David S. Miller
2003-09-30 14:19 ` Kovacs Krisztian
2003-10-07 11:56 ` kuznet
2003-10-07 12:35 ` Kovacs Krisztian
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).