* [PATCH] ipv4: fix TCP early demux
@ 2012-07-27 16:23 Eric Dumazet
2012-07-27 20:47 ` David Miller
0 siblings, 1 reply; 4+ messages in thread
From: Eric Dumazet @ 2012-07-27 16:23 UTC (permalink / raw)
To: David Miller; +Cc: netdev
From: Eric Dumazet <edumazet@google.com>
commit 92101b3b2e317 (ipv4: Prepare for change of rt->rt_iif encoding.)
invalidated TCP early demux, because rx_dst_ifindex is not properly
initialized and checked.
Also remove the use of inet_iif(skb) in favor or skb->skb_iif
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/ipv4/tcp_input.c | 1 +
net/ipv4/tcp_ipv4.c | 14 ++++++--------
net/ipv4/tcp_minisocks.c | 1 +
3 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 3e07a64..aa659e8 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -5603,6 +5603,7 @@ void tcp_finish_connect(struct sock *sk, struct sk_buff *skb)
if (skb != NULL) {
sk->sk_rx_dst = dst_clone(skb_dst(skb));
+ inet_sk(sk)->rx_dst_ifindex = skb->skb_iif;
security_inet_conn_established(sk, skb);
}
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index b6b07c9..2fbd992 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1620,17 +1620,15 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
sock_rps_save_rxhash(sk, skb);
if (sk->sk_rx_dst) {
struct dst_entry *dst = sk->sk_rx_dst;
- if (dst->ops->check(dst, 0) == NULL) {
+ if (inet_sk(sk)->rx_dst_ifindex != skb->skb_iif ||
+ dst->ops->check(dst, 0) == NULL) {
dst_release(dst);
sk->sk_rx_dst = NULL;
}
}
if (unlikely(sk->sk_rx_dst == NULL)) {
- struct inet_sock *icsk = inet_sk(sk);
- struct rtable *rt = skb_rtable(skb);
-
- sk->sk_rx_dst = dst_clone(&rt->dst);
- icsk->rx_dst_ifindex = inet_iif(skb);
+ sk->sk_rx_dst = dst_clone(skb_dst(skb));
+ inet_sk(sk)->rx_dst_ifindex = skb->skb_iif;
}
if (tcp_rcv_established(sk, skb, tcp_hdr(skb), skb->len)) {
rsk = sk;
@@ -1709,11 +1707,11 @@ void tcp_v4_early_demux(struct sk_buff *skb)
skb->destructor = sock_edemux;
if (sk->sk_state != TCP_TIME_WAIT) {
struct dst_entry *dst = sk->sk_rx_dst;
- struct inet_sock *icsk = inet_sk(sk);
+
if (dst)
dst = dst_check(dst, 0);
if (dst &&
- icsk->rx_dst_ifindex == skb->skb_iif)
+ inet_sk(sk)->rx_dst_ifindex == skb->skb_iif)
skb_dst_set_noref(skb, dst);
}
}
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 5912ac3..3f1cc20 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -388,6 +388,7 @@ struct sock *tcp_create_openreq_child(struct sock *sk, struct request_sock *req,
struct tcp_cookie_values *oldcvp = oldtp->cookie_values;
newsk->sk_rx_dst = dst_clone(skb_dst(skb));
+ inet_sk(newsk)->rx_dst_ifindex = skb->skb_iif;
/* TCP Cookie Transactions require space for the cookie pair,
* as it differs for each connection. There is no need to
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] ipv4: fix TCP early demux
2012-07-27 16:23 [PATCH] ipv4: fix TCP early demux Eric Dumazet
@ 2012-07-27 20:47 ` David Miller
2012-07-27 21:34 ` Eric Dumazet
0 siblings, 1 reply; 4+ messages in thread
From: David Miller @ 2012-07-27 20:47 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 27 Jul 2012 18:23:40 +0200
> From: Eric Dumazet <edumazet@google.com>
>
> commit 92101b3b2e317 (ipv4: Prepare for change of rt->rt_iif encoding.)
> invalidated TCP early demux, because rx_dst_ifindex is not properly
> initialized and checked.
>
> Also remove the use of inet_iif(skb) in favor or skb->skb_iif
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Applied.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] ipv4: fix TCP early demux
2012-07-27 20:47 ` David Miller
@ 2012-07-27 21:34 ` Eric Dumazet
2012-07-27 23:00 ` David Miller
0 siblings, 1 reply; 4+ messages in thread
From: Eric Dumazet @ 2012-07-27 21:34 UTC (permalink / raw)
To: David Miller; +Cc: netdev
On Fri, 2012-07-27 at 13:47 -0700, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Fri, 27 Jul 2012 18:23:40 +0200
>
> > From: Eric Dumazet <edumazet@google.com>
> >
> > commit 92101b3b2e317 (ipv4: Prepare for change of rt->rt_iif encoding.)
> > invalidated TCP early demux, because rx_dst_ifindex is not properly
> > initialized and checked.
> >
> > Also remove the use of inet_iif(skb) in favor or skb->skb_iif
> >
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
>
> Applied.
Thanks David
IPv6 part is screwed because of the bogus dst_check(dst, 0)
(and missing code that was moved out from tcp_rcv_established() to
tcp_v4_do_rcv() : I was wondering if we could make it generic to move it
back to tcp_rcv_established()) :
if (sk->sk_rx_dst) {
struct dst_entry *dst = sk->sk_rx_dst;
if (dst->ops->check(dst, 0) == NULL) {
dst_release(dst);
sk->sk_rx_dst = NULL;
}
}
if (unlikely(sk->sk_rx_dst == NULL)) {
sk->sk_rx_dst = dst_clone(skb_dst(skb));
inet_sk(sk)->rx_dst_ifindex = inet_iif(skb);
}
IPv6 wants a cookie here, not 0
I wonder why cookie is not stored in dst, and must be stored outside of
it ?
We could then use :
if (sk->sk_rx_dst) {
struct dst_entry *dst = sk->sk_rx_dst;
if (dst->ops->check(dst) == NULL) {
dst_release(dst);
sk->sk_rx_dst = NULL;
}
}
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] ipv4: fix TCP early demux
2012-07-27 21:34 ` Eric Dumazet
@ 2012-07-27 23:00 ` David Miller
0 siblings, 0 replies; 4+ messages in thread
From: David Miller @ 2012-07-27 23:00 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 27 Jul 2012 23:34:33 +0200
> I wonder why cookie is not stored in dst, and must be stored outside
> of it ?
Because in ipv6, cloned routes are never invalidated on a route
insertion or deletion. We keep them around.
The fn_sernum tracks path reachability, rather than validation of
dsts.
So when we delete or insert a new ipv6 route, we update the serial
number of all the FIB nodes on the way down to the insert/delete
point.
If, afterwards, we can still reach a route cloned from one of those
entries successfully. It is still valid.
If we were to store the serial number on the dst, it would invalidate
the cloned route, which the ipv6 code is largely not designed for.
To be honest, the whole route validation scheme in ipv6 was
jackhammered into place. It was basically two years of Alexey
repairing the largely broken scheme that Pedro had put into place.
This is ~1997 legacy stuff.
It deserves a complete rewrite, but we are too busy with ipv4 at the
moment. And frankly if nobody other than myself was concerned enough
to delete the ipv4 routing cache (code people actually use) the
likelyhood of anyone embarking on a task of similar size for ipv6 is
basically zero.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2012-07-27 23:00 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-07-27 16:23 [PATCH] ipv4: fix TCP early demux Eric Dumazet
2012-07-27 20:47 ` David Miller
2012-07-27 21:34 ` Eric Dumazet
2012-07-27 23:00 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).