netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 net 0/4] ipv6: datagram: Update dst cache of a connected udp sk during pmtu update
@ 2016-04-11 22:29 Martin KaFai Lau
  2016-04-11 22:29 ` [PATCH v2 net 1/4] ipv6: datagram: Refactor flowi6 init codes to a new function Martin KaFai Lau
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Martin KaFai Lau @ 2016-04-11 22:29 UTC (permalink / raw)
  To: netdev; +Cc: Cong Wang, Eric Dumazet, Wei Wang, Kernel Team

v2:
~ Protect __sk_dst_get() operations with rcu_read_lock in
  release_cb() because another thread may do ip6_dst_store()
  for a udp sk without taking the sk lock (e.g. in sendmsg).
~ Do a ipv6_addr_v4mapped(&sk->sk_v6_daddr) check before
  calling ip6_datagram_dst_update() in patch 3 and 4.  It is
  similar to how __ip6_datagram_connect handles it.
~ One fix in ip6_datagram_dst_update() in patch 2.  It needs
  to check (np->flow_label & IPV6_FLOWLABEL_MASK) before
  doing fl6_sock_lookup.  I was confused with the naming
  of IPV6_FLOWLABEL_MASK and IPV6_FLOWINFO_MASK.
~ Check dst->obsolete just on the safe side, although I think it
  should at least have DST_OBSOLETE_FORCE_CHK by now.
~ Add Fixes tag to patch 3 and 4
~ Add some points from the previous discussion about holding
  sk lock to the commit message in patch 3.

v1:
There is a case in connected UDP socket such that
getsockopt(IPV6_MTU) will return a stale MTU value. The reproducible
sequence could be the following:
1. Create a connected UDP socket
2. Send some datagrams out
3. Receive a ICMPV6_PKT_TOOBIG
4. No new outgoing datagrams to trigger the sk_dst_check()
   logic to update the sk->sk_dst_cache.
5. getsockopt(IPV6_MTU) returns the mtu from the invalid
   sk->sk_dst_cache instead of the newly created RTF_CACHE clone.

Patch 1 and 2 are the prep work.
Patch 3 and 4 are the fixes.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 net 1/4] ipv6: datagram: Refactor flowi6 init codes to a new function
  2016-04-11 22:29 [PATCH v2 net 0/4] ipv6: datagram: Update dst cache of a connected udp sk during pmtu update Martin KaFai Lau
@ 2016-04-11 22:29 ` Martin KaFai Lau
  2016-04-11 22:29 ` [PATCH v2 net 2/4] ipv6: datagram: Refactor dst lookup and update " Martin KaFai Lau
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Martin KaFai Lau @ 2016-04-11 22:29 UTC (permalink / raw)
  To: netdev; +Cc: Cong Wang, Eric Dumazet, Wei Wang, Kernel Team

Move flowi6 init codes for connected datagram sk to a newly created
function ip6_datagram_flow_key_init().

Notes:
1. fl6_flowlabel is used instead of fl6.flowlabel in __ip6_datagram_connect
2. ipv6_addr_is_multicast(&fl6->daddr) is used instead of
   (addr_type & IPV6_ADDR_MULTICAST) in ip6_datagram_flow_key_init()

This new function will be reused during pmtu update in the later patch.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Wei Wang <weiwan@google.com>
---
 net/ipv6/datagram.c | 50 ++++++++++++++++++++++++++++++--------------------
 1 file changed, 30 insertions(+), 20 deletions(-)

diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index 4281621..f07c1dd 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -40,6 +40,30 @@ static bool ipv6_mapped_addr_any(const struct in6_addr *a)
 	return ipv6_addr_v4mapped(a) && (a->s6_addr32[3] == 0);
 }
 
+static void ip6_datagram_flow_key_init(struct flowi6 *fl6, struct sock *sk)
+{
+	struct inet_sock *inet = inet_sk(sk);
+	struct ipv6_pinfo *np = inet6_sk(sk);
+
+	memset(fl6, 0, sizeof(*fl6));
+	fl6->flowi6_proto = sk->sk_protocol;
+	fl6->daddr = sk->sk_v6_daddr;
+	fl6->saddr = np->saddr;
+	fl6->flowi6_oif = sk->sk_bound_dev_if;
+	fl6->flowi6_mark = sk->sk_mark;
+	fl6->fl6_dport = inet->inet_dport;
+	fl6->fl6_sport = inet->inet_sport;
+	fl6->flowlabel = np->flow_label;
+
+	if (!fl6->flowi6_oif)
+		fl6->flowi6_oif = np->sticky_pktinfo.ipi6_ifindex;
+
+	if (!fl6->flowi6_oif && ipv6_addr_is_multicast(&fl6->daddr))
+		fl6->flowi6_oif = np->mcast_oif;
+
+	security_sk_classify_flow(sk, flowi6_to_flowi(fl6));
+}
+
 static int __ip6_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 {
 	struct sockaddr_in6	*usin = (struct sockaddr_in6 *) uaddr;
@@ -52,6 +76,7 @@ static int __ip6_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int a
 	struct ipv6_txoptions	*opt;
 	int			addr_type;
 	int			err;
+	__be32			fl6_flowlabel = 0;
 
 	if (usin->sin6_family == AF_INET) {
 		if (__ipv6_only_sock(sk))
@@ -66,11 +91,10 @@ static int __ip6_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int a
 	if (usin->sin6_family != AF_INET6)
 		return -EAFNOSUPPORT;
 
-	memset(&fl6, 0, sizeof(fl6));
 	if (np->sndflow) {
-		fl6.flowlabel = usin->sin6_flowinfo&IPV6_FLOWINFO_MASK;
-		if (fl6.flowlabel&IPV6_FLOWLABEL_MASK) {
-			flowlabel = fl6_sock_lookup(sk, fl6.flowlabel);
+		fl6_flowlabel = usin->sin6_flowinfo & IPV6_FLOWINFO_MASK;
+		if (fl6_flowlabel & IPV6_FLOWLABEL_MASK) {
+			flowlabel = fl6_sock_lookup(sk, fl6_flowlabel);
 			if (!flowlabel)
 				return -EINVAL;
 		}
@@ -145,7 +169,7 @@ ipv4_connected:
 	}
 
 	sk->sk_v6_daddr = *daddr;
-	np->flow_label = fl6.flowlabel;
+	np->flow_label = fl6_flowlabel;
 
 	inet->inet_dport = usin->sin6_port;
 
@@ -154,21 +178,7 @@ ipv4_connected:
 	 *	destination cache for it.
 	 */
 
-	fl6.flowi6_proto = sk->sk_protocol;
-	fl6.daddr = sk->sk_v6_daddr;
-	fl6.saddr = np->saddr;
-	fl6.flowi6_oif = sk->sk_bound_dev_if;
-	fl6.flowi6_mark = sk->sk_mark;
-	fl6.fl6_dport = inet->inet_dport;
-	fl6.fl6_sport = inet->inet_sport;
-
-	if (!fl6.flowi6_oif)
-		fl6.flowi6_oif = np->sticky_pktinfo.ipi6_ifindex;
-
-	if (!fl6.flowi6_oif && (addr_type&IPV6_ADDR_MULTICAST))
-		fl6.flowi6_oif = np->mcast_oif;
-
-	security_sk_classify_flow(sk, flowi6_to_flowi(&fl6));
+	ip6_datagram_flow_key_init(&fl6, sk);
 
 	rcu_read_lock();
 	opt = flowlabel ? flowlabel->opt : rcu_dereference(np->opt);
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 net 2/4] ipv6: datagram: Refactor dst lookup and update codes to a new function
  2016-04-11 22:29 [PATCH v2 net 0/4] ipv6: datagram: Update dst cache of a connected udp sk during pmtu update Martin KaFai Lau
  2016-04-11 22:29 ` [PATCH v2 net 1/4] ipv6: datagram: Refactor flowi6 init codes to a new function Martin KaFai Lau
@ 2016-04-11 22:29 ` Martin KaFai Lau
  2016-04-11 22:29 ` [PATCH v2 net 3/4] ipv6: datagram: Update dst cache of a connected datagram sk during pmtu update Martin KaFai Lau
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Martin KaFai Lau @ 2016-04-11 22:29 UTC (permalink / raw)
  To: netdev; +Cc: Cong Wang, Eric Dumazet, Wei Wang, Kernel Team

This patch moves the route lookup and update codes for connected
datagram sk to a newly created function ip6_datagram_dst_update()

It will be reused during the pmtu update in the later patch.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Wei Wang <weiwan@google.com>
---
 net/ipv6/datagram.c | 103 +++++++++++++++++++++++++++++-----------------------
 1 file changed, 57 insertions(+), 46 deletions(-)

diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index f07c1dd..669585e 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -64,16 +64,65 @@ static void ip6_datagram_flow_key_init(struct flowi6 *fl6, struct sock *sk)
 	security_sk_classify_flow(sk, flowi6_to_flowi(fl6));
 }
 
+static int ip6_datagram_dst_update(struct sock *sk)
+{
+	struct ip6_flowlabel *flowlabel = NULL;
+	struct in6_addr *final_p, final;
+	struct ipv6_txoptions *opt;
+	struct dst_entry *dst;
+	struct inet_sock *inet = inet_sk(sk);
+	struct ipv6_pinfo *np = inet6_sk(sk);
+	struct flowi6 fl6;
+	int err = 0;
+
+	if (np->sndflow && (np->flow_label & IPV6_FLOWLABEL_MASK)) {
+		flowlabel = fl6_sock_lookup(sk, np->flow_label);
+		if (!flowlabel)
+			return -EINVAL;
+	}
+	ip6_datagram_flow_key_init(&fl6, sk);
+
+	rcu_read_lock();
+	opt = flowlabel ? flowlabel->opt : rcu_dereference(np->opt);
+	final_p = fl6_update_dst(&fl6, opt, &final);
+	rcu_read_unlock();
+
+	dst = ip6_dst_lookup_flow(sk, &fl6, final_p);
+	if (IS_ERR(dst)) {
+		err = PTR_ERR(dst);
+		goto out;
+	}
+
+	if (ipv6_addr_any(&np->saddr))
+		np->saddr = fl6.saddr;
+
+	if (ipv6_addr_any(&sk->sk_v6_rcv_saddr)) {
+		sk->sk_v6_rcv_saddr = fl6.saddr;
+		inet->inet_rcv_saddr = LOOPBACK4_IPV6;
+		if (sk->sk_prot->rehash)
+			sk->sk_prot->rehash(sk);
+	}
+
+	ip6_dst_store(sk, dst,
+		      ipv6_addr_equal(&fl6.daddr, &sk->sk_v6_daddr) ?
+		      &sk->sk_v6_daddr : NULL,
+#ifdef CONFIG_IPV6_SUBTREES
+		      ipv6_addr_equal(&fl6.saddr, &np->saddr) ?
+		      &np->saddr :
+#endif
+		      NULL);
+
+out:
+	fl6_sock_release(flowlabel);
+	return err;
+}
+
 static int __ip6_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 {
 	struct sockaddr_in6	*usin = (struct sockaddr_in6 *) uaddr;
 	struct inet_sock	*inet = inet_sk(sk);
 	struct ipv6_pinfo	*np = inet6_sk(sk);
-	struct in6_addr	*daddr, *final_p, final;
-	struct dst_entry	*dst;
-	struct flowi6		fl6;
-	struct ip6_flowlabel	*flowlabel = NULL;
-	struct ipv6_txoptions	*opt;
+	struct in6_addr		*daddr;
 	int			addr_type;
 	int			err;
 	__be32			fl6_flowlabel = 0;
@@ -91,14 +140,8 @@ static int __ip6_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int a
 	if (usin->sin6_family != AF_INET6)
 		return -EAFNOSUPPORT;
 
-	if (np->sndflow) {
+	if (np->sndflow)
 		fl6_flowlabel = usin->sin6_flowinfo & IPV6_FLOWINFO_MASK;
-		if (fl6_flowlabel & IPV6_FLOWLABEL_MASK) {
-			flowlabel = fl6_sock_lookup(sk, fl6_flowlabel);
-			if (!flowlabel)
-				return -EINVAL;
-		}
-	}
 
 	addr_type = ipv6_addr_type(&usin->sin6_addr);
 
@@ -178,45 +221,13 @@ ipv4_connected:
 	 *	destination cache for it.
 	 */
 
-	ip6_datagram_flow_key_init(&fl6, sk);
-
-	rcu_read_lock();
-	opt = flowlabel ? flowlabel->opt : rcu_dereference(np->opt);
-	final_p = fl6_update_dst(&fl6, opt, &final);
-	rcu_read_unlock();
-
-	dst = ip6_dst_lookup_flow(sk, &fl6, final_p);
-	err = 0;
-	if (IS_ERR(dst)) {
-		err = PTR_ERR(dst);
+	err = ip6_datagram_dst_update(sk);
+	if (err)
 		goto out;
-	}
-
-	/* source address lookup done in ip6_dst_lookup */
-
-	if (ipv6_addr_any(&np->saddr))
-		np->saddr = fl6.saddr;
-
-	if (ipv6_addr_any(&sk->sk_v6_rcv_saddr)) {
-		sk->sk_v6_rcv_saddr = fl6.saddr;
-		inet->inet_rcv_saddr = LOOPBACK4_IPV6;
-		if (sk->sk_prot->rehash)
-			sk->sk_prot->rehash(sk);
-	}
-
-	ip6_dst_store(sk, dst,
-		      ipv6_addr_equal(&fl6.daddr, &sk->sk_v6_daddr) ?
-		      &sk->sk_v6_daddr : NULL,
-#ifdef CONFIG_IPV6_SUBTREES
-		      ipv6_addr_equal(&fl6.saddr, &np->saddr) ?
-		      &np->saddr :
-#endif
-		      NULL);
 
 	sk->sk_state = TCP_ESTABLISHED;
 	sk_set_txhash(sk);
 out:
-	fl6_sock_release(flowlabel);
 	return err;
 }
 
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 net 3/4] ipv6: datagram: Update dst cache of a connected datagram sk during pmtu update
  2016-04-11 22:29 [PATCH v2 net 0/4] ipv6: datagram: Update dst cache of a connected udp sk during pmtu update Martin KaFai Lau
  2016-04-11 22:29 ` [PATCH v2 net 1/4] ipv6: datagram: Refactor flowi6 init codes to a new function Martin KaFai Lau
  2016-04-11 22:29 ` [PATCH v2 net 2/4] ipv6: datagram: Refactor dst lookup and update " Martin KaFai Lau
@ 2016-04-11 22:29 ` Martin KaFai Lau
  2016-04-11 22:29 ` [PATCH v2 net 4/4] ipv6: udp: Do a route lookup and update during release_cb Martin KaFai Lau
  2016-04-14 20:45 ` [PATCH v2 net 0/4] ipv6: datagram: Update dst cache of a connected udp sk during pmtu update David Miller
  4 siblings, 0 replies; 6+ messages in thread
From: Martin KaFai Lau @ 2016-04-11 22:29 UTC (permalink / raw)
  To: netdev; +Cc: Cong Wang, Eric Dumazet, Wei Wang, Kernel Team

There is a case in connected UDP socket such that
getsockopt(IPV6_MTU) will return a stale MTU value. The reproducible
sequence could be the following:
1. Create a connected UDP socket
2. Send some datagrams out
3. Receive a ICMPV6_PKT_TOOBIG
4. No new outgoing datagrams to trigger the sk_dst_check()
   logic to update the sk->sk_dst_cache.
5. getsockopt(IPV6_MTU) returns the mtu from the invalid
   sk->sk_dst_cache instead of the newly created RTF_CACHE clone.

This patch updates the sk->sk_dst_cache for a connected datagram sk
during pmtu-update code path.

Note that the sk->sk_v6_daddr is used to do the route lookup
instead of skb->data (i.e. iph).  It is because a UDP socket can become
connected after sending out some datagrams in un-connected state.  or
It can be connected multiple times to different destinations.  Hence,
iph may not be related to where sk is currently connected to.

It is done under '!sock_owned_by_user(sk)' condition because
the user may make another ip6_datagram_connect()  (i.e changing
the sk->sk_v6_daddr) while dst lookup is happening in the pmtu-update
code path.

For the sock_owned_by_user(sk) == true case, the next patch will
introduce a release_cb() which will update the sk->sk_dst_cache.

Test:

Server (Connected UDP Socket):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Route Details:
[root@arch-fb-vm1 ~]# ip -6 r show | egrep '2fac'
2fac::/64 dev eth0  proto kernel  metric 256  pref medium
2fac:face::/64 via 2fac::face dev eth0  metric 1024  pref medium

A simple python code to create a connected UDP socket:

import socket
import errno

HOST = '2fac::1'
PORT = 8080

s = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM)
s.bind((HOST, PORT))
s.connect(('2fac:face::face', 53))
print("connected")
while True:
    try:
	data = s.recv(1024)
    except socket.error as se:
	if se.errno == errno.EMSGSIZE:
		pmtu = s.getsockopt(41, 24)
		print("PMTU:%d" % pmtu)
		break
s.close()

Python program output after getting a ICMPV6_PKT_TOOBIG:
[root@arch-fb-vm1 ~]# python2 ~/devshare/kernel/tasks/fib6/udp-connect-53-8080.py
connected
PMTU:1300

Cache routes after recieving TOOBIG:
[root@arch-fb-vm1 ~]# ip -6 r show table cache
2fac:face::face via 2fac::face dev eth0  metric 0
    cache  expires 463sec mtu 1300 pref medium

Client (Send the ICMPV6_PKT_TOOBIG):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
scapy is used to generate the TOOBIG message.  Here is the scapy script I have
used:

>>> p=Ether(src='da:75:4d:36:ac:32', dst='52:54:00:12:34:66', type=0x86dd)/IPv6(src='2fac::face', dst='2fac::1')/ICMPv6PacketTooBig(mtu=1300)/IPv6(src='2fac::
1',dst='2fac:face::face', nh='UDP')/UDP(sport=8080,dport=53)
>>> sendp(p, iface='qemubr0')

Fixes: 45e4fd26683c ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Reported-by: Wei Wang <weiwan@google.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Wei Wang <weiwan@google.com>
---
 include/net/ipv6.h  |  1 +
 net/ipv6/datagram.c | 20 +++++++++++---------
 net/ipv6/route.c    | 12 ++++++++++++
 3 files changed, 24 insertions(+), 9 deletions(-)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index d0aeb97..fd02e90 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -959,6 +959,7 @@ int compat_ipv6_getsockopt(struct sock *sk, int level, int optname,
 int ip6_datagram_connect(struct sock *sk, struct sockaddr *addr, int addr_len);
 int ip6_datagram_connect_v6_only(struct sock *sk, struct sockaddr *addr,
 				 int addr_len);
+int ip6_datagram_dst_update(struct sock *sk, bool fix_sk_saddr);
 
 int ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len,
 		    int *addr_len);
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index 669585e..59e01f27 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -64,7 +64,7 @@ static void ip6_datagram_flow_key_init(struct flowi6 *fl6, struct sock *sk)
 	security_sk_classify_flow(sk, flowi6_to_flowi(fl6));
 }
 
-static int ip6_datagram_dst_update(struct sock *sk)
+int ip6_datagram_dst_update(struct sock *sk, bool fix_sk_saddr)
 {
 	struct ip6_flowlabel *flowlabel = NULL;
 	struct in6_addr *final_p, final;
@@ -93,14 +93,16 @@ static int ip6_datagram_dst_update(struct sock *sk)
 		goto out;
 	}
 
-	if (ipv6_addr_any(&np->saddr))
-		np->saddr = fl6.saddr;
+	if (fix_sk_saddr) {
+		if (ipv6_addr_any(&np->saddr))
+			np->saddr = fl6.saddr;
 
-	if (ipv6_addr_any(&sk->sk_v6_rcv_saddr)) {
-		sk->sk_v6_rcv_saddr = fl6.saddr;
-		inet->inet_rcv_saddr = LOOPBACK4_IPV6;
-		if (sk->sk_prot->rehash)
-			sk->sk_prot->rehash(sk);
+		if (ipv6_addr_any(&sk->sk_v6_rcv_saddr)) {
+			sk->sk_v6_rcv_saddr = fl6.saddr;
+			inet->inet_rcv_saddr = LOOPBACK4_IPV6;
+			if (sk->sk_prot->rehash)
+				sk->sk_prot->rehash(sk);
+		}
 	}
 
 	ip6_dst_store(sk, dst,
@@ -221,7 +223,7 @@ ipv4_connected:
 	 *	destination cache for it.
 	 */
 
-	err = ip6_datagram_dst_update(sk);
+	err = ip6_datagram_dst_update(sk, true);
 	if (err)
 		goto out;
 
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 1d8871a..d916d6a 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1418,8 +1418,20 @@ EXPORT_SYMBOL_GPL(ip6_update_pmtu);
 
 void ip6_sk_update_pmtu(struct sk_buff *skb, struct sock *sk, __be32 mtu)
 {
+	struct dst_entry *dst;
+
 	ip6_update_pmtu(skb, sock_net(sk), mtu,
 			sk->sk_bound_dev_if, sk->sk_mark);
+
+	dst = __sk_dst_get(sk);
+	if (!dst || !dst->obsolete ||
+	    dst->ops->check(dst, inet6_sk(sk)->dst_cookie))
+		return;
+
+	bh_lock_sock(sk);
+	if (!sock_owned_by_user(sk) && !ipv6_addr_v4mapped(&sk->sk_v6_daddr))
+		ip6_datagram_dst_update(sk, false);
+	bh_unlock_sock(sk);
 }
 EXPORT_SYMBOL_GPL(ip6_sk_update_pmtu);
 
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 net 4/4] ipv6: udp: Do a route lookup and update during release_cb
  2016-04-11 22:29 [PATCH v2 net 0/4] ipv6: datagram: Update dst cache of a connected udp sk during pmtu update Martin KaFai Lau
                   ` (2 preceding siblings ...)
  2016-04-11 22:29 ` [PATCH v2 net 3/4] ipv6: datagram: Update dst cache of a connected datagram sk during pmtu update Martin KaFai Lau
@ 2016-04-11 22:29 ` Martin KaFai Lau
  2016-04-14 20:45 ` [PATCH v2 net 0/4] ipv6: datagram: Update dst cache of a connected udp sk during pmtu update David Miller
  4 siblings, 0 replies; 6+ messages in thread
From: Martin KaFai Lau @ 2016-04-11 22:29 UTC (permalink / raw)
  To: netdev; +Cc: Cong Wang, Eric Dumazet, Wei Wang, Kernel Team

This patch adds a release_cb for UDPv6.  It does a route lookup
and updates sk->sk_dst_cache if it is needed.  It picks up the
left-over job from ip6_sk_update_pmtu() if the sk was owned
by user during the pmtu update.

It takes a rcu_read_lock to protect the __sk_dst_get() operations
because another thread may do ip6_dst_store() without taking the
sk lock (e.g. sendmsg).

Fixes: 45e4fd26683c ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Reported-by: Wei Wang <weiwan@google.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Wei Wang <weiwan@google.com>
---
 include/net/ipv6.h  |  1 +
 net/ipv6/datagram.c | 20 ++++++++++++++++++++
 net/ipv6/udp.c      |  1 +
 3 files changed, 22 insertions(+)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index fd02e90..1be050a 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -960,6 +960,7 @@ int ip6_datagram_connect(struct sock *sk, struct sockaddr *addr, int addr_len);
 int ip6_datagram_connect_v6_only(struct sock *sk, struct sockaddr *addr,
 				 int addr_len);
 int ip6_datagram_dst_update(struct sock *sk, bool fix_sk_saddr);
+void ip6_datagram_release_cb(struct sock *sk);
 
 int ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len,
 		    int *addr_len);
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index 59e01f27..9dd3882 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -119,6 +119,26 @@ out:
 	return err;
 }
 
+void ip6_datagram_release_cb(struct sock *sk)
+{
+	struct dst_entry *dst;
+
+	if (ipv6_addr_v4mapped(&sk->sk_v6_daddr))
+		return;
+
+	rcu_read_lock();
+	dst = __sk_dst_get(sk);
+	if (!dst || !dst->obsolete ||
+	    dst->ops->check(dst, inet6_sk(sk)->dst_cookie)) {
+		rcu_read_unlock();
+		return;
+	}
+	rcu_read_unlock();
+
+	ip6_datagram_dst_update(sk, false);
+}
+EXPORT_SYMBOL_GPL(ip6_datagram_release_cb);
+
 static int __ip6_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 {
 	struct sockaddr_in6	*usin = (struct sockaddr_in6 *) uaddr;
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 8125931..6bc5c66 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1539,6 +1539,7 @@ struct proto udpv6_prot = {
 	.sendmsg	   = udpv6_sendmsg,
 	.recvmsg	   = udpv6_recvmsg,
 	.backlog_rcv	   = __udpv6_queue_rcv_skb,
+	.release_cb	   = ip6_datagram_release_cb,
 	.hash		   = udp_lib_hash,
 	.unhash		   = udp_lib_unhash,
 	.rehash		   = udp_v6_rehash,
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 net 0/4] ipv6: datagram: Update dst cache of a connected udp sk during pmtu update
  2016-04-11 22:29 [PATCH v2 net 0/4] ipv6: datagram: Update dst cache of a connected udp sk during pmtu update Martin KaFai Lau
                   ` (3 preceding siblings ...)
  2016-04-11 22:29 ` [PATCH v2 net 4/4] ipv6: udp: Do a route lookup and update during release_cb Martin KaFai Lau
@ 2016-04-14 20:45 ` David Miller
  4 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2016-04-14 20:45 UTC (permalink / raw)
  To: kafai; +Cc: netdev, xiyou.wangcong, edumazet, weiwan, kernel-team

From: Martin KaFai Lau <kafai@fb.com>
Date: Mon, 11 Apr 2016 15:29:33 -0700

> v2:
> ~ Protect __sk_dst_get() operations with rcu_read_lock in
>   release_cb() because another thread may do ip6_dst_store()
>   for a udp sk without taking the sk lock (e.g. in sendmsg).
> ~ Do a ipv6_addr_v4mapped(&sk->sk_v6_daddr) check before
>   calling ip6_datagram_dst_update() in patch 3 and 4.  It is
>   similar to how __ip6_datagram_connect handles it.
> ~ One fix in ip6_datagram_dst_update() in patch 2.  It needs
>   to check (np->flow_label & IPV6_FLOWLABEL_MASK) before
>   doing fl6_sock_lookup.  I was confused with the naming
>   of IPV6_FLOWLABEL_MASK and IPV6_FLOWINFO_MASK.
> ~ Check dst->obsolete just on the safe side, although I think it
>   should at least have DST_OBSOLETE_FORCE_CHK by now.
> ~ Add Fixes tag to patch 3 and 4
> ~ Add some points from the previous discussion about holding
>   sk lock to the commit message in patch 3.
> 
> v1:
> There is a case in connected UDP socket such that
> getsockopt(IPV6_MTU) will return a stale MTU value. The reproducible
> sequence could be the following:
> 1. Create a connected UDP socket
> 2. Send some datagrams out
> 3. Receive a ICMPV6_PKT_TOOBIG
> 4. No new outgoing datagrams to trigger the sk_dst_check()
>    logic to update the sk->sk_dst_cache.
> 5. getsockopt(IPV6_MTU) returns the mtu from the invalid
>    sk->sk_dst_cache instead of the newly created RTF_CACHE clone.
> 
> Patch 1 and 2 are the prep work.
> Patch 3 and 4 are the fixes.

Series applied, thanks Martin.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-04-14 20:45 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-11 22:29 [PATCH v2 net 0/4] ipv6: datagram: Update dst cache of a connected udp sk during pmtu update Martin KaFai Lau
2016-04-11 22:29 ` [PATCH v2 net 1/4] ipv6: datagram: Refactor flowi6 init codes to a new function Martin KaFai Lau
2016-04-11 22:29 ` [PATCH v2 net 2/4] ipv6: datagram: Refactor dst lookup and update " Martin KaFai Lau
2016-04-11 22:29 ` [PATCH v2 net 3/4] ipv6: datagram: Update dst cache of a connected datagram sk during pmtu update Martin KaFai Lau
2016-04-11 22:29 ` [PATCH v2 net 4/4] ipv6: udp: Do a route lookup and update during release_cb Martin KaFai Lau
2016-04-14 20:45 ` [PATCH v2 net 0/4] ipv6: datagram: Update dst cache of a connected udp sk during pmtu update David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).