netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 1/5] net: Rename FLOWI_FLAG_VRFSRC to FLOWI_FLAG_L3MDEV_SRC
  2015-10-01 15:27 [PATCH net-next 0/5] " David Ahern
@ 2015-10-01 15:27 ` David Ahern
  0 siblings, 0 replies; 8+ messages in thread
From: David Ahern @ 2015-10-01 15:27 UTC (permalink / raw)
  To: netdev; +Cc: dsahern, David Ahern

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
---
 drivers/net/vrf.c   | 4 ++--
 include/net/flow.h  | 2 +-
 include/net/route.h | 2 +-
 net/ipv4/udp.c      | 2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
index 64f2ab663ffe..f2f9c7091130 100644
--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -208,7 +208,7 @@ static netdev_tx_t vrf_process_v4_outbound(struct sk_buff *skb,
 		.flowi4_oif = vrf_dev->ifindex,
 		.flowi4_iif = LOOPBACK_IFINDEX,
 		.flowi4_tos = RT_TOS(ip4h->tos),
-		.flowi4_flags = FLOWI_FLAG_ANYSRC | FLOWI_FLAG_VRFSRC |
+		.flowi4_flags = FLOWI_FLAG_ANYSRC | FLOWI_FLAG_L3MDEV_SRC |
 				FLOWI_FLAG_SKIP_NH_OIF,
 		.daddr = ip4h->daddr,
 	};
@@ -545,7 +545,7 @@ static struct rtable *vrf_get_rtable(const struct net_device *dev,
 {
 	struct rtable *rth = NULL;
 
-	if (!(fl4->flowi4_flags & FLOWI_FLAG_VRFSRC)) {
+	if (!(fl4->flowi4_flags & FLOWI_FLAG_L3MDEV_SRC)) {
 		struct net_vrf *vrf = netdev_priv(dev);
 
 		rth = vrf->rth;
diff --git a/include/net/flow.h b/include/net/flow.h
index 9b85db85f13c..83969eebebf3 100644
--- a/include/net/flow.h
+++ b/include/net/flow.h
@@ -34,7 +34,7 @@ struct flowi_common {
 	__u8	flowic_flags;
 #define FLOWI_FLAG_ANYSRC		0x01
 #define FLOWI_FLAG_KNOWN_NH		0x02
-#define FLOWI_FLAG_VRFSRC		0x04
+#define FLOWI_FLAG_L3MDEV_SRC		0x04
 #define FLOWI_FLAG_SKIP_NH_OIF		0x08
 	__u32	flowic_secid;
 	struct flowi_tunnel flowic_tun_key;
diff --git a/include/net/route.h b/include/net/route.h
index e211dc167db1..7929c9c33587 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -258,7 +258,7 @@ static inline void ip_route_connect_init(struct flowi4 *fl4, __be32 dst, __be32
 		flow_flags |= FLOWI_FLAG_ANYSRC;
 
 	if (netif_index_is_l3_master(sock_net(sk), oif))
-		flow_flags |= FLOWI_FLAG_VRFSRC | FLOWI_FLAG_SKIP_NH_OIF;
+		flow_flags |= FLOWI_FLAG_L3MDEV_SRC | FLOWI_FLAG_SKIP_NH_OIF;
 
 	flowi4_init_output(fl4, oif, sk->sk_mark, tos, RT_SCOPE_UNIVERSE,
 			   protocol, flow_flags, dst, src, dport, sport);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 156ba75b6000..b2882cfd3136 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1024,7 +1024,7 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 		if (netif_index_is_l3_master(net, ipc.oif)) {
 			flowi4_init_output(fl4, ipc.oif, sk->sk_mark, tos,
 					   RT_SCOPE_UNIVERSE, sk->sk_protocol,
-					   (flow_flags | FLOWI_FLAG_VRFSRC |
+					   (flow_flags | FLOWI_FLAG_L3MDEV_SRC |
 					    FLOWI_FLAG_SKIP_NH_OIF),
 					   faddr, saddr, dport,
 					   inet->inet_sport);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next 0/5 v2] net: Add saddr op to l3mdev and vrf
@ 2015-10-05 15:51 David Ahern
  2015-10-05 15:51 ` [PATCH net-next 1/5] net: Rename FLOWI_FLAG_VRFSRC to FLOWI_FLAG_L3MDEV_SRC David Ahern
                   ` (5 more replies)
  0 siblings, 6 replies; 8+ messages in thread
From: David Ahern @ 2015-10-05 15:51 UTC (permalink / raw)
  To: netdev; +Cc: dsahern, David Ahern

First 2 patches are re-sends of patches that got lost in the ethosphere
Tuesday; they were part of the first round of l3mdev conversions.
Next 3 handle the source address lookup for raw and datagram sockets
bound to a VRF device.

The conversion to the get_saddr op also fixes locally originated TCP
packets showing up at the VRF device. The use of the FLOWI_FLAG_L3MDEV_SRC
flag in ip_route_connect_init was causing locally generated packets
to skip the VRF device.

v2
- rebased to top of net-next per device delete fix and hash based
  multipath patches

David Ahern (5):
  net: Rename FLOWI_FLAG_VRFSRC to FLOWI_FLAG_L3MDEV_SRC
  net: Add netif_is_l3_slave
  net: Refactor path selection in __ip_route_output_key_hash
  net: Add source address lookup op for VRF
  net: Add l3mdev saddr lookup to raw_sendmsg

 drivers/net/vrf.c         | 49 +++++++++++++++++++++++++++++++++++++++--------
 include/linux/netdevice.h |  7 +++++++
 include/net/flow.h        |  2 +-
 include/net/ip_fib.h      |  2 ++
 include/net/l3mdev.h      | 27 ++++++++++++++++++++++++++
 include/net/route.h       |  7 ++++---
 net/ipv4/fib_semantics.c  | 21 ++++++++++++++++++++
 net/ipv4/raw.c            |  8 ++++++--
 net/ipv4/route.c          | 16 +---------------
 net/ipv4/udp.c            | 22 +++------------------
 net/l3mdev/l3mdev.c       |  8 ++++----
 11 files changed, 117 insertions(+), 52 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH net-next 1/5] net: Rename FLOWI_FLAG_VRFSRC to FLOWI_FLAG_L3MDEV_SRC
  2015-10-05 15:51 [PATCH net-next 0/5 v2] net: Add saddr op to l3mdev and vrf David Ahern
@ 2015-10-05 15:51 ` David Ahern
  2015-10-05 15:51 ` [PATCH net-next 2/5] net: Add netif_is_l3_slave David Ahern
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: David Ahern @ 2015-10-05 15:51 UTC (permalink / raw)
  To: netdev; +Cc: dsahern, David Ahern

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
---
 drivers/net/vrf.c   | 4 ++--
 include/net/flow.h  | 2 +-
 include/net/route.h | 2 +-
 net/ipv4/udp.c      | 2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
index 474396353e7f..4fd5af1acff0 100644
--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -208,7 +208,7 @@ static netdev_tx_t vrf_process_v4_outbound(struct sk_buff *skb,
 		.flowi4_oif = vrf_dev->ifindex,
 		.flowi4_iif = LOOPBACK_IFINDEX,
 		.flowi4_tos = RT_TOS(ip4h->tos),
-		.flowi4_flags = FLOWI_FLAG_ANYSRC | FLOWI_FLAG_VRFSRC |
+		.flowi4_flags = FLOWI_FLAG_ANYSRC | FLOWI_FLAG_L3MDEV_SRC |
 				FLOWI_FLAG_SKIP_NH_OIF,
 		.daddr = ip4h->daddr,
 	};
@@ -545,7 +545,7 @@ static struct rtable *vrf_get_rtable(const struct net_device *dev,
 {
 	struct rtable *rth = NULL;
 
-	if (!(fl4->flowi4_flags & FLOWI_FLAG_VRFSRC)) {
+	if (!(fl4->flowi4_flags & FLOWI_FLAG_L3MDEV_SRC)) {
 		struct net_vrf *vrf = netdev_priv(dev);
 
 		rth = vrf->rth;
diff --git a/include/net/flow.h b/include/net/flow.h
index 9b85db85f13c..83969eebebf3 100644
--- a/include/net/flow.h
+++ b/include/net/flow.h
@@ -34,7 +34,7 @@ struct flowi_common {
 	__u8	flowic_flags;
 #define FLOWI_FLAG_ANYSRC		0x01
 #define FLOWI_FLAG_KNOWN_NH		0x02
-#define FLOWI_FLAG_VRFSRC		0x04
+#define FLOWI_FLAG_L3MDEV_SRC		0x04
 #define FLOWI_FLAG_SKIP_NH_OIF		0x08
 	__u32	flowic_secid;
 	struct flowi_tunnel flowic_tun_key;
diff --git a/include/net/route.h b/include/net/route.h
index d32cb76f5302..3e18d90b3f4e 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -267,7 +267,7 @@ static inline void ip_route_connect_init(struct flowi4 *fl4, __be32 dst, __be32
 		flow_flags |= FLOWI_FLAG_ANYSRC;
 
 	if (netif_index_is_l3_master(sock_net(sk), oif))
-		flow_flags |= FLOWI_FLAG_VRFSRC | FLOWI_FLAG_SKIP_NH_OIF;
+		flow_flags |= FLOWI_FLAG_L3MDEV_SRC | FLOWI_FLAG_SKIP_NH_OIF;
 
 	flowi4_init_output(fl4, oif, sk->sk_mark, tos, RT_SCOPE_UNIVERSE,
 			   protocol, flow_flags, dst, src, dport, sport);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 156ba75b6000..b2882cfd3136 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1024,7 +1024,7 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 		if (netif_index_is_l3_master(net, ipc.oif)) {
 			flowi4_init_output(fl4, ipc.oif, sk->sk_mark, tos,
 					   RT_SCOPE_UNIVERSE, sk->sk_protocol,
-					   (flow_flags | FLOWI_FLAG_VRFSRC |
+					   (flow_flags | FLOWI_FLAG_L3MDEV_SRC |
 					    FLOWI_FLAG_SKIP_NH_OIF),
 					   faddr, saddr, dport,
 					   inet->inet_sport);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next 2/5] net: Add netif_is_l3_slave
  2015-10-05 15:51 [PATCH net-next 0/5 v2] net: Add saddr op to l3mdev and vrf David Ahern
  2015-10-05 15:51 ` [PATCH net-next 1/5] net: Rename FLOWI_FLAG_VRFSRC to FLOWI_FLAG_L3MDEV_SRC David Ahern
@ 2015-10-05 15:51 ` David Ahern
  2015-10-05 15:51 ` [PATCH net-next 3/5] net: Refactor path selection in __ip_route_output_key_hash David Ahern
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: David Ahern @ 2015-10-05 15:51 UTC (permalink / raw)
  To: netdev; +Cc: dsahern, David Ahern

IPv6 addrconf keys off of IFF_SLAVE so can not use it for L3 slave.
Add a new private flag and add netif_is_l3_slave function for checking
it.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
---
 drivers/net/vrf.c         | 10 ++++------
 include/linux/netdevice.h |  7 +++++++
 net/l3mdev/l3mdev.c       |  8 ++++----
 3 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
index 4fd5af1acff0..8713317eed86 100644
--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -39,8 +39,6 @@
 #define DRV_NAME	"vrf"
 #define DRV_VERSION	"1.0"
 
-#define vrf_is_slave(dev)   ((dev)->flags & IFF_SLAVE)
-
 #define vrf_master_get_rcu(dev) \
 	((struct net_device *)rcu_dereference(dev->rx_handler_data))
 
@@ -433,7 +431,7 @@ static int do_vrf_add_slave(struct net_device *dev, struct net_device *port_dev)
 	if (ret < 0)
 		goto out_unregister;
 
-	port_dev->flags |= IFF_SLAVE;
+	port_dev->priv_flags |= IFF_L3MDEV_SLAVE;
 	__vrf_insert_slave(queue, slave);
 	cycle_netdev(port_dev);
 
@@ -448,7 +446,7 @@ static int do_vrf_add_slave(struct net_device *dev, struct net_device *port_dev)
 
 static int vrf_add_slave(struct net_device *dev, struct net_device *port_dev)
 {
-	if (netif_is_l3_master(port_dev) || vrf_is_slave(port_dev))
+	if (netif_is_l3_master(port_dev) || netif_is_l3_slave(port_dev))
 		return -EINVAL;
 
 	return do_vrf_add_slave(dev, port_dev);
@@ -462,7 +460,7 @@ static int do_vrf_del_slave(struct net_device *dev, struct net_device *port_dev)
 	struct slave *slave;
 
 	netdev_upper_dev_unlink(port_dev, dev);
-	port_dev->flags &= ~IFF_SLAVE;
+	port_dev->priv_flags &= ~IFF_L3MDEV_SLAVE;
 
 	netdev_rx_handler_unregister(port_dev);
 
@@ -672,7 +670,7 @@ static int vrf_device_event(struct notifier_block *unused,
 	if (event == NETDEV_UNREGISTER) {
 		struct net_device *vrf_dev;
 
-		if (!vrf_is_slave(dev) || netif_is_l3_master(dev))
+		if (!netif_is_l3_slave(dev))
 			goto out;
 
 		vrf_dev = netdev_master_upper_dev_get(dev);
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index b9450784ae06..b3374402c1ea 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1261,6 +1261,7 @@ struct net_device_ops {
  * @IFF_L3MDEV_MASTER: device is an L3 master device
  * @IFF_NO_QUEUE: device can run without qdisc attached
  * @IFF_OPENVSWITCH: device is a Open vSwitch master
+ * @IFF_L3MDEV_SLAVE: device is enslaved to an L3 master device
  */
 enum netdev_priv_flags {
 	IFF_802_1Q_VLAN			= 1<<0,
@@ -1286,6 +1287,7 @@ enum netdev_priv_flags {
 	IFF_L3MDEV_MASTER		= 1<<20,
 	IFF_NO_QUEUE			= 1<<21,
 	IFF_OPENVSWITCH			= 1<<22,
+	IFF_L3MDEV_SLAVE		= 1<<23,
 };
 
 #define IFF_802_1Q_VLAN			IFF_802_1Q_VLAN
@@ -3830,6 +3832,11 @@ static inline bool netif_is_l3_master(const struct net_device *dev)
 	return dev->priv_flags & IFF_L3MDEV_MASTER;
 }
 
+static inline bool netif_is_l3_slave(const struct net_device *dev)
+{
+	return dev->priv_flags & IFF_L3MDEV_SLAVE;
+}
+
 static inline bool netif_is_bridge_master(const struct net_device *dev)
 {
 	return dev->priv_flags & IFF_EBRIDGE;
diff --git a/net/l3mdev/l3mdev.c b/net/l3mdev/l3mdev.c
index ddf75ad41713..8e5ead366e7f 100644
--- a/net/l3mdev/l3mdev.c
+++ b/net/l3mdev/l3mdev.c
@@ -26,11 +26,11 @@ int l3mdev_master_ifindex_rcu(struct net_device *dev)
 
 	if (netif_is_l3_master(dev)) {
 		ifindex = dev->ifindex;
-	} else if (dev->flags & IFF_SLAVE) {
+	} else if (netif_is_l3_slave(dev)) {
 		struct net_device *master;
 
 		master = netdev_master_upper_dev_get_rcu(dev);
-		if (master && netif_is_l3_master(master))
+		if (master)
 			ifindex = master->ifindex;
 	}
 
@@ -54,7 +54,7 @@ u32 l3mdev_fib_table_rcu(const struct net_device *dev)
 	if (netif_is_l3_master(dev)) {
 		if (dev->l3mdev_ops->l3mdev_fib_table)
 			tb_id = dev->l3mdev_ops->l3mdev_fib_table(dev);
-	} else if (dev->flags & IFF_SLAVE) {
+	} else if (netif_is_l3_slave(dev)) {
 		/* Users of netdev_master_upper_dev_get_rcu need non-const,
 		 * but current inet_*type functions take a const
 		 */
@@ -62,7 +62,7 @@ u32 l3mdev_fib_table_rcu(const struct net_device *dev)
 		const struct net_device *master;
 
 		master = netdev_master_upper_dev_get_rcu(_dev);
-		if (master && netif_is_l3_master(master) &&
+		if (master &&
 		    master->l3mdev_ops->l3mdev_fib_table)
 			tb_id = master->l3mdev_ops->l3mdev_fib_table(master);
 	}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next 3/5] net: Refactor path selection in __ip_route_output_key_hash
  2015-10-05 15:51 [PATCH net-next 0/5 v2] net: Add saddr op to l3mdev and vrf David Ahern
  2015-10-05 15:51 ` [PATCH net-next 1/5] net: Rename FLOWI_FLAG_VRFSRC to FLOWI_FLAG_L3MDEV_SRC David Ahern
  2015-10-05 15:51 ` [PATCH net-next 2/5] net: Add netif_is_l3_slave David Ahern
@ 2015-10-05 15:51 ` David Ahern
  2015-10-05 15:51 ` [PATCH net-next 4/5] net: Add source address lookup op for VRF David Ahern
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: David Ahern @ 2015-10-05 15:51 UTC (permalink / raw)
  To: netdev; +Cc: dsahern, David Ahern

VRF device needs the same path selection following lookup to set source
address. Rather than duplicating code, move existing code into a
function that is exported to modules.

Code move only; no functional change.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
---
 include/net/ip_fib.h     |  2 ++
 net/ipv4/fib_semantics.c | 21 +++++++++++++++++++++
 net/ipv4/route.c         | 16 +---------------
 3 files changed, 24 insertions(+), 15 deletions(-)

diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index 7a51fd8d99e4..ac5c6e80586a 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -329,6 +329,8 @@ static inline int fib_multipath_hash(__be32 saddr, __be32 daddr)
 }
 
 void fib_select_multipath(struct fib_result *res, int hash);
+void fib_select_path(struct net *net, struct fib_result *res,
+		     struct flowi4 *fl4, int mp_hash);
 
 /* Exported by fib_trie.c */
 void fib_trie_init(void);
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index 0c49d2f3bbc0..caf994c66d59 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -1557,3 +1557,24 @@ void fib_select_multipath(struct fib_result *res, int hash)
 	res->nh_sel = 0;
 }
 #endif
+
+void fib_select_path(struct net *net, struct fib_result *res,
+		     struct flowi4 *fl4, int mp_hash)
+{
+#ifdef CONFIG_IP_ROUTE_MULTIPATH
+	if (res->fi->fib_nhs > 1 && fl4->flowi4_oif == 0) {
+		if (mp_hash < 0)
+			mp_hash = fib_multipath_hash(fl4->saddr, fl4->daddr);
+		fib_select_multipath(res, mp_hash);
+	}
+	else
+#endif
+	if (!res->prefixlen &&
+	    res->table->tb_num_default > 1 &&
+	    res->type == RTN_UNICAST && !fl4->flowi4_oif)
+		fib_select_default(fl4, res);
+
+	if (!fl4->saddr)
+		fl4->saddr = FIB_RES_PREFSRC(net, *res);
+}
+EXPORT_SYMBOL_GPL(fib_select_path);
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 54297d3a0559..54e6f456a760 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2238,21 +2238,7 @@ struct rtable *__ip_route_output_key_hash(struct net *net, struct flowi4 *fl4,
 		goto make_route;
 	}
 
-#ifdef CONFIG_IP_ROUTE_MULTIPATH
-	if (res.fi->fib_nhs > 1 && fl4->flowi4_oif == 0) {
-		if (mp_hash < 0)
-			mp_hash = fib_multipath_hash(fl4->saddr, fl4->daddr);
-		fib_select_multipath(&res, mp_hash);
-	}
-	else
-#endif
-	if (!res.prefixlen &&
-	    res.table->tb_num_default > 1 &&
-	    res.type == RTN_UNICAST && !fl4->flowi4_oif)
-		fib_select_default(fl4, &res);
-
-	if (!fl4->saddr)
-		fl4->saddr = FIB_RES_PREFSRC(net, res);
+	fib_select_path(net, &res, fl4, mp_hash);
 
 	dev_out = FIB_RES_DEV(res);
 	fl4->flowi4_oif = dev_out->ifindex;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next 4/5] net: Add source address lookup op for VRF
  2015-10-05 15:51 [PATCH net-next 0/5 v2] net: Add saddr op to l3mdev and vrf David Ahern
                   ` (2 preceding siblings ...)
  2015-10-05 15:51 ` [PATCH net-next 3/5] net: Refactor path selection in __ip_route_output_key_hash David Ahern
@ 2015-10-05 15:51 ` David Ahern
  2015-10-05 15:51 ` [PATCH net-next 5/5] net: Add l3mdev saddr lookup to raw_sendmsg David Ahern
  2015-10-07 11:28 ` [PATCH net-next 0/5 v2] net: Add saddr op to l3mdev and vrf David Miller
  5 siblings, 0 replies; 8+ messages in thread
From: David Ahern @ 2015-10-05 15:51 UTC (permalink / raw)
  To: netdev; +Cc: dsahern, David Ahern

Add operation to l3mdev to lookup source address for a given flow.
Add support for the operation to VRF driver and convert existing
IPv4 hooks to use the new lookup.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
---
 drivers/net/vrf.c    | 35 +++++++++++++++++++++++++++++++++++
 include/net/l3mdev.h | 27 +++++++++++++++++++++++++++
 include/net/route.h  |  7 ++++---
 net/ipv4/udp.c       | 22 +++-------------------
 4 files changed, 69 insertions(+), 22 deletions(-)

diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
index 8713317eed86..64499766e00f 100644
--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -36,6 +36,9 @@
 #include <net/addrconf.h>
 #include <net/l3mdev.h>
 
+#define RT_FL_TOS(oldflp4) \
+	((oldflp4)->flowi4_tos & (IPTOS_RT_MASK | RTO_ONLINK))
+
 #define DRV_NAME	"vrf"
 #define DRV_VERSION	"1.0"
 
@@ -553,9 +556,41 @@ static struct rtable *vrf_get_rtable(const struct net_device *dev,
 	return rth;
 }
 
+/* called under rcu_read_lock */
+static void vrf_get_saddr(struct net_device *dev, struct flowi4 *fl4)
+{
+	struct fib_result res = { .tclassid = 0 };
+	struct net *net = dev_net(dev);
+	u32 orig_tos = fl4->flowi4_tos;
+	u8 flags = fl4->flowi4_flags;
+	u8 scope = fl4->flowi4_scope;
+	u8 tos = RT_FL_TOS(fl4);
+
+	if (unlikely(!fl4->daddr))
+		return;
+
+	fl4->flowi4_flags |= FLOWI_FLAG_SKIP_NH_OIF;
+	fl4->flowi4_iif = LOOPBACK_IFINDEX;
+	fl4->flowi4_tos = tos & IPTOS_RT_MASK;
+	fl4->flowi4_scope = ((tos & RTO_ONLINK) ?
+			     RT_SCOPE_LINK : RT_SCOPE_UNIVERSE);
+
+	if (!fib_lookup(net, fl4, &res, 0)) {
+		if (res.type == RTN_LOCAL)
+			fl4->saddr = res.fi->fib_prefsrc ? : fl4->daddr;
+		else
+			fib_select_path(net, &res, fl4, -1);
+	}
+
+	fl4->flowi4_flags = flags;
+	fl4->flowi4_tos = orig_tos;
+	fl4->flowi4_scope = scope;
+}
+
 static const struct l3mdev_ops vrf_l3mdev_ops = {
 	.l3mdev_fib_table	= vrf_fib_table,
 	.l3mdev_get_rtable	= vrf_get_rtable,
+	.l3mdev_get_saddr	= vrf_get_saddr,
 };
 
 static void vrf_get_drvinfo(struct net_device *dev,
diff --git a/include/net/l3mdev.h b/include/net/l3mdev.h
index 87cee05a0a17..44a19a171104 100644
--- a/include/net/l3mdev.h
+++ b/include/net/l3mdev.h
@@ -17,12 +17,16 @@
  * @l3mdev_fib_table: Get FIB table id to use for lookups
  *
  * @l3mdev_get_rtable: Get cached IPv4 rtable (dst_entry) for device
+ *
+ * @l3mdev_get_saddr: Get source address for a flow
  */
 
 struct l3mdev_ops {
 	u32		(*l3mdev_fib_table)(const struct net_device *dev);
 	struct rtable *	(*l3mdev_get_rtable)(const struct net_device *dev,
 					     const struct flowi4 *fl4);
+	void		(*l3mdev_get_saddr)(struct net_device *dev,
+					    struct flowi4 *fl4);
 };
 
 #ifdef CONFIG_NET_L3_MASTER_DEV
@@ -100,6 +104,25 @@ static inline bool netif_index_is_l3_master(struct net *net, int ifindex)
 	return rc;
 }
 
+static inline void l3mdev_get_saddr(struct net *net, int ifindex,
+				    struct flowi4 *fl4)
+{
+	struct net_device *dev;
+
+	if (ifindex) {
+
+		rcu_read_lock();
+
+		dev = dev_get_by_index_rcu(net, ifindex);
+		if (dev && netif_is_l3_master(dev) &&
+		    dev->l3mdev_ops->l3mdev_get_saddr) {
+			dev->l3mdev_ops->l3mdev_get_saddr(dev, fl4);
+		}
+
+		rcu_read_unlock();
+	}
+}
+
 #else
 
 static inline int l3mdev_master_ifindex_rcu(struct net_device *dev)
@@ -144,6 +167,10 @@ static inline bool netif_index_is_l3_master(struct net *net, int ifindex)
 	return false;
 }
 
+static inline void l3mdev_get_saddr(struct net *net, int ifindex,
+				    struct flowi4 *fl4)
+{
+}
 #endif
 
 #endif /* _NET_L3MDEV_H_ */
diff --git a/include/net/route.h b/include/net/route.h
index 3e18d90b3f4e..ee81307863d5 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -266,9 +266,6 @@ static inline void ip_route_connect_init(struct flowi4 *fl4, __be32 dst, __be32
 	if (inet_sk(sk)->transparent)
 		flow_flags |= FLOWI_FLAG_ANYSRC;
 
-	if (netif_index_is_l3_master(sock_net(sk), oif))
-		flow_flags |= FLOWI_FLAG_L3MDEV_SRC | FLOWI_FLAG_SKIP_NH_OIF;
-
 	flowi4_init_output(fl4, oif, sk->sk_mark, tos, RT_SCOPE_UNIVERSE,
 			   protocol, flow_flags, dst, src, dport, sport);
 }
@@ -285,6 +282,10 @@ static inline struct rtable *ip_route_connect(struct flowi4 *fl4,
 	ip_route_connect_init(fl4, dst, src, tos, oif, protocol,
 			      sport, dport, sk);
 
+	if (!src && oif) {
+		l3mdev_get_saddr(net, oif, fl4);
+		src = fl4->saddr;
+	}
 	if (!dst || !src) {
 		rt = __ip_route_output_key(net, fl4);
 		if (IS_ERR(rt))
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index b2882cfd3136..e1fc129099ea 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1017,30 +1017,14 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 
 		fl4 = &fl4_stack;
 
-		/* unconnected socket. If output device is enslaved to a VRF
-		 * device lookup source address from VRF table. This mimics
-		 * behavior of ip_route_connect{_init}.
-		 */
-		if (netif_index_is_l3_master(net, ipc.oif)) {
-			flowi4_init_output(fl4, ipc.oif, sk->sk_mark, tos,
-					   RT_SCOPE_UNIVERSE, sk->sk_protocol,
-					   (flow_flags | FLOWI_FLAG_L3MDEV_SRC |
-					    FLOWI_FLAG_SKIP_NH_OIF),
-					   faddr, saddr, dport,
-					   inet->inet_sport);
-
-			rt = ip_route_output_flow(net, fl4, sk);
-			if (!IS_ERR(rt)) {
-				saddr = fl4->saddr;
-				ip_rt_put(rt);
-			}
-		}
-
 		flowi4_init_output(fl4, ipc.oif, sk->sk_mark, tos,
 				   RT_SCOPE_UNIVERSE, sk->sk_protocol,
 				   flow_flags,
 				   faddr, saddr, dport, inet->inet_sport);
 
+		if (!saddr && ipc.oif)
+			l3mdev_get_saddr(net, ipc.oif, fl4);
+
 		security_sk_classify_flow(sk, flowi4_to_flowi(fl4));
 		rt = ip_route_output_flow(net, fl4, sk);
 		if (IS_ERR(rt)) {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next 5/5] net: Add l3mdev saddr lookup to raw_sendmsg
  2015-10-05 15:51 [PATCH net-next 0/5 v2] net: Add saddr op to l3mdev and vrf David Ahern
                   ` (3 preceding siblings ...)
  2015-10-05 15:51 ` [PATCH net-next 4/5] net: Add source address lookup op for VRF David Ahern
@ 2015-10-05 15:51 ` David Ahern
  2015-10-07 11:28 ` [PATCH net-next 0/5 v2] net: Add saddr op to l3mdev and vrf David Miller
  5 siblings, 0 replies; 8+ messages in thread
From: David Ahern @ 2015-10-05 15:51 UTC (permalink / raw)
  To: netdev; +Cc: dsahern, David Ahern

ping originated on box through a VRF device is showing up in tcpdump
without a source address:
    $ tcpdump -n -i vrf-blue
    08:58:33.311303 IP 0.0.0.0 > 10.2.2.254: ICMP echo request, id 2834, seq 1, length 64
    08:58:33.311562 IP 10.2.2.254 > 10.2.2.2: ICMP echo reply, id 2834, seq 1, length 64

Add the call to l3mdev_get_saddr to raw_sendmsg.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
---
 net/ipv4/raw.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 28ef8a913130..09a07e8b2f35 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -484,6 +484,7 @@ static int raw_getfrag(void *from, char *to, int offset, int len, int odd,
 static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 {
 	struct inet_sock *inet = inet_sk(sk);
+	struct net *net = sock_net(sk);
 	struct ipcm_cookie ipc;
 	struct rtable *rt = NULL;
 	struct flowi4 fl4;
@@ -543,7 +544,7 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	ipc.oif = sk->sk_bound_dev_if;
 
 	if (msg->msg_controllen) {
-		err = ip_cmsg_send(sock_net(sk), msg, &ipc, false);
+		err = ip_cmsg_send(net, msg, &ipc, false);
 		if (err)
 			goto out;
 		if (ipc.opt)
@@ -598,6 +599,9 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 			    (inet->hdrincl ? FLOWI_FLAG_KNOWN_NH : 0),
 			   daddr, saddr, 0, 0);
 
+	if (!saddr && ipc.oif)
+		l3mdev_get_saddr(net, ipc.oif, &fl4);
+
 	if (!inet->hdrincl) {
 		rfv.msg = msg;
 		rfv.hlen = 0;
@@ -608,7 +612,7 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	}
 
 	security_sk_classify_flow(sk, flowi4_to_flowi(&fl4));
-	rt = ip_route_output_flow(sock_net(sk), &fl4, sk);
+	rt = ip_route_output_flow(net, &fl4, sk);
 	if (IS_ERR(rt)) {
 		err = PTR_ERR(rt);
 		rt = NULL;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next 0/5 v2] net: Add saddr op to l3mdev and vrf
  2015-10-05 15:51 [PATCH net-next 0/5 v2] net: Add saddr op to l3mdev and vrf David Ahern
                   ` (4 preceding siblings ...)
  2015-10-05 15:51 ` [PATCH net-next 5/5] net: Add l3mdev saddr lookup to raw_sendmsg David Ahern
@ 2015-10-07 11:28 ` David Miller
  5 siblings, 0 replies; 8+ messages in thread
From: David Miller @ 2015-10-07 11:28 UTC (permalink / raw)
  To: dsa; +Cc: netdev, dsahern

From: David Ahern <dsa@cumulusnetworks.com>
Date: Mon,  5 Oct 2015 08:51:22 -0700

> First 2 patches are re-sends of patches that got lost in the ethosphere
> Tuesday; they were part of the first round of l3mdev conversions.
> Next 3 handle the source address lookup for raw and datagram sockets
> bound to a VRF device.
> 
> The conversion to the get_saddr op also fixes locally originated TCP
> packets showing up at the VRF device. The use of the FLOWI_FLAG_L3MDEV_SRC
> flag in ip_route_connect_init was causing locally generated packets
> to skip the VRF device.
> 
> v2
> - rebased to top of net-next per device delete fix and hash based
>   multipath patches

Series applied, thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-10-07 11:12 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-05 15:51 [PATCH net-next 0/5 v2] net: Add saddr op to l3mdev and vrf David Ahern
2015-10-05 15:51 ` [PATCH net-next 1/5] net: Rename FLOWI_FLAG_VRFSRC to FLOWI_FLAG_L3MDEV_SRC David Ahern
2015-10-05 15:51 ` [PATCH net-next 2/5] net: Add netif_is_l3_slave David Ahern
2015-10-05 15:51 ` [PATCH net-next 3/5] net: Refactor path selection in __ip_route_output_key_hash David Ahern
2015-10-05 15:51 ` [PATCH net-next 4/5] net: Add source address lookup op for VRF David Ahern
2015-10-05 15:51 ` [PATCH net-next 5/5] net: Add l3mdev saddr lookup to raw_sendmsg David Ahern
2015-10-07 11:28 ` [PATCH net-next 0/5 v2] net: Add saddr op to l3mdev and vrf David Miller
  -- strict thread matches above, loose matches on Subject: below --
2015-10-01 15:27 [PATCH net-next 0/5] " David Ahern
2015-10-01 15:27 ` [PATCH net-next 1/5] net: Rename FLOWI_FLAG_VRFSRC to FLOWI_FLAG_L3MDEV_SRC David Ahern

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).