Netdev List
 help / color / mirror / Atom feed
* [PATCH net-next v3 1/4] bpf, mlx5: fix mlx5e_create_rq taking reference on prog
From: Daniel Borkmann @ 2016-11-19  0:45 UTC (permalink / raw)
  To: davem
  Cc: alexei.starovoitov, bblanco, zhiyisun, ranas, saeedm, netdev,
	Daniel Borkmann
In-Reply-To: <cover.1479514784.git.daniel@iogearbox.net>

In mlx5e_create_rq(), when creating a new queue, we call bpf_prog_add() but
without checking the return value. bpf_prog_add() can fail since 92117d8443bc
("bpf: fix refcnt overflow"), so we really must check it. Take the reference
right when we assign it to the rq from priv->xdp_prog, and just drop the
reference on error path. Destruction in mlx5e_destroy_rq() looks good, though.

Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 13 +++++++++----
 kernel/bpf/syscall.c                              |  1 +
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index bd0732d..54bae79 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -513,7 +513,13 @@ static int mlx5e_create_rq(struct mlx5e_channel *c,
 	rq->channel = c;
 	rq->ix      = c->ix;
 	rq->priv    = c->priv;
-	rq->xdp_prog = priv->xdp_prog;
+
+	rq->xdp_prog = priv->xdp_prog ? bpf_prog_inc(priv->xdp_prog) : NULL;
+	if (IS_ERR(rq->xdp_prog)) {
+		err = PTR_ERR(rq->xdp_prog);
+		rq->xdp_prog = NULL;
+		goto err_rq_wq_destroy;
+	}
 
 	rq->buff.map_dir = DMA_FROM_DEVICE;
 	if (rq->xdp_prog)
@@ -590,12 +596,11 @@ static int mlx5e_create_rq(struct mlx5e_channel *c,
 	rq->page_cache.head = 0;
 	rq->page_cache.tail = 0;
 
-	if (rq->xdp_prog)
-		bpf_prog_add(rq->xdp_prog, 1);
-
 	return 0;
 
 err_rq_wq_destroy:
+	if (rq->xdp_prog)
+		bpf_prog_put(rq->xdp_prog);
 	mlx5_wq_destroy(&rq->wq_ctrl);
 
 	return err;
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index ce1b7de..eb15498 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -696,6 +696,7 @@ struct bpf_prog *bpf_prog_inc(struct bpf_prog *prog)
 {
 	return bpf_prog_add(prog, 1);
 }
+EXPORT_SYMBOL_GPL(bpf_prog_inc);
 
 static struct bpf_prog *__bpf_prog_get(u32 ufd, enum bpf_prog_type *type)
 {
-- 
1.9.3

^ permalink raw reply related

* [PATCH net-next v3 3/4] bpf, mlx5: drop priv->xdp_prog reference on netdev cleanup
From: Daniel Borkmann @ 2016-11-19  0:45 UTC (permalink / raw)
  To: davem
  Cc: alexei.starovoitov, bblanco, zhiyisun, ranas, saeedm, netdev,
	Daniel Borkmann
In-Reply-To: <cover.1479514784.git.daniel@iogearbox.net>

mlx5e_xdp_set() is currently the only place where we drop reference on the
prog sitting in priv->xdp_prog when it's exchanged by a new one. We also
need to make sure that we eventually release that reference, for example,
in case the netdev is dismantled, otherwise we leak the program.

Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 491cff9..6957608 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -3705,6 +3705,9 @@ static void mlx5e_nic_cleanup(struct mlx5e_priv *priv)
 
 	if (MLX5_CAP_GEN(mdev, vport_group_manager))
 		mlx5_eswitch_unregister_vport_rep(esw, 0);
+
+	if (priv->xdp_prog)
+		bpf_prog_put(priv->xdp_prog);
 }
 
 static int mlx5e_init_nic_rx(struct mlx5e_priv *priv)
-- 
1.9.3

^ permalink raw reply related

* [PATCH net-next v3 0/4] Couple of BPF refcount fixes for mlx5
From: Daniel Borkmann @ 2016-11-19  0:44 UTC (permalink / raw)
  To: davem
  Cc: alexei.starovoitov, bblanco, zhiyisun, ranas, saeedm, netdev,
	Daniel Borkmann

Various mlx5 bugs on eBPF refcount handling found during review.
Last patch in series adds a __must_check to BPF helpers to make
sure we won't run into it again w/o compiler complaining first.

v2 -> v3:

 - Just reworked patch 2/4 so we don't need bpf_prog_sub().
 - Rebased, rest as is.

v1 -> v2:

 - After discussion with Alexei, we agreed upon rebasing the
   patches against net-next.
 - Since net-next, I've also added the __must_check to enforce
   future users to check for errors.
 - Fixed up commit message #2.
 - Simplify assignment from patch #1 based on Saeed's feedback
   on previous set.

Thanks a lot!

Daniel Borkmann (4):
  bpf, mlx5: fix mlx5e_create_rq taking reference on prog
  bpf, mlx5: fix various refcount issues in mlx5e_xdp_set
  bpf, mlx5: drop priv->xdp_prog reference on netdev cleanup
  bpf: add __must_check attributes to refcount manipulating helpers

 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 33 +++++++++++++++++------
 include/linux/bpf.h                               | 12 +++++----
 kernel/bpf/syscall.c                              |  1 +
 3 files changed, 33 insertions(+), 13 deletions(-)

-- 
1.9.3

^ permalink raw reply

* [PATCH net-next v3 2/4] bpf, mlx5: fix various refcount issues in mlx5e_xdp_set
From: Daniel Borkmann @ 2016-11-19  0:45 UTC (permalink / raw)
  To: davem
  Cc: alexei.starovoitov, bblanco, zhiyisun, ranas, saeedm, netdev,
	Daniel Borkmann
In-Reply-To: <cover.1479514784.git.daniel@iogearbox.net>

There are multiple issues in mlx5e_xdp_set():

1) The batched bpf_prog_add() is currently not checked for errors. When
   doing so, it should be done at an earlier point in time to makes sure
   that we cannot fail anymore at the time we want to set the program for
   each channel. The batched refs short-cut can only be performed when we
   don't need to perform a reset for changing the rq type and the device
   was in opened state. In case the device was not in opened state, then
   the next mlx5e_open_locked() will aquire the refs from the control prog
   via mlx5e_create_rq(), same when we need to perform a reset.

2) When swapping the priv->xdp_prog, then no extra reference count must be
   taken since we got that from call path via dev_change_xdp_fd() already.
   Otherwise, we'd never be able to release the program. Also, bpf_prog_add()
   without checking the return code could fail.

Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 54bae79..491cff9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -3144,11 +3144,21 @@ static int mlx5e_xdp_set(struct net_device *netdev, struct bpf_prog *prog)
 
 	if (was_opened && reset)
 		mlx5e_close_locked(netdev);
+	if (was_opened && !reset) {
+		/* num_channels is invariant here, so we can take the
+		 * batched reference right upfront.
+		 */
+		prog = bpf_prog_add(prog, priv->params.num_channels);
+		if (IS_ERR(prog)) {
+			err = PTR_ERR(prog);
+			goto unlock;
+		}
+	}
 
-	/* exchange programs */
+	/* exchange programs, extra prog reference we got from caller
+	 * as long as we don't fail from this point onwards.
+	 */
 	old_prog = xchg(&priv->xdp_prog, prog);
-	if (prog)
-		bpf_prog_add(prog, 1);
 	if (old_prog)
 		bpf_prog_put(old_prog);
 
@@ -3164,7 +3174,6 @@ static int mlx5e_xdp_set(struct net_device *netdev, struct bpf_prog *prog)
 	/* exchanging programs w/o reset, we update ref counts on behalf
 	 * of the channels RQs here.
 	 */
-	bpf_prog_add(prog, priv->params.num_channels);
 	for (i = 0; i < priv->params.num_channels; i++) {
 		struct mlx5e_channel *c = priv->channel[i];
 
-- 
1.9.3

^ permalink raw reply related

* [PATCH net-next v3 4/4] bpf: add __must_check attributes to refcount manipulating helpers
From: Daniel Borkmann @ 2016-11-19  0:45 UTC (permalink / raw)
  To: davem
  Cc: alexei.starovoitov, bblanco, zhiyisun, ranas, saeedm, netdev,
	Daniel Borkmann
In-Reply-To: <cover.1479514784.git.daniel@iogearbox.net>

Helpers like bpf_prog_add(), bpf_prog_inc(), bpf_map_inc() can fail
with an error, so make sure the caller properly checks their return
value and not just ignores it, which could worst-case lead to use
after free.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
---
 include/linux/bpf.h | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 01c1487..69d0a7f 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -233,14 +233,14 @@ u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
 
 struct bpf_prog *bpf_prog_get(u32 ufd);
 struct bpf_prog *bpf_prog_get_type(u32 ufd, enum bpf_prog_type type);
-struct bpf_prog *bpf_prog_add(struct bpf_prog *prog, int i);
+struct bpf_prog * __must_check bpf_prog_add(struct bpf_prog *prog, int i);
 void bpf_prog_sub(struct bpf_prog *prog, int i);
-struct bpf_prog *bpf_prog_inc(struct bpf_prog *prog);
+struct bpf_prog * __must_check bpf_prog_inc(struct bpf_prog *prog);
 void bpf_prog_put(struct bpf_prog *prog);
 
 struct bpf_map *bpf_map_get_with_uref(u32 ufd);
 struct bpf_map *__bpf_map_get(struct fd f);
-struct bpf_map *bpf_map_inc(struct bpf_map *map, bool uref);
+struct bpf_map * __must_check bpf_map_inc(struct bpf_map *map, bool uref);
 void bpf_map_put_with_uref(struct bpf_map *map);
 void bpf_map_put(struct bpf_map *map);
 int bpf_map_precharge_memlock(u32 pages);
@@ -299,7 +299,8 @@ static inline struct bpf_prog *bpf_prog_get_type(u32 ufd,
 {
 	return ERR_PTR(-EOPNOTSUPP);
 }
-static inline struct bpf_prog *bpf_prog_add(struct bpf_prog *prog, int i)
+static inline struct bpf_prog * __must_check bpf_prog_add(struct bpf_prog *prog,
+							  int i)
 {
 	return ERR_PTR(-EOPNOTSUPP);
 }
@@ -311,7 +312,8 @@ static inline void bpf_prog_sub(struct bpf_prog *prog, int i)
 static inline void bpf_prog_put(struct bpf_prog *prog)
 {
 }
-static inline struct bpf_prog *bpf_prog_inc(struct bpf_prog *prog)
+
+static inline struct bpf_prog * __must_check bpf_prog_inc(struct bpf_prog *prog)
 {
 	return ERR_PTR(-EOPNOTSUPP);
 }
-- 
1.9.3

^ permalink raw reply related

* Re: [PATCH 3/5] virtio_net: Add XDP support
From: John Fastabend @ 2016-11-19  1:02 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: tgraf, shm, alexei.starovoitov, daniel, davem, john.r.fastabend,
	netdev, bblanco, brouer
In-Reply-To: <1479511422.8455.305.camel@edumazet-glaptop3.roam.corp.google.com>

On 16-11-18 03:23 PM, Eric Dumazet wrote:
> On Fri, 2016-11-18 at 11:00 -0800, John Fastabend wrote:
>> From: Shrijeet Mukherjee <shrijeet@gmail.com>
> 
> 
>>  #include <linux/slab.h>
>> @@ -81,6 +82,8 @@ struct receive_queue {
>>  
>>  	struct napi_struct napi;
>>  
>> +	struct bpf_prog *xdp_prog;
> 
> Please add proper sparse annotation, as in 
> 
> 	struct bpf_prog __rcu *xdp_prog;
> 
> And run sparse ;)
> 
> CONFIG_SPARSE_RCU_POINTER=y
> 
> make C=2 drivers/net/virtio_net.o
> 
> 
> 
> 

Yep will do thanks! And I will fix the other comment as well.

^ permalink raw reply

* [PATCH net-next] udp: avoid one cache line miss in recvmsg()
From: Eric Dumazet @ 2016-11-19  1:18 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

From: Eric Dumazet <edumazet@google.com>

UDP_SKB_CB(skb)->partial_cov is located at offset 66 in skb,
requesting a cold cache line being read in cpu cache.

We can avoid this cache line miss for UDP sockets,
as partial_cov has a meaning only for UDPLite.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/udp.c |    3 ++-
 net/ipv6/udp.c |    3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index e1fc0116e8d59d8185670c6e55d1219bde55610d..b949770fdc08398a10f3974505a50b2b4f4b2cf3 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1389,7 +1389,8 @@ int udp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int noblock,
 	 * coverage checksum (UDP-Lite), do it before the copy.
 	 */
 
-	if (copied < ulen || UDP_SKB_CB(skb)->partial_cov || peeking) {
+	if (copied < ulen || peeking ||
+	    (is_udplite && UDP_SKB_CB(skb)->partial_cov)) {
 		checksum_valid = !udp_lib_checksum_complete(skb);
 		if (!checksum_valid)
 			goto csum_copy_err;
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 4f99417d9b401f2a65c7828e7d6b86d1d6161794..8fd4d89380b86c8630f7fd27ce4e9958497a2b89 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -363,7 +363,8 @@ int udpv6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
 	 * coverage checksum (UDP-Lite), do it before the copy.
 	 */
 
-	if (copied < ulen || UDP_SKB_CB(skb)->partial_cov || peeking) {
+	if (copied < ulen || peeking ||
+	    (is_udplite && UDP_SKB_CB(skb)->partial_cov)) {
 		checksum_valid = !udp_lib_checksum_complete(skb);
 		if (!checksum_valid)
 			goto csum_copy_err;

^ permalink raw reply related

* [PATCH net-next v2 0/4] geneve: Use LWT more effectively.
From: Pravin B Shelar @ 2016-11-19  2:10 UTC (permalink / raw)
  To: netdev; +Cc: Pravin B Shelar

Following patch series make use of geneve LWT code path for
geneve netdev type of device.
This allows us to simplify geneve module.

v1-v2:
Fix warning reported by kbuild test robot.

Pravin B Shelar (4):
  geneve: Unify LWT and netdev handling.
  geneve: Merge ipv4 and ipv6 geneve_build_skb()
  geneve: Remove redundant socket checks.
  geneve: Optimize geneve device lookup.

 drivers/net/geneve.c | 679 +++++++++++++++++++++------------------------------
 1 file changed, 274 insertions(+), 405 deletions(-)

-- 
1.8.3.1

^ permalink raw reply

* [PATCH net-next v2 2/4] geneve: Merge ipv4 and ipv6 geneve_build_skb()
From: Pravin B Shelar @ 2016-11-19  2:10 UTC (permalink / raw)
  To: netdev; +Cc: Pravin B Shelar
In-Reply-To: <1479521411-53012-1-git-send-email-pshelar@ovn.org>

There are minimal difference in building Geneve header
between ipv4 and ipv6 geneve tunnels. Following patch
refactors code to unify it.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
---
 drivers/net/geneve.c | 100 ++++++++++++++-------------------------------------
 1 file changed, 26 insertions(+), 74 deletions(-)

diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index b5e65cd..d1759aa 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -630,67 +630,34 @@ static int geneve_stop(struct net_device *dev)
 }
 
 static void geneve_build_header(struct genevehdr *geneveh,
-				__be16 tun_flags, u8 vni[3],
-				u8 options_len, u8 *options)
+				const struct ip_tunnel_info *info)
 {
 	geneveh->ver = GENEVE_VER;
-	geneveh->opt_len = options_len / 4;
-	geneveh->oam = !!(tun_flags & TUNNEL_OAM);
-	geneveh->critical = !!(tun_flags & TUNNEL_CRIT_OPT);
+	geneveh->opt_len = info->options_len / 4;
+	geneveh->oam = !!(info->key.tun_flags & TUNNEL_OAM);
+	geneveh->critical = !!(info->key.tun_flags & TUNNEL_CRIT_OPT);
 	geneveh->rsvd1 = 0;
-	memcpy(geneveh->vni, vni, 3);
+	tunnel_id_to_vni(info->key.tun_id, geneveh->vni);
 	geneveh->proto_type = htons(ETH_P_TEB);
 	geneveh->rsvd2 = 0;
 
-	memcpy(geneveh->options, options, options_len);
+	ip_tunnel_info_opts_get(geneveh->options, info);
 }
 
-static int geneve_build_skb(struct rtable *rt, struct sk_buff *skb,
-			    __be16 tun_flags, u8 vni[3], u8 opt_len, u8 *opt,
-			    bool xnet)
-{
-	bool udp_sum = !!(tun_flags & TUNNEL_CSUM);
-	struct genevehdr *gnvh;
-	int min_headroom;
-	int err;
-
-	skb_scrub_packet(skb, xnet);
-
-	min_headroom = LL_RESERVED_SPACE(rt->dst.dev) + rt->dst.header_len
-			+ GENEVE_BASE_HLEN + opt_len + sizeof(struct iphdr);
-	err = skb_cow_head(skb, min_headroom);
-	if (unlikely(err))
-		goto free_rt;
-
-	err = udp_tunnel_handle_offloads(skb, udp_sum);
-	if (err)
-		goto free_rt;
-
-	gnvh = (struct genevehdr *)__skb_push(skb, sizeof(*gnvh) + opt_len);
-	geneve_build_header(gnvh, tun_flags, vni, opt_len, opt);
-
-	skb_set_inner_protocol(skb, htons(ETH_P_TEB));
-	return 0;
-
-free_rt:
-	ip_rt_put(rt);
-	return err;
-}
-
-#if IS_ENABLED(CONFIG_IPV6)
-static int geneve6_build_skb(struct dst_entry *dst, struct sk_buff *skb,
-			     __be16 tun_flags, u8 vni[3], u8 opt_len, u8 *opt,
-			     bool xnet)
+static int geneve_build_skb(struct dst_entry *dst, struct sk_buff *skb,
+			    const struct ip_tunnel_info *info,
+			    bool xnet, int ip_hdr_len)
 {
-	bool udp_sum = !!(tun_flags & TUNNEL_CSUM);
+	bool udp_sum = !!(info->key.tun_flags & TUNNEL_CSUM);
 	struct genevehdr *gnvh;
 	int min_headroom;
 	int err;
 
+	skb_reset_mac_header(skb);
 	skb_scrub_packet(skb, xnet);
 
-	min_headroom = LL_RESERVED_SPACE(dst->dev) + dst->header_len
-			+ GENEVE_BASE_HLEN + opt_len + sizeof(struct ipv6hdr);
+	min_headroom = LL_RESERVED_SPACE(dst->dev) + dst->header_len +
+		       GENEVE_BASE_HLEN + info->options_len + ip_hdr_len;
 	err = skb_cow_head(skb, min_headroom);
 	if (unlikely(err))
 		goto free_dst;
@@ -699,9 +666,9 @@ static int geneve6_build_skb(struct dst_entry *dst, struct sk_buff *skb,
 	if (err)
 		goto free_dst;
 
-	gnvh = (struct genevehdr *)__skb_push(skb, sizeof(*gnvh) + opt_len);
-	geneve_build_header(gnvh, tun_flags, vni, opt_len, opt);
-
+	gnvh = (struct genevehdr *)__skb_push(skb, sizeof(*gnvh) +
+						   info->options_len);
+	geneve_build_header(gnvh, info);
 	skb_set_inner_protocol(skb, htons(ETH_P_TEB));
 	return 0;
 
@@ -709,12 +676,11 @@ static int geneve6_build_skb(struct dst_entry *dst, struct sk_buff *skb,
 	dst_release(dst);
 	return err;
 }
-#endif
 
 static struct rtable *geneve_get_v4_rt(struct sk_buff *skb,
 				       struct net_device *dev,
 				       struct flowi4 *fl4,
-				       struct ip_tunnel_info *info)
+				       const struct ip_tunnel_info *info)
 {
 	bool use_cache = ip_tunnel_dst_cache_usable(skb, info);
 	struct geneve_dev *geneve = netdev_priv(dev);
@@ -738,7 +704,7 @@ static struct rtable *geneve_get_v4_rt(struct sk_buff *skb,
 	}
 	fl4->flowi4_tos = RT_TOS(tos);
 
-	dst_cache = &info->dst_cache;
+	dst_cache = (struct dst_cache *)&info->dst_cache;
 	if (use_cache) {
 		rt = dst_cache_get_ip4(dst_cache, &fl4->saddr);
 		if (rt)
@@ -763,7 +729,7 @@ static struct rtable *geneve_get_v4_rt(struct sk_buff *skb,
 static struct dst_entry *geneve_get_v6_dst(struct sk_buff *skb,
 					   struct net_device *dev,
 					   struct flowi6 *fl6,
-					   struct ip_tunnel_info *info)
+					   const struct ip_tunnel_info *info)
 {
 	bool use_cache = ip_tunnel_dst_cache_usable(skb, info);
 	struct geneve_dev *geneve = netdev_priv(dev);
@@ -789,7 +755,7 @@ static struct dst_entry *geneve_get_v6_dst(struct sk_buff *skb,
 
 	fl6->flowlabel = ip6_make_flowinfo(RT_TOS(prio),
 					   info->key.label);
-	dst_cache = &info->dst_cache;
+	dst_cache = (struct dst_cache *)&info->dst_cache;
 	if (use_cache) {
 		dst = dst_cache_get_ip6(dst_cache, &fl6->saddr);
 		if (dst)
@@ -812,7 +778,8 @@ static struct dst_entry *geneve_get_v6_dst(struct sk_buff *skb,
 #endif
 
 static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
-			   struct geneve_dev *geneve, struct ip_tunnel_info *info)
+			   struct geneve_dev *geneve,
+			   const struct ip_tunnel_info *info)
 {
 	bool xnet = !net_eq(geneve->net, dev_net(geneve->dev));
 	struct geneve_sock *gs4 = rcu_dereference(geneve->sock4);
@@ -820,11 +787,9 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
 	struct rtable *rt;
 	int err = -EINVAL;
 	struct flowi4 fl4;
-	u8 *opts = NULL;
 	__u8 tos, ttl;
 	__be16 sport;
 	__be16 df;
-	u8 vni[3];
 
 	if (!gs4)
 		return err;
@@ -843,13 +808,7 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
 	}
 	df = key->tun_flags & TUNNEL_DONT_FRAGMENT ? htons(IP_DF) : 0;
 
-	tunnel_id_to_vni(key->tun_id, vni);
-	if (info->options_len)
-		opts = ip_tunnel_info_opts(info);
-
-	skb_reset_mac_header(skb);
-	err = geneve_build_skb(rt, skb, key->tun_flags, vni,
-			       info->options_len, opts, xnet);
+	err = geneve_build_skb(&rt->dst, skb, info, xnet, sizeof(struct iphdr));
 	if (unlikely(err))
 		return err;
 
@@ -862,7 +821,8 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
 
 #if IS_ENABLED(CONFIG_IPV6)
 static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
-			    struct geneve_dev *geneve, struct ip_tunnel_info *info)
+			    struct geneve_dev *geneve,
+			    const struct ip_tunnel_info *info)
 {
 	bool xnet = !net_eq(geneve->net, dev_net(geneve->dev));
 	struct geneve_sock *gs6 = rcu_dereference(geneve->sock6);
@@ -870,10 +830,8 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
 	struct dst_entry *dst = NULL;
 	int err = -EINVAL;
 	struct flowi6 fl6;
-	u8 *opts = NULL;
 	__u8 prio, ttl;
 	__be16 sport;
-	u8 vni[3];
 
 	if (!gs6)
 		return err;
@@ -891,13 +849,7 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
 					   ip_hdr(skb), skb);
 		ttl = key->ttl ? : ip6_dst_hoplimit(dst);
 	}
-	tunnel_id_to_vni(key->tun_id, vni);
-	if (info->options_len)
-		opts = ip_tunnel_info_opts(info);
-
-	skb_reset_mac_header(skb);
-	err = geneve6_build_skb(dst, skb, key->tun_flags, vni,
-				info->options_len, opts, xnet);
+	err = geneve_build_skb(dst, skb, info, xnet, sizeof(struct iphdr));
 	if (unlikely(err))
 		return err;
 
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net-next v2 1/4] geneve: Unify LWT and netdev handling.
From: Pravin B Shelar @ 2016-11-19  2:10 UTC (permalink / raw)
  To: netdev; +Cc: Pravin B Shelar
In-Reply-To: <1479521411-53012-1-git-send-email-pshelar@ovn.org>

Current geneve implementation has two separate cases to handle.
1. netdev xmit
2. LWT xmit.

In case of netdev, geneve configuration is stored in various
struct geneve_dev members. For example geneve_addr, ttl, tos,
label, flags, dst_cache, etc. For LWT ip_tunnel_info is passed
to the device in ip_tunnel_info.

Following patch uses ip_tunnel_info struct to store almost all
of configuration of a geneve netdevice. This allows us to unify
most of geneve driver code around ip_tunnel_info struct.
This dramatically simplify geneve code, since it does not
need to handle two different configuration cases. Removes
duplicate code, single code path can handle either type
of geneve devices.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
---
 drivers/net/geneve.c | 612 ++++++++++++++++++++++-----------------------------
 1 file changed, 263 insertions(+), 349 deletions(-)

diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 85a423a..b5e65cd 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -45,41 +45,22 @@ struct geneve_net {
 
 static int geneve_net_id;
 
-union geneve_addr {
-	struct sockaddr_in sin;
-	struct sockaddr_in6 sin6;
-	struct sockaddr sa;
-};
-
-static union geneve_addr geneve_remote_unspec = { .sa.sa_family = AF_UNSPEC, };
-
 /* Pseudo network device */
 struct geneve_dev {
 	struct hlist_node  hlist;	/* vni hash table */
 	struct net	   *net;	/* netns for packet i/o */
 	struct net_device  *dev;	/* netdev for geneve tunnel */
+	struct ip_tunnel_info info;
 	struct geneve_sock __rcu *sock4;	/* IPv4 socket used for geneve tunnel */
 #if IS_ENABLED(CONFIG_IPV6)
 	struct geneve_sock __rcu *sock6;	/* IPv6 socket used for geneve tunnel */
 #endif
-	u8                 vni[3];	/* virtual network ID for tunnel */
-	u8                 ttl;		/* TTL override */
-	u8                 tos;		/* TOS override */
-	union geneve_addr  remote;	/* IP address for link partner */
 	struct list_head   next;	/* geneve's per namespace list */
-	__be32		   label;	/* IPv6 flowlabel override */
-	__be16		   dst_port;
-	bool		   collect_md;
 	struct gro_cells   gro_cells;
-	u32		   flags;
-	struct dst_cache   dst_cache;
+	bool		   collect_md;
+	bool		   use_udp6_rx_checksums;
 };
 
-/* Geneve device flags */
-#define GENEVE_F_UDP_ZERO_CSUM_TX	BIT(0)
-#define GENEVE_F_UDP_ZERO_CSUM6_TX	BIT(1)
-#define GENEVE_F_UDP_ZERO_CSUM6_RX	BIT(2)
-
 struct geneve_sock {
 	bool			collect_md;
 	struct list_head	list;
@@ -87,7 +68,6 @@ struct geneve_sock {
 	struct rcu_head		rcu;
 	int			refcnt;
 	struct hlist_head	vni_list[VNI_HASH_SIZE];
-	u32			flags;
 };
 
 static inline __u32 geneve_net_vni_hash(u8 vni[3])
@@ -109,6 +89,20 @@ static __be64 vni_to_tunnel_id(const __u8 *vni)
 #endif
 }
 
+/* Convert 64 bit tunnel ID to 24 bit VNI. */
+static void tunnel_id_to_vni(__be64 tun_id, __u8 *vni)
+{
+#ifdef __BIG_ENDIAN
+	vni[0] = (__force __u8)(tun_id >> 16);
+	vni[1] = (__force __u8)(tun_id >> 8);
+	vni[2] = (__force __u8)tun_id;
+#else
+	vni[0] = (__force __u8)((__force u64)tun_id >> 40);
+	vni[1] = (__force __u8)((__force u64)tun_id >> 48);
+	vni[2] = (__force __u8)((__force u64)tun_id >> 56);
+#endif
+}
+
 static sa_family_t geneve_get_sk_family(struct geneve_sock *gs)
 {
 	return gs->sock->sk->sk_family;
@@ -117,6 +111,7 @@ static sa_family_t geneve_get_sk_family(struct geneve_sock *gs)
 static struct geneve_dev *geneve_lookup(struct geneve_sock *gs,
 					__be32 addr, u8 vni[])
 {
+	__be64 id = vni_to_tunnel_id(vni);
 	struct hlist_head *vni_list_head;
 	struct geneve_dev *geneve;
 	__u32 hash;
@@ -125,8 +120,8 @@ static struct geneve_dev *geneve_lookup(struct geneve_sock *gs,
 	hash = geneve_net_vni_hash(vni);
 	vni_list_head = &gs->vni_list[hash];
 	hlist_for_each_entry_rcu(geneve, vni_list_head, hlist) {
-		if (!memcmp(vni, geneve->vni, sizeof(geneve->vni)) &&
-		    addr == geneve->remote.sin.sin_addr.s_addr)
+		if (!memcmp(&id, &geneve->info.key.tun_id, sizeof(id)) &&
+		    addr == geneve->info.key.u.ipv4.dst)
 			return geneve;
 	}
 	return NULL;
@@ -136,6 +131,7 @@ static struct geneve_dev *geneve_lookup(struct geneve_sock *gs,
 static struct geneve_dev *geneve6_lookup(struct geneve_sock *gs,
 					 struct in6_addr addr6, u8 vni[])
 {
+	__be64 id = vni_to_tunnel_id(vni);
 	struct hlist_head *vni_list_head;
 	struct geneve_dev *geneve;
 	__u32 hash;
@@ -144,8 +140,8 @@ static struct geneve_dev *geneve6_lookup(struct geneve_sock *gs,
 	hash = geneve_net_vni_hash(vni);
 	vni_list_head = &gs->vni_list[hash];
 	hlist_for_each_entry_rcu(geneve, vni_list_head, hlist) {
-		if (!memcmp(vni, geneve->vni, sizeof(geneve->vni)) &&
-		    ipv6_addr_equal(&addr6, &geneve->remote.sin6.sin6_addr))
+		if (!memcmp(&id, &geneve->info.key.tun_id, sizeof(id)) &&
+		    ipv6_addr_equal(&addr6, &geneve->info.key.u.ipv6.dst))
 			return geneve;
 	}
 	return NULL;
@@ -160,15 +156,12 @@ static inline struct genevehdr *geneve_hdr(const struct sk_buff *skb)
 static struct geneve_dev *geneve_lookup_skb(struct geneve_sock *gs,
 					    struct sk_buff *skb)
 {
-	u8 *vni;
-	__be32 addr;
 	static u8 zero_vni[3];
-#if IS_ENABLED(CONFIG_IPV6)
-	static struct in6_addr zero_addr6;
-#endif
+	u8 *vni;
 
 	if (geneve_get_sk_family(gs) == AF_INET) {
 		struct iphdr *iph;
+		__be32 addr;
 
 		iph = ip_hdr(skb); /* outer IP header... */
 
@@ -183,6 +176,7 @@ static struct geneve_dev *geneve_lookup_skb(struct geneve_sock *gs,
 		return geneve_lookup(gs, addr, vni);
 #if IS_ENABLED(CONFIG_IPV6)
 	} else if (geneve_get_sk_family(gs) == AF_INET6) {
+		static struct in6_addr zero_addr6;
 		struct ipv6hdr *ip6h;
 		struct in6_addr addr6;
 
@@ -305,13 +299,12 @@ static int geneve_init(struct net_device *dev)
 		return err;
 	}
 
-	err = dst_cache_init(&geneve->dst_cache, GFP_KERNEL);
+	err = dst_cache_init(&geneve->info.dst_cache, GFP_KERNEL);
 	if (err) {
 		free_percpu(dev->tstats);
 		gro_cells_destroy(&geneve->gro_cells);
 		return err;
 	}
-
 	return 0;
 }
 
@@ -319,7 +312,7 @@ static void geneve_uninit(struct net_device *dev)
 {
 	struct geneve_dev *geneve = netdev_priv(dev);
 
-	dst_cache_destroy(&geneve->dst_cache);
+	dst_cache_destroy(&geneve->info.dst_cache);
 	gro_cells_destroy(&geneve->gro_cells);
 	free_percpu(dev->tstats);
 }
@@ -368,7 +361,7 @@ static int geneve_udp_encap_recv(struct sock *sk, struct sk_buff *skb)
 }
 
 static struct socket *geneve_create_sock(struct net *net, bool ipv6,
-					 __be16 port, u32 flags)
+					 __be16 port, bool ipv6_rx_csum)
 {
 	struct socket *sock;
 	struct udp_port_cfg udp_conf;
@@ -379,8 +372,7 @@ static struct socket *geneve_create_sock(struct net *net, bool ipv6,
 	if (ipv6) {
 		udp_conf.family = AF_INET6;
 		udp_conf.ipv6_v6only = 1;
-		udp_conf.use_udp6_rx_checksums =
-		    !(flags & GENEVE_F_UDP_ZERO_CSUM6_RX);
+		udp_conf.use_udp6_rx_checksums = ipv6_rx_csum;
 	} else {
 		udp_conf.family = AF_INET;
 		udp_conf.local_ip.s_addr = htonl(INADDR_ANY);
@@ -491,7 +483,7 @@ static int geneve_gro_complete(struct sock *sk, struct sk_buff *skb,
 
 /* Create new listen socket if needed */
 static struct geneve_sock *geneve_socket_create(struct net *net, __be16 port,
-						bool ipv6, u32 flags)
+						bool ipv6, bool ipv6_rx_csum)
 {
 	struct geneve_net *gn = net_generic(net, geneve_net_id);
 	struct geneve_sock *gs;
@@ -503,7 +495,7 @@ static struct geneve_sock *geneve_socket_create(struct net *net, __be16 port,
 	if (!gs)
 		return ERR_PTR(-ENOMEM);
 
-	sock = geneve_create_sock(net, ipv6, port, flags);
+	sock = geneve_create_sock(net, ipv6, port, ipv6_rx_csum);
 	if (IS_ERR(sock)) {
 		kfree(gs);
 		return ERR_CAST(sock);
@@ -579,21 +571,22 @@ static int geneve_sock_add(struct geneve_dev *geneve, bool ipv6)
 	struct net *net = geneve->net;
 	struct geneve_net *gn = net_generic(net, geneve_net_id);
 	struct geneve_sock *gs;
+	__u8 vni[3];
 	__u32 hash;
 
-	gs = geneve_find_sock(gn, ipv6 ? AF_INET6 : AF_INET, geneve->dst_port);
+	gs = geneve_find_sock(gn, ipv6 ? AF_INET6 : AF_INET, geneve->info.key.tp_dst);
 	if (gs) {
 		gs->refcnt++;
 		goto out;
 	}
 
-	gs = geneve_socket_create(net, geneve->dst_port, ipv6, geneve->flags);
+	gs = geneve_socket_create(net, geneve->info.key.tp_dst, ipv6,
+				  geneve->use_udp6_rx_checksums);
 	if (IS_ERR(gs))
 		return PTR_ERR(gs);
 
 out:
 	gs->collect_md = geneve->collect_md;
-	gs->flags = geneve->flags;
 #if IS_ENABLED(CONFIG_IPV6)
 	if (ipv6)
 		rcu_assign_pointer(geneve->sock6, gs);
@@ -601,7 +594,8 @@ static int geneve_sock_add(struct geneve_dev *geneve, bool ipv6)
 #endif
 		rcu_assign_pointer(geneve->sock4, gs);
 
-	hash = geneve_net_vni_hash(geneve->vni);
+	tunnel_id_to_vni(geneve->info.key.tun_id, vni);
+	hash = geneve_net_vni_hash(vni);
 	hlist_add_head_rcu(&geneve->hlist, &gs->vni_list[hash]);
 	return 0;
 }
@@ -609,7 +603,7 @@ static int geneve_sock_add(struct geneve_dev *geneve, bool ipv6)
 static int geneve_open(struct net_device *dev)
 {
 	struct geneve_dev *geneve = netdev_priv(dev);
-	bool ipv6 = geneve->remote.sa.sa_family == AF_INET6;
+	bool ipv6 = !!(geneve->info.mode & IP_TUNNEL_INFO_IPV6);
 	bool metadata = geneve->collect_md;
 	int ret = 0;
 
@@ -653,12 +647,12 @@ static void geneve_build_header(struct genevehdr *geneveh,
 
 static int geneve_build_skb(struct rtable *rt, struct sk_buff *skb,
 			    __be16 tun_flags, u8 vni[3], u8 opt_len, u8 *opt,
-			    u32 flags, bool xnet)
+			    bool xnet)
 {
+	bool udp_sum = !!(tun_flags & TUNNEL_CSUM);
 	struct genevehdr *gnvh;
 	int min_headroom;
 	int err;
-	bool udp_sum = !(flags & GENEVE_F_UDP_ZERO_CSUM_TX);
 
 	skb_scrub_packet(skb, xnet);
 
@@ -686,12 +680,12 @@ static int geneve_build_skb(struct rtable *rt, struct sk_buff *skb,
 #if IS_ENABLED(CONFIG_IPV6)
 static int geneve6_build_skb(struct dst_entry *dst, struct sk_buff *skb,
 			     __be16 tun_flags, u8 vni[3], u8 opt_len, u8 *opt,
-			     u32 flags, bool xnet)
+			     bool xnet)
 {
+	bool udp_sum = !!(tun_flags & TUNNEL_CSUM);
 	struct genevehdr *gnvh;
 	int min_headroom;
 	int err;
-	bool udp_sum = !(flags & GENEVE_F_UDP_ZERO_CSUM6_TX);
 
 	skb_scrub_packet(skb, xnet);
 
@@ -734,32 +728,22 @@ static struct rtable *geneve_get_v4_rt(struct sk_buff *skb,
 	memset(fl4, 0, sizeof(*fl4));
 	fl4->flowi4_mark = skb->mark;
 	fl4->flowi4_proto = IPPROTO_UDP;
+	fl4->daddr = info->key.u.ipv4.dst;
+	fl4->saddr = info->key.u.ipv4.src;
 
-	if (info) {
-		fl4->daddr = info->key.u.ipv4.dst;
-		fl4->saddr = info->key.u.ipv4.src;
-		fl4->flowi4_tos = RT_TOS(info->key.tos);
-		dst_cache = &info->dst_cache;
-	} else {
-		tos = geneve->tos;
-		if (tos == 1) {
-			const struct iphdr *iip = ip_hdr(skb);
-
-			tos = ip_tunnel_get_dsfield(iip, skb);
-			use_cache = false;
-		}
-
-		fl4->flowi4_tos = RT_TOS(tos);
-		fl4->daddr = geneve->remote.sin.sin_addr.s_addr;
-		dst_cache = &geneve->dst_cache;
+	tos = info->key.tos;
+	if (!geneve->collect_md && (tos == 1)) {
+		tos = ip_tunnel_get_dsfield(ip_hdr(skb), skb);
+		use_cache = false;
 	}
+	fl4->flowi4_tos = RT_TOS(tos);
 
+	dst_cache = &info->dst_cache;
 	if (use_cache) {
 		rt = dst_cache_get_ip4(dst_cache, &fl4->saddr);
 		if (rt)
 			return rt;
 	}
-
 	rt = ip_route_output_key(geneve->net, fl4);
 	if (IS_ERR(rt)) {
 		netdev_dbg(dev, "no route to %pI4\n", &fl4->daddr);
@@ -795,34 +779,22 @@ static struct dst_entry *geneve_get_v6_dst(struct sk_buff *skb,
 	memset(fl6, 0, sizeof(*fl6));
 	fl6->flowi6_mark = skb->mark;
 	fl6->flowi6_proto = IPPROTO_UDP;
-
-	if (info) {
-		fl6->daddr = info->key.u.ipv6.dst;
-		fl6->saddr = info->key.u.ipv6.src;
-		fl6->flowlabel = ip6_make_flowinfo(RT_TOS(info->key.tos),
-						   info->key.label);
-		dst_cache = &info->dst_cache;
-	} else {
-		prio = geneve->tos;
-		if (prio == 1) {
-			const struct iphdr *iip = ip_hdr(skb);
-
-			prio = ip_tunnel_get_dsfield(iip, skb);
-			use_cache = false;
-		}
-
-		fl6->flowlabel = ip6_make_flowinfo(RT_TOS(prio),
-						   geneve->label);
-		fl6->daddr = geneve->remote.sin6.sin6_addr;
-		dst_cache = &geneve->dst_cache;
+	fl6->daddr = info->key.u.ipv6.dst;
+	fl6->saddr = info->key.u.ipv6.src;
+	prio = info->key.tos;
+	if (!geneve->collect_md && (prio == 1)) {
+		prio = ip_tunnel_get_dsfield(ip_hdr(skb), skb);
+		use_cache = false;
 	}
 
+	fl6->flowlabel = ip6_make_flowinfo(RT_TOS(prio),
+					   info->key.label);
+	dst_cache = &info->dst_cache;
 	if (use_cache) {
 		dst = dst_cache_get_ip6(dst_cache, &fl6->saddr);
 		if (dst)
 			return dst;
 	}
-
 	if (ipv6_stub->ipv6_dst_lookup(geneve->net, gs6->sock->sk, &dst, fl6)) {
 		netdev_dbg(dev, "no route to %pI6\n", &fl6->daddr);
 		return ERR_PTR(-ENETUNREACH);
@@ -839,195 +811,130 @@ static struct dst_entry *geneve_get_v6_dst(struct sk_buff *skb,
 }
 #endif
 
-/* Convert 64 bit tunnel ID to 24 bit VNI. */
-static void tunnel_id_to_vni(__be64 tun_id, __u8 *vni)
-{
-#ifdef __BIG_ENDIAN
-	vni[0] = (__force __u8)(tun_id >> 16);
-	vni[1] = (__force __u8)(tun_id >> 8);
-	vni[2] = (__force __u8)tun_id;
-#else
-	vni[0] = (__force __u8)((__force u64)tun_id >> 40);
-	vni[1] = (__force __u8)((__force u64)tun_id >> 48);
-	vni[2] = (__force __u8)((__force u64)tun_id >> 56);
-#endif
-}
-
-static netdev_tx_t geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
-				   struct ip_tunnel_info *info)
+static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
+			   struct geneve_dev *geneve, struct ip_tunnel_info *info)
 {
-	struct geneve_dev *geneve = netdev_priv(dev);
-	struct geneve_sock *gs4;
-	struct rtable *rt = NULL;
-	const struct iphdr *iip; /* interior IP header */
+	bool xnet = !net_eq(geneve->net, dev_net(geneve->dev));
+	struct geneve_sock *gs4 = rcu_dereference(geneve->sock4);
+	const struct ip_tunnel_key *key = &info->key;
+	struct rtable *rt;
 	int err = -EINVAL;
 	struct flowi4 fl4;
+	u8 *opts = NULL;
 	__u8 tos, ttl;
 	__be16 sport;
 	__be16 df;
-	bool xnet = !net_eq(geneve->net, dev_net(geneve->dev));
-	u32 flags = geneve->flags;
+	u8 vni[3];
 
-	gs4 = rcu_dereference(geneve->sock4);
 	if (!gs4)
-		goto tx_error;
-
-	if (geneve->collect_md) {
-		if (unlikely(!info || !(info->mode & IP_TUNNEL_INFO_TX))) {
-			netdev_dbg(dev, "no tunnel metadata\n");
-			goto tx_error;
-		}
-		if (info && ip_tunnel_info_af(info) != AF_INET)
-			goto tx_error;
-	}
+		return err;
 
 	rt = geneve_get_v4_rt(skb, dev, &fl4, info);
-	if (IS_ERR(rt)) {
-		err = PTR_ERR(rt);
-		goto tx_error;
-	}
+	if (IS_ERR(rt))
+		return PTR_ERR(rt);
 
 	sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true);
-	skb_reset_mac_header(skb);
-
-	iip = ip_hdr(skb);
-
-	if (info) {
-		const struct ip_tunnel_key *key = &info->key;
-		u8 *opts = NULL;
-		u8 vni[3];
-
-		tunnel_id_to_vni(key->tun_id, vni);
-		if (info->options_len)
-			opts = ip_tunnel_info_opts(info);
-
-		if (key->tun_flags & TUNNEL_CSUM)
-			flags &= ~GENEVE_F_UDP_ZERO_CSUM_TX;
-		else
-			flags |= GENEVE_F_UDP_ZERO_CSUM_TX;
-
-		err = geneve_build_skb(rt, skb, key->tun_flags, vni,
-				       info->options_len, opts, flags, xnet);
-		if (unlikely(err))
-			goto tx_error;
-
-		tos = ip_tunnel_ecn_encap(key->tos, iip, skb);
+	if (geneve->collect_md) {
+		tos = ip_tunnel_ecn_encap(key->tos, ip_hdr(skb), skb);
 		ttl = key->ttl;
-		df = key->tun_flags & TUNNEL_DONT_FRAGMENT ? htons(IP_DF) : 0;
 	} else {
-		err = geneve_build_skb(rt, skb, 0, geneve->vni,
-				       0, NULL, flags, xnet);
-		if (unlikely(err))
-			goto tx_error;
-
-		tos = ip_tunnel_ecn_encap(fl4.flowi4_tos, iip, skb);
-		ttl = geneve->ttl;
-		if (!ttl && IN_MULTICAST(ntohl(fl4.daddr)))
-			ttl = 1;
-		ttl = ttl ? : ip4_dst_hoplimit(&rt->dst);
-		df = 0;
+		tos = ip_tunnel_ecn_encap(fl4.flowi4_tos, ip_hdr(skb), skb);
+		ttl = key->ttl ? : ip4_dst_hoplimit(&rt->dst);
 	}
-	udp_tunnel_xmit_skb(rt, gs4->sock->sk, skb, fl4.saddr, fl4.daddr,
-			    tos, ttl, df, sport, geneve->dst_port,
-			    !net_eq(geneve->net, dev_net(geneve->dev)),
-			    !!(flags & GENEVE_F_UDP_ZERO_CSUM_TX));
+	df = key->tun_flags & TUNNEL_DONT_FRAGMENT ? htons(IP_DF) : 0;
 
-	return NETDEV_TX_OK;
-
-tx_error:
-	dev_kfree_skb(skb);
+	tunnel_id_to_vni(key->tun_id, vni);
+	if (info->options_len)
+		opts = ip_tunnel_info_opts(info);
 
-	if (err == -ELOOP)
-		dev->stats.collisions++;
-	else if (err == -ENETUNREACH)
-		dev->stats.tx_carrier_errors++;
+	skb_reset_mac_header(skb);
+	err = geneve_build_skb(rt, skb, key->tun_flags, vni,
+			       info->options_len, opts, xnet);
+	if (unlikely(err))
+		return err;
 
-	dev->stats.tx_errors++;
-	return NETDEV_TX_OK;
+	udp_tunnel_xmit_skb(rt, gs4->sock->sk, skb, fl4.saddr, fl4.daddr,
+			    tos, ttl, df, sport, geneve->info.key.tp_dst,
+			    !net_eq(geneve->net, dev_net(geneve->dev)),
+			    !(info->key.tun_flags & TUNNEL_CSUM));
+	return 0;
 }
 
 #if IS_ENABLED(CONFIG_IPV6)
-static netdev_tx_t geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
-				    struct ip_tunnel_info *info)
+static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
+			    struct geneve_dev *geneve, struct ip_tunnel_info *info)
 {
-	struct geneve_dev *geneve = netdev_priv(dev);
+	bool xnet = !net_eq(geneve->net, dev_net(geneve->dev));
+	struct geneve_sock *gs6 = rcu_dereference(geneve->sock6);
+	const struct ip_tunnel_key *key = &info->key;
 	struct dst_entry *dst = NULL;
-	const struct iphdr *iip; /* interior IP header */
-	struct geneve_sock *gs6;
 	int err = -EINVAL;
 	struct flowi6 fl6;
+	u8 *opts = NULL;
 	__u8 prio, ttl;
 	__be16 sport;
-	__be32 label;
-	bool xnet = !net_eq(geneve->net, dev_net(geneve->dev));
-	u32 flags = geneve->flags;
+	u8 vni[3];
 
-	gs6 = rcu_dereference(geneve->sock6);
 	if (!gs6)
-		goto tx_error;
-
-	if (geneve->collect_md) {
-		if (unlikely(!info || !(info->mode & IP_TUNNEL_INFO_TX))) {
-			netdev_dbg(dev, "no tunnel metadata\n");
-			goto tx_error;
-		}
-	}
+		return err;
 
 	dst = geneve_get_v6_dst(skb, dev, &fl6, info);
-	if (IS_ERR(dst)) {
-		err = PTR_ERR(dst);
-		goto tx_error;
-	}
+	if (IS_ERR(dst))
+		return PTR_ERR(dst);
 
 	sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true);
-	skb_reset_mac_header(skb);
-
-	iip = ip_hdr(skb);
+	if (geneve->collect_md) {
+		prio = ip_tunnel_ecn_encap(key->tos, ip_hdr(skb), skb);
+		ttl = key->ttl;
+	} else {
+		prio = ip_tunnel_ecn_encap(ip6_tclass(fl6.flowlabel),
+					   ip_hdr(skb), skb);
+		ttl = key->ttl ? : ip6_dst_hoplimit(dst);
+	}
+	tunnel_id_to_vni(key->tun_id, vni);
+	if (info->options_len)
+		opts = ip_tunnel_info_opts(info);
 
-	if (info) {
-		const struct ip_tunnel_key *key = &info->key;
-		u8 *opts = NULL;
-		u8 vni[3];
+	skb_reset_mac_header(skb);
+	err = geneve6_build_skb(dst, skb, key->tun_flags, vni,
+				info->options_len, opts, xnet);
+	if (unlikely(err))
+		return err;
 
-		tunnel_id_to_vni(key->tun_id, vni);
-		if (info->options_len)
-			opts = ip_tunnel_info_opts(info);
+	udp_tunnel6_xmit_skb(dst, gs6->sock->sk, skb, dev,
+			     &fl6.saddr, &fl6.daddr, prio, ttl,
+			     info->key.label, sport, geneve->info.key.tp_dst,
+			     !(info->key.tun_flags & TUNNEL_CSUM));
+	return 0;
+}
+#endif
 
-		if (key->tun_flags & TUNNEL_CSUM)
-			flags &= ~GENEVE_F_UDP_ZERO_CSUM6_TX;
-		else
-			flags |= GENEVE_F_UDP_ZERO_CSUM6_TX;
+static netdev_tx_t geneve_xmit(struct sk_buff *skb, struct net_device *dev)
+{
+	struct geneve_dev *geneve = netdev_priv(dev);
+	struct ip_tunnel_info *info = NULL;
+	int err;
 
-		err = geneve6_build_skb(dst, skb, key->tun_flags, vni,
-					info->options_len, opts,
-					flags, xnet);
-		if (unlikely(err))
+	if (geneve->collect_md) {
+		info = skb_tunnel_info(skb);
+		if (unlikely(!info || !(info->mode & IP_TUNNEL_INFO_TX))) {
+			err = -EINVAL;
+			netdev_dbg(dev, "no tunnel metadata\n");
 			goto tx_error;
-
-		prio = ip_tunnel_ecn_encap(key->tos, iip, skb);
-		ttl = key->ttl;
-		label = info->key.label;
+		}
 	} else {
-		err = geneve6_build_skb(dst, skb, 0, geneve->vni,
-					0, NULL, flags, xnet);
-		if (unlikely(err))
-			goto tx_error;
-
-		prio = ip_tunnel_ecn_encap(ip6_tclass(fl6.flowlabel),
-					   iip, skb);
-		ttl = geneve->ttl;
-		if (!ttl && ipv6_addr_is_multicast(&fl6.daddr))
-			ttl = 1;
-		ttl = ttl ? : ip6_dst_hoplimit(dst);
-		label = geneve->label;
+		info = &geneve->info;
 	}
 
-	udp_tunnel6_xmit_skb(dst, gs6->sock->sk, skb, dev,
-			     &fl6.saddr, &fl6.daddr, prio, ttl, label,
-			     sport, geneve->dst_port,
-			     !!(flags & GENEVE_F_UDP_ZERO_CSUM6_TX));
-	return NETDEV_TX_OK;
+#if IS_ENABLED(CONFIG_IPV6)
+	if (info->mode & IP_TUNNEL_INFO_IPV6)
+		err = geneve6_xmit_skb(skb, dev, geneve, info);
+	else
+#endif
+		err = geneve_xmit_skb(skb, dev, geneve, info);
 
+	if (likely(!err))
+		return NETDEV_TX_OK;
 tx_error:
 	dev_kfree_skb(skb);
 
@@ -1039,23 +946,6 @@ static netdev_tx_t geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
 	dev->stats.tx_errors++;
 	return NETDEV_TX_OK;
 }
-#endif
-
-static netdev_tx_t geneve_xmit(struct sk_buff *skb, struct net_device *dev)
-{
-	struct geneve_dev *geneve = netdev_priv(dev);
-	struct ip_tunnel_info *info = NULL;
-
-	if (geneve->collect_md)
-		info = skb_tunnel_info(skb);
-
-#if IS_ENABLED(CONFIG_IPV6)
-	if ((info && ip_tunnel_info_af(info) == AF_INET6) ||
-	    (!info && geneve->remote.sa.sa_family == AF_INET6))
-		return geneve6_xmit_skb(skb, dev, info);
-#endif
-	return geneve_xmit_skb(skb, dev, info);
-}
 
 static int geneve_change_mtu(struct net_device *dev, int new_mtu)
 {
@@ -1073,14 +963,11 @@ static int geneve_fill_metadata_dst(struct net_device *dev, struct sk_buff *skb)
 {
 	struct ip_tunnel_info *info = skb_tunnel_info(skb);
 	struct geneve_dev *geneve = netdev_priv(dev);
-	struct rtable *rt;
-	struct flowi4 fl4;
-#if IS_ENABLED(CONFIG_IPV6)
-	struct dst_entry *dst;
-	struct flowi6 fl6;
-#endif
 
 	if (ip_tunnel_info_af(info) == AF_INET) {
+		struct rtable *rt;
+		struct flowi4 fl4;
+
 		rt = geneve_get_v4_rt(skb, dev, &fl4, info);
 		if (IS_ERR(rt))
 			return PTR_ERR(rt);
@@ -1089,6 +976,9 @@ static int geneve_fill_metadata_dst(struct net_device *dev, struct sk_buff *skb)
 		info->key.u.ipv4.src = fl4.saddr;
 #if IS_ENABLED(CONFIG_IPV6)
 	} else if (ip_tunnel_info_af(info) == AF_INET6) {
+		struct dst_entry *dst;
+		struct flowi6 fl6;
+
 		dst = geneve_get_v6_dst(skb, dev, &fl6, info);
 		if (IS_ERR(dst))
 			return PTR_ERR(dst);
@@ -1102,7 +992,7 @@ static int geneve_fill_metadata_dst(struct net_device *dev, struct sk_buff *skb)
 
 	info->key.tp_src = udp_flow_src_port(geneve->net, skb,
 					     1, USHRT_MAX, true);
-	info->key.tp_dst = geneve->dst_port;
+	info->key.tp_dst = geneve->info.key.tp_dst;
 	return 0;
 }
 
@@ -1224,78 +1114,69 @@ static int geneve_validate(struct nlattr *tb[], struct nlattr *data[])
 }
 
 static struct geneve_dev *geneve_find_dev(struct geneve_net *gn,
-					  __be16 dst_port,
-					  union geneve_addr *remote,
-					  u8 vni[],
+					  const struct ip_tunnel_info *info,
 					  bool *tun_on_same_port,
 					  bool *tun_collect_md)
 {
-	struct geneve_dev *geneve, *t;
+	struct geneve_dev *geneve, *t = NULL;
 
 	*tun_on_same_port = false;
 	*tun_collect_md = false;
-	t = NULL;
 	list_for_each_entry(geneve, &gn->geneve_list, next) {
-		if (geneve->dst_port == dst_port) {
+		if (info->key.tp_dst == geneve->info.key.tp_dst) {
 			*tun_collect_md = geneve->collect_md;
 			*tun_on_same_port = true;
 		}
-		if (!memcmp(vni, geneve->vni, sizeof(geneve->vni)) &&
-		    !memcmp(remote, &geneve->remote, sizeof(geneve->remote)) &&
-		    dst_port == geneve->dst_port)
+		if (info->key.tun_id == geneve->info.key.tun_id &&
+		    info->key.tp_dst == geneve->info.key.tp_dst &&
+		    !memcmp(&info->key.u, &geneve->info.key.u, sizeof(info->key.u)))
 			t = geneve;
 	}
 	return t;
 }
 
+static bool is_all_zero(const u8 *fp, size_t size)
+{
+	int i;
+
+	for (i = 0; i < size; i++)
+		if (fp[i])
+			return false;
+	return true;
+}
+
+static bool is_tnl_info_zero(const struct ip_tunnel_info *info)
+{
+	if (info->key.tun_id || info->key.tun_flags || info->key.tos ||
+	    info->key.ttl || info->key.label || info->key.tp_src ||
+	    !is_all_zero((const u8 *)&info->key.u, sizeof(info->key.u)))
+		return false;
+	else
+		return true;
+}
+
 static int geneve_configure(struct net *net, struct net_device *dev,
-			    union geneve_addr *remote,
-			    __u32 vni, __u8 ttl, __u8 tos, __be32 label,
-			    __be16 dst_port, bool metadata, u32 flags)
+			    const struct ip_tunnel_info *info,
+			    bool metadata, bool ipv6_rx_csum)
 {
 	struct geneve_net *gn = net_generic(net, geneve_net_id);
 	struct geneve_dev *t, *geneve = netdev_priv(dev);
 	bool tun_collect_md, tun_on_same_port;
 	int err, encap_len;
 
-	if (!remote)
-		return -EINVAL;
-	if (metadata &&
-	    (remote->sa.sa_family != AF_UNSPEC || vni || tos || ttl || label))
+	if (metadata && !is_tnl_info_zero(info))
 		return -EINVAL;
 
 	geneve->net = net;
 	geneve->dev = dev;
 
-	geneve->vni[0] = (vni & 0x00ff0000) >> 16;
-	geneve->vni[1] = (vni & 0x0000ff00) >> 8;
-	geneve->vni[2] =  vni & 0x000000ff;
-
-	if ((remote->sa.sa_family == AF_INET &&
-	     IN_MULTICAST(ntohl(remote->sin.sin_addr.s_addr))) ||
-	    (remote->sa.sa_family == AF_INET6 &&
-	     ipv6_addr_is_multicast(&remote->sin6.sin6_addr)))
-		return -EINVAL;
-	if (label && remote->sa.sa_family != AF_INET6)
-		return -EINVAL;
-
-	geneve->remote = *remote;
-
-	geneve->ttl = ttl;
-	geneve->tos = tos;
-	geneve->label = label;
-	geneve->dst_port = dst_port;
-	geneve->collect_md = metadata;
-	geneve->flags = flags;
-
-	t = geneve_find_dev(gn, dst_port, remote, geneve->vni,
-			    &tun_on_same_port, &tun_collect_md);
+	t = geneve_find_dev(gn, info, &tun_on_same_port, &tun_collect_md);
 	if (t)
 		return -EBUSY;
 
 	/* make enough headroom for basic scenario */
 	encap_len = GENEVE_BASE_HLEN + ETH_HLEN;
-	if (remote->sa.sa_family == AF_INET) {
+	if (ip_tunnel_info_af(info) == AF_INET) {
 		encap_len += sizeof(struct iphdr);
 		dev->max_mtu -= sizeof(struct iphdr);
 	} else {
@@ -1312,7 +1193,10 @@ static int geneve_configure(struct net *net, struct net_device *dev,
 			return -EPERM;
 	}
 
-	dst_cache_reset(&geneve->dst_cache);
+	dst_cache_reset(&geneve->info.dst_cache);
+	geneve->info = *info;
+	geneve->collect_md = metadata;
+	geneve->use_udp6_rx_checksums = ipv6_rx_csum;
 
 	err = register_netdevice(dev);
 	if (err)
@@ -1322,74 +1206,99 @@ static int geneve_configure(struct net *net, struct net_device *dev,
 	return 0;
 }
 
+static void init_tnl_info(struct ip_tunnel_info *info, __u16 dst_port)
+{
+	memset(info, 0, sizeof(*info));
+	info->key.tp_dst = htons(dst_port);
+}
+
 static int geneve_newlink(struct net *net, struct net_device *dev,
 			  struct nlattr *tb[], struct nlattr *data[])
 {
-	__be16 dst_port = htons(GENEVE_UDP_PORT);
-	__u8 ttl = 0, tos = 0;
+	bool use_udp6_rx_checksums = false;
+	struct ip_tunnel_info info;
 	bool metadata = false;
-	union geneve_addr remote = geneve_remote_unspec;
-	__be32 label = 0;
-	__u32 vni = 0;
-	u32 flags = 0;
+
+	init_tnl_info(&info, GENEVE_UDP_PORT);
 
 	if (data[IFLA_GENEVE_REMOTE] && data[IFLA_GENEVE_REMOTE6])
 		return -EINVAL;
 
 	if (data[IFLA_GENEVE_REMOTE]) {
-		remote.sa.sa_family = AF_INET;
-		remote.sin.sin_addr.s_addr =
+		info.key.u.ipv4.dst =
 			nla_get_in_addr(data[IFLA_GENEVE_REMOTE]);
+
+		if (IN_MULTICAST(ntohl(info.key.u.ipv4.dst))) {
+			netdev_dbg(dev, "multicast remote is unsupported\n");
+			return -EINVAL;
+		}
 	}
 
 	if (data[IFLA_GENEVE_REMOTE6]) {
-		if (!IS_ENABLED(CONFIG_IPV6))
-			return -EPFNOSUPPORT;
-
-		remote.sa.sa_family = AF_INET6;
-		remote.sin6.sin6_addr =
+ #if IS_ENABLED(CONFIG_IPV6)
+		info.mode = IP_TUNNEL_INFO_IPV6;
+		info.key.u.ipv6.dst =
 			nla_get_in6_addr(data[IFLA_GENEVE_REMOTE6]);
 
-		if (ipv6_addr_type(&remote.sin6.sin6_addr) &
+		if (ipv6_addr_type(&info.key.u.ipv6.dst) &
 		    IPV6_ADDR_LINKLOCAL) {
 			netdev_dbg(dev, "link-local remote is unsupported\n");
 			return -EINVAL;
 		}
+		if (ipv6_addr_is_multicast(&info.key.u.ipv6.dst)) {
+			netdev_dbg(dev, "multicast remote is unsupported\n");
+			return -EINVAL;
+		}
+		info.key.tun_flags |= TUNNEL_CSUM;
+		use_udp6_rx_checksums = true;
+#else
+		return -EPFNOSUPPORT;
+#endif
 	}
 
-	if (data[IFLA_GENEVE_ID])
+	if (data[IFLA_GENEVE_ID]) {
+		__u32 vni;
+		__u8 tvni[3];
+
 		vni = nla_get_u32(data[IFLA_GENEVE_ID]);
+		tvni[0] = (vni & 0x00ff0000) >> 16;
+		tvni[1] = (vni & 0x0000ff00) >> 8;
+		tvni[2] =  vni & 0x000000ff;
 
+		info.key.tun_id = vni_to_tunnel_id(tvni);
+	}
 	if (data[IFLA_GENEVE_TTL])
-		ttl = nla_get_u8(data[IFLA_GENEVE_TTL]);
+		info.key.ttl = nla_get_u8(data[IFLA_GENEVE_TTL]);
 
 	if (data[IFLA_GENEVE_TOS])
-		tos = nla_get_u8(data[IFLA_GENEVE_TOS]);
+		info.key.tos = nla_get_u8(data[IFLA_GENEVE_TOS]);
 
-	if (data[IFLA_GENEVE_LABEL])
-		label = nla_get_be32(data[IFLA_GENEVE_LABEL]) &
-			IPV6_FLOWLABEL_MASK;
+	if (data[IFLA_GENEVE_LABEL]) {
+		info.key.label = nla_get_be32(data[IFLA_GENEVE_LABEL]) &
+				  IPV6_FLOWLABEL_MASK;
+		if (info.key.label && (!(info.mode & IP_TUNNEL_INFO_IPV6)))
+			return -EINVAL;
+	}
 
 	if (data[IFLA_GENEVE_PORT])
-		dst_port = nla_get_be16(data[IFLA_GENEVE_PORT]);
+		info.key.tp_dst = nla_get_be16(data[IFLA_GENEVE_PORT]);
 
 	if (data[IFLA_GENEVE_COLLECT_METADATA])
 		metadata = true;
 
 	if (data[IFLA_GENEVE_UDP_CSUM] &&
 	    !nla_get_u8(data[IFLA_GENEVE_UDP_CSUM]))
-		flags |= GENEVE_F_UDP_ZERO_CSUM_TX;
+		info.key.tun_flags |= TUNNEL_CSUM;
 
 	if (data[IFLA_GENEVE_UDP_ZERO_CSUM6_TX] &&
 	    nla_get_u8(data[IFLA_GENEVE_UDP_ZERO_CSUM6_TX]))
-		flags |= GENEVE_F_UDP_ZERO_CSUM6_TX;
+		info.key.tun_flags &= ~TUNNEL_CSUM;
 
 	if (data[IFLA_GENEVE_UDP_ZERO_CSUM6_RX] &&
 	    nla_get_u8(data[IFLA_GENEVE_UDP_ZERO_CSUM6_RX]))
-		flags |= GENEVE_F_UDP_ZERO_CSUM6_RX;
+		use_udp6_rx_checksums = false;
 
-	return geneve_configure(net, dev, &remote, vni, ttl, tos, label,
-				dst_port, metadata, flags);
+	return geneve_configure(net, dev, &info, metadata, use_udp6_rx_checksums);
 }
 
 static void geneve_dellink(struct net_device *dev, struct list_head *head)
@@ -1418,45 +1327,52 @@ static size_t geneve_get_size(const struct net_device *dev)
 static int geneve_fill_info(struct sk_buff *skb, const struct net_device *dev)
 {
 	struct geneve_dev *geneve = netdev_priv(dev);
+	struct ip_tunnel_info *info = &geneve->info;
+	__u8 tmp_vni[3];
 	__u32 vni;
 
-	vni = (geneve->vni[0] << 16) | (geneve->vni[1] << 8) | geneve->vni[2];
+	tunnel_id_to_vni(info->key.tun_id, tmp_vni);
+	vni = (tmp_vni[0] << 16) | (tmp_vni[1] << 8) | tmp_vni[2];
 	if (nla_put_u32(skb, IFLA_GENEVE_ID, vni))
 		goto nla_put_failure;
 
-	if (geneve->remote.sa.sa_family == AF_INET) {
+	if (ip_tunnel_info_af(info) == AF_INET) {
 		if (nla_put_in_addr(skb, IFLA_GENEVE_REMOTE,
-				    geneve->remote.sin.sin_addr.s_addr))
+				    info->key.u.ipv4.dst))
+			goto nla_put_failure;
+
+		if (nla_put_u8(skb, IFLA_GENEVE_UDP_CSUM,
+			       !!(info->key.tun_flags & TUNNEL_CSUM)))
 			goto nla_put_failure;
+
 #if IS_ENABLED(CONFIG_IPV6)
 	} else {
 		if (nla_put_in6_addr(skb, IFLA_GENEVE_REMOTE6,
-				     &geneve->remote.sin6.sin6_addr))
+				     &info->key.u.ipv6.dst))
+			goto nla_put_failure;
+
+		if (nla_put_u8(skb, IFLA_GENEVE_UDP_ZERO_CSUM6_TX,
+			       !(info->key.tun_flags & TUNNEL_CSUM)))
+			goto nla_put_failure;
+
+		if (nla_put_u8(skb, IFLA_GENEVE_UDP_ZERO_CSUM6_RX,
+			       !geneve->use_udp6_rx_checksums))
 			goto nla_put_failure;
 #endif
 	}
 
-	if (nla_put_u8(skb, IFLA_GENEVE_TTL, geneve->ttl) ||
-	    nla_put_u8(skb, IFLA_GENEVE_TOS, geneve->tos) ||
-	    nla_put_be32(skb, IFLA_GENEVE_LABEL, geneve->label))
+	if (nla_put_u8(skb, IFLA_GENEVE_TTL, info->key.ttl) ||
+	    nla_put_u8(skb, IFLA_GENEVE_TOS, info->key.tos) ||
+	    nla_put_be32(skb, IFLA_GENEVE_LABEL, info->key.label))
 		goto nla_put_failure;
 
-	if (nla_put_be16(skb, IFLA_GENEVE_PORT, geneve->dst_port))
+	if (nla_put_be16(skb, IFLA_GENEVE_PORT, info->key.tp_dst))
 		goto nla_put_failure;
 
 	if (geneve->collect_md) {
 		if (nla_put_flag(skb, IFLA_GENEVE_COLLECT_METADATA))
 			goto nla_put_failure;
 	}
-
-	if (nla_put_u8(skb, IFLA_GENEVE_UDP_CSUM,
-		       !(geneve->flags & GENEVE_F_UDP_ZERO_CSUM_TX)) ||
-	    nla_put_u8(skb, IFLA_GENEVE_UDP_ZERO_CSUM6_TX,
-		       !!(geneve->flags & GENEVE_F_UDP_ZERO_CSUM6_TX)) ||
-	    nla_put_u8(skb, IFLA_GENEVE_UDP_ZERO_CSUM6_RX,
-		       !!(geneve->flags & GENEVE_F_UDP_ZERO_CSUM6_RX)))
-		goto nla_put_failure;
-
 	return 0;
 
 nla_put_failure:
@@ -1480,6 +1396,7 @@ struct net_device *geneve_dev_create_fb(struct net *net, const char *name,
 					u8 name_assign_type, u16 dst_port)
 {
 	struct nlattr *tb[IFLA_MAX + 1];
+	struct ip_tunnel_info info;
 	struct net_device *dev;
 	LIST_HEAD(list_kill);
 	int err;
@@ -1490,9 +1407,8 @@ struct net_device *geneve_dev_create_fb(struct net *net, const char *name,
 	if (IS_ERR(dev))
 		return dev;
 
-	err = geneve_configure(net, dev, &geneve_remote_unspec,
-			       0, 0, 0, 0, htons(dst_port), true,
-			       GENEVE_F_UDP_ZERO_CSUM6_RX);
+	init_tnl_info(&info, dst_port);
+	err = geneve_configure(net, dev, &info, true, true);
 	if (err) {
 		free_netdev(dev);
 		return ERR_PTR(err);
@@ -1510,8 +1426,7 @@ struct net_device *geneve_dev_create_fb(struct net *net, const char *name,
 		goto err;
 
 	return dev;
-
- err:
+err:
 	geneve_dellink(dev, &list_kill);
 	unregister_netdevice_many(&list_kill);
 	return ERR_PTR(err);
@@ -1594,7 +1509,6 @@ static int __init geneve_init_module(void)
 		goto out3;
 
 	return 0;
-
 out3:
 	unregister_netdevice_notifier(&geneve_notifier_block);
 out2:
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net-next v2 4/4] geneve: Optimize geneve device lookup.
From: Pravin B Shelar @ 2016-11-19  2:10 UTC (permalink / raw)
  To: netdev; +Cc: Pravin B Shelar
In-Reply-To: <1479521411-53012-1-git-send-email-pshelar@ovn.org>

Rather than comparing 64-bit tunnel-id, compare tunnel vni
which is 24-bit id. This also save conversion from vni
to tunnel id on each tunnel packet receive.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
---
 drivers/net/geneve.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index f2912ca..930b1b0 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -103,6 +103,17 @@ static void tunnel_id_to_vni(__be64 tun_id, __u8 *vni)
 #endif
 }
 
+static bool eq_tun_id_and_vni(u8 *tun_id, u8 *vni)
+{
+#ifdef __BIG_ENDIAN
+	return (vni[0] == tun_id[2]) &&
+	       (vni[1] == tun_id[1]) &&
+	       (vni[2] == tun_id[0]);
+#else
+	return !memcmp(vni, &tun_id[5], 3);
+#endif
+}
+
 static sa_family_t geneve_get_sk_family(struct geneve_sock *gs)
 {
 	return gs->sock->sk->sk_family;
@@ -111,7 +122,6 @@ static sa_family_t geneve_get_sk_family(struct geneve_sock *gs)
 static struct geneve_dev *geneve_lookup(struct geneve_sock *gs,
 					__be32 addr, u8 vni[])
 {
-	__be64 id = vni_to_tunnel_id(vni);
 	struct hlist_head *vni_list_head;
 	struct geneve_dev *geneve;
 	__u32 hash;
@@ -120,7 +130,7 @@ static struct geneve_dev *geneve_lookup(struct geneve_sock *gs,
 	hash = geneve_net_vni_hash(vni);
 	vni_list_head = &gs->vni_list[hash];
 	hlist_for_each_entry_rcu(geneve, vni_list_head, hlist) {
-		if (!memcmp(&id, &geneve->info.key.tun_id, sizeof(id)) &&
+		if (eq_tun_id_and_vni((u8 *)&geneve->info.key.tun_id, vni) &&
 		    addr == geneve->info.key.u.ipv4.dst)
 			return geneve;
 	}
@@ -131,7 +141,6 @@ static struct geneve_dev *geneve_lookup(struct geneve_sock *gs,
 static struct geneve_dev *geneve6_lookup(struct geneve_sock *gs,
 					 struct in6_addr addr6, u8 vni[])
 {
-	__be64 id = vni_to_tunnel_id(vni);
 	struct hlist_head *vni_list_head;
 	struct geneve_dev *geneve;
 	__u32 hash;
@@ -140,7 +149,7 @@ static struct geneve_dev *geneve6_lookup(struct geneve_sock *gs,
 	hash = geneve_net_vni_hash(vni);
 	vni_list_head = &gs->vni_list[hash];
 	hlist_for_each_entry_rcu(geneve, vni_list_head, hlist) {
-		if (!memcmp(&id, &geneve->info.key.tun_id, sizeof(id)) &&
+		if (eq_tun_id_and_vni((u8 *)&geneve->info.key.tun_id, vni) &&
 		    ipv6_addr_equal(&addr6, &geneve->info.key.u.ipv6.dst))
 			return geneve;
 	}
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net-next v2 3/4] geneve: Remove redundant socket checks.
From: Pravin B Shelar @ 2016-11-19  2:10 UTC (permalink / raw)
  To: netdev; +Cc: Pravin B Shelar
In-Reply-To: <1479521411-53012-1-git-send-email-pshelar@ovn.org>

Geneve already has check for device socket in route
lookup function. So no need to check it in xmit
function.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
---
 drivers/net/geneve.c | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index d1759aa..f2912ca 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -785,14 +785,11 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
 	struct geneve_sock *gs4 = rcu_dereference(geneve->sock4);
 	const struct ip_tunnel_key *key = &info->key;
 	struct rtable *rt;
-	int err = -EINVAL;
 	struct flowi4 fl4;
 	__u8 tos, ttl;
 	__be16 sport;
 	__be16 df;
-
-	if (!gs4)
-		return err;
+	int err;
 
 	rt = geneve_get_v4_rt(skb, dev, &fl4, info);
 	if (IS_ERR(rt))
@@ -828,13 +825,10 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
 	struct geneve_sock *gs6 = rcu_dereference(geneve->sock6);
 	const struct ip_tunnel_key *key = &info->key;
 	struct dst_entry *dst = NULL;
-	int err = -EINVAL;
 	struct flowi6 fl6;
 	__u8 prio, ttl;
 	__be16 sport;
-
-	if (!gs6)
-		return err;
+	int err;
 
 	dst = geneve_get_v6_dst(skb, dev, &fl6, info);
 	if (IS_ERR(dst))
-- 
1.8.3.1

^ permalink raw reply related

* Re: [PATCH 4/5] virtio_net: add dedicated XDP transmit queues
From: Jakub Kicinski @ 2016-11-19  2:10 UTC (permalink / raw)
  To: John Fastabend
  Cc: tgraf, shm, alexei.starovoitov, daniel, davem, john.r.fastabend,
	netdev, bblanco, brouer
In-Reply-To: <20161118130953.7bf2d71d@laptop>

On Fri, 18 Nov 2016 13:09:53 -0800, Jakub Kicinski wrote:
> Looks very cool! :)
> 
> On Fri, 18 Nov 2016 11:00:41 -0800, John Fastabend wrote:
> > @@ -1542,12 +1546,34 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
> >  		return -EINVAL;
> >  	}
> >  
> > +	curr_qp = vi->curr_queue_pairs - vi->xdp_queue_pairs;
> > +	if (prog)
> > +		xdp_qp = num_online_cpus();  
> 
> Is num_online_cpus() correct here?

Sorry, I don't know the virto_net code, so I'm probably wrong.  I was
concerned whether the number of cpus can change but also that the cpu
mask may be sparse and therefore offsetting by smp_processor_id()
into the queue table below could bring trouble.

@@ -353,9 +381,15 @@ static u32 do_xdp_prog(struct virtnet_info *vi,
 	switch (act) {
 	case XDP_PASS:
 		return XDP_PASS;
+	case XDP_TX:
+		qp = vi->curr_queue_pairs -
+			vi->xdp_queue_pairs +
+			smp_processor_id();
+		xdp.data = buf + (vi->mergeable_rx_bufs ? 0 : 4);
+		virtnet_xdp_xmit(vi, qp, &xdp);
+		return XDP_TX;
 	default:
 		bpf_warn_invalid_xdp_action(act);
-	case XDP_TX:
 	case XDP_ABORTED:
 	case XDP_DROP:
 		return XDP_DROP;

^ permalink raw reply

* Re: [PATCH 3/5] virtio_net: Add XDP support
From: John Fastabend @ 2016-11-19  2:16 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: tgraf, shm, alexei.starovoitov, daniel, davem, john.r.fastabend,
	netdev, bblanco, brouer
In-Reply-To: <1479511316.8455.303.camel@edumazet-glaptop3.roam.corp.google.com>

On 16-11-18 03:21 PM, Eric Dumazet wrote:
> On Fri, 2016-11-18 at 11:00 -0800, John Fastabend wrote:
> 
> 
>>  static void free_receive_bufs(struct virtnet_info *vi)
>>  {
>> +	struct bpf_prog *old_prog;
>>  	int i;
>>  
>>  	for (i = 0; i < vi->max_queue_pairs; i++) {
>>  		while (vi->rq[i].pages)
>>  			__free_pages(get_a_page(&vi->rq[i], GFP_KERNEL), 0);
>> +
>> +		old_prog = rcu_dereference(vi->rq[i].xdp_prog);
> 
> Seems wrong to me.
> 

Yep it is wrong should be rtnl_dereference() here and the
rcu_dereference() calls earlier in the patch need to be _bh().

> Are you sure lockdep (with CONFIG_PROVE_RCU=y) was happy with this ?

oops you are right it was missing.

> 
>> +		RCU_INIT_POINTER(vi->rq[i].xdp_prog, NULL);
>> +		if (old_prog)
>> +			bpf_prog_put(old_prog);

bpf_prog_put() waits a grace period of ref count is zero. That said on
driver unload we need to protect the bpf_prog_put with RTNL_LOCK() as
well.

I'll send out a v2 in a bit.

Thanks a lot.

>>  	}
>>  }
>>  
>>
> 
> 

^ permalink raw reply

* Re: [PATCH 4/5] virtio_net: add dedicated XDP transmit queues
From: John Fastabend @ 2016-11-19  2:43 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: tgraf, shm, alexei.starovoitov, daniel, davem, john.r.fastabend,
	netdev, bblanco, brouer
In-Reply-To: <20161118181004.4c15a6a1@laptop>

On 16-11-18 06:10 PM, Jakub Kicinski wrote:
> On Fri, 18 Nov 2016 13:09:53 -0800, Jakub Kicinski wrote:
>> Looks very cool! :)
>>
>> On Fri, 18 Nov 2016 11:00:41 -0800, John Fastabend wrote:
>>> @@ -1542,12 +1546,34 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
>>>  		return -EINVAL;
>>>  	}
>>>  
>>> +	curr_qp = vi->curr_queue_pairs - vi->xdp_queue_pairs;
>>> +	if (prog)
>>> +		xdp_qp = num_online_cpus();  
>>
>> Is num_online_cpus() correct here?
> 
> Sorry, I don't know the virto_net code, so I'm probably wrong.  I was
> concerned whether the number of cpus can change but also that the cpu
> mask may be sparse and therefore offsetting by smp_processor_id()
> into the queue table below could bring trouble.
> 

Seem like a valid concerns to me how about num_possible_cpus() instead.

> @@ -353,9 +381,15 @@ static u32 do_xdp_prog(struct virtnet_info *vi,
>  	switch (act) {
>  	case XDP_PASS:
>  		return XDP_PASS;
> +	case XDP_TX:
> +		qp = vi->curr_queue_pairs -
> +			vi->xdp_queue_pairs +
> +			smp_processor_id();
> +		xdp.data = buf + (vi->mergeable_rx_bufs ? 0 : 4);
> +		virtnet_xdp_xmit(vi, qp, &xdp);
> +		return XDP_TX;
>  	default:
>  		bpf_warn_invalid_xdp_action(act);
> -	case XDP_TX:
>  	case XDP_ABORTED:
>  	case XDP_DROP:
>  		return XDP_DROP;
> 

^ permalink raw reply

* Re: [PATCH 4/5] virtio_net: add dedicated XDP transmit queues
From: Jakub Kicinski @ 2016-11-19  2:57 UTC (permalink / raw)
  To: John Fastabend
  Cc: tgraf, shm, alexei.starovoitov, daniel, davem, john.r.fastabend,
	netdev, bblanco, brouer, Eric Dumazet
In-Reply-To: <582FBC6B.8000902@gmail.com>

On Fri, 18 Nov 2016 18:43:55 -0800, John Fastabend wrote:
> On 16-11-18 06:10 PM, Jakub Kicinski wrote:
> > On Fri, 18 Nov 2016 13:09:53 -0800, Jakub Kicinski wrote:  
> >> Looks very cool! :)
> >>
> >> On Fri, 18 Nov 2016 11:00:41 -0800, John Fastabend wrote:  
>  [...]  
> >>
> >> Is num_online_cpus() correct here?  
> > 
> > Sorry, I don't know the virto_net code, so I'm probably wrong.  I was
> > concerned whether the number of cpus can change but also that the cpu
> > mask may be sparse and therefore offsetting by smp_processor_id()
> > into the queue table below could bring trouble.
> >   
> 
> Seem like a valid concerns to me how about num_possible_cpus() instead.

That would solve problem 1, but could cpu_possible_mask still be sparse
on strange setups?  Let me try to dig into this, I recall someone
(Eric?) was fixing similar problems some time ago.

> > @@ -353,9 +381,15 @@ static u32 do_xdp_prog(struct virtnet_info *vi,
> >  	switch (act) {
> >  	case XDP_PASS:
> >  		return XDP_PASS;
> > +	case XDP_TX:
> > +		qp = vi->curr_queue_pairs -
> > +			vi->xdp_queue_pairs +
> > +			smp_processor_id();
> > +		xdp.data = buf + (vi->mergeable_rx_bufs ? 0 : 4);
> > +		virtnet_xdp_xmit(vi, qp, &xdp);
> > +		return XDP_TX;
> >  	default:
> >  		bpf_warn_invalid_xdp_action(act);
> > -	case XDP_TX:
> >  	case XDP_ABORTED:
> >  	case XDP_DROP:
> >  		return XDP_DROP;
> >   

^ permalink raw reply

* Re: [PATCH 4/5] virtio_net: add dedicated XDP transmit queues
From: Eric Dumazet @ 2016-11-19  3:20 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: John Fastabend, tgraf, shm, alexei.starovoitov, daniel, davem,
	john.r.fastabend, netdev, bblanco, brouer
In-Reply-To: <20161118185728.67e2b53a@laptop>

On Fri, 2016-11-18 at 18:57 -0800, Jakub Kicinski wrote:
> On Fri, 18 Nov 2016 18:43:55 -0800, John Fastabend wrote:
> > On 16-11-18 06:10 PM, Jakub Kicinski wrote:
> > > On Fri, 18 Nov 2016 13:09:53 -0800, Jakub Kicinski wrote:  
> > >> Looks very cool! :)
> > >>
> > >> On Fri, 18 Nov 2016 11:00:41 -0800, John Fastabend wrote:  
> >  [...]  
> > >>
> > >> Is num_online_cpus() correct here?  
> > > 
> > > Sorry, I don't know the virto_net code, so I'm probably wrong.  I was
> > > concerned whether the number of cpus can change but also that the cpu
> > > mask may be sparse and therefore offsetting by smp_processor_id()
> > > into the queue table below could bring trouble.
> > >   
> > 
> > Seem like a valid concerns to me how about num_possible_cpus() instead.
> 
> That would solve problem 1, but could cpu_possible_mask still be sparse
> on strange setups?  Let me try to dig into this, I recall someone
> (Eric?) was fixing similar problems some time ago.

nr_cpu_ids is probably what you want ;)

^ permalink raw reply

* Re: [PATCH 4/5] virtio_net: add dedicated XDP transmit queues
From: Jakub Kicinski @ 2016-11-19  3:23 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: John Fastabend, tgraf, shm, alexei.starovoitov, daniel, davem,
	john.r.fastabend, netdev, bblanco, brouer
In-Reply-To: <1479525658.8455.313.camel@edumazet-glaptop3.roam.corp.google.com>

On Fri, 18 Nov 2016 19:20:58 -0800, Eric Dumazet wrote:
> On Fri, 2016-11-18 at 18:57 -0800, Jakub Kicinski wrote:
> > On Fri, 18 Nov 2016 18:43:55 -0800, John Fastabend wrote:  
> > > On 16-11-18 06:10 PM, Jakub Kicinski wrote:  
>  [...]  
> > > 
> > > Seem like a valid concerns to me how about num_possible_cpus() instead.  
> > 
> > That would solve problem 1, but could cpu_possible_mask still be sparse
> > on strange setups?  Let me try to dig into this, I recall someone
> > (Eric?) was fixing similar problems some time ago.  
> 
> nr_cpu_ids is probably what you want ;)

Thank you :)

^ permalink raw reply

* [PATCH 0/2] pull request for net: batman-adv 2016-11-19
From: Simon Wunderlich @ 2016-11-19 11:06 UTC (permalink / raw)
  To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	b.a.t.m.a.n-ZwoEplunGu2X36UT3dwllkB+6BGkLq7r

Hi David,

here are some bugfix patches which we would like to have integrated
into net.

Please pull or let me know of any problem!

Thank you,
      Simon

The following changes since commit 9799c50372b23ed774791bdb87d700f1286ee8a9:

  batman-adv: fix splat on disabling an interface (2016-10-21 14:47:02 +0200)

are available in the git repository at:

  git://git.open-mesh.org/linux-merge.git tags/batadv-net-for-davem-20161119

for you to fetch changes up to e13258f38e927b61cdb5f4ad25309450d3b127d1:

  batman-adv: Detect missing primaryif during tp_send as error (2016-11-04 12:27:39 +0100)

----------------------------------------------------------------
Here are two batman-adv bugfix patches:

 - Revert a splat on disabling interface which created another problem,
   by Sven Eckelmann

 - Fix error handling when the primary interface disappears during a
   throughput meter test, by Sven Eckelmann

----------------------------------------------------------------
Sven Eckelmann (2):
      batman-adv: Revert "fix splat on disabling an interface"
      batman-adv: Detect missing primaryif during tp_send as error

 net/batman-adv/hard-interface.c | 1 +
 net/batman-adv/tp_meter.c       | 1 +
 2 files changed, 2 insertions(+)

^ permalink raw reply

* [PATCH 1/2] batman-adv: Revert "fix splat on disabling an interface"
From: Simon Wunderlich @ 2016-11-19 11:06 UTC (permalink / raw)
  To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	b.a.t.m.a.n-ZwoEplunGu2X36UT3dwllkB+6BGkLq7r
In-Reply-To: <20161119110614.23325-1-sw-2YrNx6rUIHYiY0qSoAWiAoQuADTiUCJX@public.gmane.org>

From: Sven Eckelmann <sven-KaDOiPu9UxWEi8DpZVb4nw@public.gmane.org>

The commit 9799c50372b2 ("batman-adv: fix splat on disabling an interface")
fixed a warning but at the same time broke the rtnl function add_slave for
devices which were temporarily removed.

batadv_softif_slave_add requires soft_iface of and hard_iface to be NULL
before it is allowed to be enslaved. But this resetting of soft_iface to
NULL in batadv_hardif_disable_interface was removed with the aforementioned
commit.

Reported-by: Julian Labus <julian-OytpCXbLH6+xrkVEDGYtmQ@public.gmane.org>
Signed-off-by: Sven Eckelmann <sven-KaDOiPu9UxWEi8DpZVb4nw@public.gmane.org>
Acked-by: Linus Lüssing <linus.luessing-djzkFPsfvsizQB+pC5nmwQ@public.gmane.org>
Signed-off-by: Simon Wunderlich <sw-2YrNx6rUIHYiY0qSoAWiAoQuADTiUCJX@public.gmane.org>
---
 net/batman-adv/hard-interface.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/batman-adv/hard-interface.c b/net/batman-adv/hard-interface.c
index e034afb..08ce361 100644
--- a/net/batman-adv/hard-interface.c
+++ b/net/batman-adv/hard-interface.c
@@ -652,6 +652,7 @@ void batadv_hardif_disable_interface(struct batadv_hard_iface *hard_iface,
 			batadv_softif_destroy_sysfs(hard_iface->soft_iface);
 	}
 
+	hard_iface->soft_iface = NULL;
 	batadv_hardif_put(hard_iface);
 
 out:
-- 
2.10.2

^ permalink raw reply related

* [PATCH 2/2] batman-adv: Detect missing primaryif during tp_send as error
From: Simon Wunderlich @ 2016-11-19 11:06 UTC (permalink / raw)
  To: davem; +Cc: netdev, b.a.t.m.a.n, Sven Eckelmann, Simon Wunderlich
In-Reply-To: <20161119110614.23325-1-sw@simonwunderlich.de>

From: Sven Eckelmann <sven@narfation.org>

The throughput meter detects different situations as problems for the
current test. It stops the test after these and reports it to userspace.
This also has to be done when the primary interface disappeared during the
test.

Fixes: 33a3bb4a3345 ("batman-adv: throughput meter implementation")
Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
---
 net/batman-adv/tp_meter.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/batman-adv/tp_meter.c b/net/batman-adv/tp_meter.c
index 2333777..8af1611 100644
--- a/net/batman-adv/tp_meter.c
+++ b/net/batman-adv/tp_meter.c
@@ -837,6 +837,7 @@ static int batadv_tp_send(void *arg)
 	primary_if = batadv_primary_if_get_selected(bat_priv);
 	if (unlikely(!primary_if)) {
 		err = BATADV_TP_REASON_DST_UNREACHABLE;
+		tp_vars->reason = err;
 		goto out;
 	}
 
-- 
2.10.2

^ permalink raw reply related

* 35921 netdev
From: ccc @ 2016-11-19 12:59 UTC (permalink / raw)
  To: netdev

[-- Attachment #1: EMAIL_506811372915036_netdev.zip --]
[-- Type: application/zip, Size: 13174 bytes --]

^ permalink raw reply

* [PATCH 0/8] pull request for net-next: batman-adv 2016-11-19
From: Simon Wunderlich @ 2016-11-19 13:35 UTC (permalink / raw)
  To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	b.a.t.m.a.n-ZwoEplunGu2X36UT3dwllkB+6BGkLq7r

Hi David,

this should be our last feature pull request for batman-adv in this round.

Please pull or let me know of any problem!

Thank you,
      Simon

The following changes since commit 93bbaab455f30fd43911e0881a02107a17150a62:

  batman-adv: Reject unicast packet with zero/mcast dst address (2016-11-08 19:02:36 +0100)

are available in the git repository at:

  git://git.open-mesh.org/linux-merge.git tags/batadv-next-for-davem-20161119

for you to fetch changes up to 9b4aec647a92a2464337db10507348aecf0f0fd7:

  batman-adv: fix rare race conditions on interface removal (2016-11-08 19:02:39 +0100)

----------------------------------------------------------------
This feature patchset includes the following changes:

 - 6 patches adding functionality to detect a WiFi interface under
   other virtual interfaces, like VLANs. They introduce a cache for
   the detected the WiFi configuration to avoid RTNL locking in
   critical sections. Patches have been prepared by Marek Lindner
   and Sven Eckelmann

 - Enable automatic module loading for genl requests, by Sven Eckelmann

 - Fix a potential race condition on interface removal. This is not
   happening very often in practice, but requires bigger changes to fix,
   so we are sending this to net-next. By Linus Luessing

----------------------------------------------------------------
Linus Lüssing (1):
      batman-adv: fix rare race conditions on interface removal

Marek Lindner (3):
      batman-adv: refactor wifi interface detection
      batman-adv: additional checks for virtual interfaces on top of WiFi
      batman-adv: retrieve B.A.T.M.A.N. V WiFi neighbor stats from real interface

Sven Eckelmann (4):
      batman-adv: Return non-const ptr in batadv_getlink_net
      batman-adv: Cache the type of wifi device for each hardif
      batman-adv: Update wifi flags on upper link change
      batman-adv: Add module alias for batadv netlink family

 net/batman-adv/bat_iv_ogm.c        |  21 +--
 net/batman-adv/bat_v_elp.c         |  41 +++---
 net/batman-adv/hard-interface.c    | 173 +++++++++++++++++++++--
 net/batman-adv/hard-interface.h    |   5 +-
 net/batman-adv/main.c              |   3 +
 net/batman-adv/send.c              | 283 ++++++++++++++++++++++++++++---------
 net/batman-adv/send.h              |   5 +
 net/batman-adv/translation-table.c |  10 +-
 net/batman-adv/types.h             |  19 +++
 9 files changed, 445 insertions(+), 115 deletions(-)

^ permalink raw reply

* [PATCH 1/8] batman-adv: Return non-const ptr in batadv_getlink_net
From: Simon Wunderlich @ 2016-11-19 13:35 UTC (permalink / raw)
  To: davem; +Cc: netdev, b.a.t.m.a.n, Sven Eckelmann, Simon Wunderlich
In-Reply-To: <20161119133528.1379-1-sw@simonwunderlich.de>

From: Sven Eckelmann <sven.eckelmann@open-mesh.com>

The returned net_namespace of batadv_getlink_net may be used with functions
that potentially modify the struct. Thus it must return the pointer as
non-const like rtnl_link_ops::get_link_net does.

Signed-off-by: Sven Eckelmann <sven.eckelmann@open-mesh.com>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
---
 net/batman-adv/hard-interface.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/batman-adv/hard-interface.c b/net/batman-adv/hard-interface.c
index a7a462e..fdbb47e 100644
--- a/net/batman-adv/hard-interface.c
+++ b/net/batman-adv/hard-interface.c
@@ -92,8 +92,8 @@ batadv_hardif_get_by_netdev(const struct net_device *net_dev)
  *
  * Return: result of rtnl_link_ops->get_link_net or @fallback_net
  */
-static const struct net *batadv_getlink_net(const struct net_device *netdev,
-					    const struct net *fallback_net)
+static struct net *batadv_getlink_net(const struct net_device *netdev,
+				      struct net *fallback_net)
 {
 	if (!netdev->rtnl_link_ops)
 		return fallback_net;
@@ -116,9 +116,9 @@ static const struct net *batadv_getlink_net(const struct net_device *netdev,
  * Return: true if the devices are each others parent, otherwise false
  */
 static bool batadv_mutual_parents(const struct net_device *dev1,
-				  const struct net *net1,
+				  struct net *net1,
 				  const struct net_device *dev2,
-				  const struct net *net2)
+				  struct net *net2)
 {
 	int dev1_parent_iflink = dev_get_iflink(dev1);
 	int dev2_parent_iflink = dev_get_iflink(dev2);
@@ -154,7 +154,7 @@ static bool batadv_is_on_batman_iface(const struct net_device *net_dev)
 {
 	struct net *net = dev_net(net_dev);
 	struct net_device *parent_dev;
-	const struct net *parent_net;
+	struct net *parent_net;
 	bool ret;
 
 	/* check if this is a batman-adv mesh interface */
-- 
2.10.2

^ permalink raw reply related

* [PATCH 2/8] batman-adv: refactor wifi interface detection
From: Simon Wunderlich @ 2016-11-19 13:35 UTC (permalink / raw)
  To: davem; +Cc: netdev, b.a.t.m.a.n, Marek Lindner, Sven Eckelmann,
	Simon Wunderlich
In-Reply-To: <20161119133528.1379-1-sw@simonwunderlich.de>

From: Marek Lindner <mareklindner@neomailbox.ch>

The ELP protocol requires cfg80211 to auto-detect the WiFi througput
to a given neighbor. Use batadv_is_cfg80211_netdev() to determine
whether or not an interface is eligible.

Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Sven Eckelmann <sven.eckelmann@open-mesh.com>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
---
 net/batman-adv/bat_v_elp.c      | 29 ++++++++++++++---------------
 net/batman-adv/hard-interface.c | 26 +++++++++++++++++++++-----
 net/batman-adv/hard-interface.h |  1 +
 3 files changed, 36 insertions(+), 20 deletions(-)

diff --git a/net/batman-adv/bat_v_elp.c b/net/batman-adv/bat_v_elp.c
index 54bdd41..e601def 100644
--- a/net/batman-adv/bat_v_elp.c
+++ b/net/batman-adv/bat_v_elp.c
@@ -90,22 +90,21 @@ static u32 batadv_v_elp_get_throughput(struct batadv_hardif_neigh_node *neigh)
 	 * cfg80211 API
 	 */
 	if (batadv_is_wifi_netdev(hard_iface->net_dev)) {
-		if (hard_iface->net_dev->ieee80211_ptr) {
-			ret = cfg80211_get_station(hard_iface->net_dev,
-						   neigh->addr, &sinfo);
-			if (ret == -ENOENT) {
-				/* Node is not associated anymore! It would be
-				 * possible to delete this neighbor. For now set
-				 * the throughput metric to 0.
-				 */
-				return 0;
-			}
-			if (!ret)
-				return sinfo.expected_throughput / 100;
+		if (!batadv_is_cfg80211_netdev(hard_iface->net_dev))
+			/* unsupported WiFi driver version */
+			goto default_throughput;
+
+		ret = cfg80211_get_station(hard_iface->net_dev,
+					   neigh->addr, &sinfo);
+		if (ret == -ENOENT) {
+			/* Node is not associated anymore! It would be
+			 * possible to delete this neighbor. For now set
+			 * the throughput metric to 0.
+			 */
+			return 0;
 		}
-
-		/* unsupported WiFi driver version */
-		goto default_throughput;
+		if (!ret)
+			return sinfo.expected_throughput / 100;
 	}
 
 	/* if not a wifi interface, check if this device provides data via
diff --git a/net/batman-adv/hard-interface.c b/net/batman-adv/hard-interface.c
index fdbb47e..c3fbc1b 100644
--- a/net/batman-adv/hard-interface.c
+++ b/net/batman-adv/hard-interface.c
@@ -202,6 +202,26 @@ static bool batadv_is_valid_iface(const struct net_device *net_dev)
 }
 
 /**
+ * batadv_is_cfg80211_netdev - check if the given net_device struct is a
+ *  cfg80211 wifi interface
+ * @net_device: the device to check
+ *
+ * Return: true if the net device is a cfg80211 wireless device, false
+ *  otherwise.
+ */
+bool batadv_is_cfg80211_netdev(struct net_device *net_device)
+{
+	if (!net_device)
+		return false;
+
+	/* cfg80211 drivers have to set ieee80211_ptr */
+	if (net_device->ieee80211_ptr)
+		return true;
+
+	return false;
+}
+
+/**
  * batadv_is_wifi_netdev - check if the given net_device struct is a wifi
  *  interface
  * @net_device: the device to check
@@ -221,11 +241,7 @@ bool batadv_is_wifi_netdev(struct net_device *net_device)
 		return true;
 #endif
 
-	/* cfg80211 drivers have to set ieee80211_ptr */
-	if (net_device->ieee80211_ptr)
-		return true;
-
-	return false;
+	return batadv_is_cfg80211_netdev(net_device);
 }
 
 /**
diff --git a/net/batman-adv/hard-interface.h b/net/batman-adv/hard-interface.h
index a043182..7f9acab 100644
--- a/net/batman-adv/hard-interface.h
+++ b/net/batman-adv/hard-interface.h
@@ -65,6 +65,7 @@ enum batadv_hard_if_cleanup {
 
 extern struct notifier_block batadv_hard_if_notifier;
 
+bool batadv_is_cfg80211_netdev(struct net_device *net_device);
 bool batadv_is_wifi_netdev(struct net_device *net_device);
 bool batadv_is_wifi_iface(int ifindex);
 struct batadv_hard_iface*
-- 
2.10.2

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox