Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH 2/3] mlx4: Don't bother using skb_tx_hash in mlx4_en_select_queue
From: Alexander Duyck @ 2018-04-27 18:06 UTC (permalink / raw)
  To: netdev, davem
  Cc: linux-rdma, dennis.dalessandro, niranjana.vishwanathapura, tariqt
In-Reply-To: <20180427180142.4883.96259.stgit@ahduyck-green-test.jf.intel.com>

The code in the fallback path has supported XDP in conjunction with the Tx
traffic classification for TCs for over a year now. So instead of just
calling skb_tx_hash for every packet we are better off using the fallback
since that will record the Tx queue to the socket and then that can be used
instead of having to recompute the hash every time.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 6b68537..0227786 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -694,7 +694,7 @@ u16 mlx4_en_select_queue(struct net_device *dev, struct sk_buff *skb,
 	u16 rings_p_up = priv->num_tx_rings_p_up;
 
 	if (netdev_get_num_tc(dev))
-		return skb_tx_hash(dev, skb);
+		return fallback(dev, skb);
 
 	return fallback(dev, skb) % rings_p_up;
 }

^ permalink raw reply related

* [PATCH 1/3] opa_vnic: Just use skb_get_hash instead of skb_tx_hash
From: Alexander Duyck @ 2018-04-27 18:06 UTC (permalink / raw)
  To: netdev, davem
  Cc: linux-rdma, dennis.dalessandro, niranjana.vishwanathapura, tariqt
In-Reply-To: <20180427180142.4883.96259.stgit@ahduyck-green-test.jf.intel.com>

This patch is meant to clean up how the opa_vnic is obtaining entropy from
Tx packets.

The code as it was written was claiming to get 16 bits of hash, but from
what I can tell it was only ever actually getting 14 bits as it was limited
to 0 - (2^15 - 1). It then was folding the result to get a 8 bit value for
entropy.

Instead of throwing away all that input I am cutting out the middle man and
instead having the code call skb_get_hash directly and then folding the 32
bit value into a 8 bit value using a pair of shifts and XOR operations.

Execution wise this new approach should provide more entropy and be faster
since we are bypassing the reciprocal multiplication to reduce the 32b
value to 16b and instead just using a shift/XOR combination.

In addition we can drop the unneeded adapter value from the call to get the
entropy since the netdev itself isn't even needed.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.c   |   21 ++++++++++----------
 .../infiniband/ulp/opa_vnic/opa_vnic_internal.h    |    2 +-
 drivers/infiniband/ulp/opa_vnic/opa_vnic_netdev.c  |    2 +-
 3 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.c b/drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.c
index 4be3aef..267da82 100644
--- a/drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.c
+++ b/drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.c
@@ -443,17 +443,16 @@ static u8 opa_vnic_get_rc(struct __opa_veswport_info *info,
 }
 
 /* opa_vnic_calc_entropy - calculate the packet entropy */
-u8 opa_vnic_calc_entropy(struct opa_vnic_adapter *adapter, struct sk_buff *skb)
+u8 opa_vnic_calc_entropy(struct sk_buff *skb)
 {
-	u16 hash16;
-
-	/*
-	 * Get flow based 16-bit hash and then XOR the upper and lower bytes
-	 * to get the entropy.
-	 * __skb_tx_hash limits qcount to 16 bits. Hence, get 15-bit hash.
-	 */
-	hash16 = __skb_tx_hash(adapter->netdev, skb, BIT(15));
-	return (u8)((hash16 >> 8) ^ (hash16 & 0xff));
+	u32 hash = skb_get_hash(skb);
+
+	/* store XOR of all bytes in lower 8 bits */
+	hash ^= hash >> 8;
+	hash ^= hash >> 16;
+
+	/* return lower 8 bits as entropy */
+	return (u8)(hash & 0xFF);
 }
 
 /* opa_vnic_get_def_port - get default port based on entropy */
@@ -490,7 +489,7 @@ void opa_vnic_encap_skb(struct opa_vnic_adapter *adapter, struct sk_buff *skb)
 
 	hdr = skb_push(skb, OPA_VNIC_HDR_LEN);
 
-	entropy = opa_vnic_calc_entropy(adapter, skb);
+	entropy = opa_vnic_calc_entropy(skb);
 	def_port = opa_vnic_get_def_port(adapter, entropy);
 	len = opa_vnic_wire_length(skb);
 	dlid = opa_vnic_get_dlid(adapter, skb, def_port);
diff --git a/drivers/infiniband/ulp/opa_vnic/opa_vnic_internal.h b/drivers/infiniband/ulp/opa_vnic/opa_vnic_internal.h
index afd95f4..43ac61f 100644
--- a/drivers/infiniband/ulp/opa_vnic/opa_vnic_internal.h
+++ b/drivers/infiniband/ulp/opa_vnic/opa_vnic_internal.h
@@ -299,7 +299,7 @@ struct opa_vnic_adapter *opa_vnic_add_netdev(struct ib_device *ibdev,
 void opa_vnic_rem_netdev(struct opa_vnic_adapter *adapter);
 void opa_vnic_encap_skb(struct opa_vnic_adapter *adapter, struct sk_buff *skb);
 u8 opa_vnic_get_vl(struct opa_vnic_adapter *adapter, struct sk_buff *skb);
-u8 opa_vnic_calc_entropy(struct opa_vnic_adapter *adapter, struct sk_buff *skb);
+u8 opa_vnic_calc_entropy(struct sk_buff *skb);
 void opa_vnic_process_vema_config(struct opa_vnic_adapter *adapter);
 void opa_vnic_release_mac_tbl(struct opa_vnic_adapter *adapter);
 void opa_vnic_query_mac_tbl(struct opa_vnic_adapter *adapter,
diff --git a/drivers/infiniband/ulp/opa_vnic/opa_vnic_netdev.c b/drivers/infiniband/ulp/opa_vnic/opa_vnic_netdev.c
index ce57e0f..0c8aec6 100644
--- a/drivers/infiniband/ulp/opa_vnic/opa_vnic_netdev.c
+++ b/drivers/infiniband/ulp/opa_vnic/opa_vnic_netdev.c
@@ -104,7 +104,7 @@ static u16 opa_vnic_select_queue(struct net_device *netdev, struct sk_buff *skb,
 
 	/* pass entropy and vl as metadata in skb */
 	mdata = skb_push(skb, sizeof(*mdata));
-	mdata->entropy =  opa_vnic_calc_entropy(adapter, skb);
+	mdata->entropy = opa_vnic_calc_entropy(skb);
 	mdata->vl = opa_vnic_get_vl(adapter, skb);
 	rc = adapter->rn_ops->ndo_select_queue(netdev, skb,
 					       accel_priv, fallback);

^ permalink raw reply related

* [PATCH 0/3] Clean up users of skb_tx_hash and __skb_tx_hash
From: Alexander Duyck @ 2018-04-27 18:06 UTC (permalink / raw)
  To: netdev, davem
  Cc: linux-rdma, dennis.dalessandro, niranjana.vishwanathapura, tariqt

I am in the process of doing some work to try and enable macvlan Tx queue
selection without using ndo_select_queue. As a part of that I will likely
need to make changes to skb_tx_hash. As such this is a clean up or refactor
of the two spots where he function has been used. In both cases it didn't
really seem like the function was being used correctly so I have updated
both code paths to not make use of the function.

My current development environment doesn't have an mlx4 or OPA vnic
available so the changes to those have been build tested only.

---

Alexander Duyck (3):
      opa_vnic: Just use skb_get_hash instead of skb_tx_hash
      mlx4: Don't bother using skb_tx_hash in mlx4_en_select_queue
      net: Revoke export for __skb_tx_hash, update it to just be static skb_tx_hash

 drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.c   |   21 ++++++++++----------
 .../infiniband/ulp/opa_vnic/opa_vnic_internal.h    |    2 +-
 drivers/infiniband/ulp/opa_vnic/opa_vnic_netdev.c  |    2 +-
 drivers/net/ethernet/mellanox/mlx4/en_tx.c         |    2 +-
 include/linux/netdevice.h                          |   13 ------------
 net/core/dev.c                                     |   10 ++++------
 6 files changed, 17 insertions(+), 33 deletions(-)

^ permalink raw reply

* Re: [PATCH bpf-next v2 00/15] Introducing AF_XDP support
From: Björn Töpel @ 2018-04-27 18:12 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Karlsson, Magnus, Alexander Duyck, Alexander Duyck,
	John Fastabend, Alexei Starovoitov, Jesper Dangaard Brouer,
	Daniel Borkmann, Michael S. Tsirkin, Network Development,
	Björn Töpel, michael.lundkvist, Brandeburg, Jesse,
	Singhai, Anjali, Zhang, Qi Z
In-Reply-To: <CAF=yD-JUezRSGP_6f=WDHqcznFEW4K95hd6qVK8BaAr6Q5VR5Q@mail.gmail.com>

2018-04-27 19:16 GMT+02:00 Willem de Bruijn <willemdebruijn.kernel@gmail.com>:
> On Fri, Apr 27, 2018 at 8:17 AM, Björn Töpel <bjorn.topel@gmail.com> wrote:
>> From: Björn Töpel <bjorn.topel@intel.com>
>>
>> This patch set introduces a new address family called AF_XDP that is
>> optimized for high performance packet processing and, in upcoming
>> patch sets, zero-copy semantics. In this v2 version, we have removed
>> all zero-copy related code in order to make it smaller, simpler and
>> hopefully more review friendly. This patch set only supports copy-mode
>> for the generic XDP path (XDP_SKB) for both RX and TX and copy-mode
>> for RX using the XDP_DRV path. Zero-copy support requires XDP and
>> driver changes that Jesper Dangaard Brouer is working on. Some of his
>> work has already been accepted. We will publish our zero-copy support
>> for RX and TX on top of his patch sets at a later point in time.
>
>> Changes from V1:
>>
>> * Fixes to bugs spotted by Will in his review
>> * Implemented the performance otimization to BPF_MAP_TYPE_XSKMAP
>>   suggested by Will
>
> An xsk may only exist in one map at a time. Is this somehow assured?
>

Actually this is *not* the case. An xsk may reside in many maps, and
multiple times in the same map. So it's not assured at all. :-)

The restriction for an xsk is per netdev/queue/umem (and) the napi
context guarantee the SPSC constraint.

For the record, your XSKMAP suggestion gave ~100kpps in the ingress
path! Very nice!

>> * Refactored packet_direct_xmit to become a common function
>>   in core/dev.c as suggested by Will
>> * Added documentation as suggested by Jesper
>> * Proper page unpinning as suggested by MST
>> * Some minor code cleanups
>
> Everything else looks great to me. If the above is correct (or corrected)
>
> Acked-by: Willem de Bruijn <willemb@google.com>
>

Thanks for the in-depth review, Will! Very much appreciated! (bow)


Björn

> I did not read everything again, but applied both patchsets on top of
> bpf-next to do a diff of diffs. In case others find it useful:
>
>   https://github.com/wdebruij/linux/tree/bpf-next-afxdp-v1
>   https://github.com/wdebruij/linux/tree/bpf-next-afxdp-v2

^ permalink raw reply

* DSA
From: Dave Richards @ 2018-04-27 18:10 UTC (permalink / raw)
  To: netdev@vger.kernel.org

Hello,

I am building a prototype for a new product based on a Lanner, Inc. embedded PC.  It is an Intel Celeron-based system with two host I210 GbE chips connected to 2 MV88E6172 chips (one NIC to one switch).  Everything appears to show up hardware-wise.  My question is, what is the next step?  How does DSA know which NICs are intended to be masters?  Is this supposed to be auto-detected or is this knowledge supposed to be communicated explicitly.  Reading through the DSA driver code I see that there is a check of the OF property list for the device for a "label"/"cpu" property/value pair that needs to be present.  Who sets this and when?

I'm sorry for this basic question, but Google has not enlightened me.

	Thanks!

	Dave

Dave Richards
VP Software Engineering
Impinj, Inc
400 Fairview Ave N. #1200
Seattle, WA 
O: (206) 812-9863

^ permalink raw reply

* Re: Request for stable 4.14.x inclusion: net: don't call update_pmtu unconditionally
From: Thomas Deutschmann @ 2018-04-27 18:07 UTC (permalink / raw)
  To: Eddie Chapman, Greg KH; +Cc: stable, davem, nicolas.dichtel, netdev
In-Reply-To: <ae1401af-a400-f6de-658e-bae0b29c52e4@ehuk.net>

Hi Greg,

first, we need to cherry-pick another patch first:
 
> From 52a589d51f1008f62569bf89e95b26221ee76690 Mon Sep 17 00:00:00 2001
> From: Xin Long <lucien.xin@gmail.com>
> Date: Mon, 25 Dec 2017 14:43:58 +0800
> Subject: [PATCH] geneve: update skb dst pmtu on tx path
> 
> Commit a93bf0ff4490 ("vxlan: update skb dst pmtu on tx path") has fixed
> a performance issue caused by the change of lower dev's mtu for vxlan.
> 
> The same thing needs to be done for geneve as well.
> 
> Note that geneve cannot adjust it's mtu according to lower dev's mtu
> when creating it. The performance is very low later when netperfing
> over it without fixing the mtu manually. This patch could also avoid
> this issue.
> 
> Signed-off-by: Xin Long <lucien.xin@gmail.com>
> Signed-off-by: David S. Miller <davem@davemloft.net>

Then you can apply the following backport. A backport is required
because v4.15 has commit 77552cfa39c48e695c39d0553afc8c6018e411ce
which rewrote

> skb_dst(skb2)->ops->update_pmtu(skb_dst(skb2), NULL, skb2, rel_info);

into

> 		skb_dst(skb2)->ops->update_pmtu(skb_dst(skb2), NULL, skb2,
> 						rel_info);

in net/ipv6/ip6_tunnel.c which is missing:

>From b2fb9a8178660f92c6ab29d3171bc44e2cb1b618 Mon Sep 17 00:00:00 2001
From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Date: Thu, 25 Jan 2018 19:03:03 +0100
Subject: net: don't call update_pmtu unconditionally

commit f15ca723c1ebe6c1a06bc95fda6b62cd87b44559 upstream.

Some dst_ops (e.g. md_dst_ops)) doesn't set this handler. It may result to:
"BUG: unable to handle kernel NULL pointer dereference at           (null)"

Let's add a helper to check if update_pmtu is available before calling it.

Fixes: 52a589d51f10 ("geneve: update skb dst pmtu on tx path")
Fixes: a93bf0ff4490 ("vxlan: update skb dst pmtu on tx path")
CC: Roman Kapl <code@rkapl.cz>
CC: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/infiniband/ulp/ipoib/ipoib_cm.c | 3 +--
 drivers/net/geneve.c                    | 4 ++--
 drivers/net/vxlan.c                     | 6 ++----
 include/net/dst.h                       | 8 ++++++++
 net/ipv4/ip_tunnel.c                    | 3 +--
 net/ipv4/ip_vti.c                       | 2 +-
 net/ipv6/ip6_tunnel.c                   | 5 ++---
 net/ipv6/ip6_vti.c                      | 2 +-
 net/ipv6/sit.c                          | 4 ++--
 9 files changed, 20 insertions(+), 17 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index 7774654c2ccb..7a5ed5a5391e 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -1447,8 +1447,7 @@ void ipoib_cm_skb_too_long(struct net_device *dev, struct sk_buff *skb,
 	struct ipoib_dev_priv *priv = ipoib_priv(dev);
 	int e = skb_queue_empty(&priv->cm.skb_queue);
 
-	if (skb_dst(skb))
-		skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL, skb, mtu);
+	skb_dst_update_pmtu(skb, mtu);
 
 	skb_queue_tail(&priv->cm.skb_queue, skb);
 	if (e)
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 1b0fcf0b2afa..fbc825ac97ab 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -829,7 +829,7 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
 		int mtu = dst_mtu(&rt->dst) - sizeof(struct iphdr) -
 			  GENEVE_BASE_HLEN - info->options_len - 14;
 
-		skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL, skb, mtu);
+		skb_dst_update_pmtu(skb, mtu);
 	}
 
 	sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true);
@@ -875,7 +875,7 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
 		int mtu = dst_mtu(dst) - sizeof(struct ipv6hdr) -
 			  GENEVE_BASE_HLEN - info->options_len - 14;
 
-		skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL, skb, mtu);
+		skb_dst_update_pmtu(skb, mtu);
 	}
 
 	sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true);
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index bb44f0c6891f..3d9c5b35a4a7 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2158,8 +2158,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 		if (skb_dst(skb)) {
 			int mtu = dst_mtu(ndst) - VXLAN_HEADROOM;
 
-			skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL,
-						       skb, mtu);
+			skb_dst_update_pmtu(skb, mtu);
 		}
 
 		tos = ip_tunnel_ecn_encap(tos, old_iph, skb);
@@ -2200,8 +2199,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 		if (skb_dst(skb)) {
 			int mtu = dst_mtu(ndst) - VXLAN6_HEADROOM;
 
-			skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL,
-						       skb, mtu);
+			skb_dst_update_pmtu(skb, mtu);
 		}
 
 		tos = ip_tunnel_ecn_encap(tos, old_iph, skb);
diff --git a/include/net/dst.h b/include/net/dst.h
index 694c2e6ae618..ebfb4328fdb1 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -520,4 +520,12 @@ static inline struct xfrm_state *dst_xfrm(const struct dst_entry *dst)
 }
 #endif
 
+static inline void skb_dst_update_pmtu(struct sk_buff *skb, u32 mtu)
+{
+	struct dst_entry *dst = skb_dst(skb);
+
+	if (dst && dst->ops->update_pmtu)
+		dst->ops->update_pmtu(dst, NULL, skb, mtu);
+}
+
 #endif /* _NET_DST_H */
diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
index 13f7bbc0168d..a2fcc20774a6 100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -521,8 +521,7 @@ static int tnl_update_pmtu(struct net_device *dev, struct sk_buff *skb,
 	else
 		mtu = skb_dst(skb) ? dst_mtu(skb_dst(skb)) : dev->mtu;
 
-	if (skb_dst(skb))
-		skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL, skb, mtu);
+	skb_dst_update_pmtu(skb, mtu);
 
 	if (skb->protocol == htons(ETH_P_IP)) {
 		if (!skb_is_gso(skb) &&
diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
index 89453cf62158..c9cd891f69c2 100644
--- a/net/ipv4/ip_vti.c
+++ b/net/ipv4/ip_vti.c
@@ -209,7 +209,7 @@ static netdev_tx_t vti_xmit(struct sk_buff *skb, struct net_device *dev,
 
 	mtu = dst_mtu(dst);
 	if (skb->len > mtu) {
-		skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL, skb, mtu);
+		skb_dst_update_pmtu(skb, mtu);
 		if (skb->protocol == htons(ETH_P_IP)) {
 			icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
 				  htonl(mtu));
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 7e11f6a811f5..d61a82fd4b60 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -652,7 +652,7 @@ ip4ip6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 		if (rel_info > dst_mtu(skb_dst(skb2)))
 			goto out;
 
-		skb_dst(skb2)->ops->update_pmtu(skb_dst(skb2), NULL, skb2, rel_info);
+		skb_dst_update_pmtu(skb2, rel_info);
 	}
 	if (rel_type == ICMP_REDIRECT)
 		skb_dst(skb2)->ops->redirect(skb_dst(skb2), NULL, skb2);
@@ -1141,8 +1141,7 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield,
 		mtu = 576;
 	}
 
-	if (skb_dst(skb) && !t->parms.collect_md)
-		skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL, skb, mtu);
+	skb_dst_update_pmtu(skb, mtu);
 	if (skb->len - t->tun_hlen - eth_hlen > mtu && !skb_is_gso(skb)) {
 		*pmtu = mtu;
 		err = -EMSGSIZE;
diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
index 7c0f647b5195..2493a40bc4b1 100644
--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -486,7 +486,7 @@ vti6_xmit(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
 
 	mtu = dst_mtu(dst);
 	if (!skb->ignore_df && skb->len > mtu) {
-		skb_dst(skb)->ops->update_pmtu(dst, NULL, skb, mtu);
+		skb_dst_update_pmtu(skb, mtu);
 
 		if (skb->protocol == htons(ETH_P_IPV6)) {
 			if (mtu < IPV6_MIN_MTU)
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index f03c1a562135..b35d8905794c 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -925,8 +925,8 @@ static netdev_tx_t ipip6_tunnel_xmit(struct sk_buff *skb,
 			df = 0;
 		}
 
-		if (tunnel->parms.iph.daddr && skb_dst(skb))
-			skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL, skb, mtu);
+		if (tunnel->parms.iph.daddr)
+			skb_dst_update_pmtu(skb, mtu);
 
 		if (skb->len > mtu && !skb_is_gso(skb)) {
 			icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
-- 
2.17.0

^ permalink raw reply related

* Re: [PATCH net-next] tcp: remove mss check in tcp_select_initial_window()
From: David Miller @ 2018-04-27 18:05 UTC (permalink / raw)
  To: weiwan; +Cc: netdev, ycheng, edumazet, soheil
In-Reply-To: <20180426165810.164524-1-tracywwnj@gmail.com>

From: Wei Wang <weiwan@google.com>
Date: Thu, 26 Apr 2018 09:58:10 -0700

> From: Wei Wang <weiwan@google.com>
> 
> In tcp_select_initial_window(), we only set rcv_wnd to
> tcp_default_init_rwnd() if current mss > (1 << wscale). Otherwise,
> rcv_wnd is kept at the full receive space of the socket which is a
> value way larger than tcp_default_init_rwnd().
> With larger initial rcv_wnd value, receive buffer autotuning logic
> takes longer to kick in and increase the receive buffer.
> 
> In a TCP throughput test where receiver has rmem[2] set to 125MB
> (wscale is 11), we see the connection gets recvbuf limited at the
> beginning of the connection and gets less throughput overall.
> 
> Signed-off-by: Wei Wang <weiwan@google.com>
> Acked-by: Eric Dumazet <edumazet@google.com>
> Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
> Acked-by: Yuchung Cheng <ycheng@google.com>

Very nice commit message.

Applied.

^ permalink raw reply

* Re: [PATCH net-next v3 0/4] fixes from 2018-04-17 - v3
From: David Miller @ 2018-04-27 18:03 UTC (permalink / raw)
  To: ubraun; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, raspl, ubraun
In-Reply-To: <20180426151823.78967-1-ubraun@linux.ibm.com>

From: Ursula Braun <ubraun@linux.ibm.com>
Date: Thu, 26 Apr 2018 17:18:19 +0200

> Version 3 changes
>    * no deferring of setsockopts TCP_NODELAY and TCP_CORK anymore
>    * allow fallback for some sockopts eliminating SMC usage
>    * when setting TCP_NODELAY always enforce data transmission
>      (not only together with corked data)

This looks a lot better than what you were doing before.

Series applied, thanks.

^ permalink raw reply

* Re: [net-next] net: intel: Cleanup the copyright/license headers
From: David Miller @ 2018-04-27 18:00 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, nhorman, sassmann, jogreene
In-Reply-To: <20180426150809.11482-1-jeffrey.t.kirsher@intel.com>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Thu, 26 Apr 2018 08:08:09 -0700

> After many years of having a ~30 line copyright and license header to our
> source files, we are finally able to reduce that to one line with the
> advent of the SPDX identifier.
> 
> Also caught a few files missing the SPDX license identifier, so fixed
> them up.
> 
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> Acked-by: Shannon Nelson <shannon.nelson@oracle.com>
> Acked-by: Richard Cochran <richardcochran@gmail.com>
> Tested-by: Andrew Bowers <andrewx.bowers@intel.com>

Applied, thanks.

^ permalink raw reply

* Re: [net-next] ipv6: sr: Extract the right key values for "seg6_make_flowlabel"
From: David Miller @ 2018-04-27 17:59 UTC (permalink / raw)
  To: amsalam20; +Cc: dav.lebrun, netdev, linux-kernel
In-Reply-To: <1524751871-1353-1-git-send-email-amsalam20@gmail.com>

From: Ahmed Abdelsalam <amsalam20@gmail.com>
Date: Thu, 26 Apr 2018 16:11:11 +0200

> @@ -119,6 +119,9 @@ int seg6_do_srh_encap(struct sk_buff *skb, struct ipv6_sr_hdr *osrh, int proto)
>  	int hdrlen, tot_len, err;
>  	__be32 flowlabel;
>  
> +	inner_hdr = ipv6_hdr(skb);

You have to make this assignment after, not before, the skb_cow_header()
call.  Otherwise this point can be pointing to freed up memory.

^ permalink raw reply

* Re: [PATCH 2/2] bpf: btf: remove a couple conditions
From: Martin KaFai Lau @ 2018-04-27 17:55 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Alexei Starovoitov, Daniel Borkmann, netdev, linux-kernel,
	kernel-janitors
In-Reply-To: <20180427172023.6japncdd3nbqauzn@kafai-mbp>

On Fri, Apr 27, 2018 at 10:20:25AM -0700, Martin KaFai Lau wrote:
> On Fri, Apr 27, 2018 at 05:04:59PM +0300, Dan Carpenter wrote:
> > We know "err" is zero so we can remove these and pull the code in one
> > indent level.
> > 
> > Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
> Thanks for the simplification!
> 
> Acked-by: Martin KaFai Lau <kafai@fb.com>
btw, it should be for bpf-next.  Please tag the subject with bpf-next when
you respin. Thanks!

> 
> > ---
> > This applies to the BPF tree (linux-next)
> > 
> > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > index e631b6fd60d3..7cb0905f37c2 100644
> > --- a/kernel/bpf/btf.c
> > +++ b/kernel/bpf/btf.c
> > @@ -1973,16 +1973,14 @@ static struct btf *btf_parse(void __user *btf_data, u32 btf_data_size,
> >  	if (err)
> >  		goto errout;
> >  
> > -	if (!err && log->level && bpf_verifier_log_full(log)) {
> > +	if (log->level && bpf_verifier_log_full(log)) {
> >  		err = -ENOSPC;
> >  		goto errout;
> >  	}
> >  
> > -	if (!err) {
> > -		btf_verifier_env_free(env);
> > -		btf_get(btf);
> > -		return btf;
> > -	}
> > +	btf_verifier_env_free(env);
> > +	btf_get(btf);
> > +	return btf;
> >  
> >  errout:
> >  	btf_verifier_env_free(env);

^ permalink raw reply

* Re: [PATCH net-next v9 2/4] net: Introduce generic failover module
From: Jiri Pirko @ 2018-04-27 17:53 UTC (permalink / raw)
  To: Sridhar Samudrala
  Cc: mst, stephen, davem, netdev, virtualization, virtio-dev,
	jesse.brandeburg, alexander.h.duyck, kubakici, jasowang,
	loseweigh, aaron.f.brown
In-Reply-To: <1524848820-42258-3-git-send-email-sridhar.samudrala@intel.com>

Fri, Apr 27, 2018 at 07:06:58PM CEST, sridhar.samudrala@intel.com wrote:
>This provides a generic interface for paravirtual drivers to listen
>for netdev register/unregister/link change events from pci ethernet
>devices with the same MAC and takeover their datapath. The notifier and
>event handling code is based on the existing netvsc implementation.
>
>It exposes 2 sets of interfaces to the paravirtual drivers.
>1. For paravirtual drivers like virtio_net that use 3 netdev model, the
>   the failover module provides interfaces to create/destroy additional
>   master netdev and all the slave events are managed internally.
>        net_failover_create()
>        net_failover_destroy()
>   A failover netdev is created that acts a master device and controls 2
>   slave devices. The original virtio_net netdev is registered as 'standby'
>   netdev and a passthru/vf device with the same MAC gets registered as
>   'primary' netdev. Both 'standby' and 'primary' netdevs are associated
>   with the same 'pci' device.  The user accesses the network interface via
>   'failover' netdev. The 'failover' netdev chooses 'primary' netdev as
>   default for transmits when it is available with link up and running.
>2. For existing netvsc driver that uses 2 netdev model, no master netdev
>   is created. The paravirtual driver registers each instance of netvsc
>   as a 'failover' netdev  along with a set of ops to manage the slave
>   events. There is no 'standby' netdev in this model. A passthru/vf device
>   with the same MAC gets registered as 'primary' netdev.
>        net_failover_register()
>        net_failover_unregister()
>
>Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
>---
> include/linux/netdevice.h  |  16 +
> include/net/net_failover.h |  62 ++++
> net/Kconfig                |  10 +
> net/core/Makefile          |   1 +
> net/core/net_failover.c    | 892 +++++++++++++++++++++++++++++++++++++++++++++
> 5 files changed, 981 insertions(+)
> create mode 100644 include/net/net_failover.h
> create mode 100644 net/core/net_failover.c

checkpatch says:

_exportax/0002-net-Introduce-generic-failover-module.patch
----------------------------------------------------------
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#92: 
new file mode 100644

Please add an entry to the MAINTAINERS file.

^ permalink raw reply

* Re: [PATCH net-next] net: Fix coccinelle warning
From: David Miller @ 2018-04-27 17:53 UTC (permalink / raw)
  To: ktkhai; +Cc: netdev, lkp
In-Reply-To: <152474505955.21078.9976470400033894421.stgit@localhost.localdomain>

From: Kirill Tkhai <ktkhai@virtuozzo.com>
Date: Thu, 26 Apr 2018 15:18:38 +0300

> kbuild test robot says:
> 
>   >coccinelle warnings: (new ones prefixed by >>)
>   >>> net/core/dev.c:1588:2-3: Unneeded semicolon
> 
> So, let's remove it.
> 
> Reported-by: kbuild test robot <lkp@intel.com>
> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH net-next v9 0/4] Enable virtio_net to act as a standby for a passthru device
From: Samudrala, Sridhar @ 2018-04-27 17:53 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: mst, stephen, davem, netdev, virtualization, virtio-dev,
	jesse.brandeburg, alexander.h.duyck, kubakici, jasowang,
	loseweigh, aaron.f.brown
In-Reply-To: <20180427174523.GE5632@nanopsycho.orion>

On 4/27/2018 10:45 AM, Jiri Pirko wrote:
> Fri, Apr 27, 2018 at 07:06:56PM CEST, sridhar.samudrala@intel.com wrote:
>> v9:
>> Select NET_FAILOVER automatically when VIRTIO_NET/HYPERV_NET
>> are enabled. (stephen)
>>
>> Tested live migration with virtio-net/AVF(i40evf) configured in
>> failover mode while running iperf in background.
>> Build tested netvsc module.
>>
>> The main motivation for this patch is to enable cloud service providers
>> to provide an accelerated datapath to virtio-net enabled VMs in a
>> transparent manner with no/minimal guest userspace changes. This also
>> enables hypervisor controlled live migration to be supported with VMs that
>> have direct attached SR-IOV VF devices.
>>
>> Patch 1 introduces a new feature bit VIRTIO_NET_F_STANDBY that can be
>> used by hypervisor to indicate that virtio_net interface should act as
>> a standby for another device with the same MAC address.
>>
>> Patch 2 introduces a failover module that provides a generic interface for
>> paravirtual drivers to listen for netdev register/unregister/link change
>> events from pci ethernet devices with the same MAC and takeover their
>> datapath. The notifier and event handling code is based on the existing
>> netvsc implementation. It provides 2 sets of interfaces to paravirtual
>> drivers to support 2-netdev(netvsc) and 3-netdev(virtio_net) models.
>>
>> Patch 3 extends virtio_net to use alternate datapath when available and
>> registered. When STANDBY feature is enabled, virtio_net driver creates
>> an additional 'failover' netdev that acts as a master device and controls
>> 2 slave devices.  The original virtio_net netdev is registered as
>> 'standby' netdev and a passthru/vf device with the same MAC gets
>> registered as 'primary' netdev. Both 'standby' and 'primary' netdevs are
>> associated with the same 'pci' device.  The user accesses the network
>> interface via 'failover' netdev. The 'failover' netdev chooses 'primary'
>> netdev as default for transmits when it is available with link up and
>> running.
>>
>> Patch 4 refactors netvsc to use the registration/notification framework
>> supported by failover module.
>>
>> As this patch series is initially focusing on usecases where hypervisor
>> fully controls the VM networking and the guest is not expected to directly
>> configure any hardware settings, it doesn't expose all the ndo/ethtool ops
>> that are supported by virtio_net at this time. To support additional usecases,
>> it should be possible to enable additional ops later by caching the state
>> in virtio netdev and replaying when the 'primary' netdev gets registered.
>>
>> The hypervisor needs to enable only one datapath at any time so that packets
>> don't get looped back to the VM over the other datapath. When a VF is
>> plugged, the virtio datapath link state can be marked as down.
>> At the time of live migration, the hypervisor needs to unplug the VF device
> >from the guest on the source host and reset the MAC filter of the VF to
>> initiate failover of datapath to virtio before starting the migration. After
>> the migration is completed, the destination hypervisor sets the MAC filter
>> on the VF and plugs it back to the guest to switch over to VF datapath.
>>
>> This patch is based on the discussion initiated by Jesse on this thread.
>> https://marc.info/?l=linux-virtualization&m=151189725224231&w=2
>
> No changes in v9?

I listed v9 updates at the start of the message.

v9:
Select NET_FAILOVER automatically when VIRTIO_NET/HYPERV_NET
are enabled. (stephen)

Tested live migration with virtio-net/AVF(i40evf) configured in
failover mode while running iperf in background.
Build tested netvsc module.



>
>> v8:
>> - Made the failover managment routines more robust by updating the feature
>>   bits/other fields in the failover netdev when slave netdevs are
>>   registered/unregistered. (mst)
>> - added support for handling vlans.
>> - Limited the changes in netvsc to only use the notifier/event/lookups
>>   from the failover module. The slave register/unregister/link-change
>>   handlers are only updated to use the getbymac routine to get the
>>   upper netdev. There is no change in their functionality. (stephen)
>> - renamed structs/function/file names to use net_failover prefix. (mst)
>>
>> v7
>> - Rename 'bypass/active/backup' terminology with 'failover/primary/standy'
>>   (jiri, mst)
>> - re-arranged dev_open() and dev_set_mtu() calls in the register routines
>>   so that they don't get called for 2-netdev model. (stephen)
>> - fixed select_queue() routine to do queue selection based on VF if it is
>>   registered as primary. (stephen)
>> -  minor bugfixes
>>
>> v6 RFC:
>>   Simplified virtio_net changes by moving all the ndo_ops of the
>>   bypass_netdev and create/destroy of bypass_netdev to 'bypass' module.
>>   avoided 2 phase registration(driver + instances).
>>   introduced IFF_BYPASS/IFF_BYPASS_SLAVE dev->priv_flags
>>   replaced mutex with a spinlock
>>
>> v5 RFC:
>>   Based on Jiri's comments, moved the common functionality to a 'bypass'
>>   module so that the same notifier and event handlers to handle child
>>   register/unregister/link change events can be shared between virtio_net
>>   and netvsc.
>>   Improved error handling based on Siwei's comments.
>> v4:
>> - Based on the review comments on the v3 version of the RFC patch and
>>   Jakub's suggestion for the naming issue with 3 netdev solution,
>>   proposed 3 netdev in-driver bonding solution for virtio-net.
>> v3 RFC:
>> - Introduced 3 netdev model and pointed out a couple of issues with
>>   that model and proposed 2 netdev model to avoid these issues.
>> - Removed broadcast/multicast optimization and only use virtio as
>>   backup path when VF is unplugged.
>> v2 RFC:
>> - Changed VIRTIO_NET_F_MASTER to VIRTIO_NET_F_BACKUP (mst)
>> - made a small change to the virtio-net xmit path to only use VF datapath
>>   for unicasts. Broadcasts/multicasts use virtio datapath. This avoids
>>   east-west broadcasts to go over the PCI link.
>> - added suppport for the feature bit in qemu
>>
>> Sridhar Samudrala (4):
>>   virtio_net: Introduce VIRTIO_NET_F_STANDBY feature bit
>>   net: Introduce generic failover module
>>   virtio_net: Extend virtio to use VF datapath when available
>>   netvsc: refactor notifier/event handling code to use the failover
>>     framework
>>
>> drivers/net/Kconfig             |   1 +
>> drivers/net/hyperv/Kconfig      |   1 +
>> drivers/net/hyperv/hyperv_net.h |   2 +
>> drivers/net/hyperv/netvsc_drv.c | 134 ++----
>> drivers/net/virtio_net.c        |  37 +-
>> include/linux/netdevice.h       |  16 +
>> include/net/net_failover.h      |  62 +++
>> include/uapi/linux/virtio_net.h |   3 +
>> net/Kconfig                     |  10 +
>> net/core/Makefile               |   1 +
>> net/core/net_failover.c         | 892 ++++++++++++++++++++++++++++++++++++++++
>> 11 files changed, 1046 insertions(+), 113 deletions(-)
>> create mode 100644 include/net/net_failover.h
>> create mode 100644 net/core/net_failover.c
>>
>> -- 
>> 2.14.3

^ permalink raw reply

* Re: [PATCH net-next] geneve: fix build with modular IPV6
From: David Miller @ 2018-04-27 17:52 UTC (permalink / raw)
  To: tobias.regnery; +Cc: netdev, linux-kernel, alexey.kodanev
In-Reply-To: <20180426103636.16113-1-tobias.regnery@gmail.com>

From: Tobias Regnery <tobias.regnery@gmail.com>
Date: Thu, 26 Apr 2018 12:36:36 +0200

> Commit c40e89fd358e ("geneve: configure MTU based on a lower device") added
> an IS_ENABLED(CONFIG_IPV6) to geneve, leading to the following link error
> with CONFIG_GENEVE=y and CONFIG_IPV6=m:
> 
> drivers/net/geneve.o: In function `geneve_link_config':
> geneve.c:(.text+0x14c): undefined reference to `rt6_lookup'
> 
> Fix this by adding a Kconfig dependency and forcing GENEVE to be a module
> when IPV6 is a module.
> 
> Fixes: c40e89fd358e ("geneve: configure MTU based on a lower device")
> Signed-off-by: Tobias Regnery <tobias.regnery@gmail.com>

Applied, thank you.

^ permalink raw reply

* Re: pull-request: wireless-drivers 2018-04-26
From: David Miller @ 2018-04-27 17:50 UTC (permalink / raw)
  To: kvalo; +Cc: linux-wireless, netdev, linux-kernel
In-Reply-To: <87h8ny6ztl.fsf@kamboji.qca.qualcomm.com>

From: Kalle Valo <kvalo@codeaurora.org>
Date: Thu, 26 Apr 2018 13:12:54 +0300

> here's a pull request to net tree, more info below. Please let me know
> if you have any problems.

Pulled, thanks Kalle.

^ permalink raw reply

* Re: [PATCH net-next v9 0/4] Enable virtio_net to act as a standby for a passthru device
From: Jiri Pirko @ 2018-04-27 17:45 UTC (permalink / raw)
  To: Sridhar Samudrala
  Cc: mst, stephen, davem, netdev, virtualization, virtio-dev,
	jesse.brandeburg, alexander.h.duyck, kubakici, jasowang,
	loseweigh, aaron.f.brown
In-Reply-To: <1524848820-42258-1-git-send-email-sridhar.samudrala@intel.com>

Fri, Apr 27, 2018 at 07:06:56PM CEST, sridhar.samudrala@intel.com wrote:
>v9:
>Select NET_FAILOVER automatically when VIRTIO_NET/HYPERV_NET 
>are enabled. (stephen)
>
>Tested live migration with virtio-net/AVF(i40evf) configured in 
>failover mode while running iperf in background.
>Build tested netvsc module.
>
>The main motivation for this patch is to enable cloud service providers
>to provide an accelerated datapath to virtio-net enabled VMs in a 
>transparent manner with no/minimal guest userspace changes. This also
>enables hypervisor controlled live migration to be supported with VMs that
>have direct attached SR-IOV VF devices.
>
>Patch 1 introduces a new feature bit VIRTIO_NET_F_STANDBY that can be
>used by hypervisor to indicate that virtio_net interface should act as
>a standby for another device with the same MAC address.
>
>Patch 2 introduces a failover module that provides a generic interface for 
>paravirtual drivers to listen for netdev register/unregister/link change
>events from pci ethernet devices with the same MAC and takeover their
>datapath. The notifier and event handling code is based on the existing
>netvsc implementation. It provides 2 sets of interfaces to paravirtual 
>drivers to support 2-netdev(netvsc) and 3-netdev(virtio_net) models.
>
>Patch 3 extends virtio_net to use alternate datapath when available and
>registered. When STANDBY feature is enabled, virtio_net driver creates
>an additional 'failover' netdev that acts as a master device and controls
>2 slave devices.  The original virtio_net netdev is registered as
>'standby' netdev and a passthru/vf device with the same MAC gets
>registered as 'primary' netdev. Both 'standby' and 'primary' netdevs are
>associated with the same 'pci' device.  The user accesses the network
>interface via 'failover' netdev. The 'failover' netdev chooses 'primary'
>netdev as default for transmits when it is available with link up and
>running.
>
>Patch 4 refactors netvsc to use the registration/notification framework
>supported by failover module.
>
>As this patch series is initially focusing on usecases where hypervisor 
>fully controls the VM networking and the guest is not expected to directly 
>configure any hardware settings, it doesn't expose all the ndo/ethtool ops
>that are supported by virtio_net at this time. To support additional usecases,
>it should be possible to enable additional ops later by caching the state
>in virtio netdev and replaying when the 'primary' netdev gets registered. 
> 
>The hypervisor needs to enable only one datapath at any time so that packets
>don't get looped back to the VM over the other datapath. When a VF is
>plugged, the virtio datapath link state can be marked as down.
>At the time of live migration, the hypervisor needs to unplug the VF device
>from the guest on the source host and reset the MAC filter of the VF to
>initiate failover of datapath to virtio before starting the migration. After
>the migration is completed, the destination hypervisor sets the MAC filter
>on the VF and plugs it back to the guest to switch over to VF datapath.
>
>This patch is based on the discussion initiated by Jesse on this thread.
>https://marc.info/?l=linux-virtualization&m=151189725224231&w=2


No changes in v9?


>
>v8:
>- Made the failover managment routines more robust by updating the feature 
>  bits/other fields in the failover netdev when slave netdevs are 
>  registered/unregistered. (mst)
>- added support for handling vlans.
>- Limited the changes in netvsc to only use the notifier/event/lookups
>  from the failover module. The slave register/unregister/link-change 
>  handlers are only updated to use the getbymac routine to get the 
>  upper netdev. There is no change in their functionality. (stephen)
>- renamed structs/function/file names to use net_failover prefix. (mst)
>
>v7
>- Rename 'bypass/active/backup' terminology with 'failover/primary/standy'
>  (jiri, mst)
>- re-arranged dev_open() and dev_set_mtu() calls in the register routines
>  so that they don't get called for 2-netdev model. (stephen)
>- fixed select_queue() routine to do queue selection based on VF if it is
>  registered as primary. (stephen)
>-  minor bugfixes
>
>v6 RFC:
>  Simplified virtio_net changes by moving all the ndo_ops of the 
>  bypass_netdev and create/destroy of bypass_netdev to 'bypass' module.
>  avoided 2 phase registration(driver + instances).
>  introduced IFF_BYPASS/IFF_BYPASS_SLAVE dev->priv_flags 
>  replaced mutex with a spinlock
>
>v5 RFC:
>  Based on Jiri's comments, moved the common functionality to a 'bypass'
>  module so that the same notifier and event handlers to handle child
>  register/unregister/link change events can be shared between virtio_net
>  and netvsc.
>  Improved error handling based on Siwei's comments.
>v4:
>- Based on the review comments on the v3 version of the RFC patch and
>  Jakub's suggestion for the naming issue with 3 netdev solution,
>  proposed 3 netdev in-driver bonding solution for virtio-net.
>v3 RFC:
>- Introduced 3 netdev model and pointed out a couple of issues with
>  that model and proposed 2 netdev model to avoid these issues.
>- Removed broadcast/multicast optimization and only use virtio as
>  backup path when VF is unplugged.
>v2 RFC:
>- Changed VIRTIO_NET_F_MASTER to VIRTIO_NET_F_BACKUP (mst)
>- made a small change to the virtio-net xmit path to only use VF datapath
>  for unicasts. Broadcasts/multicasts use virtio datapath. This avoids
>  east-west broadcasts to go over the PCI link.
>- added suppport for the feature bit in qemu
>
>Sridhar Samudrala (4):
>  virtio_net: Introduce VIRTIO_NET_F_STANDBY feature bit
>  net: Introduce generic failover module
>  virtio_net: Extend virtio to use VF datapath when available
>  netvsc: refactor notifier/event handling code to use the failover
>    framework
>
> drivers/net/Kconfig             |   1 +
> drivers/net/hyperv/Kconfig      |   1 +
> drivers/net/hyperv/hyperv_net.h |   2 +
> drivers/net/hyperv/netvsc_drv.c | 134 ++----
> drivers/net/virtio_net.c        |  37 +-
> include/linux/netdevice.h       |  16 +
> include/net/net_failover.h      |  62 +++
> include/uapi/linux/virtio_net.h |   3 +
> net/Kconfig                     |  10 +
> net/core/Makefile               |   1 +
> net/core/net_failover.c         | 892 ++++++++++++++++++++++++++++++++++++++++
> 11 files changed, 1046 insertions(+), 113 deletions(-)
> create mode 100644 include/net/net_failover.h
> create mode 100644 net/core/net_failover.c
>
>-- 
>2.14.3

^ permalink raw reply

* Re: [PATCH net] mlxsw: spectrum_switchdev: Do not remove mrouter port from MDB's ports list
From: David Miller @ 2018-04-27 17:45 UTC (permalink / raw)
  To: idosch; +Cc: netdev, jiri, nogahf, colin.king, mlxsw
In-Reply-To: <20180426084629.20978-1-idosch@mellanox.com>

From: Ido Schimmel <idosch@mellanox.com>
Date: Thu, 26 Apr 2018 11:46:29 +0300

> When IGMP snooping is enabled on a bridge, traffic forwarded by an MDB
> entry should be sent to both ports member in the MDB's ports list and
> mrouter ports.
> 
> In case a port needs to be removed from an MDB's ports list, but this
> port is also configured as an mrouter port, then do not update the
> device so that it will continue to forward traffic through that port.
> 
> Fix a copy-paste error that checked that IGMP snooping is enabled twice
> instead of checking the port's mrouter state.
> 
> Fixes: ded711c87a04 ("mlxsw: spectrum_switchdev: Consider mrouter status for mdb changes")
> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
> Reported-by: Colin King <colin.king@canonical.com>
> Reviewed-by: Nogah Frankel <nogahf@mellanox.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH net-next 00/13] s390/net: updates 2018-04-26
From: David Miller @ 2018-04-27 17:41 UTC (permalink / raw)
  To: jwi; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, raspl, ubraun
In-Reply-To: <20180426074224.100432-1-jwi@linux.ibm.com>

From: Julian Wiedmann <jwi@linux.ibm.com>
Date: Thu, 26 Apr 2018 09:42:11 +0200

> please apply the following patches to net-next. There's the usual
> cleanups & small improvements, and Kittipon adds HW offload support
> for IPv6 checksumming.

Series applied, thanks Julian.

^ permalink raw reply

* Re: [PATCH net] nfp: don't depend on eth_tbl being available
From: Jakub Kicinski @ 2018-04-27 17:38 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, oss-drivers
In-Reply-To: <20180427.111556.1931814913979522341.davem@davemloft.net>

On Fri, 27 Apr 2018 11:15:56 -0400 (EDT), David Miller wrote:
> From: Jakub Kicinski <jakub.kicinski@netronome.com>
> Date: Wed, 25 Apr 2018 11:21:08 -0700
> 
> > For very very old generation of the management FW Ethernet port
> > information table may theoretically not be available.  This in
> > turn will cause the nfp_port structures to not be allocated.
> > 
> > Make sure we don't crash the kernel when there is no eth_tbl:
> > 
> > RIP: 0010:nfp_net_pci_probe+0xf2/0xb40 [nfp]
> > ...
> > Call Trace:
> >   nfp_pci_probe+0x6de/0xab0 [nfp]
> >   local_pci_probe+0x47/0xa0
> >   work_for_cpu_fn+0x1a/0x30
> >   process_one_work+0x1de/0x3e0
> > 
> > Found while working with broken/development version of management FW.
> > 
> > Fixes: a5950182c00e ("nfp: map mac_stats and vf_cfg BARs")
> > Fixes: 93da7d9660ee ("nfp: provide nfp_port to of nfp_net_get_mac_addr()")
> > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> > Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>  
> 
> Applied, thanks Jakub.

Thank you!

> Do you want this queued up for -stable?  It seems borderline, at best, to me.

Yes, I think we don't need stable for now.  This should never happen
(tm) in production and there has been some churn around this code.  I
don't think it's worth a backport.

^ permalink raw reply

* Re: [PATCH V8 net-next 00/14] TLS offload, netdev & MLX5 support
From: David Miller @ 2018-04-27 17:37 UTC (permalink / raw)
  To: borisp; +Cc: netdev, saeedm, davejwatson, ktkhai
In-Reply-To: <20180427.133412.16008743935367755.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Date: Fri, 27 Apr 2018 13:34:12 -0400 (EDT)

> From: David Miller <davem@davemloft.net>
> Date: Fri, 27 Apr 2018 13:31:47 -0400 (EDT)
> 
>> From: Boris Pismenny <borisp@mellanox.com>
>> Date: Thu, 26 Apr 2018 09:37:08 +0300
>> 
>>> This series adds a generic infrastructure to offload TLS crypto to a
>>> network devices. It enables the kernel TLS socket to skip encryption and
>>> authentication operations on the transmit side of the data path. Leaving
>>> those computationally expensive operations to the NIC.
>>  ...
>> 
>> Series applied.
>> 
>> Assuming no build test issues I will push it out to net-next.
> 
> No dice, I'm reverting.
> 
> net/tls/tls_sw.c: In function ‘tls_set_sw_offload’:
> net/tls/tls_sw.c:1192:22: warning: ‘sw_ctx_rx’ may be used uninitialized in this function [-Wmaybe-uninitialized]
>    sw_ctx_rx->sk_poll = sk->sk_socket->ops->poll;
>    ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~

Also in one of the later patches, there are new files added with empty
lines at the end.  GIT warns about this and I have to fix it up each
and every time.

So if you could fix that too I would appreciate it.

Thanks.

^ permalink raw reply

* Re: [PATCH net] sctp: clear the new asoc's stream outcnt in sctp_stream_update
From: David Miller @ 2018-04-27 17:35 UTC (permalink / raw)
  To: lucien.xin; +Cc: netdev, linux-sctp, marcelo.leitner, nhorman
In-Reply-To: <7a1180e29789ab0aa339ae8b456a100520ffcdc5.1524727304.git.lucien.xin@gmail.com>

From: Xin Long <lucien.xin@gmail.com>
Date: Thu, 26 Apr 2018 15:21:44 +0800

> When processing a duplicate cookie-echo chunk, sctp moves the new
> temp asoc's stream out/in into the old asoc, and later frees this
> new temp asoc.
> 
> But now after this move, the new temp asoc's stream->outcnt is not
> cleared while stream->out is set to NULL, which would cause a same
> crash as the one fixed in Commit 79d0895140e9 ("sctp: fix error
> path in sctp_stream_init") when freeing this asoc later.
> 
> This fix is to clear this outcnt in sctp_stream_update.
> 
> Fixes: f952be79cebd ("sctp: introduce struct sctp_stream_out_ext")
> Reported-by: Jianwen Ji <jiji@redhat.com>
> Signed-off-by: Xin Long <lucien.xin@gmail.com>

Applied and queued up for -stable.

^ permalink raw reply

* Re: [PATCH V8 net-next 00/14] TLS offload, netdev & MLX5 support
From: David Miller @ 2018-04-27 17:34 UTC (permalink / raw)
  To: borisp; +Cc: netdev, saeedm, davejwatson, ktkhai
In-Reply-To: <20180427.133147.399393882476929920.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Date: Fri, 27 Apr 2018 13:31:47 -0400 (EDT)

> From: Boris Pismenny <borisp@mellanox.com>
> Date: Thu, 26 Apr 2018 09:37:08 +0300
> 
>> This series adds a generic infrastructure to offload TLS crypto to a
>> network devices. It enables the kernel TLS socket to skip encryption and
>> authentication operations on the transmit side of the data path. Leaving
>> those computationally expensive operations to the NIC.
>  ...
> 
> Series applied.
> 
> Assuming no build test issues I will push it out to net-next.

No dice, I'm reverting.

net/tls/tls_sw.c: In function ‘tls_set_sw_offload’:
net/tls/tls_sw.c:1192:22: warning: ‘sw_ctx_rx’ may be used uninitialized in this function [-Wmaybe-uninitialized]
   sw_ctx_rx->sk_poll = sk->sk_socket->ops->poll;
   ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from ./include/linux/dma-mapping.h:11:0,
                 from ./include/linux/skbuff.h:34,
                 from ./include/net/strparser.h:14,
                 from net/tls/tls_sw.c:41:
./include/linux/scatterlist.h:183:31: warning: ‘sw_ctx_tx’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  prv[prv_nents - 1].page_link = ((unsigned long) sgl | SG_CHAIN)
                               ^
net/tls/tls_sw.c:1051:28: note: ‘sw_ctx_tx’ was declared here
  struct tls_sw_context_tx *sw_ctx_tx;
                            ^~~~~~~~~

^ permalink raw reply

* Re: [PATCH net-next 0/8] net: Extend availability of PHY statistics
From: David Miller @ 2018-04-27 17:33 UTC (permalink / raw)
  To: f.fainelli; +Cc: netdev, andrew, vivien.didelot, cphealy, nikita.yoush
In-Reply-To: <98af5159-e53d-72a3-c5ec-6c99c347d4ea@gmail.com>

From: Florian Fainelli <f.fainelli@gmail.com>
Date: Fri, 27 Apr 2018 10:27:38 -0700

> Still not enough coffee, what I am suggesting is actually to make the
> body of the functions be static inline functions within
> include/net/phy.h that way they get inlined into net/core/ethtool.c and
> we can keep PHYLIB modular, does that look acceptable?

That should be fine.

^ permalink raw reply

* Re: [PATCH V8 net-next 00/14] TLS offload, netdev & MLX5 support
From: David Miller @ 2018-04-27 17:31 UTC (permalink / raw)
  To: borisp; +Cc: netdev, saeedm, davejwatson, ktkhai
In-Reply-To: <1524724642-119397-1-git-send-email-borisp@mellanox.com>

From: Boris Pismenny <borisp@mellanox.com>
Date: Thu, 26 Apr 2018 09:37:08 +0300

> This series adds a generic infrastructure to offload TLS crypto to a
> network devices. It enables the kernel TLS socket to skip encryption and
> authentication operations on the transmit side of the data path. Leaving
> those computationally expensive operations to the NIC.
 ...

Series applied.

Assuming no build test issues I will push it out to net-next.

Thanks.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox