Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH,net-next,0/2] Improve code coverage of syzkaller
From: David Miller @ 2017-09-20 16:36 UTC (permalink / raw)
  To: willemdebruijn.kernel; +Cc: peterpenkov96, netdev
In-Reply-To: <CAF=yD-J_qL=hkgnbgPmHW3o_iC6WKmqFUZmmzQKs-+AUhEWY-Q@mail.gmail.com>

From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Date: Wed, 20 Sep 2017 11:38:45 -0400

> I think that the compile time option was chosen because of the ns_capable
> check, so that with user namespaces unprivileged processes can control this
> path. Perhaps we can require capable() only to set IFF_NAPI_FRAGS.
> 
> Then we can convert the napi_gro_receive path to be conditional on a new
> IFF_NAPI flag instead of this compile time option.

That works for me.

^ permalink raw reply

* [PATCH net-next] net: avoid a full fib lookup when rp_filter is disabled.
From: Paolo Abeni @ 2017-09-20 16:26 UTC (permalink / raw)
  To: netdev; +Cc: David S. Miller, Eric Dumazet, Hannes Frederic Sowa

Since commit 1dced6a85482 ("ipv4: Restore accept_local behaviour
in fib_validate_source()") a full fib lookup is needed even if
the rp_filter is disabled, if accept_local is false - which is
the default.

What we really need in the above scenario is just checking
that the source IP address is not local, and in most case we
can do that is a cheaper way looking up the ifaddr hash table.

This commit adds a helper for such lookup, and uses it to
validate the src address when rp_filter is disabled and no
'local' routes are created by the user space in the relevant
namespace.

A new ipv4 netns flag is added to account for such routes.
We need that to preserve the same behavior we had before this
patch.

It also drops the checks to bail early from __fib_validate_source,
added by the commit 1dced6a85482 ("ipv4: Restore accept_local
behaviour in fib_validate_source()") they do not give any
measurable performance improvement: if we do the lookup with are
on a slower path.

This improves UDP performances for unconnected sockets
when rp_filter is disabled by 5% and also gives small but
measurable performance improvement for TCP flood scenarios.

v1 -> v2:
 - use the ifaddr lookup helper in __ip_dev_find(), as suggested
   by Eric
 - fall-back to full lookup if custom local routes are present

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 include/linux/inetdevice.h |  1 +
 include/net/netns/ipv4.h   |  1 +
 net/ipv4/devinet.c         | 30 ++++++++++++++++++------------
 net/ipv4/fib_frontend.c    | 22 +++++++++++++++++-----
 4 files changed, 37 insertions(+), 17 deletions(-)

diff --git a/include/linux/inetdevice.h b/include/linux/inetdevice.h
index fb3f809e34e4..751d051f0bc7 100644
--- a/include/linux/inetdevice.h
+++ b/include/linux/inetdevice.h
@@ -179,6 +179,7 @@ __be32 inet_confirm_addr(struct net *net, struct in_device *in_dev, __be32 dst,
 			 __be32 local, int scope);
 struct in_ifaddr *inet_ifa_byprefix(struct in_device *in_dev, __be32 prefix,
 				    __be32 mask);
+struct in_ifaddr *inet_lookup_ifaddr_rcu(struct net *net, __be32 addr);
 static __inline__ bool inet_ifa_match(__be32 addr, struct in_ifaddr *ifa)
 {
 	return !((addr^ifa->ifa_address)&ifa->ifa_mask);
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index 20d061c805e3..20720721da4b 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -49,6 +49,7 @@ struct netns_ipv4 {
 #ifdef CONFIG_IP_MULTIPLE_TABLES
 	struct fib_rules_ops	*rules_ops;
 	bool			fib_has_custom_rules;
+	bool			fib_has_custom_local_routes;
 	struct fib_table __rcu	*fib_main;
 	struct fib_table __rcu	*fib_default;
 #endif
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index d7adc0616599..7ce22a2c07ce 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -137,22 +137,12 @@ static void inet_hash_remove(struct in_ifaddr *ifa)
  */
 struct net_device *__ip_dev_find(struct net *net, __be32 addr, bool devref)
 {
-	u32 hash = inet_addr_hash(net, addr);
 	struct net_device *result = NULL;
 	struct in_ifaddr *ifa;
 
 	rcu_read_lock();
-	hlist_for_each_entry_rcu(ifa, &inet_addr_lst[hash], hash) {
-		if (ifa->ifa_local == addr) {
-			struct net_device *dev = ifa->ifa_dev->dev;
-
-			if (!net_eq(dev_net(dev), net))
-				continue;
-			result = dev;
-			break;
-		}
-	}
-	if (!result) {
+	ifa = inet_lookup_ifaddr_rcu(net, addr);
+	if (!ifa) {
 		struct flowi4 fl4 = { .daddr = addr };
 		struct fib_result res = { 0 };
 		struct fib_table *local;
@@ -165,6 +155,8 @@ struct net_device *__ip_dev_find(struct net *net, __be32 addr, bool devref)
 		    !fib_table_lookup(local, &fl4, &res, FIB_LOOKUP_NOREF) &&
 		    res.type == RTN_LOCAL)
 			result = FIB_RES_DEV(res);
+	} else {
+		result = ifa->ifa_dev->dev;
 	}
 	if (result && devref)
 		dev_hold(result);
@@ -173,6 +165,20 @@ struct net_device *__ip_dev_find(struct net *net, __be32 addr, bool devref)
 }
 EXPORT_SYMBOL(__ip_dev_find);
 
+/* called under RCU lock */
+struct in_ifaddr *inet_lookup_ifaddr_rcu(struct net *net, __be32 addr)
+{
+	u32 hash = inet_addr_hash(net, addr);
+	struct in_ifaddr *ifa;
+
+	hlist_for_each_entry_rcu(ifa, &inet_addr_lst[hash], hash)
+		if (ifa->ifa_local == addr &&
+		    net_eq(dev_net(ifa->ifa_dev->dev), net))
+			return ifa;
+
+	return NULL;
+}
+
 static void rtmsg_ifa(int event, struct in_ifaddr *, struct nlmsghdr *, u32);
 
 static BLOCKING_NOTIFIER_HEAD(inetaddr_chain);
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index 37819ab4cc74..f02819134ba2 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -345,9 +345,6 @@ static int __fib_validate_source(struct sk_buff *skb, __be32 src, __be32 dst,
 	if (res.type != RTN_UNICAST &&
 	    (res.type != RTN_LOCAL || !IN_DEV_ACCEPT_LOCAL(idev)))
 		goto e_inval;
-	if (!rpf && !fib_num_tclassid_users(net) &&
-	    (dev->ifindex != oif || !IN_DEV_TX_REDIRECTS(idev)))
-		goto last_resort;
 	fib_combine_itag(itag, &res);
 	dev_match = false;
 
@@ -402,13 +399,26 @@ int fib_validate_source(struct sk_buff *skb, __be32 src, __be32 dst,
 			struct in_device *idev, u32 *itag)
 {
 	int r = secpath_exists(skb) ? 0 : IN_DEV_RPFILTER(idev);
+	struct net *net = dev_net(dev);
 
-	if (!r && !fib_num_tclassid_users(dev_net(dev)) &&
-	    IN_DEV_ACCEPT_LOCAL(idev) &&
+	if (!r && !fib_num_tclassid_users(net) &&
 	    (dev->ifindex != oif || !IN_DEV_TX_REDIRECTS(idev))) {
+		if (IN_DEV_ACCEPT_LOCAL(idev))
+			goto ok;
+		/* if no local routes are added from user space we can check
+		 * for local addresses looking-up the ifaddr table
+		 */
+		if (net->ipv4.fib_has_custom_local_routes)
+			goto full_check;
+		if (inet_lookup_ifaddr_rcu(net, src))
+			return -EINVAL;
+
+ok:
 		*itag = 0;
 		return 0;
 	}
+
+full_check:
 	return __fib_validate_source(skb, src, dst, tos, oif, dev, r, idev, itag);
 }
 
@@ -759,6 +769,8 @@ static int inet_rtm_newroute(struct sk_buff *skb, struct nlmsghdr *nlh,
 	}
 
 	err = fib_table_insert(net, tb, &cfg, extack);
+	if (!err && cfg.fc_type == RTN_LOCAL)
+		net->ipv4.fib_has_custom_local_routes = true;
 errout:
 	return err;
 }
-- 
2.13.5

^ permalink raw reply related

* [PATCH net-next] net: dsa: use dedicated CPU port
From: Vivien Didelot @ 2017-09-20 16:28 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, kernel, David S. Miller, Florian Fainelli,
	Andrew Lunn, Vivien Didelot

Each port in DSA has its own dedicated CPU port currently available in
its parent switch's ds->ports[port].cpu_dp. Use it instead of getting
the unique tree CPU port, which will be deprecated soon.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
---
 drivers/net/dsa/b53/b53_common.c | 4 ++--
 drivers/net/dsa/bcm_sf2.c        | 6 +++---
 drivers/net/dsa/mt7530.c         | 4 ++--
 drivers/net/dsa/mv88e6060.c      | 2 +-
 drivers/net/dsa/qca8k.c          | 2 +-
 5 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index a9f2a5b55a5e..d4ce092def83 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1336,7 +1336,7 @@ EXPORT_SYMBOL(b53_fdb_dump);
 int b53_br_join(struct dsa_switch *ds, int port, struct net_device *br)
 {
 	struct b53_device *dev = ds->priv;
-	s8 cpu_port = ds->dst->cpu_dp->index;
+	s8 cpu_port = ds->ports[port].cpu_dp->index;
 	u16 pvlan, reg;
 	unsigned int i;
 
@@ -1382,7 +1382,7 @@ void b53_br_leave(struct dsa_switch *ds, int port, struct net_device *br)
 {
 	struct b53_device *dev = ds->priv;
 	struct b53_vlan *vl = &dev->vlans[0];
-	s8 cpu_port = ds->dst->cpu_dp->index;
+	s8 cpu_port = ds->ports[port].cpu_dp->index;
 	unsigned int i;
 	u16 pvlan, reg, pvid;
 
diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 0072a959db5b..898d5642b516 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -661,7 +661,7 @@ static int bcm_sf2_sw_resume(struct dsa_switch *ds)
 static void bcm_sf2_sw_get_wol(struct dsa_switch *ds, int port,
 			       struct ethtool_wolinfo *wol)
 {
-	struct net_device *p = ds->dst->cpu_dp->netdev;
+	struct net_device *p = ds->ports[port].cpu_dp->netdev;
 	struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds);
 	struct ethtool_wolinfo pwol;
 
@@ -684,9 +684,9 @@ static void bcm_sf2_sw_get_wol(struct dsa_switch *ds, int port,
 static int bcm_sf2_sw_set_wol(struct dsa_switch *ds, int port,
 			      struct ethtool_wolinfo *wol)
 {
-	struct net_device *p = ds->dst->cpu_dp->netdev;
+	struct net_device *p = ds->ports[port].cpu_dp->netdev;
 	struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds);
-	s8 cpu_port = ds->dst->cpu_dp->index;
+	s8 cpu_port = ds->ports[port].cpu_dp->index;
 	struct ethtool_wolinfo pwol;
 
 	p->ethtool_ops->get_wol(p, &pwol);
diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
index c142b97add2c..faa3b88d2206 100644
--- a/drivers/net/dsa/mt7530.c
+++ b/drivers/net/dsa/mt7530.c
@@ -928,11 +928,11 @@ mt7530_setup(struct dsa_switch *ds)
 	struct device_node *dn;
 	struct mt7530_dummy_poll p;
 
-	/* The parent node of cpu_dp->netdev which holds the common system
+	/* The parent node of master netdev which holds the common system
 	 * controller also is the container for two GMACs nodes representing
 	 * as two netdev instances.
 	 */
-	dn = ds->dst->cpu_dp->netdev->dev.of_node->parent;
+	dn = ds->ports[MT7530_CPU_PORT].netdev->dev.of_node->parent;
 	priv->ethernet = syscon_node_to_regmap(dn);
 	if (IS_ERR(priv->ethernet))
 		return PTR_ERR(priv->ethernet);
diff --git a/drivers/net/dsa/mv88e6060.c b/drivers/net/dsa/mv88e6060.c
index dce7fa57eb55..621cdc46ad81 100644
--- a/drivers/net/dsa/mv88e6060.c
+++ b/drivers/net/dsa/mv88e6060.c
@@ -176,7 +176,7 @@ static int mv88e6060_setup_port(struct dsa_switch *ds, int p)
 		  ((p & 0xf) << PORT_VLAN_MAP_DBNUM_SHIFT) |
 		   (dsa_is_cpu_port(ds, p) ?
 			ds->enabled_port_mask :
-			BIT(ds->dst->cpu_dp->index)));
+			BIT(ds->ports[p].cpu_dp->index)));
 
 	/* Port Association Vector: when learning source addresses
 	 * of packets, add the address to the address database using
diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
index 5ada7a41449c..82f09711ac1a 100644
--- a/drivers/net/dsa/qca8k.c
+++ b/drivers/net/dsa/qca8k.c
@@ -506,7 +506,7 @@ qca8k_setup(struct dsa_switch *ds)
 		pr_warn("regmap initialization failed");
 
 	/* Initialize CPU port pad mode (xMII type, delays...) */
-	phy_mode = of_get_phy_mode(ds->dst->cpu_dp->dn);
+	phy_mode = of_get_phy_mode(ds->ports[QCA8K_CPU_PORT].dn);
 	if (phy_mode < 0) {
 		pr_err("Can't find phy-mode for master device\n");
 		return phy_mode;
-- 
2.14.1

^ permalink raw reply related

* Re: [PATCH net-next 09/14] gtp: Allow configuring GTP interface as standalone
From: Tom Herbert @ 2017-09-20 16:24 UTC (permalink / raw)
  To: Andreas Schultz
  Cc: Tom Herbert, David S. Miller, Linux Kernel Network Developers,
	Pablo Neira Ayuso, Harald Welte, Rohit Seth
In-Reply-To: <56e339a8-fa2d-6f8d-a89c-c0f3f242763a@tpip.net>

On Wed, Sep 20, 2017 at 9:07 AM, Andreas Schultz <aschultz@tpip.net> wrote:
>
>
> On 20/09/17 17:57, Tom Herbert wrote:
>>
>> On Wed, Sep 20, 2017 at 8:27 AM, Andreas Schultz <aschultz@tpip.net>
>> wrote:
>>>
>>> On 19/09/17 02:38, Tom Herbert wrote:
>>>>
>>>>
>>>> Add new configuration of GTP interfaces that allow specifying a port to
>>>> listen on (as opposed to having to get sockets from a userspace control
>>>> plane). This allows GTP interfaces to be configured and the data path
>>>> tested without requiring a GTP-C daemon.
>>>
>>>
>>>
>>> This would imply that you can have multiple independent GTP sockets on
>>> the
>>> same IP address.That is not permitted by the GTP specifications. 3GPP TS
>>> 29.281, section 4.3 states clearly that there is "only" one GTP entity
>>> per
>>> IP address.A PDP context is defined by the destination IP and the TEID.
>>> The
>>> destination port is not part of the identity of a PDP context.
>>>
>> We are in no way trying change GTP, if someone runs this in a real GTP
>> network then they need to abide by the specification. However, there
>> is nothing inconsistent and it breaks nothing if someone wishes to use
>> different port numbers in their own private network for testing or
>> development purposes. Every other UDP application that has assigned
>> port number allows configurable ports, I don't see that GTP is so
>> special that it should be an exception.
>
>
> GTP isn't special, I just don't like to have testing only features in there
> when the same goal can be reached without having to add extra stuff. Adding
> code that is not going to be useful in real production setups (or in this
> case would even break production setups when enabled accidentally) makes the
> implementation more complex than it needs to be.

Well, you could make the same argument that allowing GTP to configured
as standalone interface is a problem since GTP is only allowed to be
with used with GTP-C. But, then we have something in the kernel that
the community is expected to support, but requires jumping through a
whole bunch of hoops just to run a simple netperf. The more that
patches and features look like other things in the kernel that are
already well established, the better the chances we can accept them
and support them. It's probably a natural consequence of any large
open source project, so sometimes it's worth the effort to add a few
lines of complexity to get the benefits of community contribution and
support.

Tom

^ permalink raw reply

* [PATCH net-next] bpf: Optimize lpm trie delete
From: Craig Gallek @ 2017-09-20 16:22 UTC (permalink / raw)
  To: Daniel Mack, Alexei Starovoitov, Daniel Borkmann,
	David S . Miller; +Cc: netdev

From: Craig Gallek <kraig@google.com>

Before the delete operator was added, this datastructure maintained
an invariant that intermediate nodes were only present when necessary
to build the tree.  This patch updates the delete operation to reinstate
that invariant by removing unnecessary intermediate nodes after a node is
removed and thus keeping the tree structure at a minimal size.

Suggested-by: Daniel Mack <daniel@zonque.org>
Signed-off-by: Craig Gallek <kraig@google.com>
---
 kernel/bpf/lpm_trie.c | 55 +++++++++++++++++++++++++++------------------------
 1 file changed, 29 insertions(+), 26 deletions(-)

diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c
index 9d58a576b2ae..b5a7d70ec8b5 100644
--- a/kernel/bpf/lpm_trie.c
+++ b/kernel/bpf/lpm_trie.c
@@ -397,7 +397,7 @@ static int trie_delete_elem(struct bpf_map *map, void *_key)
 	struct lpm_trie_node __rcu **trim;
 	struct lpm_trie_node *node;
 	unsigned long irq_flags;
-	unsigned int next_bit;
+	unsigned int next_bit = 0;
 	size_t matchlen = 0;
 	int ret = 0;
 
@@ -408,14 +408,12 @@ static int trie_delete_elem(struct bpf_map *map, void *_key)
 
 	/* Walk the tree looking for an exact key/length match and keeping
 	 * track of where we could begin trimming the tree.  The trim-point
-	 * is the sub-tree along the walk consisting of only single-child
-	 * intermediate nodes and ending at a leaf node that we want to
-	 * remove.
+	 * is the location of the pointer where we will remove a node from the
+	 * tree.
 	 */
 	trim = &trie->root;
-	node = rcu_dereference_protected(
-		trie->root, lockdep_is_held(&trie->lock));
-	while (node) {
+	while ((node = rcu_dereference_protected(
+		       *trim, lockdep_is_held(&trie->lock)))) {
 		matchlen = longest_prefix_match(trie, node, key);
 
 		if (node->prefixlen != matchlen ||
@@ -423,15 +421,7 @@ static int trie_delete_elem(struct bpf_map *map, void *_key)
 			break;
 
 		next_bit = extract_bit(key->data, node->prefixlen);
-		/* If we hit a node that has more than one child or is a valid
-		 * prefix itself, do not remove it. Reset the root of the trim
-		 * path to its descendant on our path.
-		 */
-		if (!(node->flags & LPM_TREE_NODE_FLAG_IM) ||
-		    (node->child[0] && node->child[1]))
-			trim = &node->child[next_bit];
-		node = rcu_dereference_protected(
-			node->child[next_bit], lockdep_is_held(&trie->lock));
+		trim = &node->child[next_bit];
 	}
 
 	if (!node || node->prefixlen != key->prefixlen ||
@@ -442,25 +432,38 @@ static int trie_delete_elem(struct bpf_map *map, void *_key)
 
 	trie->n_entries--;
 
-	/* If the node we are removing is not a leaf node, simply mark it
+	/* If the node we are removing has two children, simply mark it
 	 * as intermediate and we are done.
 	 */
-	if (rcu_access_pointer(node->child[0]) ||
+	if (rcu_access_pointer(node->child[0]) &&
 	    rcu_access_pointer(node->child[1])) {
 		node->flags |= LPM_TREE_NODE_FLAG_IM;
 		goto out;
 	}
 
-	/* trim should now point to the slot holding the start of a path from
-	 * zero or more intermediate nodes to our leaf node for deletion.
-	 */
-	while ((node = rcu_dereference_protected(
-			*trim, lockdep_is_held(&trie->lock)))) {
+	/* If the node has no children, it can be completely removed */
+	if (!rcu_access_pointer(node->child[0]) &&
+	    !rcu_access_pointer(node->child[1])) {
 		RCU_INIT_POINTER(*trim, NULL);
-		trim = rcu_access_pointer(node->child[0]) ?
-			&node->child[0] :
-			&node->child[1];
 		kfree_rcu(node, rcu);
+		goto out;
+	}
+
+	/* If the node has one child, we may be able to collapse the tree
+	 * while removing this node if the node's child is in the same
+	 * 'next bit' slot as this node was in its parent or if the node
+	 * itself is the root.
+	 */
+	if (trim == &trie->root) {
+		next_bit = node->child[0] ? 0 : 1;
+		rcu_assign_pointer(trie->root, node->child[next_bit]);
+		kfree_rcu(node, rcu);
+	} else if (rcu_access_pointer(node->child[next_bit])) {
+		rcu_assign_pointer(*trim, node->child[next_bit]);
+		kfree_rcu(node, rcu);
+	} else {
+		/* If we can't collapse, just mark this node as intermediate */
+		node->flags |= LPM_TREE_NODE_FLAG_IM;
 	}
 
 out:
-- 
2.14.1.821.g8fa685d3b7-goog

^ permalink raw reply related

* [Patch net] net_sched: remove cls_flower idr on failure
From: Cong Wang @ 2017-09-20 16:18 UTC (permalink / raw)
  To: netdev; +Cc: Cong Wang, Chris Mi, Jiri Pirko

Fixes: c15ab236d69d ("net/sched: Change cls_flower to use IDR")
Cc: Chris Mi <chrism@mellanox.com>
Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 net/sched/cls_flower.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 1a267e77c6de..d230cb4c8094 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -922,28 +922,28 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 
 		if (!tc_flags_valid(fnew->flags)) {
 			err = -EINVAL;
-			goto errout;
+			goto errout_idr;
 		}
 	}
 
 	err = fl_set_parms(net, tp, fnew, &mask, base, tb, tca[TCA_RATE], ovr);
 	if (err)
-		goto errout;
+		goto errout_idr;
 
 	err = fl_check_assign_mask(head, &mask);
 	if (err)
-		goto errout;
+		goto errout_idr;
 
 	if (!tc_skip_sw(fnew->flags)) {
 		if (!fold && fl_lookup(head, &fnew->mkey)) {
 			err = -EEXIST;
-			goto errout;
+			goto errout_idr;
 		}
 
 		err = rhashtable_insert_fast(&head->ht, &fnew->ht_node,
 					     head->ht_params);
 		if (err)
-			goto errout;
+			goto errout_idr;
 	}
 
 	if (!tc_skip_hw(fnew->flags)) {
@@ -952,7 +952,7 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 					   &mask.key,
 					   fnew);
 		if (err)
-			goto errout;
+			goto errout_idr;
 	}
 
 	if (!tc_in_hw(fnew->flags))
@@ -981,6 +981,9 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 	kfree(tb);
 	return 0;
 
+errout_idr:
+	if (fnew->handle)
+		idr_remove_ext(&head->handle_idr, fnew->handle);
 errout:
 	tcf_exts_destroy(&fnew->exts);
 	kfree(fnew);
-- 
2.13.0

^ permalink raw reply related

* [PATCH] ipv6: Use ipv6_authlen for len in ipv6_skip_exthdr
From: Xiang Gao @ 2017-09-20 16:18 UTC (permalink / raw)
  To: trivial, netdev, davem, kuznet, yoshfuji; +Cc: qasdfgtyuiop

In ipv6_skip_exthdr, the lengh of AH header is computed manually
as (hp->hdrlen+2)<<2. However, in include/linux/ipv6.h, a macro
named ipv6_authlen is already defined for exactly the same job. This
commit replaces the manual computation code with the macro.

Signed-off-by: Xiang Gao <qasdfgtyuiop@gmail.com>
---
 net/ipv6/exthdrs_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv6/exthdrs_core.c b/net/ipv6/exthdrs_core.c
index 305e2ed730bf..115d60919f72 100644
--- a/net/ipv6/exthdrs_core.c
+++ b/net/ipv6/exthdrs_core.c
@@ -99,7 +99,7 @@ int ipv6_skip_exthdr(const struct sk_buff *skb, int start, u8 *nexthdrp,
 				break;
 			hdrlen = 8;
 		} else if (nexthdr == NEXTHDR_AUTH)
-			hdrlen = (hp->hdrlen+2)<<2;
+			hdrlen = ipv6_authlen(hp);
 		else
 			hdrlen = ipv6_optlen(hp);
 
-- 
2.14.1

^ permalink raw reply related

* Re: [PATCH] ipv6_skip_exthdr: use ipv6_authlen for AH hdrlen
From: Xiang Gao @ 2017-09-20 16:17 UTC (permalink / raw)
  To: David Miller; +Cc: trivial, netdev, Alexey Kuznetsov, Hideaki YOSHIFUJI
In-Reply-To: <20170919.153208.1575676406897804623.davem@davemloft.net>

Hi David,

Thanks for your time and all your suggestions. I will resend a new patch soon.

Xiang Gao
Xiang Gao


2017-09-19 18:32 GMT-04:00 David Miller <davem@davemloft.net>:
> From: Xiang Gao <qasdfgtyuiop@gmail.com>
> Date: Tue, 19 Sep 2017 08:59:50 -0400
>
>> In ipv6_skip_exthdr, the lengh of AH header is computed manually
>> as (hp->hdrlen+2)<<2. However, in include/linux/ipv6.h, a macro
>> named ipv6_authlen is already defined for exactly the same job. This
>> commit replaces the manual computation code with the macro.
>
> All patch submissions must have a proper signoff.
>
> Also, please use a proper subsystem prefix in your Subject
> line "[PATCH] ipv6: Use ipv6_authlen for AH hdrlen in ipv6_skip_exthdr()"
> would have been much better as "ipv6: " is the appropriate
> subsystem prefix to use here.
>
> Thanks.

^ permalink raw reply

* [PATCH v4 4/4] samples/bpf: Add documentation on cross compilation
From: Joel Fernandes @ 2017-09-20 16:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: netdev, alison, juri.lelli, fengc, daniel, davem, ast,
	kernel-team, Joel Fernandes
In-Reply-To: <20170920161159.25747-1-joelaf@google.com>

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Joel Fernandes <joelaf@google.com>
---
 samples/bpf/README.rst | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/samples/bpf/README.rst b/samples/bpf/README.rst
index 79f9a58f1872..2b906127ef54 100644
--- a/samples/bpf/README.rst
+++ b/samples/bpf/README.rst
@@ -64,3 +64,13 @@ It is also possible to point make to the newly compiled 'llc' or
 'clang' command via redefining LLC or CLANG on the make command line::
 
  make samples/bpf/ LLC=~/git/llvm/build/bin/llc CLANG=~/git/llvm/build/bin/clang
+
+Cross compiling samples
+-----------------------
+Inorder to cross-compile, say for arm64 targets, export CROSS_COMPILE and ARCH
+environment variables before calling make. This will direct make to build
+samples for the cross target.
+
+export ARCH=arm64
+export CROSS_COMPILE="aarch64-linux-gnu-"
+make samples/bpf/ LLC=~/git/llvm/build/bin/llc CLANG=~/git/llvm/build/bin/clang
-- 
2.14.1.821.g8fa685d3b7-goog

^ permalink raw reply related

* [PATCH v4 3/4] samples/bpf: Fix pt_regs issues when cross-compiling
From: Joel Fernandes @ 2017-09-20 16:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: netdev, alison, juri.lelli, fengc, daniel, davem, ast,
	kernel-team, Joel Fernandes
In-Reply-To: <20170920161159.25747-1-joelaf@google.com>

BPF samples fail to build when cross-compiling for ARM64 because of incorrect
pt_regs param selection. This is because clang defines __x86_64__ and
bpf_headers thinks we're building for x86. Since clang is building for the BPF
target, it shouldn't make assumptions about what target the BPF program is
going to run on. To fix this, lets pass ARCH so the header knows which target
the BPF program is being compiled for and can use the correct pt_regs code.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Joel Fernandes <joelaf@google.com>
---
 samples/bpf/Makefile                      |  2 +-
 tools/testing/selftests/bpf/bpf_helpers.h | 56 +++++++++++++++++++++++++++----
 2 files changed, 50 insertions(+), 8 deletions(-)

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 13f74b67ca44..ebc2ad69b62c 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -230,7 +230,7 @@ $(obj)/%.o: $(src)/%.c
 	$(CLANG) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) -I$(obj) \
 		-I$(srctree)/tools/testing/selftests/bpf/ \
 		-D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value -Wno-pointer-sign \
-		-Wno-compare-distinct-pointer-types \
+		-D__TARGET_ARCH_$(ARCH) -Wno-compare-distinct-pointer-types \
 		-Wno-gnu-variable-sized-type-not-at-end \
 		-Wno-address-of-packed-member -Wno-tautological-compare \
 		-Wno-unknown-warning-option \
diff --git a/tools/testing/selftests/bpf/bpf_helpers.h b/tools/testing/selftests/bpf/bpf_helpers.h
index 36fb9161b34a..4875395b0b52 100644
--- a/tools/testing/selftests/bpf/bpf_helpers.h
+++ b/tools/testing/selftests/bpf/bpf_helpers.h
@@ -109,7 +109,47 @@ static int (*bpf_skb_under_cgroup)(void *ctx, void *map, int index) =
 static int (*bpf_skb_change_head)(void *, int len, int flags) =
 	(void *) BPF_FUNC_skb_change_head;
 
+/* Scan the ARCH passed in from ARCH env variable (see Makefile) */
+#if defined(__TARGET_ARCH_x86)
+	#define bpf_target_x86
+	#define bpf_target_defined
+#elif defined(__TARGET_ARCH_s930x)
+	#define bpf_target_s930x
+	#define bpf_target_defined
+#elif defined(__TARGET_ARCH_arm64)
+	#define bpf_target_arm64
+	#define bpf_target_defined
+#elif defined(__TARGET_ARCH_mips)
+	#define bpf_target_mips
+	#define bpf_target_defined
+#elif defined(__TARGET_ARCH_powerpc)
+	#define bpf_target_powerpc
+	#define bpf_target_defined
+#elif defined(__TARGET_ARCH_sparc)
+	#define bpf_target_sparc
+	#define bpf_target_defined
+#else
+	#undef bpf_target_defined
+#endif
+
+/* Fall back to what the compiler says */
+#ifndef bpf_target_defined
 #if defined(__x86_64__)
+	#define bpf_target_x86
+#elif defined(__s390x__)
+	#define bpf_target_s930x
+#elif defined(__aarch64__)
+	#define bpf_target_arm64
+#elif defined(__mips__)
+	#define bpf_target_mips
+#elif defined(__powerpc__)
+	#define bpf_target_powerpc
+#elif defined(__sparc__)
+	#define bpf_target_sparc
+#endif
+#endif
+
+#if defined(bpf_target_x86)
 
 #define PT_REGS_PARM1(x) ((x)->di)
 #define PT_REGS_PARM2(x) ((x)->si)
@@ -122,7 +162,7 @@ static int (*bpf_skb_change_head)(void *, int len, int flags) =
 #define PT_REGS_SP(x) ((x)->sp)
 #define PT_REGS_IP(x) ((x)->ip)
 
-#elif defined(__s390x__)
+#elif defined(bpf_target_s390x)
 
 #define PT_REGS_PARM1(x) ((x)->gprs[2])
 #define PT_REGS_PARM2(x) ((x)->gprs[3])
@@ -135,7 +175,7 @@ static int (*bpf_skb_change_head)(void *, int len, int flags) =
 #define PT_REGS_SP(x) ((x)->gprs[15])
 #define PT_REGS_IP(x) ((x)->psw.addr)
 
-#elif defined(__aarch64__)
+#elif defined(bpf_target_arm64)
 
 #define PT_REGS_PARM1(x) ((x)->regs[0])
 #define PT_REGS_PARM2(x) ((x)->regs[1])
@@ -148,7 +188,7 @@ static int (*bpf_skb_change_head)(void *, int len, int flags) =
 #define PT_REGS_SP(x) ((x)->sp)
 #define PT_REGS_IP(x) ((x)->pc)
 
-#elif defined(__mips__)
+#elif defined(bpf_target_mips)
 
 #define PT_REGS_PARM1(x) ((x)->regs[4])
 #define PT_REGS_PARM2(x) ((x)->regs[5])
@@ -161,7 +201,7 @@ static int (*bpf_skb_change_head)(void *, int len, int flags) =
 #define PT_REGS_SP(x) ((x)->regs[29])
 #define PT_REGS_IP(x) ((x)->cp0_epc)
 
-#elif defined(__powerpc__)
+#elif defined(bpf_target_powerpc)
 
 #define PT_REGS_PARM1(x) ((x)->gpr[3])
 #define PT_REGS_PARM2(x) ((x)->gpr[4])
@@ -172,7 +212,7 @@ static int (*bpf_skb_change_head)(void *, int len, int flags) =
 #define PT_REGS_SP(x) ((x)->sp)
 #define PT_REGS_IP(x) ((x)->nip)
 
-#elif defined(__sparc__)
+#elif defined(bpf_target_sparc)
 
 #define PT_REGS_PARM1(x) ((x)->u_regs[UREG_I0])
 #define PT_REGS_PARM2(x) ((x)->u_regs[UREG_I1])
@@ -182,6 +222,8 @@ static int (*bpf_skb_change_head)(void *, int len, int flags) =
 #define PT_REGS_RET(x) ((x)->u_regs[UREG_I7])
 #define PT_REGS_RC(x) ((x)->u_regs[UREG_I0])
 #define PT_REGS_SP(x) ((x)->u_regs[UREG_FP])
+
+/* Should this also be a bpf_target check for the sparc case? */
 #if defined(__arch64__)
 #define PT_REGS_IP(x) ((x)->tpc)
 #else
@@ -190,10 +232,10 @@ static int (*bpf_skb_change_head)(void *, int len, int flags) =
 
 #endif
 
-#ifdef __powerpc__
+#ifdef bpf_target_powerpc
 #define BPF_KPROBE_READ_RET_IP(ip, ctx)		({ (ip) = (ctx)->link; })
 #define BPF_KRETPROBE_READ_RET_IP		BPF_KPROBE_READ_RET_IP
-#elif defined(__sparc__)
+#elif bpf_target_sparc
 #define BPF_KPROBE_READ_RET_IP(ip, ctx)		({ (ip) = PT_REGS_RET(ctx); })
 #define BPF_KRETPROBE_READ_RET_IP		BPF_KPROBE_READ_RET_IP
 #else
-- 
2.14.1.821.g8fa685d3b7-goog

^ permalink raw reply related

* [PATCH v4 2/4] samples/bpf: Enable cross compiler support
From: Joel Fernandes @ 2017-09-20 16:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: netdev, alison, juri.lelli, fengc, daniel, davem, ast,
	kernel-team, Joel Fernandes
In-Reply-To: <20170920161159.25747-1-joelaf@google.com>

When cross compiling, bpf samples use HOSTCC for compiling the non-BPF part of
the sample, however what we really want is to use the cross compiler to build
for the cross target since that is what will load and run the BPF sample.
Detect this and compile samples correctly.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Joel Fernandes <joelaf@google.com>
---
 samples/bpf/Makefile | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index cf17c7932a6e..13f74b67ca44 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -177,6 +177,11 @@ HOSTLOADLIBES_syscall_tp += -lelf
 LLC ?= llc
 CLANG ?= clang
 
+# Detect that we're cross compiling and use the cross compiler
+ifdef CROSS_COMPILE
+HOSTCC = $(CROSS_COMPILE)gcc
+endif
+
 # Trick to allow make to be run from this directory
 all:
 	$(MAKE) -C ../../ $(CURDIR)/
-- 
2.14.1.821.g8fa685d3b7-goog

^ permalink raw reply related

* [PATCH v4 1/4] samples/bpf: Use getppid instead of getpgrp for array map stress
From: Joel Fernandes @ 2017-09-20 16:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: netdev, alison, juri.lelli, fengc, daniel, davem, ast,
	kernel-team, Joel Fernandes

When cross-compiling the bpf sample map_perf_test for aarch64, I find that
__NR_getpgrp is undefined. This causes build errors. This syscall is deprecated
and requires defining __ARCH_WANT_SYSCALL_DEPRECATED. To avoid having to define
that, just use a different syscall (getppid) for the array map stress test.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Joel Fernandes <joelaf@google.com>
---
 samples/bpf/map_perf_test_kern.c | 2 +-
 samples/bpf/map_perf_test_user.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/samples/bpf/map_perf_test_kern.c b/samples/bpf/map_perf_test_kern.c
index 098c857f1eda..2b2ffb97018b 100644
--- a/samples/bpf/map_perf_test_kern.c
+++ b/samples/bpf/map_perf_test_kern.c
@@ -266,7 +266,7 @@ int stress_hash_map_lookup(struct pt_regs *ctx)
 	return 0;
 }
 
-SEC("kprobe/sys_getpgrp")
+SEC("kprobe/sys_getppid")
 int stress_array_map_lookup(struct pt_regs *ctx)
 {
 	u32 key = 1, i;
diff --git a/samples/bpf/map_perf_test_user.c b/samples/bpf/map_perf_test_user.c
index f388254896f6..a0310fc70057 100644
--- a/samples/bpf/map_perf_test_user.c
+++ b/samples/bpf/map_perf_test_user.c
@@ -282,7 +282,7 @@ static void test_array_lookup(int cpu)
 
 	start_time = time_get_ns();
 	for (i = 0; i < max_cnt; i++)
-		syscall(__NR_getpgrp, 0);
+		syscall(__NR_getppid, 0);
 	printf("%d:array_lookup %lld lookups per sec\n",
 	       cpu, max_cnt * 1000000000ll * 64 / (time_get_ns() - start_time));
 }
-- 
2.14.1.821.g8fa685d3b7-goog

^ permalink raw reply related

* re: mac80211: avoid allocating TXQs that won't be used
From: Colin Ian King @ 2017-09-20 16:08 UTC (permalink / raw)
  To: Johannes Berg
  Cc: David S. Miller, linux-wireless@vger.kernel.org,
	netdev@vger.kernel.org

Johannes,

Static analysis with CoverityScan on linux-next today detected a null
pointer dereference issue on commit:

>From 0fc4b3403d215ecd3c05505ec1f0028a227ed319 Mon Sep 17 00:00:00 2001
From: Johannes Berg <johannes.berg@intel.com>
Date: Thu, 22 Jun 2017 12:20:29 +0200
Subject: [PATCH] mac80211: avoid allocating TXQs that won't be used

Issue: sdata is null when the sdata is dereferenced by:

                   sdata->vif.type != NL80211_IFTYPE_AP_VLAN &&
                   sdata->vif.type != NL80211_IFTYPE_MONITOR)

note that sdata is assigned a non-null much later with the statement
sdata = netdev_priv(ndev).

Detected by CoverityScan CID#1456974 ("Explicit null dereferenced")

Colin

^ permalink raw reply

* Re: [PATCH net-next 09/14] gtp: Allow configuring GTP interface as standalone
From: Andreas Schultz @ 2017-09-20 16:07 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Tom Herbert, David S. Miller, Linux Kernel Network Developers,
	Pablo Neira Ayuso, Harald Welte, Rohit Seth
In-Reply-To: <CALx6S3771X4QUm2mJG=aJJ0f4Tm98NpGA7060vF1P8iuCPgicQ@mail.gmail.com>



On 20/09/17 17:57, Tom Herbert wrote:
> On Wed, Sep 20, 2017 at 8:27 AM, Andreas Schultz <aschultz@tpip.net> wrote:
>> On 19/09/17 02:38, Tom Herbert wrote:
>>>
>>> Add new configuration of GTP interfaces that allow specifying a port to
>>> listen on (as opposed to having to get sockets from a userspace control
>>> plane). This allows GTP interfaces to be configured and the data path
>>> tested without requiring a GTP-C daemon.
>>
>>
>> This would imply that you can have multiple independent GTP sockets on the
>> same IP address.That is not permitted by the GTP specifications. 3GPP TS
>> 29.281, section 4.3 states clearly that there is "only" one GTP entity per
>> IP address.A PDP context is defined by the destination IP and the TEID. The
>> destination port is not part of the identity of a PDP context.
>>
> We are in no way trying change GTP, if someone runs this in a real GTP
> network then they need to abide by the specification. However, there
> is nothing inconsistent and it breaks nothing if someone wishes to use
> different port numbers in their own private network for testing or
> development purposes. Every other UDP application that has assigned
> port number allows configurable ports, I don't see that GTP is so
> special that it should be an exception.

GTP isn't special, I just don't like to have testing only features in 
there when the same goal can be reached without having to add extra 
stuff. Adding code that is not going to be useful in real production 
setups (or in this case would even break production setups when enabled 
accidentally) makes the implementation more complex than it needs to be.

You can always add multiple IP's to your test system and have the same 
effect without having to change the ports.

Regards
Andreas

> 
> Tom
> 

^ permalink raw reply

* [PATCH v4 0/4] Add cross-compilation support to eBPF samples
From: Joel Fernandes @ 2017-09-20 16:04 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: alison, juri.lelli, fengc, daniel, davem, ast, kernel-team,
	Joel Fernandes

These patches fix issues seen when cross-compiling eBPF samples on arm64.
Compared to [1], I dropped the controversial inline-asm patch and exploring
other options to fix it. However these patches are a step in the right
direction and I look forward to getting them into -next and the merge window.

Changes since v3:
- just a repost with acks

[1] https://lkml.org/lkml/2017/8/7/417

Joel Fernandes (4):
  samples/bpf: Use getppid instead of getpgrp for array map stress
  samples/bpf: Enable cross compiler support
  samples/bpf: Fix pt_regs issues when cross-compiling
  samples/bpf: Add documentation on cross compilation

 samples/bpf/Makefile                      |  7 +++-
 samples/bpf/README.rst                    | 10 ++++++
 samples/bpf/map_perf_test_kern.c          |  2 +-
 samples/bpf/map_perf_test_user.c          |  2 +-
 tools/testing/selftests/bpf/bpf_helpers.h | 56 +++++++++++++++++++++++++++----
 5 files changed, 67 insertions(+), 10 deletions(-)

-- 
2.14.1.821.g8fa685d3b7-goog

^ permalink raw reply

* Re: [PATCH net-next 09/14] gtp: Allow configuring GTP interface as standalone
From: Tom Herbert @ 2017-09-20 15:57 UTC (permalink / raw)
  To: Andreas Schultz
  Cc: Tom Herbert, David S. Miller, Linux Kernel Network Developers,
	Pablo Neira Ayuso, Harald Welte, Rohit Seth
In-Reply-To: <a6d1dad0-e922-2913-d4c8-592d403ee1cb@tpip.net>

On Wed, Sep 20, 2017 at 8:27 AM, Andreas Schultz <aschultz@tpip.net> wrote:
> On 19/09/17 02:38, Tom Herbert wrote:
>>
>> Add new configuration of GTP interfaces that allow specifying a port to
>> listen on (as opposed to having to get sockets from a userspace control
>> plane). This allows GTP interfaces to be configured and the data path
>> tested without requiring a GTP-C daemon.
>
>
> This would imply that you can have multiple independent GTP sockets on the
> same IP address.That is not permitted by the GTP specifications. 3GPP TS
> 29.281, section 4.3 states clearly that there is "only" one GTP entity per
> IP address.A PDP context is defined by the destination IP and the TEID. The
> destination port is not part of the identity of a PDP context.
>
We are in no way trying change GTP, if someone runs this in a real GTP
network then they need to abide by the specification. However, there
is nothing inconsistent and it breaks nothing if someone wishes to use
different port numbers in their own private network for testing or
development purposes. Every other UDP application that has assigned
port number allows configurable ports, I don't see that GTP is so
special that it should be an exception.

Tom

^ permalink raw reply

* [PATCH v5 net 3/3] lan78xx: Use default values loaded from EEPROM/OTP after reset
From: Nisar Sayed @ 2017-09-20 21:06 UTC (permalink / raw)
  To: davem; +Cc: UNGLinuxDriver, netdev
In-Reply-To: <20170920210638.11150-1-Nisar.Sayed@microchip.com>

Use default value of auto duplex and auto speed values loaded
from EEPROM/OTP after reset. The LAN78xx allows platform
configurations to be loaded from EEPROM/OTP.
Ex: When external phy is connected, the MAC can be configured to
have correct auto speed, auto duplex, auto polarity configured
from the EEPROM/OTP.

Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Signed-off-by: Nisar Sayed <Nisar.Sayed@microchip.com>
---
 drivers/net/usb/lan78xx.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index f8c63eec8353..0161f77641fa 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -2449,7 +2449,6 @@ static int lan78xx_reset(struct lan78xx_net *dev)
 	/* LAN7801 only has RGMII mode */
 	if (dev->chipid == ID_REV_CHIP_ID_7801_)
 		buf &= ~MAC_CR_GMII_EN_;
-	buf |= MAC_CR_AUTO_DUPLEX_ | MAC_CR_AUTO_SPEED_;
 	ret = lan78xx_write_reg(dev, MAC_CR, buf);
 
 	ret = lan78xx_read_reg(dev, MAC_TX, &buf);
-- 
2.14.1

^ permalink raw reply related

* [PATCH v5 net 2/3] lan78xx: Allow EEPROM write for less than MAX_EEPROM_SIZE
From: Nisar Sayed @ 2017-09-20 21:06 UTC (permalink / raw)
  To: davem; +Cc: UNGLinuxDriver, netdev
In-Reply-To: <20170920210638.11150-1-Nisar.Sayed@microchip.com>

Allow EEPROM write for less than MAX_EEPROM_SIZE

Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Signed-off-by: Nisar Sayed <Nisar.Sayed@microchip.com>
---
 drivers/net/usb/lan78xx.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index fcf85ae37435..f8c63eec8353 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -1290,11 +1290,10 @@ static int lan78xx_ethtool_set_eeprom(struct net_device *netdev,
 	if (ret)
 		return ret;
 
-	/* Allow entire eeprom update only */
-	if ((ee->magic == LAN78XX_EEPROM_MAGIC) &&
-	    (ee->offset == 0) &&
-	    (ee->len == 512) &&
-	    (data[0] == EEPROM_INDICATOR))
+	/* Invalid EEPROM_INDICATOR at offset zero will result in a failure
+	 * to load data from EEPROM
+	 */
+	if (ee->magic == LAN78XX_EEPROM_MAGIC)
 		ret = lan78xx_write_raw_eeprom(dev, ee->offset, ee->len, data);
 	else if ((ee->magic == LAN78XX_OTP_MAGIC) &&
 		 (ee->offset == 0) &&
-- 
2.14.1

^ permalink raw reply related

* [PATCH v5 net 1/3] lan78xx: Fix for eeprom read/write when device auto suspend
From: Nisar Sayed @ 2017-09-20 21:06 UTC (permalink / raw)
  To: davem; +Cc: UNGLinuxDriver, netdev
In-Reply-To: <20170920210638.11150-1-Nisar.Sayed@microchip.com>

Fix for eeprom read/write when device auto suspend

Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Signed-off-by: Nisar Sayed <Nisar.Sayed@microchip.com>
---
 drivers/net/usb/lan78xx.c | 24 ++++++++++++++++++++----
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index b99a7fb09f8e..fcf85ae37435 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -1265,30 +1265,46 @@ static int lan78xx_ethtool_get_eeprom(struct net_device *netdev,
 				      struct ethtool_eeprom *ee, u8 *data)
 {
 	struct lan78xx_net *dev = netdev_priv(netdev);
+	int ret;
+
+	ret = usb_autopm_get_interface(dev->intf);
+	if (ret)
+		return ret;
 
 	ee->magic = LAN78XX_EEPROM_MAGIC;
 
-	return lan78xx_read_raw_eeprom(dev, ee->offset, ee->len, data);
+	ret = lan78xx_read_raw_eeprom(dev, ee->offset, ee->len, data);
+
+	usb_autopm_put_interface(dev->intf);
+
+	return ret;
 }
 
 static int lan78xx_ethtool_set_eeprom(struct net_device *netdev,
 				      struct ethtool_eeprom *ee, u8 *data)
 {
 	struct lan78xx_net *dev = netdev_priv(netdev);
+	int ret;
+
+	ret = usb_autopm_get_interface(dev->intf);
+	if (ret)
+		return ret;
 
 	/* Allow entire eeprom update only */
 	if ((ee->magic == LAN78XX_EEPROM_MAGIC) &&
 	    (ee->offset == 0) &&
 	    (ee->len == 512) &&
 	    (data[0] == EEPROM_INDICATOR))
-		return lan78xx_write_raw_eeprom(dev, ee->offset, ee->len, data);
+		ret = lan78xx_write_raw_eeprom(dev, ee->offset, ee->len, data);
 	else if ((ee->magic == LAN78XX_OTP_MAGIC) &&
 		 (ee->offset == 0) &&
 		 (ee->len == 512) &&
 		 (data[0] == OTP_INDICATOR_1))
-		return lan78xx_write_raw_otp(dev, ee->offset, ee->len, data);
+		ret = lan78xx_write_raw_otp(dev, ee->offset, ee->len, data);
 
-	return -EINVAL;
+	usb_autopm_put_interface(dev->intf);
+
+	return ret;
 }
 
 static void lan78xx_get_strings(struct net_device *netdev, u32 stringset,
-- 
2.14.1

^ permalink raw reply related

* [PATCH v5 net 0/3] lan78xx: This series of patches are for lan78xx driver.
From: Nisar Sayed @ 2017-09-20 21:06 UTC (permalink / raw)
  To: davem; +Cc: UNGLinuxDriver, netdev

This series of patches are for lan78xx driver.

These patches fixes potential issues associated with lan78xx driver.

v5
- Updated changes as per comments

v4
- Updated changes to handle return values as per comments
- Updated EEPROM write handling as per comments

v3
- Updated chagnes as per comments

v2
- Added patch version information
- Added fixes tag
- Updated patch description
- Updated chagnes as per comments

v1
- Splitted patches as per comments
- Dropped "fixed_phy device support" and "Fix for system suspend" changes

Nisar Sayed (3):
  lan78xx: Fix for eeprom read/write when device auto suspend
  lan78xx: Allow EEPROM write for less than MAX_EEPROM_SIZE
  lan78xx: Use default values loaded from EEPROM/OTP after reset

 drivers/net/usb/lan78xx.c | 34 ++++++++++++++++++++++++----------
 1 file changed, 24 insertions(+), 10 deletions(-)

-- 
2.14.1

^ permalink raw reply

* Re: [PATCH,net-next,0/2] Improve code coverage of syzkaller
From: Willem de Bruijn @ 2017-09-20 15:38 UTC (permalink / raw)
  To: David Miller; +Cc: peterpenkov96, Network Development
In-Reply-To: <20170919.230853.540610280718334640.davem@davemloft.net>

On Wed, Sep 20, 2017 at 2:08 AM, David Miller <davem@davemloft.net> wrote:
> From: Petar Penkov <peterpenkov96@gmail.com>
> Date: Tue, 19 Sep 2017 21:26:14 -0700
>
>> Furthermore, in a way testing already requires specific kernel
>> configuration.  In this particular example, syzkaller prefers
>> synchronous operation and therefore needs 4KSTACKS disabled. Other
>> features that require rebuilding are KASAN and dbx. From this point
>> of view, I still think that having the TUN_NAPI flag has value.
>
> Then I think this path could be enabled/disabled with a runtime flag
> just as easily, no?

I think that the compile time option was chosen because of the ns_capable
check, so that with user namespaces unprivileged processes can control this
path. Perhaps we can require capable() only to set IFF_NAPI_FRAGS.

Then we can convert the napi_gro_receive path to be conditional on a new
IFF_NAPI flag instead of this compile time option.

^ permalink raw reply

* Re: [PATCH net-next 03/14] gtp: Call common functions to get tunnel routes and add dst_cache
From: Andreas Schultz @ 2017-09-20 15:37 UTC (permalink / raw)
  To: Harald Welte, David Miller; +Cc: tom, netdev, pablo, rohit
In-Reply-To: <20170919120942.dpy5kmkhzws7pqd5@nataraja>



On 19/09/17 14:09, Harald Welte wrote:
> Hi Dave,
> 
> On Mon, Sep 18, 2017 at 09:17:51PM -0700, David Miller wrote:
>> This and the new dst caching code ignores any source address selection
>> done by ip_route_output_key() or the new tunnel route lookup helpers.
>>
>> Either source address selection should be respected, or if saddr will
>> never be modified by a route lookup for some specific reason here,
>> that should be documented.
> 
> The IP source address is fixed by signaling on the GTP-C control plane
> and nothing that the kernel can unilaterally decide to change.  Such a
> change of address would have to be decided by and first be signaled on
> GTP-C to the peer by the userspace daemon, which would then update the
> PDP context in the kernel.

I think we had this discussion before. The sending IP and port are not 
part of the identity of the PDP context. So IMHO the sender is permitted
to change the source IP at random.

Regards
Andreas

> 
> So I guess you're asking us to document that rationale as form of a
> source code comment ?
> 

^ permalink raw reply

* Re: [PATCH net-next 09/14] gtp: Allow configuring GTP interface as standalone
From: Andreas Schultz @ 2017-09-20 15:27 UTC (permalink / raw)
  To: Tom Herbert, davem; +Cc: netdev, pablo, laforge, rohit
In-Reply-To: <20170919003904.5124-10-tom@quantonium.net>

On 19/09/17 02:38, Tom Herbert wrote:
> Add new configuration of GTP interfaces that allow specifying a port to
> listen on (as opposed to having to get sockets from a userspace control
> plane). This allows GTP interfaces to be configured and the data path
> tested without requiring a GTP-C daemon.

This would imply that you can have multiple independent GTP sockets on 
the same IP address.That is not permitted by the GTP specifications. 
3GPP TS 29.281, section 4.3 states clearly that there is "only" one GTP 
entity per IP address.A PDP context is defined by the destination IP and 
the TEID. The destination port is not part of the identity of a PDP context.

Even the source IP and source port are not part of the tunnel identity. 
This makes is possible to send traffic from a new SGSN/SGW during 
handover before the control protocol has announced the handover.

At this point the usual response is: THAT IS NOT SAFE. Yes, GTP has been 
designed for cooperative networks only and should not be used on 
hostile/unsecured networks.

On the sending side, using multiple ports is permitted as long as the 
default GTP port is always able to receive incoming messages.

Andreas

[...]

^ permalink raw reply

* Re: [PATCH net-next 00/14] gtp: Additional feature support
From: Andreas Schultz @ 2017-09-20 15:34 UTC (permalink / raw)
  To: Harald Welte, Tom Herbert
  Cc: Tom Herbert, David S. Miller, Linux Kernel Network Developers,
	Pablo Neira Ayuso, Rohit Seth
In-Reply-To: <20170919231947.bt4am3els3p26l6p@nataraja>

Hi Harald,

On 20/09/17 01:19, Harald Welte wrote:
> Hi Tom,
> 
> On Tue, Sep 19, 2017 at 08:59:28AM -0700, Tom Herbert wrote:
>> On Tue, Sep 19, 2017 at 5:43 AM, Harald Welte <laforge@gnumonks.org>
>> wrote:
>>> On Mon, Sep 18, 2017 at 05:38:50PM -0700, Tom Herbert wrote:
>>>>    - IPv6 support
>>>
>>> see my detailed comments in other mails.  It's unfortunately only
>>> support for the already "deprecated" IPv6-only PDP contexts, not the
>>> more modern v4v6 type.  In order to interoperate with old and new
>>> approach, all three cases (v4, v6 and v4v6) should be supported from
>>> one code base.
>>>
>> It sounds like something that can be subsequently added.
> 
> Not entirely, at least on the netlink (and any other configuration
> interface) you will have to reflect this from the very beginning.  You
> have to have an explicit PDP type and cannot rely on the address type to
> specify the type of PDP context.  Whatever interfaces are introduced
> now will have to remain compatible to any future change.
> 
> My strategy to avoid any such possible 'road blocks' from being
> introduced would be to simply add v4v6 and v6 support in one go.  The
> differences are marginal (having both an IPv6 prefix and a v4 address in
> parallel, rather than mutually exclusive only).
> 
>> Do you have a reference to the spec?
> 
> See http://osmocom.org/issues/2418#note-7 which lists Section 11.2.1.3.2
> of 3GPP TS 29.061 in combination with RFC3314, RFC7066, RFC6459 and
> 3GPP TS 23.060 9.2.1 as well as a summary of my understanding of it some
> months ago.
> 
>>>>    - Configurable networking interfaces so that GTP kernel can be
>>>>    used and tested without needing GSN network emulation (i.e. no
>>>>    user space daemon needed).
>>>
>>> We have some pretty decent userspace utilities for configuring the
>>> GTP interfaces and tunnels in the libgtpnl repository, but if it
>>> helps people to have another way of configuration, I won't be
>>> against it.
>>>
>> AFAIK those userspace utilities don't support IPv6.
> 
> Of course not [yet]. libgtpnl and the command line tools have been
> implemented specifically for the in-kernel GTP driver, and you have to
> make sure to add related support on both the kernel and the userspace
> side (libgtpnl). So there's little point in adding features on either
> side before the other side.  There would be no way to test...
> 
>> Being able to configure GTP like any other encapsulation will
>> facilitate development of IPv6 and other features.
> 
> That may very well be the case, but adding "IPv6 support" to kernel GTP
> in a way that is not in line with the existing userspace libraries and
> control-plane implementations means that you're developing those
> features in an artificial environment that doesn't resemble real 3GPP
> interoperable networks out there.
> 
> As indicated, I'm not against adding additional interfaces, but we have
> to make sure that we add IPv6 support (or any new feature support) to at
> least libgtpnl, and to make sure we test interoperability with existing
> 3GPP network equipment such as real IPv6 capable phones and SGSNs.
> 
>>> I'm not sure if this is a useful feature.  GTP is used only in
>>> operator-controlled networks and only on standard ports.  It's not
>>> possible to negotiate any non-standard ports on the signaling plane
>>> either.
>>>
>> Bear in mind that we're not required to do everything the GTP spec
>> says.
> 
> Yes, we are, at least as long as it affects interoperability with other
> implemetations out there.
> 
> GTP uses well-known port numbers on *both* sides of the tunnel, and you
> cannot deviate from that.

Actually, the well-known port is only mandatory for the receiving side.
The sending side can choose any port it wishes as long as it is prepared
to receive possible error indication on the well-known port.

Of course, it makes the implementation simple to use only one port, but 
for scalability it might be a good idea to support per PDP context 
sending ports.

Regards
Andreas

> There's no point in having all kinds of feetures in the GTP user plane
> which are not interoperable with other implementations, and which are
> completely outside of the information model / architecture of GTP.
> 
> In the real world, GTP-U is only used in combination with GTP-C.  And in
> GTP-C you can only negotiate the IP address of both sides of GTP-U, and
> not the port number information.  As a result, the port numbers are
> static on both sides.
> 
>> My impression is GTP designers probably didn't think in terms of
>> getting best performance. But we can ;-)
> 
> I think it's wasted efforts if it's about "random udp ports" as no
> standards-compliant implementation out there with which you will have to
> interoperate will be able to support it.
> 
> GTP is used between home and roaming operator.  If you want to introduce
> changes to how it works, you will have to have control over both sides
> of the implementation of both the GTP-C and the GTP-u plane, which is
> very unlikely and rather the exception in the hundreds of operators you
> interoperate with.  Also keep in mind that there often are various
> "middleboxes" that will suddenly have to reflect your changes.  That
> starts from packet filters at various locations in the operator networks
> and/or roaming hubs, down to GTP hubs and others.
> 
> My opinion is: Non-standard GTP ports are not going to happen.
> 
>> I also brought up open_ggsn. ggsn to sgsn.
> 
> That's good to hear.  For both v4 and v6 PDP contexts?  Whcih phones
> did you use for testing?  Particularly given how convolved the address
> allocation is (see below), I'm surprised it would work.
> 
>>> For IPv6 (and v4v6) PDP contexts there is quite a bit of extra headache
>>> related to the way how router solicitation/advertisements are modified
>>> in the 3GPP world.
>>>
>>> The address allocation in v4 is simple:
>>> * MS/UE requests dynamic or fixed IPv4 address via EUA IE of PDP context
>>>    activation
>>> * GGSN responds with IPv4 address in EUA of Activate PDP context
>>>    response (and then uses netlink to tell the kernel about that
>>>    IPv4 address)
>>>
>>> In v6 or the v6 portion of v4v6 it works differently:
>>> * MS/UE requests dynamic or fixed IPv4 address in EUA IE of PDP context
>>>    activation
>>> * GGSN responds with an IPv6 address, but that address is *not* used
>>>    for communication, but simply used as an "interface identifier" to
>>>    build a link-local address.
>>> * MS then uses router solicitation using that link-local address
>>> * GGSN responds with router advertisement, allocating a single /64
>>>    prefix, from which the MS then generates a fully-qualified IPv6
>>>    source address for communication.
>>>
>>> How did you envision this to be done with the v6 support you just added?
>>> At the very least, the /64 prefix matching would have to be implemented
>>> so that in fact all addresses within that /64 prefix are matched +
>>> encapsulated for a given PDP context in the downlink (to phone)
>>> direction.
>>>
>>> [...]
>> I would hope all the above you're describing is mostly control plane
>> matters.
> 
> It is not.  The control plane is GTP-C and runs on different UDP ports
> (at least for GTPv1/v2).  The user plane is GTP-U and is what's done in
> the kernel.  And by its very nature, IPv6 router
> solicitations/advertisements (as well as neighbor
> solicitations/advertisements) are part of the user plane and thus
> handled in GTP-U.
> 
>> At least a good design decouples data palne and control
>> plane. I know that GTP is a bit convoluted in this regard.
> 
> The problem is that IPv6 has never been specified properly for
> point-to-point links.  There's no decent PPP specs for IPv6.  So the
> 3GPP folks had to try to be as close as possible to the existing
> (broadcast) link layer model to facilitate existing IPv6 implemetations
> to work over 3GPP bearers.  That's why they kept whatever possible to
> re-use in terms of neighbor/router discovery.
> 
> So the problem is now: Unless you handle GTP-U *entirely* in the kernel
> (including router + neighbor advertisement/solicitation), you will have
> a "split GTP-U" plane between kernel and userspace.  And in that context
> the question is who owns the sequence numbers, how will you avoid race
> conditions, ... - my simple suggestion is thus to keep with the current
> split and do everything GTP-U related inside the kernel and everything
> GTP-C related in userspace.
> 
> I think there has to be a clear plan/architecture on how to implement
> those bits in terms of the kernel/userspace split, and at least a proof
> of concept implementation that we can show works with some real phones
> out there - otherwise there's no point in having IPv6 support that works
> well with some custom tools.
> 
> Regards,
> 	Harald
> 

^ permalink raw reply

* Re: [RFC PATCH] net: Introduce a socket option to enable picking tx queue based on rx queue.
From: Hannes Frederic Sowa @ 2017-09-20 15:30 UTC (permalink / raw)
  To: Sridhar Samudrala; +Cc: alexander.h.duyck, netdev
In-Reply-To: <1504222032-6337-1-git-send-email-sridhar.samudrala@intel.com>

Sridhar Samudrala <sridhar.samudrala@intel.com> writes:

> This patch introduces a new socket option SO_SYMMETRIC_QUEUES that can be used
> to enable symmetric tx and rx queues on a socket.
>
> This option is specifically useful for epoll based multi threaded workloads
> where each thread handles packets received on a single RX queue . In this model,
> we have noticed that it helps to send the packets on the same TX queue
> corresponding to the queue-pair associated with the RX queue specifically when
> busy poll is enabled with epoll().
>
> Two new fields are added to struct sock_common to cache the last rx ifindex and
> the rx queue in the receive path of an SKB. __netdev_pick_tx() returns the cached
> rx queue when this option is enabled and the TX is happening on the same device.

Would it help to make the rx and tx skb hashes symmetric
(skb_get_hash_symmetric) on request?

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox